**BOTTLENECK : A program for detecting
recent effective population size reductions from allele data
frequencies**

Sylvain PIRY(1), Gordon LUIKART(2) and Jean-Marie CORNUET(1)

- Laboratoire de Modélisation et de Biologie Evolutive. INRA-URLB. 488 rue de la Croix-Lavit, F-34090 Montpellier, France
- Laboratoire de Biologie des Populations d'Altitude, CNRS UMR 5553, Université Joseph Fourier, F-38041 Grenoble Cedex 09, France

**Principle : **Populations which have experienced a recent
reduction of their effective population size exhibit a correlative reduction
of the allele numbers and heterozygosities at polymorphic loci. But the allelic
diversity is reduced faster than the heterozygosity, i.e. the observed
heterozygosity is larger than the heterozygosity expected from the observed
allele number were the locus at mutation-drift equilibrium. Strictly speaking,
this has been demonstrated only for loci evolving under the Infinite Allele
Model (IAM) by Maruyama and Fuerst (1985). If the locus evolves under the
strict Stepwise Mutation Model (SMM), there can be situations where this
heterozygosity excess is not observed (Cornuet and Luikart 1996). However,
few loci follow the strict SMM, and as soon as they depart slightly from
this mutation model towards the IAM, they will exhibit an heterozygosity
excess as a consequence of a genetic bottleneck.

In a population at mutation-drift equilibrium (i.e. the effective size of which has remained constant in the past), there is approximately an equal probability that a locus shows an heterozygosity excess or an heterozygosity deficit. To determine whether a population exhibits a significant number of loci with heterozygosity excess, we proposed three tests, namely a "sign test", a "standardized differences test" (Cornuet and Luikart 1996), and a "Wilcoxon sign-rank test" (Luikart et al., 1997a). We also proposed a descriptor of the allele frequency distribution ("mode-shift" indicator) which discriminates many bottlenecked populations from stable populations (Luikart et al, 1997b).

**Description : **
The program *BOTTLENECK*
computes for each population sample and for each locus the distribution of the
heterozygosity expected from the observed number of alleles (*k*), given the
sample size (*n*) under the assumption of mutation-drift equilibrium. This
distribution is obtained through simulating the coalescent process of *n* genes
under three possible mutation models, the IAM, SMM and the two phase model (TPM
which allows multiple-step mutations; note that, beside default values, choosing
both a proportion of SMM in the TPM = 0.000 and a variance of the geometric
distribution for TPM = 0.36 correspond to sensible parameter values for most
microsatellites).
This enables the computation of the average (Hexp) which is compared to the
observed
heterozygosity (Hobs, in the sense of Nei's *gene diversity*) to establish
whether there is an heterozygosity excess or deficit at this locus. In addition,
the standard deviation (SD) of the mutation-drift equilibrium distribution
of the heterozygosity is used to compute the standardized difference for
each locus ((Hobs-Hexp)/SD). The distribution obtained through simulation
enables also the computation of a P-value for the observed heterozygosity.

The way in which the coalescent process is simulated is unconventional due
to the conditioning by the observed number of alleles. The phylogeny of the
n genes is simulated as usual (Hudson, 1990). Under the IAM, a single mutation
is allocated at a time and the resulting number of alleles is computed. The
process is repeated until the latter reaches the observed number of alleles.
Under the SMM, a Bayesian approach is used as explained in Cornuet and Luikart
(1996). Briefly, the likelihood distribution of the parameter *theta*
(= 4Neµ) given the number of alleles (*k*) and the sample size
(*n*) is evaluated as the proportion of iterations (in the simulation
process) producing exactly *k* alleles for a varying set of
*thetas*. As a second step, drawing random values of *theta* according
to the likelihood distribution, the coalescent process is simulated as usual.
Only heterozygosities found in iterations producing exactly *k* alleles
are considered.

Once all loci available in a population sample have been processed, the three statistical tests are performed for each mutation model as explained in Cornuet and Luikart (1996) and Luikart et al. (1997a, b) and the allele frequency distribution is established in order to see whether it is approximately L-shaped (as expected under mutation-drift equilibrium) or not (recent bottlenecks provoke a mode shift).

**Data file format : **Five data file formats
are accepted and automatically recognized by *BOTTLENECK*. All are text
files. Two are the *GENEPOP
and GENETIX
*formats. The other three formats concern single population data. The
first
line is a title line. Each following line provides the necessary data for
each
locus. In all cases, the line starts with the name of the locus followed by
the number of alleles (*k*). In one data file format, the line includes
successively the sample size (number of gene copies = *n*) and the
unbiased
genic diversity (*sensu* Nei, 1987). In the second format, the line is
completed with the number of copies of each allele. In the third format, the
line includes the sample size (*n*) and the frequency of each allele.
All
data on the same line are separated by one or more spaces.

**References:**

Cornuet J.M. and Luikart G., 1997 Description and power analysis of two tests
for detecting recent population bottlenecks from allele frequency data.
*Genetics* 144:2001-2014.
PubMed
query. Please, cite this article if you use Bottleneck.

Hudson R.R., 1990 Gene genealogies and the coalescent process, pp. 1-42 in Oxford Survey in Evolutionary Biology, Vol. 7, edited by D. Futuyama and J. Antonovics. Oxford University Press, Oxford.

Luikart G., Allendorf F.W., Cornuet J.M. and William B. Sherwin, 1997. Distortion of allele frequency distributions provides a test for recent population bottlenecks. Journal of Heredity (Accepted July, 1997)

Luikart G. and Cornuet J.M., 1998. Empirical evaluation of a test for identifying recently bottlenecked populations from allele frequency data. Conservation Biology 12(1):228-237.

Luikart G., 1997. Usefulness of molecular markers for detecting population bottlenecks and monitoring genetic change. Ph. D. Thesis. University of Montana, Missoula, USA.

Maruyama T. and Fuerst P.A., 1985 Population bottlenecks and non equilibrium
models in population genetics. II. Number of alleles in a small population
that was formed by a recent bottleneck. *Genetics* 111:675-689.