Abstract
A coalescent argument is used to derive the effective size in simple models with recurrent local extinctions. Several alternative methods of derivation of this result are given and compared to earlier analyses of this problem. The different methods described in this paper all give the same result, which differs from earlier ones. For two published sets of estimates of demographic parameters, metapopulation structure appears to result in a moderate reduction of effective size relative to total adult population size.
Similar content being viewed by others
Introduction
To determine the effects of local extinctions and recolonizations on genetic diversity and effective size, Slatkin (1977) defined two models, the ‘migrant pool’ and ‘propagule pool’ models. In the former, recolonizers come from the whole metapopulation; in the latter, they preferentially come from a single population. These models were subsequently investigated by a number of works (eg Wade and McCauley, 1988; Whitlock and McCauley, 1990; Whitlock and Barton, 1997; Pannell and Charlesworth, 1999). However, there are several inaccuracies in the methods previously used to compute effective size in these models. This short paper discusses several methods for computing effective size. All of them provide a single expression (equation (7)), which has not been previously given. First, a simple coalescent argument for this result is provided. Then, alternative derivations are used to explain discrepancies with earlier works. The quantitative importance of these discrepancies is briefly discussed. A Mathematica (Wolfram, 1999) notebook performing the computations described in this paper is available on request.
A simple coalescent argument
Effective size is defined to give the asymptotic rate of coalescence of pairs of genes. For pairs of genes in different demes, this rate can be deduced by a two-step argument. First, the two ancestral lineages must gather in the same deme. Then, they may coalesce or separate again in different demes.
No local extinctions
In the island model without local extinctions, this argument develops as follows. Let m be the dispersal probability. The probability that two genes in different demes come from a single deme in the previous generation is
where nd is the number of demes. Thus, on a timescale of nd generations, the rate at which genes in different demes come in a single deme is 1−(1−m)2. Then, the ancestral lineages coalesce immediately on this time scale if they have the same parent (probability: 1/N, where N is the number of adults, here haploid, per deme), or if they coalesce in this deme in a recent past, rather than separate in different demes again. The probability of the latter event may be written in terms of Wright's FST measure of population structure, as (1−1/N)FST, since in the island model, FST is approximately (to O(1/nd)) the probability of recent coalescence of genes within demes (Hudson, 1998; Rousset, 2002). Then, the overall rate of coalescence is
per nd generations. For Wright's island model,
and the whole expression can be written as (1−FST)/N. Hence, the effective size is (1−FST)/(Nnd) as already given by Wright (1943).
This derivation of effective size is in line with other computations of effective size in ‘structured coalescents with two time scales’ (eg Nordborg, 2001; Wakeley and Aliacar, 2001; Nordborg and Krone, 2002). Here the separation of time scales is obtained for a large number of demes, and the expression for effective size is correct to leading order in 1/nd. Wakeley and Aliacar (2001) have already used such an approach (together with additional approximations) to obtain an expression for the effective size of a metapopulation. Their result is consistent with the more general one presented below.
Using a time scale of nd generations to derive the effective size is in line with such previous arguments. Alternatively, we can compute the probability that two lineages coalesce in a given ‘current’ generation as the sum over t of probabilities that (1) the two ancestral lineages gathered in the same deme t generations earlier (probability (1−(1−m)2)/nd), and (2) given t, they coalesce in the current generation rather than separate again in different demes. 1/N+(N−1)FST/N then appears as the sum over t of probabilities of the second event.
Local extinctions
With local extinctions, the two demes where the genes are sampled may have both become extinct in the previous generation (probability e2), or neither became extinct (probability (1−e)2), or only one did (probability 2e(1−e)). If neither became extinct, the probability that the two lineages come from a single deme in the previous generation is
where (1−e)nd is the number of parental demes that contribute to the next generation. If one or both demes became extinct, the probability is simply 1/((1−e)nd).
Taking the different cases into account, the overall rate of coalescence is
where ∼ means that the computation is correct to leading order in 1/nd. This can be written as
where QR≡1/N+(N−1)FST/N is the ‘identity by descent’ among gametes produced by adults within a deme, that is the relatedness of such gametes relative to gametes produced in different demes. QR can be computed, for example, as in Wade and McCauley (1988), or can be deduced from the recursions detailed below. One then obtains that for an infinite number of demes
where k is the number of recolonizers, and where φ=0 for the migrant pool model and φ=(1−m)2 for the propagule pool model. Hence,
This result is consistent with approximation (23) derived by Wakeley annd Aliacar (2001) for φ=0, large N, and small m and e. Further, equation (7) is also obtained by the following argument, which is longer but allows step-by-step comparison with earlier analyses of Pannell and Charlesworth (1999).
Matrix formulation
The asymptotic rate of coalescence is obtained as (1−λ)−1, where λ is the largest eigenvalue of the matrix which describes the decrease of gene diversities in the absence of mutation (eg Hill, 1972; Ewens, 1982; Whitlock and Barton, 1997). To construct this matrix, we first consider a system of recursions for probabilities of identity within and among demes, comparable to those of Slatkin (1977) and Pannell and Charlesworth (1999). These recursions include mutation as those of Slatkin, but of course they describe the decrease of gene diversities when mutation rate is set to zero.
The life cycle considered in these models is as follows (Slatkin, 1977). In the absence of extinction, events occur in the following order: gamete production, dispersal, and population regulation, where N adult offspring survive. In each generation any deme can independently become extinct with some probability e. Extinction occurs before reproduction, so that the adults do not contribute anything to the next generation. An extinct deme is immediately recolonized by k colonizers, which reproduce immediately so that there will always be N adults in the next generations in all demes, recolonized or not. Thus, recolonizers experience two rounds of dispersal and reproduction within one ‘generation’, that is within the time only one round is considered for demes that do not become extinct. This assumption was thought to simplify the computation of FST, but can be relaxed (see Discussion). In the ‘migrant pool’ model, colonizers in the propagules are independently sampled from all other demes. In the ‘propagule pool’ model, propagules of k colonizers are formed in nonextinct demes after gamete production and dispersal, and each extinct deme is recolonized by the k members of a single propagule. As noticed by Pannell and Charlesworth (1999), it is actually not required that the extinct and colonized habitats are the same: it is only assumed that a constant number of demes become extinct and that an equal number of habitats are colonized in each generation. For conciseness, I consider N and k where previous authors considered 2N and 2k genes. Let Qi be the probability of identity of genes from different adults, within demes (Q1) and in different demes (Q2). Here we consider the probability of allelic identity in the infinite allele model. Let μ be the mutation rate. We write the recursions for next-generation identities Q1′ and Q2′ as
where
Here:
• Equation (8) and (9) are as in Slatkin (1977) and Pannell and Charlesworth (1999) (except for obvious typos in the latter). For nd → ∞, equilibrium FST is the solution Q1 of the recursion deduced from equation (8) with Q2=0:
This yields equation (6).
• C is a shorthand for a term already considered by these authors.
• Equation (10) is modified from Slatkin (1977) as suggested by Pannell and Charlesworth (1999).
• Equation (10) is modified from Pannell and Charlesworth's equation (A.3) so as to be consistent with the exact recursions in Nagylaki (1983) for e=0. Nevertheless, this difference does not affect results for FST nor for effective size to leading order in 1/nd, because equation (11) and their equation (A.3) are identical to first order in 1/nd: they differ only by terms of order 1/nd2. Likewise, the O(1/nd) term in equation (10) does not affect FST nor effective size to leading order in 1/nd. Thus, this term, which represents the probability that two genes immigrating in the same deme come from a common parental deme, can also be neglected in later analyses.
•In equation (12), I use the notation φ for the probability that two colonizers have parent(s) in the same deme. If propagule pools are formed after dispersal of gametes (consistently with Slatkin's verbal description of the life cycle), then φ=(1−m)2+O(1/nd). If propagule pools are formed from locally produced gametes, then φ=1. In the migrant pool model, φ=O(1/nd). It may be checked that the O(1/nd) terms in φ have no bearing on the results, and they will be ignored below.
There are some inconsistencies among different analyses of the propagule pool model. Slatkin analyzed two different scenarios: ‘model I’ and ‘model II’ (respectively, an islands-mainland model without dispersal between the islands, and a finite island model with migration). As noticed by Wade and McCauley (1988), FST should be the same in both models when nd → ∞. For the propagule pool model, equation (15) of Wade and McCauley (1988) is consistent with equation (6) of Slatkin (1977); they imply φ=(1−m)2. But Slatkin's and Pannell and Charlesworth's systems of recursions for model II yield different results; they imply φ=1. Of course, both cases can be considered, provided they are distinguished.
The probability of common origin of colonizers φ was first considered by Whitlock and McCauley (1990), but I was unable to match their equations with the above ones. In particular, their equation (4) for identity among gametes implies that there are two successive reproductions at recolonization if φ=1 (as seen from terms of order 1/(kN) in the recursions, and consistently with Slatkin's description of the life cycle), while there is only one reproduction at recolonization if φ=1 (only terms of order 1/k appear, so gametes are not produced by N adults at some stage). Thus, their equations may not correspond to a well-defined life cycle.
•The main discrepancy with earlier expressions is in equation (13), which has the factor Q1+(1−Q1)/N instead of Q1 in Slatkin (1977) and Pannell and Charlesworth (1999). This difference takes into account that when genes from different demes originate from parent(s) in a single deme, they may actually have the same parent and coalesce. This does not affect FST values, but it does affect Ne values.
From the above equations, one can derive the probabilities gij that a pair of genes of type i (i=1 for pairs within a deme and 2 for pairs in different demes) derives without coalescence from a pair of genes of type j. They are
Ne is obtained as (1−λ)−1, where λ is the largest eigenvalue of G≡(gij). To leading order in 1/nd, the expression for Ne can be deduced from a perturbation approximation for λ near the limit nd → ∞. Classical expressions for perturbation approximations (eg Horn and Johnson, 1985; Charlesworth, 1994; Caswell, 2001) here take the form
where y and e are left and right eigenvectors associated with the largest eigenvalue (here 1) or the unperturbed matrix and dG are the terms of order 1/nd in
By this method, one obtains again equation (7). It differs from the result implied by the same method but with Pannell and Charlesworth's system of recursions, as expected from the difference in the definition of B1 equation (13)). Differences between the rate of coalescence predicted from the above recursions and from Pannell and Charlesworth's ones are only of order e(e+m), and the maximum differences I have found numerically were by factors of ≈1.5 to 2.2 for k=N, large m, and e from 0.25 to 0.7 (details not shown). Approximations comparable to those of their Table 2 for total diversity πT can be derived from equation (7). It appears that a factor (1+Ne/k) is missing from the denominator of their central and right-hand approximations for πT when e⩽m. This may imply some reductions in effective size.
Alternative coalescent arguments
Alternative derivations of effective size also allow comparison with Whitlock and Barton's methods. First, the rate of coalescence may be obtained as the probability ξ that two different lineages that have not already coalesced are in gametes from the same deme, times the probability πc that two gametes from one deme coalesce within one generation. Here, distinguishing whether the deme has just been recolonized or not,
and ξ is the equilibrium solution of the recursion
Here ξ′ is the probability ξ considered one generation later, and the factor of ξ′ is the probability that two gametes produced in the same deme originate from two different gametes produced in a single deme one generation before (compare with the denominator of QR in equation (6). The remainder is the probability that genes in different demes come from gametes in the same deme (see equations (4) and (5)). It is straightforward to check that ξπc is 1/Ne as given by equation (7).
A variant of this argument is to compute ξ as (1−FST)ρ where ρ is the equilibrium solution of the recursion
obtained from equation (2) by ignoring the coalescence terms 1/N and 1/k. The rationale for the computation of ρ is given in the Appendix. This approach again yields equation (7), but now 1/Ne is expressed in the form (1−FST)ρπc, which allows a comparison with an argument on p. 434 of Whitlock and Barton (1997). They derive 1/Ne from their equation (13), in the form
However, if effective size was of this form, then their ϑx should depend on k (as ρπc does). This is not the case. The resulting formula tends to overestimate effective size, possibly by a factor of 100 or more for φ=1 and e≫m (details not shown). It also conflicts with the approximation 1/Ne≈2(m+e)FST/nd given by Whitlock and Barton (1997) and further considered by Pannell and Charlesworth (1999). Much the same can be said of their equation (22), which is correct only when k → ∞ (for φ=0). A possible explanation for these discrepancies is that results are derived from their equation (3), which does not hold in Slatkin's models (see the Appendix).
The approximation 1/Ne≈2(m+e)FST/nd is valid, but in need of a general argument. This approximation can be deduced simply by expressing QR as a function of FST≡Q using equation (15), plugging the result in equation (5), and simplifying for small m and e.
Discussion
It should be a relief to everyone that effective size can be obtained by the simple coalescent argument leading to equation (4). Such arguments efficiently yield expressions for effective size in more complex metapopulations with variable deme size (Rousset, in press). However, the coalescent argument has been obscured by earlier analyses (except Wakeley And Aliacar, 2001), which conflict with the present results. Previous recursions for probabilities of identity in Slatkin (1977), Whitlock and McCauley (1990) and Pannell and Charlesworth (1999) are inconsistent with Slatkin's life cycle and do not correspond to another well-defined life cycle. These discrepancies affect expressions for FST in Whitlock and McCauley (1990) and Whitlock and Barton (1997) and for effective size in Whitlock and Barton (1997) and Pannell and Charlesworth (1999). Quantitatively, effective size differs slightly from the expression resulting from Pannell and Charlesworth's system of recursions, and may differ substantially from equation (22) of Whitlock and Barton (1997) (for φ=0) or from results based on their equation (13).
Expectedly, the present results support the intuitive conclusion that extinctions reduce the effective size, which previous works had reached. The simple coalescent argument easily yields equation (5), which shows that propagule size k and probability of common origin φ affect effective size only through their effects on QR, that is on FST. Also as expected, lower k and higher φ reduce the effective size.
The assumption that two successive reproduction events occur right after extinction when only one occurs in nonextinct demes may seem unnatural and is easily relaxed (eg Whitlock et al, 1993), but results will then depend on additional assumptions about the life cycle, that is whether demes of k colonizers produce as much juveniles as demes of N individuals. If so, equation (5) is still valid, giving Ne in terms of the identity QR among gametes produced within a deme. QR obeys a recursion of the form
A concrete illustration of the different formulas is obtained by applying equation (5) to two sets of estimates of demographic parameters from the literature. Whitlock (1992) estimated 2N=21.7 (genes copies), m=0.31, φ=0.5, e=0.1, 2k=10.6 (gene copies) in the beetle Bolithoterus cornutus. The ratio Ne/(Nnd) is 0.67 or 0.72 whether an intercalary generation is assumed at recolonization or not. Ingvarsson et al (1997) estimated 2N=22.2 (genes copies), m=0.366, φ=0.5, e=0.255, 2k=8 (gene copies) in the beetle Phalacrus substriatus. The ratio Ne/Nnd is likewise 0.35 or 0.40. Thus, the overall effect of population structure seems to be a moderate reduction of effective size, whatever formula is used. Substantially larger reductions in effective size may occur for lower numbers k of colonizers relative to N. How often this occurs is an empirical question.
References
Caswell H (2001). Matrix Population Models. Sinauer: Sunderland, MA.
Charlesworth B (1994). Evolution in Age-structured Populations, 2nd edn. Cambridge University Press: Cambridge.
Ewens WJ (1982). On the concept of the effective population size. Theor Popul Biol 21: 373–378.
Hill WG (1972). Effective size of populations with overlapping generations. Theor Popul Biol 3: 278–289.
Horn RA, Johnson CR (1985) Matrix Analysis. Cambridge University Press: Cambridge.
Hudson RR (1998). Island models and the coalescent process. Mol Ecol 7: 413–418.
Ingvarsson PK, Olsson K, Ericson L (1997). Extinction–recolonization dynamics in the mycophagous beetle Phalarus substriatus. Evolution 51: 187–195.
Nagylaki T (1983). The robustness of neutral models of geographical variation. Theor Popul Biol 24: 268–294.
Nordborg M (2001). Coalescent theory. In: Balding DJ, Bishop M, Cannings C (eds.) Handbook of Statistical Genetics, John Wiley & Sons: Chichester, UK. pp 179–212.
Nordborg M, Krone SM (2002). Separation of time scales and convergence to the coalescent in structured populations. In: Slatkin M, Veuille M (eds.) Modern developments in Theoretical Population Genetics, Oxford University Press: Oxford. pp 194–232.
Pannell JR, Charlesworth B (1999). Neutral genetic diversity in a metapopulation with recurrent local extinction and recolonization. Evolution 53: 664–676.
Rousset F (2002). Inbreeding and relatedness coefficients: what do they measure? Heredity 88: 371–380.
Rousset F (in press). Genetic Structure and Selection in Subdivided Populations. Princeton University Press: Princeton, NJ
Slatkin M (1977) Gene flow and genetic drift in a species subject to frequent local extinctions. Theor Popul Biol 12: 253–262.
Wade M, McCauley D (1988). Extinction and recolonization: their effects on the genetic differentiation of local populations. Evolution 42: 995–1005.
Wakeley J, Aliacar N (2001). Gene genealogies in a metapopulation. Genetics 159: 893–905.
Whitlock MC (1992). Nonequilibrium population structure in forked fungus beetles: extinction, colonization, and the genetic variance among populations. Am Nat 139: 952–970.
Whitlock MC, Barton NH (1997). The effective size of a subdivided population. Genetics 146: 427–441.
Whitlock MC, McCauley DE (1990). Some population genetic consequences of colony formation and extinction: genetic correlations within founding groups. Evolution 44: 1717–1724.
Whitlock MC, Phillips PC, Wade MJ (1993). Gene interaction affects the additive genetic variance in subdivided populations with migration and extinction. Evolution 47: 1758–1769.
Wolfram S (1999). The Mathematica Book, 4th edn. Wolfram Media/Cambridge University Press: Cambridge.
Wright S (1943). Isolation by distance. Genetics 28: 114–138.
Acknowledgements
I thank J Pannell for very helpful comments on the manuscript. I also thank M Slatkin for comments and M Whitlock for some answers. This is paper ISEM 03-024.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Migration matrix approach and rationale for ρ: We can write the recursions (8) and (9) in the form
where Q is the column vector (Q1,Q2). Here 1−Q1 is the gain in identity when a coalescence event occurs, hence the elements i1 of à must be the probabilities of coalescence of a pair i of genes in the previous generation; and A are the remaining factors of the Q's in recursions (8) and (9), which describe the transitions of a pair of lineages between the states ‘within the same deme’ and ‘in different demes’. However, coalescence events are ignored in the definition of A. A recursion of the form of equation (A.1) reduces to equation (3) of Whitlock and Barton (1997) if Ã=A and if A can be written as the direct product with itself of the migration matrix for single genes, but neither condition holds here.
The elements of the left 1 eigenvector ɛ of A give the probabilities that ancestral pairs of genes are within the same deme or in different demes, given the A matrix of transition probabilities, where coalescence events are ignored. The rationale for the computation of ρ from equation (23) is to give ɛ in the form ɛ∼(ρ,1−ρ).
Premultiplying equation (26) ɛ yields, for μ=0,
hence the second term on the right is the absolute reduction in gene diversity in one generation. Thus the relative reduction in diversity per generation is
As (ɛ·Ã)1∼ρπc, we recover the derivation of 1/Ne as ρπc(1−FST).
Rights and permissions
About this article
Cite this article
Rousset, F. Effective size in simple metapopulation models. Heredity 91, 107–111 (2003). https://doi.org/10.1038/sj.hdy.6800286
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/sj.hdy.6800286
Keywords
This article is cited by
-
Effective size of the hierarchically structured populations of the agent of malaria: a coalescent-based model
Heredity (2010)
-
Diffusion approximations for one-locus multi-allele kin selection, mutation and random drift in group-structured populations: a unifying approach to selection models in population genetics
Journal of Mathematical Biology (2009)
-
Simple allelic-phenotype diversity and differentiation statistics for allopolyploids
Heredity (2006)
-
Evidence for a recent genetic bottleneck in the endangered Florida Keys silver rice rat (Oryzomys argentatus) revealed by microsatellite DNA analyses
Conservation Genetics (2006)