The questions that population biologists are currently asking, and attempting to answer using molecular genetic markers, have overstretched many of the standard analytical tools of population genetics. Computer simulation of the demographic history of populations with focus on the most important evolutionary forces that bear upon the questions of interest can be a useful synergistic tool. The approach is most informative when used with a specific species or population in mind, one for which there is enough basic knowledge to enable selecting a simulation model that is representative of the evolutionary forces at play, without being too cumbersome. In a recent and highly informative study, Edmonds et al (2004) adopted this approach to infer the origin of neutral mutations in a population whose range has expanded starting from a single source, the real case in mind being the radiation of humans from a small population in East Africa to the whole world over the last 50 000 years.

What is known about the spread of humans from various types of genetic data as well as from archeological evidence is best represented by a model of dispersal called demic diffusion (Ammerman and Cavalli-Sforza, 1984; Cavalli-Sforza and Feldman, 2003). In this model, individuals in a population undergoing logistic growth move undirected to adjacent areas and form demes, that is, locally interbreeding subpopulations. The demes exchange migrants at a constant rate resulting in colonization that progresses radially outwards from the point of origin akin to an advancing wave. Its edge of advance – or wave front – is more or less a closed circle that moves at a rate directly proportional to the growth rate and the migration rate.

Edmonds et al (2004) used the demic diffusion model to consider the fate of mutations that arise at the wave front. They selected a set of demographic parameters that represent the dynamics of uniparentally transmitted genetic markers such as the non-recombining portion of the Y chromosome (NRY) and mtDNA. In those simulations where at least one wave-front mutant persisted to the end of the run, several striking trends were apparent.

The mutants either stayed around their place of origin at low frequencies or traveled along with the wave front in the direction of colonization and attained higher frequencies. In the latter case, the centroid of the spatial distribution of the traveling mutants (corresponding to the deme whose coordinates are the average latitude and longitude of all the mutants) was approximately midway between the deme of origin and the edge of expansion. This led to the proposal of a general rule for inferring the origin of a traveling wave-front mutation referred to as ‘twice the distance’ rule. Accordingly, the origin of a wave-front mutant lies between the centroid and the origin of expansion; the distance from the mutant origin to the centroid is equal to the distance from the centroid to the edge of the expansion.

The applicability of this inference rule to other organisms hinges on colonization proceeding as a wave front from a single source. In general, wave-front colonization occurs when the distance that dispersers travel from their point of origin is either a single step, as in the demic diffusion model, or when it can be represented by a dispersal function with exponentially bound tails (Mollison, 1977). If, on the other hand, the tails of the dispersal function are thicker, implying relatively higher rate of long-distance migration, the population tends to spread in leaps and bounds, forming colonies far enough ahead of the wave of advance to act as new foci. The intervening territory between these foci and the original wave front is then colonized from multiple directions.

Nichols and Hewitt (1994) demonstrated how this type of rare long-distance dispersal can generate patterns of introgression very different from wave-front colonization when two genetically distinct populations of the same species come in contact as they spread into new territory. Another simulation study of colonization from a single homogeneous source similar to that of Edmonds et al (2004) revealed that where the distribution of dispersal distances is more kurtotic (peaked) than the normal distribution, that is, more short-distance, fewer intermediate-distance and more long-range dispersers compared to the normal, striking patchiness of alleles and genotypes can arise as a result of the establishment of bridgeheads in advance of the major invasion front (Ibrahim et al, 1996).

As with most simulation-based inferences, the ‘twice the distance rule’ is only applicable in species whose demographic history does not deviate significantly from the assumptions of the simulation model. In the case of the agriculture-mediated spread of humans, particularly into Europe, it has been amply demonstrated using genetic and archeological data that wave-front diffusion from a single source is an adequate depiction. In many other documented colonization histories, however, the impact of long-distance colonists cannot be ignored (Hengeveld, 1989; Nathan et al, 2002; Hewitt, 2004). In these cases, the fate of mutations that arise either in the bridgehead colonists or in the main wave front when colonization is stratified, that is, involves diffusion as well as jump dispersal, is unlikely to conform to the ‘twice the distance’ rule. Exploring these alternative demographic scenarios using simulations could be informative too, particularly now that the Edmonds study has established reference trends that can be used for comparison.

A number of other insights were gained from the simulations by Edmonds et al (2004). Assuming, as is often done, that a mutation is likely to be most abundant at or near its place of origin can be erroneous. In the case of the most informative mutations (ie those that reach high frequency), the origin is closer to the beginning of the expansion than to the centroid. As an alternative to the ‘twice the distance’ rule, they proposed that an estimate of the age of a mutation from its genealogy can be used to place the origin on the line that connects the beginning and the end of an expansion passing through the centroid of the mutation. Finally, additional avenues of research that may predict more exactly the location of mutational events have been identified.