Introduction

The possibility to compare patterns of genetic diversity between extinct and extant populations is of great importance in conservation genetics because it allows the assessment of past population structures, and documents the consequences of random genetic drift in populations that declined as a consequence of either natural or human-induced processes (Roy et al. 1994). Until recently, genetic studies of small and endangered populations have suffered from the problem that reliable estimates of the pre-exploitation diversity were usually not available. Consequently, the low levels of genetic diversity observed in many extant populations could not be used to reliably date eventually occurred past population reductions (Fréville et al. 1998; Tessier and Bernatchez 1999; Pichler and Baker 2000).

Genetic studies of rare and endangered species can be extended by adding historical samples from museum (Thomas et al. 1990; Ellegren 1991; Nielsen et al. 1999; Rosembaum et al. 2000) and herbarium collections (Saltonstall 2002). This progress has been made possible by the technical advancement of PCR-based methods that now allow the analysis of ancient specimens from minute tissue samples that may not compromise the value of rare, often irreplaceable, historic herbarium specimens (Maunder et al. 1999; Drábková et al. 2002). However, DNA extracted from such historic specimens is often highly degraded as a consequence of long sample storage (Towne and Devor 1990; Roy et al. 1994; Savolainen et al. 1995; Drábková et al. 2002). Consequently, short DNA fragments from molecules that are present in high copy numbers, such as the mitochondrial (Thomas et al. 1990, Taberlet and Bouvet 1992) and plastid DNA genomes (cpDNA), may represent the best targets for studies on herbarium specimens (Savolainen et al. 1995; Fay and Cowan 2001).

The phylogeographic history of the rare marsh orchid Anacamptis palustris (Orchidaceae) has recently been reconstructed using a highly polymorphic plastid region with a complex molecular evolution. Cozzolino et al. (2003a) referred to the different sequence rearrangements observed as haplotypes, and to the length polymorphisms observed in some haplotypes as alleles. Patterns of haplotype and allele variation in this hypervariable plastid region were found to be very useful for documenting, at a remarkably fine geographic scale, patterns of genetic variation among extant populations located along the Italian peninsula (Cozzolino et al. 2003a). The phylogeographic scenario that has been deduced suggests that A. palustris has a center of diversification in north-eastern Italy, where the highest plastid diversity was observed and where the largest extant populations exist (Cozzolino et al. 2003b). However, comparisons of genetic diversity among north-eastern Italian populations and central and southern Italian populations revealed a striking pattern of discontinuity. While north-eastern Italian populations along the Adriatic sea were characterized by having a large number of alleles at the widespread haplotype N (according to Cozzolino et al. 2003a), the central and southern Italian populations along the coasts of the Tyrrhenian Sea mostly harbored different, exclusive haplotypes or only alleles with small repeat numbers (short alleles) at the N haplotype. This haplotype pattern may be a consequence of the relatively recent establishment of populations along the coasts of the Tyrrhenian Sea by seeds carrying short alleles, which is realistic, because short alleles are common in the potential source populations in north-eastern Italy. The absence of long alleles could then be a consequence of the young age of these populations that has not allowed the evolution of long alleles in the populations from the Tyrrhenian coast. Alternatively, the observed pattern of genetic variation in these populations may be an artifact, brought about by small population sizes as a probable consequence of human impact (Petit et al. 1998; Cozzolino et al. 2003b). Strong bottlenecks could lead to a severe loss of allelic variation and the survival of only the most common alleles (Ellegren 2000). At the same time, the presence of exclusive haplotypes in populations along the coasts of Tyrrhenian Sea may either be a consequence of severe reductions of haplotype diversity and distribution due to genetic drift, or may have evolved locally.

Since Roman times, coastal marshes—the habitat of A. palustris—along the Italian peninsula have been extensively drained to reduce breeding grounds of the malaria vector Anopheles, and to recover suitable land for agriculture. Further massive reductions of these habitats have occurred in recent times, between the First and Second World War, when extensive drainage of coastal marsh habitats was promoted by the national government (Bevilacqua and Rossi-Doria 1984). Consequently, marsh habitats that formerly occurred over hundreds of kilometers along the Italian coasts, are now restricted to a few habitat patches that are often strongly isolated from neighboring patches. This strong anthropogenetic impact most likely has caused significant reductions of A. palustris population sizes along the Tyrrhenian coasts. In the absence of data from pre-drainage A. palustris populations it was unclear whether observed levels of genetic variation are a consequence of these most recent habitat destructions or date further back.

We therefore assessed genetic variation using historic herbarium specimens collected before the Second World War and compared the results with those previously reported for extant A. palustris populations. Specifically, we asked (a) whether levels of genetic diversity in extant populations are reduced compared to historic samples, (b) whether the rare and exclusive haplotypes found in extant populations from the Tyrrhenian coasts were once widespread, or whether they have evolved locally and failed to spread, and (c) whether long alleles at the N haplotype were once present in populations from the Tyrrhenian coasts and have been lost recently, or whether their presence in extant populations from the Tyrrhenian coast is due to recent recruitment from Adriatic populations.

Methods

DNA amplification and sequencing

Major Italian and Swiss herbarium collections were searched for historical specimens of A. palustris collected between the early nineteenth century and the Second World War (see Table 1 for the list of all historical specimens contributing data to this analysis). All herbarium specimens were visually inspected and all available information on the collection labels was recorded. For DNA analysis a small portion of a leaf tip from the herbarium specimen (approx. 5 × 5 mm) was sampled with a sterile forceps and DNA was extracted according to Doyle and Doyle (1987) and resuspended in 10 μl of distilled water. PCR amplification of a part of the plastid tRNALEU intron was carried out using two specific primers and reaction conditions as described in Cozzolino et al. (2003c), with 5 μl of resuspended DNA as PCR template.

Table 1 List of examined herbarium A. palustris accessions

Precautions were taken to guard against contaminations (Glenn et al. 1999; Miller and Waits 2003; Gilbert et al. 2005), although the use of species-specific PCR primers strongly reduced the likelihood of contaminations. DNA extractions and PCR amplifications were carried out in a separate building (University of Naples) where no A. palustris DNA had ever been investigated before and control extractions (without plant material) and control PCR reactions (without DNA) were done as negative controls. Amplification products were visualized on 2% Metaphore gels (FMS, Rockland, ME, USA), using a 50 bp ladder as standard (FMS) stained with ethidium bromide and photographed using a digital camera. Amplification products that did not produce visible bands on the gel were subject to ten additional amplification cycles by adding additional polymerase. Finally, the remaining 5 μl of resuspended DNA were used either for independent verification of the results in an independent laboratory (University of Calabria) or as material for additional experiments for samples that did not successfully amplified in the first attempt. Approximately one-third of the samples were independently examined twice.

All PCR products were sequenced to check the repeat type (haplotype) and the nature of the length polymorphism (allelic variation). Out of 110 examined herbarium specimens, a total of 89 samples (81%), among which 85 samples from coasts of Tyrrhenian Sea, were successfully amplified and sequenced. PCR products were purified and cycle-sequenced using BigDye Terminator sequencing chemistry on an ABI Prism 310 Genetic Analyzer (Applied Biosystems, Foster City, CA, USA). Sequence Navigator software (Applied Biosystems) was used to inspect electropherograms and to align sequences.

Data analysis

Analyses were performed to determine if the pattern of genetic variation in the extant populations is different from that of the historic samples. Because herbarium specimens have typically not been collected for the purpose of conducting analyses of genetic diversity and population structure, sample sizes are often inadequate for a given locality and time period and the amount of information available for the different populations is usually limited (Nielsen et al. 1999). To partially overcome this problem in the present study we grouped individuals from geographically close populations together into regions. Thus we carried out population pooling both for historic samples and for extant samples in order to gather, wherever possible, comparable historic and extant samples for each region.

This methodological choice can be justified by the fact that most of the historic and extant populations (see Fig. 1) originated from the same, narrowly circumscribed geographic area (typically marshes along the coasts that are separated by a few kilometers). Consequently we do not expect that pooling samples from different, but geographically close, populations substantially distorts estimates of genetic diversity (HT) and population differentiation (GST), which can occur when samples from distant and isolated populations are pooled.

Fig. 1
figure 1

Geographic location of extant (filled) and historic (empty) populations from the Tyrrhenian coast

The change over time in alleles frequencies can also be explained by the stochastic effects alone, as genetic drift and sampling errors. Thus, temporal differences in allelic frequencies were analyzed within regions following Waples (1989) in order to test whether significant temporal variation in haplotype frequencies are due entirely to stochastic process (as drift and sampling error) in a finite population. Analysis were conducted assuming 5 years per orchid generation and selection effects on haplotype frequencies were ignored here since we are concerned with neutral markers. A significant test (at the 0.05 level) implies that pure drift and sampling error was not sufficient to explain the difference in the haplotype frequencies (i.e., haplotype frequencies have changed greater than expected from drift alone) and that other effects, such as migrations or bottlenecks, can be considered to explain the changes.

Patterns of diversity and population differentiation were then compared between these historic samples and a comparable set of extant populations (a subset of those sampled in Cozzolino et al. 2003a) from the Tyrrhenian coast. We estimated population subdivision (GST) and genetic diversity (HT) in the two datasets (historic and extant) and compared patterns of pairwise geographic differentiation among population pairs between historic and extant samples. Levels of population subdivision for plastid genomes using unordered alleles (GST) and genetic diversity within populations (HT) were calculated according to Pons and Petit (1995) using the software HAPLODIV, and analysis of molecular variance (AMOVA, Excoffier et al. 1992) was calculated using ARLEQUIN version 3.0 (Excoffier et al. 2005) to estimate the partitioning of variance into within and among populations components. We tested isolation by distance using the Mantel test as implemented in the software TFPGA version 1.3 (Miller 1997) using as geographic distances beeline distances among pairs of populations and as genetic distances Nei’s (1987) pairwise distances as calculated with ARLEQUIN version 3.0. Rarefaction was applied to correct diversity estimates for differences in sample sizes between extant and historic samples by employing the option RAREFACTION in the software CONTRIB (Petit et al. 1998). Finally, to discern whether a haplotype found in the historic samples but not in the samples of extant populations (and viceversa) was missed because of extinction or because if insufficient sampling, we used the approach given by Rosenbaum et al. (2000). By assuming that each herbarium specimens represented a random sample from the historic populations and that no new haplotype has arisen since the herbarium specimens were taken, the probability of observing a specimen with a rare haplotype is equal to the relative frequency of this rare haplotypes in the population. The probability of detecting a given haplotype was set at 95 and 99%, respectively and the maximal frequency of rare haplotypes in the historical (n = 85) and extant populations (n = 316) was estimated. The resulting maximal frequency values are in essence the frequency estimates for any possible rare haplotype that was not detected, which makes the observed data ≥95 and ≥99% probable since the time when the specimens were collected.

Results

Three distinct haplotypes (A, D, and N) were detected among the 85 herbarium specimens from the Tyrrhenian coast (Tables 2, 3, Fig. 1). Of these, the N haplotype showed allelic variation due to changes in repeat copy numbers (N1 and N2) (Table 4). A new allele of haplotype A was found in the historic samples. This allele differs from the original A allele, which was found in the extant Orbetello population (Cozzolino et al. 2003a), in the number of repeats: five repeats in Orbetello and two repeats only in the historic samples. The alleles at this haplotype will, from now on, be referred to as A5 and A2 alleles, respectively (Table 4). Haplotype D was found in all historic A. palustris populations from the Italian peninsula and Sicily and was the most common haplotype in the historic sample (Fig. 2). In Campania and Sicily, only haplotype D was found, while in Tuscany and Latium we also found haplotypes A (allele A2) and N. Allele N1 was found in both Tuscany and Latium, whereas allele N2 was found only in Tuscany. The two haplotypes B and E found in extant populations from the Tyrrhenian coast (Cozzolino et al. 2003a) were not found in the historic sample (Table 2, Fig. 2). All haplotypes but E significantly differed in their frequency between historic and extant samples (Table 3).

Table 2 Plastid haplotypes detected in extant (bold) and historic (italic) A. palustris populations
Table 3 Numbers and frequencies of haplotypes in populations from the Tyrrhenian coast
Table 4 Numbers and frequencies of alleles at different haplotypes in populations from the Tyrrhenian coast
Fig. 2
figure 2

Detected geographic distribution of plastid haplotypes in extant and historic populations of A. palustris along the Italian peninsula

Genetic differentiation among regions, as estimated by GST, was significantly lower (P = 0.006, t-test) among the historic samples from the Tyrrhenian coast (GST = 0.133) than among extant populations (GST = 0.320). Accordingly, AMOVA revealed that only 18% of the variation was found among regions for the historic samples, whereas 40% of the variation was found among regions in the extant sample. The weak genetic differentiation among regions was reflected in the analysis of isolation by distance. No significant association between geographic and genetic distances was found for the historic samples (r = 0.656, P > 0.1) whereas evidence for isolation by distance was found among extant samples (r = 0.708, P < 0.01).

The estimate of gene diversity (based on all extant and historic sampled individuals) was significantly lower (P = 0.012, t-test) for the historic sample (HT = 0.352) compared to the sample of extant populations (HT = 0.790). However, rarefaction revealed that for the sample size achieved with the historic samples from the Tyrrhenian coast (85 individuals), only 3.6 haplotypes can be expected, which is close to the 3 haplotypes observed in the historic sample.

Using the approach implemented by Rosembaum et al. (2000) we estimated that we would have been able to detect a rare haplotype in the sample of extant populations from the Tyrrhenian coast with a probability of 95% if it had occurred as low as 0.95% in the historic sample, and with a probability of 99% if it had occurred at a rate as low as 1.45%.

At the same time, we would have been able to detect a rare haplotype in the sample of historic populations with a probability of 95% if it had occurred at a rate as low as 3.47% in the extant historic sample from the Tyrrhenian coast, and with a probability of 99% if it had occurred at a rate as low as 5.28%.

Discussion

Our comparison of sampling locations for herbarium specimens with the distribution of extant A. palustris populations (Cozzolino et al. 2003a) confirms that the distribution of A. palustris has been drastically reduced during the twentieth century (Fig. 1). Approximately 80% of collection sites described on the labels of the investigated herbarium specimens are either no longer marsh habitats or, if marsh habitats still exist, the populations of A. palustris are extinct. Our results demonstrate that this widespread deterioration, reduction, and destruction of suitable habitat and the concomitant loss of A. palustris populations has left its signature in the gene pool of the remaining extant populations. These populations have been sampled nearly exhaustively by Cozzolino et al. (2003a), with the only exception being one small population that is located in a bird reserve and cannot be visited during the orchid’s flowering time which overlaps with bird nesting times in this area. Additionally, most of the extant populations correspond or are very close to the historic ones (see Fig. 1). It is thus unlikely that plastid haplotypes or alleles founds in herbarium specimens remained unsampled.

At first sight, evidence for a substantial loss of genetic variation in the recent history of the species may be absent. The analysis of the extant A. palustris populations has revealed a total of five haplotypes (Cozzolino et al. 2003a), in contrast to the three haplotypes detected in the historic sample (Table 2, Fig. 3). The apparent increase of haplotype number and gene diversity in extant populations could be due to new mutations that have arisen relatively recently or due to haplotypes that have been introduced into extant populations by recent immigrants, or could be a sampling artifact. The numbers of haplotypes detected depends strongly on sample sizes, which differed markedly between the study of extant populations from the Tyrrhenian coast (Cozzolino et al. 2003a, 316 examined specimens) and this study (85 specimens). The problem is aggravated when haplotypes occur at unequal frequencies, because the detection of rare haplotypes becomes less likely. For instance, the rare haplotype B has been found only in two extant populations, for which no historic samples from nearby sites were available, and the E haplotype, was found in four specimens only from Campania and thus occurred only in a frequency of 1.27% in the extant sample. Consequently, we propose that the two haplotypes E and B, which were rare in extant populations, were not found in the much smaller historic sample because of the limited sample size.

Fig. 3
figure 3

Relative haplotype frequencies in extant (empty) and historic (filled) populations from the Tyrrhenian coast

Generally, the limited sample size and the methodological bias in the collecting procedure for herbarium specimens (i.e., few individuals only from each locality) may render the herbarium collection less amenable for detecting rare haplotypes and alleles, but they are still useful for detecting variation in the distribution and frequency of common haplotypes and alleles. This is particularly true when several samples from the same (or from very close) locality are available in historic herbaria, as in the present case study of A. palustris.

The reverse pattern, i.e., the finding of a haplotype/allele in the historic sample but not in the extant populations may again be due to insufficient sampling or could be the consequence of random genetic drift following population reductions and bottlenecks. The two alleles that were found in the historic sample, but were either absent (allele A2) or rare (allele N1) in the extant populations, were present in relatively high frequencies in the historic sample, which makes it highly unlikely that these haplotypes were still present in extant populations, but missed due to insufficient sampling. For instance, allele A2, which occurred with a frequency of 9.4% in the herbarium samples (Tables 3, 4), would have remained undetected in the extant sample with a probability of 2.8 × 10−14. This indicates that its absence from extant A. palustris populations is likely due to extinction.

Waples’ test performed on haplotype frequencies was significant for all haplotypes but haplotype E. This implies that pure drift or sampling errors are not sufficient explanation of the observed temporal changes in haplotype frequencies and thus there are evidence of greater than expected change in haplotype frequencies due to population reduction.

Population structure has equally been affected by population reduction. The much higher genetic differentiation among pools of extant populations, compared to historic populations, indicates that substantial fragmentation has occurred among populations along the Tyrrhenian coast. This effect may have been renforced by local random genetic drift which is expected to increase genetic differentiation among populations, even when population numbers remain constant. The increase in genetic differentiation strongly suggests that the remaining A. palustris populations are not sufficiently close or large to exchange migrants at a rate that could balance genetic drift.

Phylogeographic studies often have problems to explain the occurrence of exclusive haplotypes in some isolated populations (Schaal et al. 1998). The haplotype pattern found in our earlier analysis of extant populations left open several possible hypotheses for the origin of these isolated populations or for the origin of the exclusive haplotypes. The addition of data from historic samples may help distinguishing between alternative scenarios. For example, the extant population from Viterbo (Fig. 1) (Cozzolino et al. 2003a), which grows in a sulphuric inland marsh and not in coastal marshes, is presently small (50 flowering individuals in 2002) and fixed for allele N1 at haplotype N. In the extant sample, this allele has been found outside of Viterbo only in two individuals from a population from the Tyrrhenian coast (Castelvolturno) and three individuals from the Adriatic Sea (Cozzolino et al. 2003a). The frequency of this allele in the historic sample is in stark contrast to its rarity in present-day populations. Allele N1 has occurred in the past also in populations Marina dei Ronchi, Palude di Mondunico and Bagni di Tivoli (Fig. 1). All these populations are nowadays extinct. Thus, the historic samples demonstrate that allele N1 has not evolved locally in Viterbo and thus does not flag an evolutionary uniqueness of that particular sulfuric inland marsh population; instead, the current distribution of N1 is relictual. Similarly, haplotype A, which was found to occur among extant populations exclusively in the small population of Orbetello in Tuscany, has now been found, in the form of allele A2, to be relatively common in the historic sample from the Tyrrhenian coast (Table 4). This finding further allows to reject the earlier hypothesis (Cozzolino et al. 2004) that the Orbetello population originated from a Balearic population of A. robusta, where a similar haplotype (the D n haplotype in Cozzolino et al. 2004) occurs.

The three remaining extant populations from Tuscany carry haplotypes A, B, and N, but lack haplotype D, which separates them from all other, more southern populations from the Tyrrhenian coast, where D is common. This genetic discontinuity is again the result of a loss of haplotype D from Tuscany, as evidenced by the frequency of this haplotype in the historic sample from Tuscany. Thus, populations from Tuscany are not derived from Adriatic populations, but clearly belong to a lineage that is distributed along the Tyrrhenian coast and extends into Sicily and northern Africa as confirmed by the analyses of few additional historic samples from Tunisia (see Table 1).

Extant populations from the Adriatic coast were found to be highly variable at haplotype N, where a large number of alleles with different repeat numbers were observed (Cozzolino et al. 2003b) and a long N allele has also been found in corresponding historic samples (Table 2, Fig. 2).

Among extant populations from the Tyrrhenian coast, the diversity at haplotype N was much lower. This pattern was confirmed by the results from the historic sample. Historic populations from the Tyrrhenian coast did not harbor, at a level that could be detected with our sampling, long alleles at haplotype N (Tables 2, 4, Fig. 4). In fact, given our sample size (85), we would have detected with a probability of 99% a long allele at haplotype N if long alleles had occurred with a frequency of at least 5.28%. When present, haplotype N occurs in the historic sample from the Tyrrhenian coast only in the form of short alleles, N1 or N2 (Table 4, Fig. 4), which suggests that the current sporadic presence of long alleles in the extant populations from the Tyrrhenian coast, Torre Lago and Castelvolturno (Fig. 1), is likely due or to the reduction of a previously larger allelic variation at this haplotype, or to long distance dispersal events from Adriatic populations, where long alleles presumable evolved and are still common. Although much can be learned from the comparison of extant and historic samples, one has to keep in mind that in a geographic area such as Italian peninsula, that has been heavily influenced by human presence during the last 3,000 years, numerous unrecorded events have occurred prior to the collection of our historic samples, and these may already have shaped the genetic diversity of populations.

Fig. 4
figure 4

Relative allele frequencies at haplotype N in extant (empty) and historic (filled) populations from the Tyrrhenian coast

The comparison of present day and historic haplotype distributions helps us not only to better understand the evolutionary history of A. palustris, but may also provide guidelines for the conservation or reintroduction of the species. Although the species has minute wind-dispersed seeds, which are typically found in orchids (Arditti and Ghani 2000), and can occasionally disperse to relatively distant sites, the low number of long alleles at haplotype N that are found outside of the Adriatic coast, where they most likely evolved, indicates that long distance recruitment events are nevertheless rare. The populations that are able to send out migrants are the only remaining large populations from the Adriatic coast. The small extant populations from the Tyrrhenian coast are no longer able to exchange sufficient migrants to prevent genetic drift and thus genetic differentiation among populations is increasing. Where suitable habitat still exists, reintroduction projects should be carried out, because a natural colonization of habitat patches is unlikely. For this purpose, the value of analyzing historic samples lies in identifying suitable source populations.

For example on Sicily, where A. palustris is now extinct (Fig. 1), all populations harbored haplotype D (Table 2, Fig. 2). If the species were to be reintroduced to Sicily, extant populations from southern Italy that carry the D haplotype may be used as genetic resources. From these, individuals should be selected that currently grow under similar ecological conditions as those present in the reintroduction area.

Overall, the combined analysis of genetic variation in historic and extant populations of A. palustris has revealed haplotype extinctions, a low frequency of long distance dispersal events in a time window of approximately one and a half centuries, and the disappearance of a substantial number of the sites from which the plant has been reported in the past. Together, these findings strongly emphasize that A. palustris populations from the Tyrrhenian coast are under sever risk of extinction and must be protected.

Phylogeographic and population genetic studies of other plant species, especially those that have experienced population declines and range reductions, could greatly benefit from including herbarium specimens into their sampling. Otherwise, the risk of incorrectly inferring the species’ evolutionary history from observed patterns of genetic diversity is high.