Introduction

Accurate species delineation is fundamental for the practical protection and conservation of biota (Mace 2004), but the use of genetic markers sometimes disagree with traditional morphology-based taxonomy (Bickford et al. 2007; Padial et al. 2010; Mayr 2011). Conflicts between morphological and molecular data are common in temperate freshwater fishes, as populations tend to be isolated from each other due to physical barriers, present a disjunct distribution, and show high levels of phenotypic and ecological differentiation (Taylor 1999; Adams et al. 2016; Moccetti et al. 2019).

The salmonid European whitefish Coregonus lavaretus forms a large species complex spanning across the north temperate regions of the Palaearctic, displaying high levels of phenotypic variation at the inter- and intra-population levels (e.g. Østbye et al. 2005a, b; Præbel et al. 2013a; Siwertsson et al. 2013; Ozerov et al. 2015; Adams et al. 2016) and rapid evolution under changing environmental conditions (Hudson et al. 2013; Jacobs et al. 2019). From an evolutionary perspective this species is a renowned example of repeated evolution across limnetic-benthic gradients in lacustrine environments (e.g. Østbye et al. 2005a, b; Vonlanthen et al. 2009; Præbel et al. 2013a; Siwertsson et al. 2013; Hudson et al. 2017), but for fish biologists it represents a taxonomic challenge, as morphological similarities do not necessarily reflect shared genetic history (e.g. Præbel et al. 2013a). This is due to the long-standing use of certain morphological, ecological, and meristic traits that might be environmentally influenced to discriminate coregonids into different species (Etheridge et al. 2012a). The number of gill rakers on the anterior gill arch is one the main characters used (Svärdson 1952, 1979; Kottelat and Freyhof 2007), which while genetically based, has been shown to be influenced by plasticity (Lindsey 1981) and under natural selection (Ozerov et al. 2015; Häkli et al. 2018; Jacobs et al. 2019) given its important role in feeding ecology (Kahilainen et al. 2011).

European whitefish is native to seven lakes in Great Britain, clustered into three distinct geographic areas. These are lochs Eck and Lomond in west-central Scotland, where it is known as powan, Red Tarn, Haweswater, Brotherswater, and Ullswater in north-west England, known as schelly, and Llyn Tegid in north Wales, known as gwyniad (Maitland and Campbell 1992) (hereafter referred to as Scottish, English and Welsh populations respectively). Apart from the lakes of Red Tarn and Brotherswater which are connected to Ullswater through small waterways, all lakes are isolated from one another. Originally, these populations were classified as three distinct species: populations in Scotland were described as Coregonus clupeoides Lacépède 1803, populations in England were classified as Coregonus stigmaticus Regan 1908, and the single Welsh population was classified as Coregonus pennantii Valeciennes 1848. Subsequently, these populations were recognised to be part of the Coregonus lavaretus species complex and grouped into a single species (Maitland and Campbell 1992). More recently, however, Kottelat and Freyhof (2007), largely on the basis of morphological characteristics, re-classified the entire European whitefish complex recognising 59 separate species across Europe and suggesting that there are many more waiting for formal description. For the United Kingdom (UK), Kottelat and Freyhof (2007) using morphological and meristic traits, including gill raker number, reinstated the late nineteenth century species classification, splitting the seven populations into three species (C. clupeoides, C. stigmaticus, C. pennantii).

The validity of the species boundaries and the identification key by Kottelat and Freyhof (2007) for the European whitefish populations in Britain was tested by Etheridge et al. (2012a). Whilst there exists an overall morphological separation between the Scottish, English, and Welsh populations, Etheridge et al. (2012a) concluded that the observed differences between them were insufficient evidence to split these populations into three separate species. Additionally, a large overlap existed between key morphological traits of putative different species, leading Etheridge et al. (2012a) to attribute the morphological differences between groups to phenotypic plasticity, genetic drift, and/or local adaptation. However, their study did not investigate to what extent the morphological separation was supported by genetic differentiation. Phenotypic and genetic information combined is necessary to gain a better understanding of the diversity present within populations, and to assess their conservation status. In particular, assessment of genetic variation and population structure forms the basis for the delineation of evolutionarily significant units (ESUs) (Quintela et al. 2010), a concept proposed by Ryder (1986) which might be more fitting to the scenario of many freshwater fish species whose high diversity is associated with post-glacial recolonization of northern Europe (April et al. 2013). Therefore, an accurate assessment of the genetic population structure of the European whitefish populations in Britain is needed, in particular with markers such as microsatellites that have been shown to be useful in the delimitation of ESUs (e.g. Mockford et al. 2006; de Oliveira et al. 2008; Zhu et al. 2013). Developing an understanding of the genetic diversity within and between populations, in addition to phenotypic variation, that exists within these populations is of significant importance if an adequate conservation strategy for this species is to be achieved (Brodersen and Seehausen 2014).

To date, genetic research on C. lavaretus in the UK has been limited and there is no clear consensus on the relationship between populations. Early studies using electrophoretic analysis of proteins on C. lavaretus and the two other Coregonus species currently recognised in the British Isles C. albula, and C. autumnalis, (Maitland and Campbell 1992), showed that the seven C. lavaretus populations constitute a single species distinct from the other two species (Ferguson 1974). A later study, using allozymes, focused on the English and Welsh populations of C. lavaretus showing the Welsh population to be nested within the English group (Beaumont et al. 1995). However, mitochondrial restriction fragment length polymorphism (RFLPs) from another study showed the English populations to be monophyletic, and supported a closer relationship between the Scottish and Welsh populations (Hartley 1995).

European whitefish in the UK is a priority species for the UK Biodiversity Action Plan, and it is protected under Schedule 5 of the Wildlife and Countryside Act 1981 due to its rarity and vulnerability to pressures such as invasive species, water quality and habitat degradation (Maitland and Lyle 2013; Winfield et al. 2013). This has led to the establishment of several translocated populations using stocks from Scotland (Maitland and Lyle 2013; Adams et al. 2014), England (Winfield et al. 2013), and Wales (Thomas et al. 2013), as a means of mitigating potential threats (Præbel et al. 2019). Therefore a better understanding of genetic diversity and structuring is of primary importance for the appropriate conservation of European whitefish in Britain.

In the present study, we applied a population genetics analysis of all native European whitefish populations in Britain to: 1) assess their overall genetic diversity and compare it to published data on other European populations; and 2) determine the level of genetic structuring between populations. The outputs of the study are then viewed in a conservation policy context.

Methods

Sampling

Sampling was conducted between 2005 and 2009. Samples were collected from six native populations in Britain (Fig. 1); Loch Lomond (LLA, n = 41) and Loch Eck (ECK, n = 36) in Scotland, Llyn Tegid (LTE, n = 44) in Wales, Red Tarn (RTA, n = 22), Brotherswater (BWA, n = 19) and Ullswater (UWA, n = 29) in the Lake District of north-west England. Samples from Haweswater (HAW, n = 46) were collected in 1986. Samples were collected using Norden benthic gill nets as described in Etheridge et al. (2012a). Details about lake characteristics are reported in Table S1.

Fig. 1
figure 1

Locations of Loch Eck, Loch Lomond, Llyn Tegid and the English Lake District (containing Brotherswater, Haweswater, Red Tarn and Ullswater) in the U.K. Note that all locations except Haweswater and Ullswater are not visible at the scales of the figure. Partly redrawn with permission from Ramsbottom (1976). The three geographic clusters of populations are shown. Lochs Eck and Lomond are in west-central Scotland (SCT), Llyn Tegid in north central Wales (WLS), and the other four lakes in the north west of England (ENG)

Laboratory procedures

Genomic DNA was extracted from gill filaments, fin clips or scales using E-Z96 Tissue DNA Kit (OMEGA Bio-tek) following the manufacturer instructions. A total of 15 microsatellite loci (Table S2, Patton et al. 1997; Susnik et al. 1999; Turgeon et al. 1999; Rogers et al. 2004; Winkler and Weiss 2008) were amplified in four polymerase chain reaction (PCR) multiplexes in 2.5 µl reactions following the protocol by Præbel et al. (2013b). Briefly, the PCR profiles for all multiplex assays (mplx) consisted of an initial denaturation step at 95 °C for 15 min and a final elongation step of 30 min at 60 °C. The amplification cycles consisted of: mplx (I) 25 cycles of 95 °C for 30 s, 57 °C for 3 min, and 72 °C for 1 min; mplx (II) 25 cycles of 95 °C for 30 s, 60 °C for 3 min, and 72 °C for 1 min; mplx (III) 26 cycles of 95 °C for 30 s, 61 °C for 3 min, and 72 °C for 1 min; mplx (IV) 27 cycles of 95 °C for 30 s, 60 °C for 3 min, and 72 °C for 1 min. The fluorescent-labeled PCR products were separated on an ABI 3130 XL Automated Genetic Analyzer (Applied Biosystems) and sized according to the GeneScan LIZ 500 internal size standard (Applied Biosystems).

Genotyping, validation and quality control of genotypic data

Alleles were automatically binned in predefined allelic bins by the GeneMapper 3.7 software (Applied Biosystems) and verified twice by visual inspection. After the first validation of the genotypes, 3–9% of individuals per population were re-extracted and re-run at all 15 loci to ensure properly amplified alleles, i.e. reducing the possibility for methodological errors such as large allele drop-out and stutter scoring. The genotypes resulting from the initial run and the rerun were manually compared for all individuals to rule out miss-scoring of alleles. If any doubt occurred in this comparison the sample was re-extracted and re-run at all loci to obtain a consensus genotype. No mismatch was identified in the dataset. The samples were finally screened for abnormalities (null alleles, scoring errors, etc.) in the software MICRO-CHECKER 2.2.3 (Van Oosterhout et al. 2004), using 1,000 bootstrap iterations over loci to generate the expected homozygote and heterozygote frequencies.

Genetic diversity and variation

Expected and observed heterozygosity, allele and effective allele numbers, and FIS were calculated in GenoDive v. 2.0b27 (Meirmans and Van Tienderen 2004) per population and per locus. Allelic and private allelic richness per population and per locus were determined using a rarefaction procedure, to account for sample size differences, for the smallest sample size (38 genes) implemented in HP-RARE v. 1.0 (Kalinowski 2005). Tests for linkage disequilibrium (LD) between pairs of loci were conducted in the R package Genepop v. 1.0.5 (Rousset 2008) with the following settings: 100,000 dememorization, 1000 batches, 10,000 iterations. Deviations from Hardy–Weinberg equilibrium (HWE) for loci within populations were tested in GenoDive using 1000 permutations of alleles among individuals. Pairwise p-values from the LD and HWE tests were corrected for multiple comparisons by sequential Bonferroni corrections (Rice, 1989). We also tested for a correlation between latitude and observed heterozygosity and allelic richness using Spearman correlation in the R environment (R Core Team 2019).

Population structure

FST (Weir and Cockerham 1984) was calculated at the global level and pairwise between populations using the R package hierfstat v. 0.04-22 (Goudet 2005) and GenoDive respectively, and tested for significance using 10,000 permutations of genotypes among populations.

Discriminant analysis of principal components (DAPC) (Jombart et al. 2010) implemented in the R package adegenet v.2.1.1 (Jombart 2008) was used to identify clusters of genetically related individuals. DAPC is a method that relies on transforming the allele frequency data into principal components that are then passed onto the discriminant analysis (DA), which maximises variation between groups and minimises it within groups (Jombart et al. 2010). The DAPC method works in two ways: (1) if prior knowledge of the genetic groups that individuals belong to exists, then these genetic groups will be used as priors in the analysis for assignment to group membership; (2) when no such information exists, genetic groups can be identified using k-means clustering sequentially with increasing values of k, with the best k value being the one scoring the lowest Bayesian information criterion (BIC), and individuals assigned to such groups. We ran the DAPC both with lake of origin as prior information for group membership, and without prior information (testing k values from 1 to 10), and compared the results to assess if populations show evidence of structure by lake. We then visualised the genetic clusters using the complot function in adegenet, which produces a barplot of group assignment probabilities, and using scatterplots. The xvalDAPC function in adegenet was used to determine the number of principal components (PCs) to be retained by the DAPC analysis.

In addition to DAPC, which maximises between-group variation, we also ran a principal component analysis (PCA), which summarise overall genetic variation, in adegenet. For the DAPC and PCA analyses, 15 individuals (two from Lomond, Eck and Ullswater, three from Llyn Tegid, and six from Haweswater) that had missing data at one or more loci were excluded.

Analysis of molecular variance (AMOVA) using FST was employed to test the effect of hierarchical groupings on the genetic variance, by testing north (Lomond, Eck) and south (Llyn Tegid, Haweswater, Brotherswater, Ullswater, Red Tarn) grouping, and by geographic region (Lomond, Eck), (Llyn Tegid), (Haweswater, Brotherswater, Ullswater, Red Tarn), following the Kottelat and Freyhoff (2007) taxonomic designation. AMOVA was carried out using the R package poppr v.2.8.1 (Kamvar et al. 2014), using 1000 permutations of individuals among groupings to assess significant difference from a null hypothesis of panmixia.

A distance-based phylogenetic relationship among all the populations was constructed using unrooted Neighbor-joining (NJ) clustering analysis of Cavalli-Sforza and Edwards (1967) Chords genetic distances (Table 2). The tree was constructed in poppr with 1,000 bootstrap replicates with the function aboot and was rooted using mid-point rooting due to lack of a specific outgroup.

All the programs used for the genetic analyses, apart from HP-RARE and GenoDive, are part of the R environment (R Core Team 2019).

Results

Genotyping, validation and quality control of genotypic data

We did not identify any mismatch between the original individual multi-locus genotypes and the re-extracted replicates within the present dataset. Heterozygote deficits were indicated by MICRO-CHECKER at four loci in three populations (BWF2 (Ullswater), Cla-Tet1 (Llyn Tegid), Cla-Tet13 (Haweswater, Llyn Tegid), and Cla-Tet18 (Haweswater)). In two out of the four cases, the heterozygote deficit appeared to reflect the presence of null alleles. In the remaining two cases the heterozygote deficit was also associated with more than 50% of the alleles being in one allele class but seemed to result from stuttering. As this effect is not pervasive in the dataset, we suggest that the heterozygote deficit reflects genetic effects associated with the population history, e.g. low diversity, rather than the occurrence of null alleles and other locus specific abnormalities (but see discussion). This is further supported by the careful validation and quality control of the genotypic data where all poorly amplified samples were re-extracted. Consequently, all loci were retained in further analysis.

Genetic diversity and variation

Summary statistics for the microsatellite loci are reported in Table S3. The number of alleles per locus ranged from three to 29, with an average of 9.6 alleles per locus. Observed heterozygosity ranged from 0.17 to 0.72, with an average of 0.42 across loci with all populations combined. Six loci (BWF1, BWF2, Cocl_lav04, Cocl-lav10, BFRO_018, and Cocl_lav49) were monomorphic in one or several populations (Table S3). Deviations from linkage equilibrium were identified for five of 105 locus pair tests but all were non-significant after sequential Bonferroni corrections. Deviations from HWE were found in four populations at four loci, for a total of eight comparisons, but were not significant after Bonferroni corrections.

The Scottish populations showed slightly lower genetic variation and allelic richness compared to the populations from Wales and England (Table 1), with the Scottish population of Loch Lomond (HO = 0.35, AR = 3.04) having higher diversity than Loch Eck (HO = 0.28, AR = 2.56). Llyn Tegid in Wales was the most genetically diverse population (HO = 0.5, AR = 5.29), with the highest heterozygosity, allelic and private allelic richness (Table 1). Among the English populations, Haweswater had the highest heterozygosity and allelic richness (HO = 0.5, AR = 3.88) (Table 1) followed by Red Tarn (HO = 0.44, AR = 3.27), Brotherswater (HO = 0.46, AR = 3.00), and Ullswater (HO = 0.41, AR = 2.83). We found a significant negative correlation between latitude and observed heterozygosity (Spearman’s ρ = -0.96, p-value = 0.003), but no correlation between latitude and allelic richness (Spearman’s ρ = − 0.71, p-value = 0.088).

Table 1 Lakes sampled, location, sample abbreviations (Code), year of sampling, sample size (N), number of alleles (NA), effective number of alleles (NAE), allelic (AR) and private allelic (APR) richness, observed heterozygosity (HO), expected heterozygosity (HE), Inbreeding coefficient (GIS) estimates

Population structure

Mean global FST was 0.388 (95% CI 0.308–0.478), and all pairwise FST values between populations were significant after 10,000 permutations (Table 2). The populations in Scotland showed high FST differentiation from the rest of the British populations, with values between 0.44 and 0.58. The lowest differentiation was observed between the English populations of Brotherswater and Ullswater (FST = 0.038), followed by the Scottish populations of Lomond and Eck (FST = 0.063). Genetic differentiation for the remaining English populations ranged from FST of 0.120 between Haweswater and Ullswater to FST of 0.290 between Red Tarn and Brotherswater. Genetic differentiation between Llyn Tegid and the English populations ranged from a FST value of 0.260 to a FST value of 0.315.

Table 2 Weir and Cockerham FST between populations (below diagonal). All P-values were significant after Bonferroni correction. Cavalli-Sorza and Edwards chords distances (above diagonal) calculated in phytools. LLA Lock Lomond, ECK Loch Eck, LTE Llyn Tegid, RTA Red Tarn, HAW Haweswater, BWA Brotherswater, UWA  Ullswater. Geographic area of origin is indicated in the first column, Sct Scotland, Wls Wales, Eng England

The DAPC without prior information identified six genetic clusters as most likely (Fig. 2a), after retaining 20 PCs as suggested by the xvalDAPC function. Individuals from Loch Lomond and Loch Eck were assigned to two separate clusters, although few individuals of each lake were assigned to the opposite cluster; in contrast, Llyn Tegid, Red Tarn, and Haweswater individuals formed separate clusters which did not include individuals originating from another site. Brotherswater and Ullswater samples were assigned to cluster 6 (Fig. 2a). The two clusters formed by individuals from Lomond and Eck were separated from the others along discriminant function 1 (DF1), which explained 72.7% of the variation, while DF2 explained 16.3% of the total variation (Fig. 2b), and separated Llyn Tegid from the English populations of Haweswater, Brotherswater and Ullswater, with Red Tarn in an intermediate distance between these two groups.

Fig. 2
figure 2

Analyses of population structure using DAPC and PCA. a Barplot showing assignment of individuals to the six clusters recovered by the DAPC without prior information. b Scatterplot of the DAPC without prior information, displaying clusters 1–6 along discriminant function (DF) 1 and 2. c Barplot showing the assignment of individuals to lakes of origin as recovered by the DAPC with prior information; d Scatterplot of the DAPC with prior information, displaying the genetic clusters corresponding to lakes along DF1 and DF2. Ellipses on 2B and 2D are inertia ellipses. E) Results of the principal component analysis, displaying PC1, PC2, and PC3. Colour codes are the same throughout the figure components

The DAPC with prior information had an overall proportion of correct assignment of individuals to populations (lake) of origin of 95.5%. Five individuals could not be assigned to either Lomond or Eck with more than 80% accuracy, and five individuals could not be assigned to either Brotherswater or Ullswater with more than 80% accuracy (Fig. 2c). DF1 explained 71.6% of the total variation, and separated Lomond and Eck from the other populations; DF2 explained 16.8% of the total variance and separated the remaining populations in a similar fashion as the DAPC without prior information (Fig. 2d).

Similar results were recovered by the PCA (Fig. 2e). The Scottish populations were separated from the rest along PC1, which explained 12.4% of the total variance, while PC2 separated the Welsh population from the English ones, explaining 6.6% of the variance. PC3, which explained 2.9% of the variance, separated Red Tarn and Haweswater from Brotherswater and Ullswater, and PC4, with 2.6% of the variance, separated Haweswater from Red Tarn (not shown).

We tested two hierarchical levels in the AMOVA (Table 3), to assess how variation was partitioned across the landscape. In the north-to-south grouping (Scotland)(Wales,England), variation between groups and variation between lakes within groups explained 33% and 14% of the total variance respectively, while in the grouping based on geographic origin (Scotland), (Wales), (England), variation between groups and variation between lakes within groups explained 35% and 9% of the total variance respectively, indicating the latter grouping as a more realistic representation of the landscape wide grouping.

Table 3 Results of the analysis of molecular variance (AMOVA) using as two grouping factors: north/south grouping or grouping by geographic area. Significance was assessed with 1000 permutations, and p-values were adjusted for multiple comparison using Bonferroni correction

In a mid-point rooted NJ tree, Lomond and Eck formed a cluster sister to all the other populations, and Llyn Tegid was then separated as a sister to a clade of all English populations (Fig. 3).

Fig. 3
figure 3

Neighbor-Joining tree calculated with Cavalli-Sforza and Edwards chord distances for the seven population of European whitefish in Britain. Mid-point rooting was applied to due to lack of outgroup. Support was calculated using 1000 bootstrap replicates

Discussion

Here we describe microsatellite genetic diversity and population structure in the seven native populations of European whitefish Coregonus lavaretus in Britain. These populations are of high conservation concern (Winfield et al. 2013), and our results provide important information for future conservation plans. We find a latitudinal gradient of genetic diversity, with the Welsh population having the highest genetic diversity and the Scottish populations the lowest. We also found evidence of population structuring, both between and within geographic areas.

Genetic diversity of British populations

Although comparison of genetic diversity estimates obtained from different microsatellite loci can only be qualitative, our findings suggest that the European whitefish populations in Britain, with Scottish ones in particular, tend to harbour lower genetic diversity compared to other European populations. Populations in Scotland had the highest number of monomorphic loci, with Loch Lomond and Loch Eck having three and four monomorphic loci respectively, Llyn Tegid, Red Tarn, and Ullswater had one monomorphic locus each, and Haweswater and Brotherswater had none despite having smaller sample sizes than Loch Lomond and Loch Eck. As a point of comparison, Häkli et al. (2018) sampled 27 European whitefish populations from Fennoscandia using 21 microsatellites, including the 15 we used, and found only five monomorphic loci. Furthermore, no monomorphic loci were observed in a dataset of 15 populations and 11 microsatellites, including eight loci present in our dataset, from Denmark (Hansen et al. 2008). Allelic richness across all British populations for 11 loci out of 15 analysed in this study was also lower (AR = 1.0–5.8) compared to whitefish populations from Fennoscandia (AR = 2.0–13.6) (Säisä et al. 2008; Præbel et al. 2013a; Häkli et al. 2018). Heterozygosity in the British populations ranged from 0.28 in Eck to 0.50 in Llyn Tegid, but a study looking at European whitefish in the Baltic region (Estonia to Sweden) with 12 microsatellites, seven of which overlapped with ours, found heterozygosity levels between 0.64 and 0.77 (Wennerström et al. 2013). Another study based on 529 individuals and 11 microsatellites, six of which overlapped with ours, found heterozygosity between 0.68 and 0.80 (Ozerov et al. 2015). A more similar level of heterozygosity, 0.47 to 0.58, was found in Alpine populations of whitefish (Hudson et al. 2017). Additionally, there appears to be a latitudinal gradient in heterozygosity within Britain, with the Welsh population in the south having the highest heterozygosity, followed by the English populations, and then the Scottish populations in the north. Private allelic richness was three times higher in Llyn Tegid compared to the second richest populations of Haweswater in England and Lomond in Scotland. We speculate that the latitudinal gradient in heterozygosity could have been generated during the postglacial recolonization of Britain as glaciers retreated northward. Similar patterns have been observed for mammals (Pope et al. 2006; Edwards et al. 2012) and plants (Sutherland et al. 2010). Lower genetic diversity at the range edge of species distributions, as European whitefish is in Britain at the extreme western reaches of the distribution, is a known phenomenon (Eckert et al. 2008), and in Europe there is a documented inverse correlation between distance from glacial refugium and contemporary genetic variation (e.g. Schmitt and Seitz 2002; Ficetola et al. 2007; Schmitt 2007; Dufresnes et al. 2014). Many populations in the British Isles have shown lower genetic diversity and/or high genetic differentiation as a result of postglacial recolonization events, including amphibians (Zeisset and Beebee 2001; Rowe et al. 2006), mammals (Dool et al. 2013; Montgomery et al. 2014), and fish (Bracken et al. 2015). A study using mitochondrial fragments found the Scottish Loch Lomond population to be part of the northern clade of European whitefish (Bernatchez and Dodson 1994), which has been hypothesised to have persisted during the most recent glaciation in a refugium west of the Ural mountain ridge (Østbye et al. 2005a, b), thus possibly explaining the observed lower genetic variation in the British populations. Furthermore, many continental whitefish populations exist in connected river systems (e.g. Winkler et al. 2011) and in the Baltic sea, possibly allowing for gene flow and potentially maintaining higher genetic diversity. An additional factor which could be contributing to the lower genetic diversity in the British populations is the lack of stocking or fisheries related translocations that have instead characterized many mainland European populations of whitefish (Eckman et al. 2007; Säisä et al. 2008; Dierking et al. 2014).

Evidence of genetic structuring among geographic areas

Analysis of population structure and molecular variance revealed three major genetic groups, which correspond to the populations in Scotland, England, and Wales. Our population differentiation estimate between Lochs Lomond and Eck in Scotland agrees with previous results (our FST = 0.063 vs FST = 0.056 of Adams et al. [2016]) and is much lower compared to the global FST of 0.388. The PCA supported this result of weak genetic structure between these two populations, and the DAPC analyses, with or without prior information on lake of origin, could not accurately assign several individuals to one of these lakes. Whitefish from these two lochs are quite distinct from each other in their morphology and ecology (Brown and Scott 1994; Pomeroy 1991; Etheridge et al. 2012b), and are believed to have been separated for thousands of years based on landscape features (Maitland 1970). The weak FST divergence between these two populations compared to the global FST could indicate a more recent divergence of the Loch Lomond and Loch Eck populations, compared to the divergence among most other British populations, or possibly indicates past introgression and gene flow.

The Welsh population of Llyn Tegid was more closely related to the English populations but formed its own genetic group, as indicated by the PCA and DAPC, and possessed many unique alleles. Coupling the higher genetic diversity of the Llyn Tegid population with the plausible south to north colonization route of whitefish after the ice retreat, it is reasonable to suggest the Welsh population was the first to settle in Britain.

In England we observed inter-lacustrine genetic differentiation that is hierarchically structured. The population of Red Tarn was the most divergent of the English group, separating on the NJ tree as sister to the other populations. In the DAPC analyses, individuals from Red Tarn formed a group that was clearly separated from the other English populations, and was somewhat close to the Welsh population. It has been suggested that the Red Tarn European whitefish population originated from human introduction (Macpherson 1892), given the high elevation of the lake, and founder effects might explain the high differentiation with the remaining English populations (Beaumont et al. 1995). However, there is no evidence of this introduction (Maitland et al. 1990), nor did we observe the bottleneck-associated reduced genetic diversity (Frankham et al. 2002) in the Red Tarn population when comparing it with the other English populations. Therefore, our analyses suggest Red Tarn is a natural and distinct population.

The next most divergent population was Haweswater, which also possesses the highest genetic diversity in England, and it is surpassed only by Llyn Tegid. The whitefish population in Haweswater has experienced severe declines due to anthropogenic water level fluctuations (the lake is used as water reservoir) and predation by cormorants (Winfield et al. 2013). Fortunately, our results suggest that this demographic crash is not reflected in the genetic diversity of the population, which remains relatively high. This suggests recovery will be possible if appropriate measures are put in place to prevent further reductions in overall population size.

The Brotherswater and Ullswater populations had the lowest inter-lacustrine differentiation of any two British populations, and DAPC analysis without prior information assigned individuals from both lakes to a single genetic cluster. These two lakes are connected by a short (length ~ 4 km) river, which despite no contemporary reports of sightings of whitefish along its length (Beaumont et al. 1995), could represent an ancestral corridor of gene flow for these populations. However, further studies are needed to discern the demographic events causing this pattern.

Our analysis of nuclear data supports the finding of previous research conducted with mitochondrial RFLPs (Hartley, 1995), in that the Welsh population has the highest genetic diversity, and that the seven British populations of whitefish cluster by geographic region. However, contrary to Hartley (1995) and previous studies based on electrophoretic analysis (Ferguson 1974; Ferguson et al. 1978) we find the Welsh population to be more closely related to, but not nested within, the English populations, as had been recorded by allozyme analysis (Beaumont et al. 1995). Finally, while previous studies recovered very low inter-population variability (Hartley 1995) in British European whitefish, our novel findings on hierarchical population genetic structuring, which were facilitated by the use of microsatellite loci, indicate high inter-population variation.

It is worth pointing out that all the lakes that contain European whitefish in Britain drain to the west of Britain into the Irish sea, where the post-glacial Lough Hibernia was located, and which Maitland (1970) believed to be the source of the British populations of C. lavaretus. If this were the case, it would support a monophyletic relationship of the British whitefish populations, but further studies including other European populations are needed to validate this hypothesis.

Evolutionary units of British Coregonus lavaretus

Our genetic analyses recover high genetic divergence between Scottish, English, and Welsh populations of European whitefish. Although there was some evidence of similar geographic clustering in phenotype observed by Etheridge et al. (2012a) based on the morphological traits proposed by Kottelat and Freyhof (2007) to distinguish the whitefish populations into the species C. clupeoides, C. stigmaticus, and C. pennantii, there was very considerable phenotypic overlap between individuals from these groups. Therefore, we argue that there is still not enough evidence to support the re-classification of European whitefish populations into three distinct species as was proposed in Kottelat and Freyhof (2007).

First, our inter-lacustrine genetic differentiation estimates within geographic areas contrast with the morphological results of Etheridge et al. (2012a). Those authors found the highest inter-population differentiation to be between Lomond and Eck in Scotland, while we observed low genetic differentiation between these two populations. Furthermore, they find Brotherswater individuals more morphologically similar to Red Tarn and Haweswater within England, while we found the lowest genetic differentiation in the dataset to be between Brotherswater and Ullswater. Our and Etheridge et al. (2012a) findings indicate that the morphological traits used to classify the three Coregonus species are too variable, perhaps influenced by local adaptation and/or phenotypic plasticity, and therefore not suitable for taxonomic purposes in these populations.

Second, while the level of genetic differentiation between whitefish populations in Britain is high, especially between Scottish and English lakes, it is within the levels of intra-population genetic differentiation observed for other temperate freshwater fish species as measured with microsatellite loci (e.g. Hänfling et al. 2002; Campos et al. 2006; Harris and Taylor 2010; Gordeeva et al. 2010; Taylor et al. 2011).

Therefore, while at this stage we cannot support the distinction of British C. lavaretus populations into three separate species, we recommend the delimitation of management unit (MU) status for each population, as these populations are demographically independent units, show restricted gene flow, high genetic differentiation and structuring (Funk et al. 2012). Furthermore, we propose to separate the Scottish, English, and Welsh populations into separate ESUs sensu Fraser and Bernatchez (2001) given their separate evolutionary histories.