Skip to main content
Log in

Retention of agronomically important variation in germplasm core collections: implications for allele mining

  • Original Paper
  • Published:
Theoretical and Applied Genetics Aims and scope Submit manuscript

Abstract

The primary targets of allele mining efforts are loci of agronomic importance. Agronomic loci typically exhibit patterns of allelic diversity that are consistent with a history of natural or artificial selection. Natural or artificial selection causes the distribution of genetic diversity at such loci to deviate substantially from the pattern found at neutral loci. The germplasm utilized for allele mining should contain maximum allelic variation at loci of interest, in the smallest possible number of samples. We show that the popular core collection assembly procedure “M” (marker allele richness), which leverages variation at neutral loci, performs worse than random assembly for retaining variation at a locus of agronomic importance in sugar beet (Beta vulgaris L. subsp. vulgaris) that is under selection. We present a corrected procedure (“M+”) that outperforms M. An extensive coalescent simulation was performed to demonstrate more generally the retention of neutral versus selected allelic variation in core subsets assembled with M+. A negative correlation in level of allelic diversity between neutral and selected loci was observed in 42% of simulated data sets. When core collection assembly is guided by neutral marker loci, as is the current common practice, enhanced allelic variation at agronomically important loci should not necessarily be expected.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Backes G, Graner A, Foroughi–Wehr B, Fischbeck G, Wenzel G, Jahoor A (1995) Localization of quantitative trait loci (QTL) for agronomic important characters by the use of a RFLP map in barley (Hordeum vulgare L.). Theor Appl Genet 90:294–302

    Article  CAS  Google Scholar 

  • Balfourier F, Roussel V, Strelchenko P, Exbrayat-Vinson F, Sourdille P, Boutet G, Koenig J, Ravel C, Mitrofanova O, Beckert M, Charmet G (2007) A worldwide bread wheat core collection arrayed in a 384–well plate. Theor Appl Genet 114:1265–1275

    Article  PubMed  Google Scholar 

  • Bataillon TM, David JL, Schoen DJ (1996) Neutral genetic markers and conservation genetics: simulated germplasm collections. Genetics 144:409–417

    PubMed  CAS  Google Scholar 

  • Baum DA, Shaw KL (1995) Genealogical perspectives on the species problem. In: Hoch PC, Stephenson AG (eds) Experimental and molecular approaches to plant biosystematics. Missouri Botanical Garden, St. Louis, pp 289–303

    Google Scholar 

  • Bernacchi D, Beck–Bunn T, Eshed Y, Lopez J, Petiard V, Uhlig J, Zamir D, Tanksley S (1998) Advanced backcross QTL analysis in tomato. I. Identification of QTLs for traits of agronomic importance from Lycopersicon hirsutum. Theor Appl Genet 97:381–397

    Article  CAS  Google Scholar 

  • Bhullar NK, Street K, Mackay M, Yahlaoul N, Keller B (2009) Unlocking wheat genetic resources for the molecular identification of previously undescribed functional alleles at the Pm3 resistance locus. Proc Natl Acad Sci USA 106:9519–9524

    Article  PubMed  CAS  Google Scholar 

  • Bhullar NK, Zhang Z, Wicker T, Keller B (2010) Wheat gene bank accessions as a source of new alleles of the powdery mildew resistance gene Pm3: a large scale allele mining project. BMC Plant Biol 10:88

    Article  PubMed  Google Scholar 

  • Bishop GJ, Harrison K, Jones JDG (1996) The tomato Dwarf gene isolated by heterologous transposon tagging encodes the first member of a new cytochrome P450 family. Plant Cell 8:959–969

    Article  PubMed  CAS  Google Scholar 

  • Börner A (2006) Preservation of plant genetic resources in the biotechnology era. Biotechnol J 1:1393–1404

    Article  PubMed  Google Scholar 

  • Braverman JM, Hudson RR, Kaplan NL, Langley CH, Stephan W (1995) The hitchhiking effect on the site frequency spectrum of DNA polymorphisms. Genetics 140:783–796

    PubMed  CAS  Google Scholar 

  • Brown AHD (1989) Core collections: a practical approach to genetic resources management. Genome 31:818–824

    Article  Google Scholar 

  • Caicedo AL (2008) Geographic diversity cline of R gene homologs in wild populations of Solanum pimpinellifolium. Am J Bot 95:393–398

    Article  PubMed  CAS  Google Scholar 

  • Casa AM, Mitchell SE, Hamblin MT, Sun H, Bowers JE, Paterson AH, Aquadro CF, Kresovich S (2005) Diversity and selection in sorghum: simultaneous analyses using simple sequence repeats. Theor Appl Genet 111:23–30

    Article  PubMed  CAS  Google Scholar 

  • Chapman MA, Pashley CH, Wenzler J, Hvala J, Tang S, Knapp SJ, Burke JM (2008) A genomic scan for selection reveals candidates for genes involved in the evolution of cultivated sunflower (Helianthus annuus). Plant Cell 20:2931–2945

    Article  PubMed  CAS  Google Scholar 

  • Charlesworth B, Charlesworth D, Barton NH (2003) The effects of genetic and geographic structure on neutral variation. Annu Rev Ecol Evol S 34:99–125

    Article  Google Scholar 

  • Chen H, Morrell PL, de la Cruz M, Clegg MT (2008) Nucleotide diversity and linkage disequilibrium in wild avocado (Persea americana Mill.). J Hered 99:382–389

    Article  PubMed  CAS  Google Scholar 

  • Crossa J, Burgueño J, Dreisigacker S, Vargas M, Herrera–Foessel SA, Lillemo M, Singh RP, Trethowan R, Warburton M, Franco J, Reynolds M, Crouch JH, Ortiz R (2007) Association analysis of historical bread wheat germplasm using additive genetic covariance of relatives and population structure. Genetics 177:1889–1913

    Article  PubMed  CAS  Google Scholar 

  • David JL, Savy Y, Brabant P (1993) Outcrossing and selfing evolution in populations under directional selection. Heredity 71:642–651

    Article  Google Scholar 

  • Donoghue MJ (1985) A critique of the biological species concept and recommendations for a phylogenetic alternative. Bryologist 88:172–181

    Article  Google Scholar 

  • El Mousadik A, Petit RJ (1996) High level of genetic differentiation for allelic richness among populations of the argan tree [Argania spinosa (L.) Skeels] endemic to Morocco. Theor Appl Genet 92:832–839

    Article  Google Scholar 

  • England PR, Luikart G, Waples RS (2010) Early detection of population fragmentation using linkage disequilibrium estimation of effective population size. Conserv Genet 11:2425–2430

    Article  Google Scholar 

  • Escribano P, Viruel MA, Hormaza JI (2008) Comparison of different methods to construct a core germplasm collection in woody perennial species with simple sequence repeat markers: a case study in cherimoya (Annona cherimola, Annonaceae), an underutilised subtropical fruit tree species. Ann Appl Biol 153:25–32

    Article  Google Scholar 

  • Ewens WJ (1972) The sampling theory of selectively neutral alleles. Theor Popul Biol 3:87–112

    Article  PubMed  CAS  Google Scholar 

  • Ewing G, Hermisson J (2010) MSMS: a coalescent simulation program including recombination, demographic structure and selection at a single locus. Bioinformatics 26:2064–2065

    Article  PubMed  CAS  Google Scholar 

  • Fay JC, Wu C-I (2000) Hitchhiking under positive Darwinian selection. Genetics 155:1405–1413

    PubMed  CAS  Google Scholar 

  • Felsenstein J (1965) The effect of linkage on directional selection. Genetics 52:349–363

    PubMed  CAS  Google Scholar 

  • Felsenstein J (1974) The evolutionary advantage of recombination. Genetics 78:737–756

    PubMed  CAS  Google Scholar 

  • Fisher RA (1930) The genetical theory of natural selection. Clarendon Press, Oxford

    Google Scholar 

  • Franco J, Crossa J, Taba S, Shands H (2005) A sampling strategy for conserving genetic diversity when forming core subsets. Crop Sci 45:1035–1044

    Article  Google Scholar 

  • Frankel OH (1984) Genetic perspectives of germplasm conservation. In: Arber W, Illmensee K, Peacock WJ, Starlinger P (eds) Genetic manipulation: impact on man and society. Cambridge University Press, Cambridge, pp 161–170

    Google Scholar 

  • Frankham R (1996) Relationship of genetic variation to population size in wildlife. Conserv Biol 10:1500–1508

    Article  Google Scholar 

  • González-Martínez SC, Wheeler NC, Ersoz E, Nelson CD, Neale DB (2007) Association genetics in Pinus taeda L. I. wood property traits. Genetics 175:399–409

    Article  PubMed  Google Scholar 

  • Goudet, J (2001) FSTAT, a program to estimate and test gene diversities and fixation indices (version 2.9.3). Software distributed by the author

  • Gouesnard B, Bataillon TM, Decoux G, Rozale C, Schoen DJ, David JL (2001) MSTRAT: an algorithm for building germ plasm core collections by maximizing allelic or phenotypic richness. J Hered 92:93–94

    Article  PubMed  CAS  Google Scholar 

  • Gur A, Zamir D (2004) Unused natural variation can lift yield barriers in plant breeding. PLoS Biol 2:e245

    Article  PubMed  Google Scholar 

  • Hamrick JL, Godt MJW (1997) Allozyme diversity in cultivated crops. Crop Sci 37:26–30

    Article  CAS  Google Scholar 

  • Hedrick PW (2005) A standardized genetic differentiation measure. Evolution 59:1633–1638

    PubMed  CAS  Google Scholar 

  • Hill WG, Robertson A (1966) The effect of linkage on limits to artificial selection. Genet Res 8:269–294

    Article  PubMed  CAS  Google Scholar 

  • Hoisington D, Khairallah M, Reeves T, Ribaut J-M, Skovmand B, Taba S, Warburton M (1999) Plant genetic resources: what can they contribute toward increased crop productivity? Proc Natl Acad Sci USA 96:5937–5943

    Article  PubMed  CAS  Google Scholar 

  • Hudson RR (1983) Properties of a neutral allele model with intragenic recombination. Theor Popul Biol 23:183–201

    Article  PubMed  CAS  Google Scholar 

  • Hudson RR, Coyne JA (2002) Mathematical consequences of the genealogical species concept. Evolution 56:1557–1565

    PubMed  Google Scholar 

  • Huelsenbeck JP, Andolfatto P (2007) Inference of population structure under a Dirichlet process model. Genetics 175:1787–1802

    Article  PubMed  CAS  Google Scholar 

  • Hyten DL, Song Q, Zhu Y, Choi I–Y, Nelson RL, Costa JM, Specht JE, Shoemaker RC, Cregan PB (2006) Impacts of genetic bottlenecks on soybean genome diversity. Proc Natl Acad Sci USA 103:16666–16671

    Article  PubMed  CAS  Google Scholar 

  • Jansen J, van Hintum T (2007) Genetic distance sampling: a novel sampling method for obtaining core collections using genetic distances with an application to cultivated lettuce. Theor Appl Genet 114:421–428

    Article  PubMed  CAS  Google Scholar 

  • Johal GS, Briggs SP (1992) Reductase activity encoded by the HM1 disease resistance gene in maize. Science 258:985–987

    Article  PubMed  CAS  Google Scholar 

  • Johal GS, Balint–Kurti P, Weil CF (2008) Mining and harnessing natural variation: a little MAGIC. Crop Sci 48:2066–2073

    Article  Google Scholar 

  • Jost L (2008) G ST and its relatives do not measure differentiation. Mol Ecol 17:4015–4026

    Article  PubMed  Google Scholar 

  • Kaur N, Street K, Mackay M, Yahiaoui N, Keller B (2008) Molecular approaches for characterization and use of natural disease resistance in wheat. Eur J Plant Pathol 121:387–397

    Article  CAS  Google Scholar 

  • Kim K–W, Chung H–K, Cho G–T, Ma K–H, Chandrabalan D, Gwag J–G, Kim T–S, Cho E–G, Park Y–J (2007) PowerCore: a program applying the advanced M strategy with a heuristic search for establishing core sets. Bioinformatics 23:2155–2162

    Article  PubMed  CAS  Google Scholar 

  • Kimura M (1956) A model of a genetic system which leads to closer linkage by natural selection. Evolution 10:278–287

    Article  Google Scholar 

  • Kimura M, Crow JF (1964) The number of alleles that can be maintained in a finite population. Genetics 49:725–738

    PubMed  CAS  Google Scholar 

  • Kingman JFC (1982) On the genealogy of large populations. J Appl Prob 19A:27–43

    Article  Google Scholar 

  • Kumar GR, Sakthivel K, Sundaram RM, Neeraja CN, Balachandran SM, Rani NS, Viraktamath BC, Madhav MS (2010) Allele mining in crops: prospects and potentials. Biotechnol Adv 28:451–461

    Article  PubMed  CAS  Google Scholar 

  • Lancaster AK, Single RM, Solberg OD, Nelson MP, Thomson G (2007) PyPop update—a software pipeline for large-scale multilocus population genomics. Tissue Antigens 69(s1):192–197

    Article  PubMed  CAS  Google Scholar 

  • Latha R, Rubia L, Bennett J, Swaminathan MS (2004) Allele mining for stress tolerance genes in Oryza species and related germplasm. Mol Biotechnol 27:101–108

    Article  PubMed  CAS  Google Scholar 

  • Le Cunff L, Fournier–Level A, Laucou V, Vezzulli S, Lacombe T, Adam–Blondon A-F, Boursiquot J–M, This P (2008) Construction of nested genetic core collections to optimize the exploitation of natural diversity in Vitis vinifera L. subsp. sativa. BMC Plant Biol 8:31

    Article  PubMed  Google Scholar 

  • Lewontin RC (1964) The interaction of selection and linkage. II. Optimum models. Genetics 50:757–782

    PubMed  CAS  Google Scholar 

  • Mallet J (1995) A species definition for the modern synthesis. Trends Ecol Evol 10:294–299

    Article  PubMed  CAS  Google Scholar 

  • Martin G, Otto SP, Lenormand T (2006) Selection for recombination in structured populations. Genetics 172:593–609

    Article  PubMed  CAS  Google Scholar 

  • McGrath JM, Trebbi D, Fenwick A, Panella L, Schulz B, Laurent V, Barnes, Murray SC (2007) An open–source first–generation molecular genetic map from a sugarbeet × table beet cross and its extension to physical mapping. The Plant Genome 47:S27–S44

    Google Scholar 

  • McKay JK, Latta RG (2002) Adaptive population divergence: markers, QTL and traits. Trends Ecol Evol 17:285–291

    Article  Google Scholar 

  • McKhann HI, Camilleri C, Bérard A, Bataillon T, David JL, Reboud X, Le Corre V, Caloustian C, Gut IG, Brunel D (2004) Nested core collections maximizing genetic diversity in Arabidopsis thaliana. Plant J 38:193–202

    Article  PubMed  CAS  Google Scholar 

  • Morjan CL, Rieseberg LH (2004) How species evolve collectively: implications of gene flow and selection for the spread of advantageous alleles. Mol Ecol 13:1341–1356

    Article  PubMed  CAS  Google Scholar 

  • Morrell PL, Toleno DM, Lundy KE, Clegg MT (2006) Estimating the contribution of mutation, recombination and gene conversion in the generation of haplotypic diversity. Genetics 173:1705–1723

    Article  PubMed  CAS  Google Scholar 

  • Muller HJ (1932) Some genetic aspects of sex. Am Nat 66:118–138

    Article  Google Scholar 

  • Nei M, Li W-H (1973) Linkage disequilibrium in subdivided populations. Genetics 75:213–219

    PubMed  CAS  Google Scholar 

  • Nielsen R (2005) Molecular signatures of natural selection. Annu Rev Genet 39:197–218

    Article  PubMed  CAS  Google Scholar 

  • Nixon KC, Wheeler QD (1990) An amplification of the phylogenetic species concept. Cladistics 6:211–223

    Article  Google Scholar 

  • Nybom H, Bartish IV (2000) Effects of life history traits and sampling strategies on genetic diversity estimates obtained with RAPD markers in plants. Perspect Plant Ecol Evol Syst 3:93–114

    Article  Google Scholar 

  • Ohta T (1982) Linkage disequilibrium due to random genetic drift in finite subdivided populations. Proc Natl Acad Sci USA 79:1940–1944

    Article  PubMed  CAS  Google Scholar 

  • Prada D (2009) Molecular population genetics and agronomic alleles in seed banks: searching for a needle in a haystack? J Exp Bot 60:2541–2552

    Article  PubMed  CAS  Google Scholar 

  • Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959

    PubMed  CAS  Google Scholar 

  • Reed DH, Frankham R (2001) How closely correlated are molecular and quantitative measures of genetic variation? A meta–analysis. Evolution 55:1095–1103

    PubMed  CAS  Google Scholar 

  • Reeves PA, He Y, Schmitz RJ, Amasino RM, Panella LW, Richards CM (2007) Evolutionary conservation of the FLC–mediated vernalization response: evidence from the sugar beet (Beta vulgaris). Genetics 176:295–307

    Article  PubMed  CAS  Google Scholar 

  • Richards CM (2004) Molecular technologies for managing and using gene bank collections. In: MC de Vicente (ed) The evolving role of genebanks in the fast developing field of molecular genetics. Issues Genet Resour 11:13–18. IPGRI, Rome, Italy

  • Richards CM, Brownson M, Mitchell SE, Kresovich S, Panella L (2004) Polymorphic microsatellite markers for inferring diversity in wild and domesticated sugar beet (Beta vulgaris). Mol Ecol Notes 4:243–345

    Article  CAS  Google Scholar 

  • Schoen DJ, Brown AHD (1993) Conservation of allelic richness in wild crop relatives is aided by assessment of genetic markers. Proc Natl Acad Sci USA 90:10623–10627

    Article  PubMed  CAS  Google Scholar 

  • Schoen DJ, Brown AHD (2001) The conservation of wild plant species in seed banks. Bioscience 51:960–966

    Article  Google Scholar 

  • Slatkin M (1976) The rate of spread of an advantageous allele in a subdivided population. In: Karlin S, Nevo E (eds) Population genetics and ecology. Academic Press, New York, pp 767–780

    Google Scholar 

  • Slatkin M (1994) An exact test for neutrality based on the Ewens sampling distribution. Genet Res 64:71–74

    Article  PubMed  CAS  Google Scholar 

  • Slatkin M (1996) A correction to the exact test based on the Ewens sampling distribution. Genet Res 68:259–260

    Article  PubMed  CAS  Google Scholar 

  • Tanksley SD, McCouch SR (1997) Seed banks and molecular maps: unlocking genetic potential from the wild. Science 277:418–423

    Article  Google Scholar 

  • Tenaillon MI, U’Ren J, Tenaillon O, Gaut BS (2004) Selection versus demography: a multilocus investigation of the domestication process in maize. Mol Biol Evol 21:1214–1225

    Article  PubMed  CAS  Google Scholar 

  • Viard F, Bernard J, Desplanque B (2002) Crop–weed interactions in the Beta vulgaris complex at a local scale: allelic diversity and gene flow within sugar beet fields. Theor Appl Genet 104:688–697

    Article  PubMed  CAS  Google Scholar 

  • Vigouroux Y, McMullen M, Hittinger CT, Houchins K, Schulz L, Kresovich S, Matsuoka Y, Doebley J (2002) Identifying genes of agronomic importance in maize by screening microsatellites for evidence of selection during domestication. Proc Natl Acad Sci USA 99:9650–9655

    Article  PubMed  CAS  Google Scholar 

  • Walsh B (2008) Using molecular markers for detecting domestication, improvement, and adaptation genes. Euphytica 161:1–17

    Article  CAS  Google Scholar 

  • Walters C, Volk GA, Richards CM (2008) Genebanks in the post-genomic age: emerging roles and anticipated uses. Biodiversity 9:68–71

    Article  Google Scholar 

  • Wang M, Allefs S, van den Berg RG, Vleeshouwers VGAA, van der Vossen EAG, Vosman B (2008) Allele mining in Solanum: conserved homologues of Rpiblb1 are identified in Solanum stoloniferum. Theor Appl Genet 116:933–943

    Article  PubMed  CAS  Google Scholar 

  • Whitham S, Dinesh–Kumar SP, Choi D, Hehl R, Corr C, Baker B (1994) The product of the tobacco mosaic virus resistance gene N: similarity to toll and the interleukin–1 receptor. Cell 78:1101–1115

    Article  PubMed  CAS  Google Scholar 

  • Wright S (1931) Evolution in Mendelian populations. Genetics 16:97–159

    PubMed  CAS  Google Scholar 

  • Wright SI, Gaut BS (2005) Molecular population genetics and the search for adaptive evolution in plants. Mol Biol Evol 22:506–519

    Article  PubMed  CAS  Google Scholar 

  • Xiao J, Li J, Yuan L, Tanksley SD (1996) Identification of QTLs affecting traits of agronomic importance in a recombinant inbred population derived from a subspecific rice cross. Theor Appl Genet 92:230–244

    Article  CAS  Google Scholar 

  • Yamasaki M, Tenaillon MI, Vroh Bi I, Schroeder SG, Sanchez–Villeda H, Doebley JF, Gaut BS, McMullen MD (2005) A large–scale screen for artificial selection in maize identifies candidate agronomic loci for domestication and crop improvement. Plant Cell 17:2859–2872

    Article  PubMed  CAS  Google Scholar 

  • Zhu Q, Zheng X, Luo J, Gaut BS, Ge S (2007) Multilocus analysis of nucleotide variation of Oryza sativa and its wild relatives: severe bottleneck during domestication of rice. Mol Biol Evol 24:875–888

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

We thank Ann Fenwick for genotyping the reference loci, and Gayle Volk and Dale Lockwood for comments on the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patrick A. Reeves.

Additional information

Communicated by A. Graner.

Electronic supplementary material

Below is the link to the electronic supplementary material.

122_2011_1776_MOESM1_ESM.tif

Supplementary Figure 1. Correlation between degree of population subdivision and improvement in allelic retention at target loci. The standardized metric of population differentiation G′ST, estimated using neutral reference loci, is plotted on the X-axis. The number of additional alleles present in core subsets assembled using M+, relative to random assembly, is plotted on the Y-axis. M+ performed better for highly subdivided data sets when neutral loci were targeted, but not when selected loci were targeted. (TIFF 127 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Reeves, P.A., Panella, L.W. & Richards, C.M. Retention of agronomically important variation in germplasm core collections: implications for allele mining. Theor Appl Genet 124, 1155–1171 (2012). https://doi.org/10.1007/s00122-011-1776-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00122-011-1776-4

Keywords

Navigation