Abstract
The primary targets of allele mining efforts are loci of agronomic importance. Agronomic loci typically exhibit patterns of allelic diversity that are consistent with a history of natural or artificial selection. Natural or artificial selection causes the distribution of genetic diversity at such loci to deviate substantially from the pattern found at neutral loci. The germplasm utilized for allele mining should contain maximum allelic variation at loci of interest, in the smallest possible number of samples. We show that the popular core collection assembly procedure “M” (marker allele richness), which leverages variation at neutral loci, performs worse than random assembly for retaining variation at a locus of agronomic importance in sugar beet (Beta vulgaris L. subsp. vulgaris) that is under selection. We present a corrected procedure (“M+”) that outperforms M. An extensive coalescent simulation was performed to demonstrate more generally the retention of neutral versus selected allelic variation in core subsets assembled with M+. A negative correlation in level of allelic diversity between neutral and selected loci was observed in 42% of simulated data sets. When core collection assembly is guided by neutral marker loci, as is the current common practice, enhanced allelic variation at agronomically important loci should not necessarily be expected.
Similar content being viewed by others
References
Backes G, Graner A, Foroughi–Wehr B, Fischbeck G, Wenzel G, Jahoor A (1995) Localization of quantitative trait loci (QTL) for agronomic important characters by the use of a RFLP map in barley (Hordeum vulgare L.). Theor Appl Genet 90:294–302
Balfourier F, Roussel V, Strelchenko P, Exbrayat-Vinson F, Sourdille P, Boutet G, Koenig J, Ravel C, Mitrofanova O, Beckert M, Charmet G (2007) A worldwide bread wheat core collection arrayed in a 384–well plate. Theor Appl Genet 114:1265–1275
Bataillon TM, David JL, Schoen DJ (1996) Neutral genetic markers and conservation genetics: simulated germplasm collections. Genetics 144:409–417
Baum DA, Shaw KL (1995) Genealogical perspectives on the species problem. In: Hoch PC, Stephenson AG (eds) Experimental and molecular approaches to plant biosystematics. Missouri Botanical Garden, St. Louis, pp 289–303
Bernacchi D, Beck–Bunn T, Eshed Y, Lopez J, Petiard V, Uhlig J, Zamir D, Tanksley S (1998) Advanced backcross QTL analysis in tomato. I. Identification of QTLs for traits of agronomic importance from Lycopersicon hirsutum. Theor Appl Genet 97:381–397
Bhullar NK, Street K, Mackay M, Yahlaoul N, Keller B (2009) Unlocking wheat genetic resources for the molecular identification of previously undescribed functional alleles at the Pm3 resistance locus. Proc Natl Acad Sci USA 106:9519–9524
Bhullar NK, Zhang Z, Wicker T, Keller B (2010) Wheat gene bank accessions as a source of new alleles of the powdery mildew resistance gene Pm3: a large scale allele mining project. BMC Plant Biol 10:88
Bishop GJ, Harrison K, Jones JDG (1996) The tomato Dwarf gene isolated by heterologous transposon tagging encodes the first member of a new cytochrome P450 family. Plant Cell 8:959–969
Börner A (2006) Preservation of plant genetic resources in the biotechnology era. Biotechnol J 1:1393–1404
Braverman JM, Hudson RR, Kaplan NL, Langley CH, Stephan W (1995) The hitchhiking effect on the site frequency spectrum of DNA polymorphisms. Genetics 140:783–796
Brown AHD (1989) Core collections: a practical approach to genetic resources management. Genome 31:818–824
Caicedo AL (2008) Geographic diversity cline of R gene homologs in wild populations of Solanum pimpinellifolium. Am J Bot 95:393–398
Casa AM, Mitchell SE, Hamblin MT, Sun H, Bowers JE, Paterson AH, Aquadro CF, Kresovich S (2005) Diversity and selection in sorghum: simultaneous analyses using simple sequence repeats. Theor Appl Genet 111:23–30
Chapman MA, Pashley CH, Wenzler J, Hvala J, Tang S, Knapp SJ, Burke JM (2008) A genomic scan for selection reveals candidates for genes involved in the evolution of cultivated sunflower (Helianthus annuus). Plant Cell 20:2931–2945
Charlesworth B, Charlesworth D, Barton NH (2003) The effects of genetic and geographic structure on neutral variation. Annu Rev Ecol Evol S 34:99–125
Chen H, Morrell PL, de la Cruz M, Clegg MT (2008) Nucleotide diversity and linkage disequilibrium in wild avocado (Persea americana Mill.). J Hered 99:382–389
Crossa J, Burgueño J, Dreisigacker S, Vargas M, Herrera–Foessel SA, Lillemo M, Singh RP, Trethowan R, Warburton M, Franco J, Reynolds M, Crouch JH, Ortiz R (2007) Association analysis of historical bread wheat germplasm using additive genetic covariance of relatives and population structure. Genetics 177:1889–1913
David JL, Savy Y, Brabant P (1993) Outcrossing and selfing evolution in populations under directional selection. Heredity 71:642–651
Donoghue MJ (1985) A critique of the biological species concept and recommendations for a phylogenetic alternative. Bryologist 88:172–181
El Mousadik A, Petit RJ (1996) High level of genetic differentiation for allelic richness among populations of the argan tree [Argania spinosa (L.) Skeels] endemic to Morocco. Theor Appl Genet 92:832–839
England PR, Luikart G, Waples RS (2010) Early detection of population fragmentation using linkage disequilibrium estimation of effective population size. Conserv Genet 11:2425–2430
Escribano P, Viruel MA, Hormaza JI (2008) Comparison of different methods to construct a core germplasm collection in woody perennial species with simple sequence repeat markers: a case study in cherimoya (Annona cherimola, Annonaceae), an underutilised subtropical fruit tree species. Ann Appl Biol 153:25–32
Ewens WJ (1972) The sampling theory of selectively neutral alleles. Theor Popul Biol 3:87–112
Ewing G, Hermisson J (2010) MSMS: a coalescent simulation program including recombination, demographic structure and selection at a single locus. Bioinformatics 26:2064–2065
Fay JC, Wu C-I (2000) Hitchhiking under positive Darwinian selection. Genetics 155:1405–1413
Felsenstein J (1965) The effect of linkage on directional selection. Genetics 52:349–363
Felsenstein J (1974) The evolutionary advantage of recombination. Genetics 78:737–756
Fisher RA (1930) The genetical theory of natural selection. Clarendon Press, Oxford
Franco J, Crossa J, Taba S, Shands H (2005) A sampling strategy for conserving genetic diversity when forming core subsets. Crop Sci 45:1035–1044
Frankel OH (1984) Genetic perspectives of germplasm conservation. In: Arber W, Illmensee K, Peacock WJ, Starlinger P (eds) Genetic manipulation: impact on man and society. Cambridge University Press, Cambridge, pp 161–170
Frankham R (1996) Relationship of genetic variation to population size in wildlife. Conserv Biol 10:1500–1508
González-Martínez SC, Wheeler NC, Ersoz E, Nelson CD, Neale DB (2007) Association genetics in Pinus taeda L. I. wood property traits. Genetics 175:399–409
Goudet, J (2001) FSTAT, a program to estimate and test gene diversities and fixation indices (version 2.9.3). Software distributed by the author
Gouesnard B, Bataillon TM, Decoux G, Rozale C, Schoen DJ, David JL (2001) MSTRAT: an algorithm for building germ plasm core collections by maximizing allelic or phenotypic richness. J Hered 92:93–94
Gur A, Zamir D (2004) Unused natural variation can lift yield barriers in plant breeding. PLoS Biol 2:e245
Hamrick JL, Godt MJW (1997) Allozyme diversity in cultivated crops. Crop Sci 37:26–30
Hedrick PW (2005) A standardized genetic differentiation measure. Evolution 59:1633–1638
Hill WG, Robertson A (1966) The effect of linkage on limits to artificial selection. Genet Res 8:269–294
Hoisington D, Khairallah M, Reeves T, Ribaut J-M, Skovmand B, Taba S, Warburton M (1999) Plant genetic resources: what can they contribute toward increased crop productivity? Proc Natl Acad Sci USA 96:5937–5943
Hudson RR (1983) Properties of a neutral allele model with intragenic recombination. Theor Popul Biol 23:183–201
Hudson RR, Coyne JA (2002) Mathematical consequences of the genealogical species concept. Evolution 56:1557–1565
Huelsenbeck JP, Andolfatto P (2007) Inference of population structure under a Dirichlet process model. Genetics 175:1787–1802
Hyten DL, Song Q, Zhu Y, Choi I–Y, Nelson RL, Costa JM, Specht JE, Shoemaker RC, Cregan PB (2006) Impacts of genetic bottlenecks on soybean genome diversity. Proc Natl Acad Sci USA 103:16666–16671
Jansen J, van Hintum T (2007) Genetic distance sampling: a novel sampling method for obtaining core collections using genetic distances with an application to cultivated lettuce. Theor Appl Genet 114:421–428
Johal GS, Briggs SP (1992) Reductase activity encoded by the HM1 disease resistance gene in maize. Science 258:985–987
Johal GS, Balint–Kurti P, Weil CF (2008) Mining and harnessing natural variation: a little MAGIC. Crop Sci 48:2066–2073
Jost L (2008) G ST and its relatives do not measure differentiation. Mol Ecol 17:4015–4026
Kaur N, Street K, Mackay M, Yahiaoui N, Keller B (2008) Molecular approaches for characterization and use of natural disease resistance in wheat. Eur J Plant Pathol 121:387–397
Kim K–W, Chung H–K, Cho G–T, Ma K–H, Chandrabalan D, Gwag J–G, Kim T–S, Cho E–G, Park Y–J (2007) PowerCore: a program applying the advanced M strategy with a heuristic search for establishing core sets. Bioinformatics 23:2155–2162
Kimura M (1956) A model of a genetic system which leads to closer linkage by natural selection. Evolution 10:278–287
Kimura M, Crow JF (1964) The number of alleles that can be maintained in a finite population. Genetics 49:725–738
Kingman JFC (1982) On the genealogy of large populations. J Appl Prob 19A:27–43
Kumar GR, Sakthivel K, Sundaram RM, Neeraja CN, Balachandran SM, Rani NS, Viraktamath BC, Madhav MS (2010) Allele mining in crops: prospects and potentials. Biotechnol Adv 28:451–461
Lancaster AK, Single RM, Solberg OD, Nelson MP, Thomson G (2007) PyPop update—a software pipeline for large-scale multilocus population genomics. Tissue Antigens 69(s1):192–197
Latha R, Rubia L, Bennett J, Swaminathan MS (2004) Allele mining for stress tolerance genes in Oryza species and related germplasm. Mol Biotechnol 27:101–108
Le Cunff L, Fournier–Level A, Laucou V, Vezzulli S, Lacombe T, Adam–Blondon A-F, Boursiquot J–M, This P (2008) Construction of nested genetic core collections to optimize the exploitation of natural diversity in Vitis vinifera L. subsp. sativa. BMC Plant Biol 8:31
Lewontin RC (1964) The interaction of selection and linkage. II. Optimum models. Genetics 50:757–782
Mallet J (1995) A species definition for the modern synthesis. Trends Ecol Evol 10:294–299
Martin G, Otto SP, Lenormand T (2006) Selection for recombination in structured populations. Genetics 172:593–609
McGrath JM, Trebbi D, Fenwick A, Panella L, Schulz B, Laurent V, Barnes, Murray SC (2007) An open–source first–generation molecular genetic map from a sugarbeet × table beet cross and its extension to physical mapping. The Plant Genome 47:S27–S44
McKay JK, Latta RG (2002) Adaptive population divergence: markers, QTL and traits. Trends Ecol Evol 17:285–291
McKhann HI, Camilleri C, Bérard A, Bataillon T, David JL, Reboud X, Le Corre V, Caloustian C, Gut IG, Brunel D (2004) Nested core collections maximizing genetic diversity in Arabidopsis thaliana. Plant J 38:193–202
Morjan CL, Rieseberg LH (2004) How species evolve collectively: implications of gene flow and selection for the spread of advantageous alleles. Mol Ecol 13:1341–1356
Morrell PL, Toleno DM, Lundy KE, Clegg MT (2006) Estimating the contribution of mutation, recombination and gene conversion in the generation of haplotypic diversity. Genetics 173:1705–1723
Muller HJ (1932) Some genetic aspects of sex. Am Nat 66:118–138
Nei M, Li W-H (1973) Linkage disequilibrium in subdivided populations. Genetics 75:213–219
Nielsen R (2005) Molecular signatures of natural selection. Annu Rev Genet 39:197–218
Nixon KC, Wheeler QD (1990) An amplification of the phylogenetic species concept. Cladistics 6:211–223
Nybom H, Bartish IV (2000) Effects of life history traits and sampling strategies on genetic diversity estimates obtained with RAPD markers in plants. Perspect Plant Ecol Evol Syst 3:93–114
Ohta T (1982) Linkage disequilibrium due to random genetic drift in finite subdivided populations. Proc Natl Acad Sci USA 79:1940–1944
Prada D (2009) Molecular population genetics and agronomic alleles in seed banks: searching for a needle in a haystack? J Exp Bot 60:2541–2552
Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959
Reed DH, Frankham R (2001) How closely correlated are molecular and quantitative measures of genetic variation? A meta–analysis. Evolution 55:1095–1103
Reeves PA, He Y, Schmitz RJ, Amasino RM, Panella LW, Richards CM (2007) Evolutionary conservation of the FLC–mediated vernalization response: evidence from the sugar beet (Beta vulgaris). Genetics 176:295–307
Richards CM (2004) Molecular technologies for managing and using gene bank collections. In: MC de Vicente (ed) The evolving role of genebanks in the fast developing field of molecular genetics. Issues Genet Resour 11:13–18. IPGRI, Rome, Italy
Richards CM, Brownson M, Mitchell SE, Kresovich S, Panella L (2004) Polymorphic microsatellite markers for inferring diversity in wild and domesticated sugar beet (Beta vulgaris). Mol Ecol Notes 4:243–345
Schoen DJ, Brown AHD (1993) Conservation of allelic richness in wild crop relatives is aided by assessment of genetic markers. Proc Natl Acad Sci USA 90:10623–10627
Schoen DJ, Brown AHD (2001) The conservation of wild plant species in seed banks. Bioscience 51:960–966
Slatkin M (1976) The rate of spread of an advantageous allele in a subdivided population. In: Karlin S, Nevo E (eds) Population genetics and ecology. Academic Press, New York, pp 767–780
Slatkin M (1994) An exact test for neutrality based on the Ewens sampling distribution. Genet Res 64:71–74
Slatkin M (1996) A correction to the exact test based on the Ewens sampling distribution. Genet Res 68:259–260
Tanksley SD, McCouch SR (1997) Seed banks and molecular maps: unlocking genetic potential from the wild. Science 277:418–423
Tenaillon MI, U’Ren J, Tenaillon O, Gaut BS (2004) Selection versus demography: a multilocus investigation of the domestication process in maize. Mol Biol Evol 21:1214–1225
Viard F, Bernard J, Desplanque B (2002) Crop–weed interactions in the Beta vulgaris complex at a local scale: allelic diversity and gene flow within sugar beet fields. Theor Appl Genet 104:688–697
Vigouroux Y, McMullen M, Hittinger CT, Houchins K, Schulz L, Kresovich S, Matsuoka Y, Doebley J (2002) Identifying genes of agronomic importance in maize by screening microsatellites for evidence of selection during domestication. Proc Natl Acad Sci USA 99:9650–9655
Walsh B (2008) Using molecular markers for detecting domestication, improvement, and adaptation genes. Euphytica 161:1–17
Walters C, Volk GA, Richards CM (2008) Genebanks in the post-genomic age: emerging roles and anticipated uses. Biodiversity 9:68–71
Wang M, Allefs S, van den Berg RG, Vleeshouwers VGAA, van der Vossen EAG, Vosman B (2008) Allele mining in Solanum: conserved homologues of Rpi–blb1 are identified in Solanum stoloniferum. Theor Appl Genet 116:933–943
Whitham S, Dinesh–Kumar SP, Choi D, Hehl R, Corr C, Baker B (1994) The product of the tobacco mosaic virus resistance gene N: similarity to toll and the interleukin–1 receptor. Cell 78:1101–1115
Wright S (1931) Evolution in Mendelian populations. Genetics 16:97–159
Wright SI, Gaut BS (2005) Molecular population genetics and the search for adaptive evolution in plants. Mol Biol Evol 22:506–519
Xiao J, Li J, Yuan L, Tanksley SD (1996) Identification of QTLs affecting traits of agronomic importance in a recombinant inbred population derived from a subspecific rice cross. Theor Appl Genet 92:230–244
Yamasaki M, Tenaillon MI, Vroh Bi I, Schroeder SG, Sanchez–Villeda H, Doebley JF, Gaut BS, McMullen MD (2005) A large–scale screen for artificial selection in maize identifies candidate agronomic loci for domestication and crop improvement. Plant Cell 17:2859–2872
Zhu Q, Zheng X, Luo J, Gaut BS, Ge S (2007) Multilocus analysis of nucleotide variation of Oryza sativa and its wild relatives: severe bottleneck during domestication of rice. Mol Biol Evol 24:875–888
Acknowledgments
We thank Ann Fenwick for genotyping the reference loci, and Gayle Volk and Dale Lockwood for comments on the manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by A. Graner.
Electronic supplementary material
Below is the link to the electronic supplementary material.
122_2011_1776_MOESM1_ESM.tif
Supplementary Figure 1. Correlation between degree of population subdivision and improvement in allelic retention at target loci. The standardized metric of population differentiation G′ST, estimated using neutral reference loci, is plotted on the X-axis. The number of additional alleles present in core subsets assembled using M+, relative to random assembly, is plotted on the Y-axis. M+ performed better for highly subdivided data sets when neutral loci were targeted, but not when selected loci were targeted. (TIFF 127 kb)
Rights and permissions
About this article
Cite this article
Reeves, P.A., Panella, L.W. & Richards, C.M. Retention of agronomically important variation in germplasm core collections: implications for allele mining. Theor Appl Genet 124, 1155–1171 (2012). https://doi.org/10.1007/s00122-011-1776-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00122-011-1776-4