Abstract
Next-generation sequencing technologies have increased markedly the throughput of genetic studies, allowing the identification of several thousands of SNPs within a single experiment. Even though sequencing cost is rapidly decreasing, the price for whole-genome re-sequencing of a large number of individuals is still costly, especially in plants with a large and highly redundant genome. In recent years, several reduced representation library approaches have been developed for reducing the sequencing cost per individual. Among them, genotyping-by-sequencing (GBS) represents a simple, cost-effective, and highly multiplexed alternative for species with or without an available reference genome. However, this technology requires specific optimization for each species, especially for the restriction enzyme (RE) used. Here we report on the application of GBS in a test experiment with 18 genotypes of wild and domesticated Phaseolus vulgaris. After an in silico digestion with different RE of the P. vulgaris genome reference sequence, we selected CviAII as the most suitable RE for GBS in common bean based on the high frequency and even distribution of restriction sites. A total of 44,875 SNPs, 1940 deletions, and 1693 insertions were identified, with 50 % of the variants located in genic sequences and tagging 11,027 genes. SNP and InDel distributions were positively correlated with gene density across the genome. In addition, we were able to also identify putative copy number variations of genomic segments between different genotypes. In conclusion, GBS with the CviAII enzyme results in thousands of evenly spaced markers and provides a reliable, high-throughput, and cost-effective approach for genotyping both wild and domesticated common beans.
Similar content being viewed by others
References
Ali OA, O’Rourke SM, Amish SJ, Meek MH, Luikart G, Jeffres C, Miller MR (2016) RAD capture (Rapture): flexible and efficient sequence-based genotyping. Genetics 202:389–400
Altmann A, Weber P, Bader D, Preuss M, Binder EB, Mϋller-Myhsok B (2012) A beginners guide to SNP calling from high-throughput DNA-sequencing data. Hum Genet 131:1451–1454
Altshuler D, Pollare VJ, Cowles CR, Van Etten WJ, Baldwin J, Linton L, Landes ES (2000) An SNP map of the human genome generated by reduced representation shotgun sequencing. Nature 407:513–516
Baird NA, Etter PD, Atwood TS, Currey MC, Shiver AL, Lewis ZA, Selker EU, Cresko WA, Johnson EA (2008) Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS ONE 3:e3376
Beebe S, Ramirez J, Jarvis A, Rao MI, Mosquera G, Bueno JM, Blair MW (2011) Genetic improvement of common beans and the challenges of climate change. In: Yadav SS, Redden RJ, Hatfield JL, Lotze-Campen H, Hall AE (eds) Crop adaption to climate change. Wiley-Blackwell, Oxford, pp 356–369
Beissinger TM, Hirsch CN, Sekhon RS, Foester JM, Johnson JM, Muttoni G, Vaillancourt B, Buell CR, Kaeppler SM, de Leon N (2013) Marker density and read depth for genotyping populations using genotyping-by-sequencing. Genetics 193:1073–1081
Bitocchi E, Bellucci E, Giardini A, Rau D, Rodriguez M, Biagetti E, Santilocchi R, Spagnoletti Zeuli P, Gioia T, Logozzo G, Attene G, Nanni L, Papa R (2013) Molecular analysis of the parallel domestication of the common bean (Phaseolus vulgaris) in Mesoamerica and the Andes. New Phytol 197:300–313
Blair MW, Diaz LM, Buendia HF, Duque MC (2009) Genetic diversity, seed size associations and population structure of a core collection of common beans (Phaseolus vulgaris L.). Theor Appl Genet 119:955–972
Broughton WJ, Hernandez G, Blair M, Beebe S, Gepts P, Vanderleyden J (2003) Beans (Phaseolus spp.)—model food legumes. Plant Soil 252:55–128
Cabanski CR, Cavin K, Bizon C, Parker Wilkerson MD, Wilhelmsen JS, Perou CM, Marron JS, Hayes DN (2012) ReQON: a bioconductor package for recalibrating quality scores from next-generation sequencing data. BMC Bioinformatics 13:221
Chacón SMI, Pickersgill B, Debouck DG, Arias JS (2007) Phylogeographic analysis of the chloroplast DNA variation in wild common bean (Phaseolus vulgaris L.) in the Americas. Plant Syst Evol 266:175–195
Cock PJA, Antao T, Chang JT, Chapman BA, Cox CJ, Dalke A, Friedberg I, Hamelryck T, Kauff F, Wilczyinski B, de Hoon MJL (2009) Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25:1422–1423
Conesa A, Götz S, García-Gómez JM et al (2005) Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21:3674–3676
Cook DE, Lee TG, Guo X et al (2012) Copy number variation of multiple genes at Rhg1 mediates nematode resistance in soybean. Science 338:1206–1209
Danecek P, Auton A, Abecasis G et al (2011) The variant call format and VCFtools. Bioinformatics 27:2156–2158
Davey JW, Hohenlohe PA, Etter PD, Boone JQ, Catchen JM, Blaxter ML (2011) Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nat Rev Genet 12:499–510
De Donato M, Peters SO, Mitchell SE, Hussain T, Imumorin IG (2013) Genotyping-by-sequencing (GBS): a novel, efficient and cost-effective genotyping method for cattle using next-generation sequencing. PLoS ONE 8:e62137
DeBolt S (2010) Copy number variation shapes genome diversity in Arabidopsis over immediate family generational scales. Genome Biol Evol 2:441–453
Descham S, Campbell MA (2010) Utilization of next-generation sequencing platforms in plant genomics and genetic variants discovery. Mol Breed 25:553–570
Elshire RJ, Glaubitz JC, Sun Q, Polanf JA, Kawamoto K, Buckler ES, Mitchell SE (2011) A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS ONE 6:e19379
Freytag GF, Debouck DG (2002) Taxonomy, distribution, and ecology of the genus Phaseolus (Leguminosae–Papilionoideae) in North America, Mexico and Central America. Botanical Research Institute of Texas, Fort Worth
Gepts P (1998) Origin and evolution of common bean: past events and recent trends. HortScience 33:1124–1130
Gepts P (2014) Beans: origins and development. In: Smith C (ed) Encyclopedia of global archaeology. Springer, Berlin, pp 822–827
Gepts P, Aragão F, de Barros E, Blair MW, Brondani R, Broughton W, Galasso I, Hernández G, Kami J, Lariguet P, McClean P, Melotto M, Miklas P, Pauls P, Pedrosa-Harand A, Porch T, Sánchez F, Sparvoli F, Yu K (2008) Genomics of Phaseolus beans, a major source of dietary protein and micronutrients in the tropics. In: Moore PH, Ming R (eds) Genomics of tropical crop plants. Springer, Berlin, pp 113–143
Glaubitz JC, Casstevens TM, Lu F, Harriman J, Elshire RJ, Sun Q, Buckler ES (2014) TASSEL-GBS: a high capacity genotyping by sequencing analysis pipeline. PLoS ONE 9:e90346
Goodstein DM, Shu S, Howson R et al (2012) Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res 40:D1178–D1186
Gouy M, Guindon S, Gascuel O (2010) SeaView version 4: a multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Mol Biol Evol 27:221–224
Grativol C, Hemerly AS, Ferreira PCG (2012) Genetic and epigenetic regulation of stress responses in natural plant populations. Biochim Biophys Acta 1819:176–185
Greminger MP, Stölting KN, Nater A, Goossens B, Arora N, Bruggmann R, Patrignani A, Nussberger B, Sharma R, Kraus RH, Ambu LN, Singleton I, Chikhi L, van Schaik CP, Krützen M (2014) Generation of SNP datasets for orangutan population genomics using improved reduced-representation sequencing and direct comparisons of SNP calling algorithms. BMC Genomics 15:16
Hart JP, Griffiths PD (2015) Genotyping-by-sequencing enabled mapping and marker development for the potyvirus resistance allele in common bean. Plant Genome. doi:10.3835/plantgenome2014.09.0058
Henry IM, Zinkgraf MS, Groover AT, Comai L (2015) A system for dosage-based functional genomics in poplar. Plant Cell 27:2370–2383
Iquira E, Humira S, François B (2015) Association mapping of QTLs for sclerotinia stem rot resistance in a collection of soybean plant introductions using a genotyping by sequencing (GBS) approach. BMC Plant Biol 15:5
Jaganathan D, Thudi M, Kale S et al (2015) Genotyping-by-sequencing based intra-specific genetic map refines a QTL-hotspot region for drought tolerance in chickpea. Mol Genet Genomics 290:559–571
Kami J, Velásquez VB, Debouck DG, Gepts P (1995) Identification of presumed ancestral DNA sequences of phaseolin in Phaseolus vulgaris. Proc Natl Acad Sci 92:1101–1104
Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16:111–120
Kwak M, Gepts P (2009) Structure of genetic diversity in the two major gene pools of common bean (Phaseolus vulgaris L., Fabaceae). Theor Appl Genet 118:979–992
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25:1754–1760
Li H, Vikram P, Singh RP et al (2015) A high density GBS map of bread wheat and its application for dissecting complex disease resistance traits. BMC Genomics 16:216
Liu H, Bayer M, Druka A, Russel JR, Hackett CA, Poland J, Ramsay L, Hedley PE, Waugh R (2014) An evaluation of genotyping by sequencing (GBS) to map the Breviarisatum-e (ari-e) locus in cultivated barley. BMC Genomics 15:104
McHale LK, Haun WJ, Xu WW et al (2012) Structural variants in the soybean genome localize to clusters of biotic stress-response genes. Plant Physiol 159:1295–1308
Miklas PN, Kelly JD, Beede SE, Blair MW (2006) Common bean breeding for resistance against biotic and abiotic stresses: from classical to MAS breeding. Euphytica 145:105–131
Monson-Miller J, Sanchez-Mendez D, Fass J, Henry IM, Tai TH, Comai L (2012) Reference genome-independent assessment of mutation density using restriction enzyme-phased sequencing. BMS Genomics 13:72
Pallotta MA, Warner P, Fox RL, Kuchel H, Jefferies SJ, Langridge P (2003) Marker assisted wheat breeding in the southern region of Australia. In: Proceedings of the 10th international wheat genetics symposium, Paestum, Italy, pp 1–6
Schmutz J, McClean PE, Mamidi S, We GA, Cannon SB et al (2014) A reference genome for common bean and genome-wide analysis of dual domestications. Nat Genet 46:707–713
Schnable PS, Ware D, Fulton RS et al (2009) The B73 maize genome: complexity, diversity, and dynamics. Science 326:1112–1115
Schröder S, Mamidi S, Lee R et al (2016) Optimization of genotyping by sequencing (GBS) data in common bean (Phaseolus vulgaris L.). Mol Breed 36:1–9
Singh SP, Gepts P, Debouck DG (1991) Races of common bean (Phaseolus vulgaris L., Fabaceae). Econ Bot 45:379–396
Stapley J, Reger J, Feulner PG, Smadja C, Galindo J, Ekblom R, Bennison C, Ball AD, Beckerman AP, Slate J (2010) Adaptation genomics: the next generation. Trends Ecol Evol 25:705–712
Talukder ZI, Anderson E, Miklas PN, Blair MW, Osorno J, Dilawari M, Hossain KG (2010) Genetic diversity and selection of genotypes to enhance Zn and Fe content in common bean. Can J Plant Sci 90:49–60
Thudi M, Li Y, Jackson SA, May GD, Varshney RK (2012) Current state-of-art of sequencing technologies for plant genomics research. Brief Funct Genomics 11:3–11
Varshney RK, Terauchi R, McCouch SR (2014) Harvesting the promising fruits of genomics: applying genome sequencing technologies to crop breeding. PLoS Biol 12:e1001883
Żmieńko A, Samelak A, Kozłowski P, Figlerowicz M (2014) Copy number polymorphism in plant genomes. Theor Appl Genet 127:1–18
Zou X, Shi S, Austin RS, Merico D, Munholland S, Marsolaris F, Navabi A, Crosby WL, Pauls KP, Yu K, Cui Y (2014) Genome-wide single nucleotide polymorphism and insertion–deletion discovery through next-generation sequencing of reduced representation libraries in common bean. Mol Breed 33:769–778
Acknowledgments
This work used the Vincent J. Coates Genomics Sequencing Laboratory at UC Berkeley, supported by NIH S10 Instrumentation Grants S10RR029668 and S10RR027303. This project was supported by Agriculture and Food Research Initiative (AFRI) Competitive Grant No. 2013-67013-21224 from the USDA National Institute of Food and Agriculture.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary File S1
Bean genotypes analyzed in this study with the barcodes used for multiplexed sequencing (PDF 32 kb)
Supplementary File S2
Correlation between SNP distribution (Total SNPs) and density on a 1 Mb non-overlapping bin (SNPs/Mb) with chromosome length. Regression lines and Pearson regression coefficient (r) are shown (PDF 138 kb)
Supplementary File S3
Distribution of variants and genes with the relative density in 1 Mb non-overlapping bins in the 11 P. vulgaris chromosomes (PDF 12663 kb)
Supplementary File S4
Read coverage in 1 Mb non-overlapping bins across the 11 chromosomes for the G19833 reference genotype (PDF 107 kb)
Supplementary File S5
RRC in the analyzed genotypes (PDF 75 kb)
Supplementary File S6
Regions harboring putative CNVs in the different genotypes. The coordinates of the genomic bins in the different chromosomes are reported in BED format (PDF 35 kb)
Supplementary File S7
Significant GO terms (FDR < 0.05) enriched in the genes located in putative CNVs. Test Set is the set of the up-regulated genes, Reference Set is the background of the P. vulgaris GO terms mapping (PDF 5762 kb)
Supplementary File S8
Annotation, together with the best Arabidopsis hit, of the genes located in putative CNVs. When available the best Arabidopsis hit common name was used (PDF 62 kb)
Rights and permissions
About this article
Cite this article
Ariani, A., Berny Mier y Teran, J.C. & Gepts, P. Genome-wide identification of SNPs and copy number variation in common bean (Phaseolus vulgaris L.) using genotyping-by-sequencing (GBS). Mol Breeding 36, 87 (2016). https://doi.org/10.1007/s11032-016-0512-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11032-016-0512-9