Abstract
Banana and plantain (Musa spp.) are grown in more than 120 countries in tropical and subtropical regions and constitute an important staple food for millions of people. A Musa acuminata ssp. malaccencis DH Pahang bacterial artificial chromosome (BAC) library (MAMB) was submitted for BAC-end sequencing. MAMB consists of 23,040 clones, with a 140-kbp average insert size, accounting for a five times coverage of the banana genome. A total of 46,080 reads were generated, and 42,750 (92.8%) high-quality sequences were obtained after trimming for vector and quality. Analysis of these data shows a GC content of 41.39%, whereas interspersed repeats comprise 32.3%. The most common repeated sequences found show homology to ribosomal RNA genes, particularly 18S rRNA, while the Ty3/gypsy type monkey retrotransposon is the most common retro element. The sequence data were used to generate a banana-specific repeat library containing 54 new repetitive elements which accounted for 11.86% of the total nucleotides. Simple sequence repeats represent 0.7% of the sequence data and allowed the identification of 2,455 potentially useful marker sites. Functional annotation identified 2,705 sequences that could code for proteins of known function. Microsynteny analysis shows a higher number of co-linear matches to Oryza sativa, in contrast to Arabidopsis thaliana. This database of BAC-end sequences is useful for the assembly of the complete banana genome sequence and is important for identification in functional genomics experiments.
Similar content being viewed by others
References
Abrusan G, Grundmann N, DeMester L, Makalowski W (2009) TEclass—a tool for automated classification of unknown eukaryotic transposable elements. Bioinformatics 25:1329–1330
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389
Ashburner M, Ball C, Blake J et al (2000) Gene Ontology: tool for the unification of biology. Nat Genet 25:25–29
Balint-Kurti P, Clendennen S, Dolezelova M, Valarik M, Dolezel J, Beetham P, May G (2000) Identification and chromosomal localization of the monkey retrotransposon in Musa sp. Mol Gen Genet 263:908–915
Bartos J, Alkhimova O, Dolezelova M, De Langhe E, Dolezel J (2005) Nuclear genome size and genomic distribution of ribosomal DNA in Musa and Ensete (Musaceae): taxonomic implications. Cytogenet Genome Res 109:50–57. doi:10.1159/000082381
Bennett M, Smith J (1991) Nuclear-DNA amounts in angiosperms. Philos Trans The R Soc Lond Ser 334:309–345
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Rapp BA, Wheeler DL (2002) GenBank. Nucleic Acids Res 30:17–20
Carpentier SC, Witters E, Laukens K, Onckelen HV, Swennen R, Panis B (2007) Banana (Musa spp.) as a model to study the meristem proteome: acclimation to osmotic stress. Proteomics 7:92–105. doi:10.1002/pmic.200600533
Carpentier SC, Coemans B, Podevin N, Laukens K, Witters E, Matsumura H, Terauchi R, Swennen R, Panis B (2008a) Functional genomics in a non-model crop: transcriptomics or proteomics? Physiol Plant 133:117–130. doi:10.1111/j.1399-3054.2008.01069.x
Carpentier SC, Panis B, Vertommen A, Swennen R, Sergeant K, Renaut J, Laukens K, Witters E, Samyn B, Devreese B (2008b) Proteome analysis of non-model plants: a challenging but powerful approach. Mass Spectrom Rev 27:354–377. doi:10.1002/mas.20170
Cheung F, Town CD (2007) A BAC end view of the Musa acuminata genome. BMC Plant Biol 7:29
Dsouza M, Larsen N, Overbeek, R (1997) Searching for patterns in genomic data. Trends Genet 13:497–498
Gasteiger E, Jung E, Bairoch A (2001) SWISS-PROT: connecting biomolecular knowledge via a protein database. Curr Issues Mol Biol 3:47–56
Goff SA, Ricke D, Lan T-H, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H et al (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296:92–100
Hong CP, Plaha P, Koo DH, Yang TJ, Choi SR, Lee YK, Uhm T, Bang JW, Edwards D, Bancroft I, Park BS, Lee J, Lim YP (2006) A survey of the Brassica rapa genome by BAC-end sequence analysis and comparison with Arabidopsis thaliana. Mol Cells 22:300–307
Hribova E, Dolezelova M, Town CD, Macas J, Dolezel J (2007) Isolation and characterization of the highly repeated fraction of the banana genome. Cytogenet Genome Res 119:268–274
Hribova E, Neumann P, Matsumoto T, Roux N, Macas J, Dolezel J (2010) Repetitive part of the banana (Musa acuminata) genome investigated by low-depth 454 sequencing. BMC Plant Biol 10(1):204
Huo N, Lazo G, Vogel J et al (2008) The nuclear genome of Brachypodium distachyon: analysis of BAC end sequences. Funct Integr Genomics 8:135–147. doi:10.1007/s10142-007-0062-7
Lai C, Yu Q, Hou S et al (2006) Analysis of papaya BAC end sequences reveals first insights into the organization of a fruit tree genome. Mol Genet Genomics 276:1–12. doi:10.1007/s00438-006-0122-z
Lescot M, Piffanelli P, Ciampi AY et al (2008) Insights into the Musa genome: syntenic relationships to rice and between Musa species. BMC Genomics 9:58
Lysak M, Dolezelova M, Horry J, Swennen R, Dolezel J (1999) Flow cytometric analysis of nuclear DNA content in Musa. Theor Appl Genet 98:1344–1350
Marin DH, Romero RA, Guzman M, Sutton TB (2003) Black Sigatoka: an increasing threat to banana cultivation. Plant Dis 87:208–222
Masoudi-Nejad A, Tonomura K, Kawashima S, Moriya Y, Suzuki M, Itoh M, Kanehisa M, Endo T, Goto S (2006) EGassembler: online bioinformatics service for large-scale processing, clustering and assembling ESTs and genomic DNA fragments. Nucleic Acids Res 34:459–462
Osuji J, Harrison G, Crouch J, Heslop-Harrison J (1997) Identification of the genomic constitution of Musa L. lines (bananas, plantains and hybrids) using molecular cytogenetics. Ann Bot 80:787–793
Paux E, Roger D, Badaeva E, Gay G, Bernard M, Sourdille P, Feuillet C (2006) Characterizing the composition and evolution of homoeologous genomes in hexaploid wheat through BAC-end sequencing on chromosome 3B. Plant J 48:463–474
Rozen S, Skaletsky H (2000) Primer3 on the WWW for general users and for biologist programmers. Meth Mol Biol 132:365–386
SanMiguel P, Gaut B, Tikhonov A, Nakajima Y, Bennetzen J (1998) The paleontology of intergene retrotransposons of maize. Nat Genet 20:43–45
Schoof H, Ernst R, Nazarov V, Pfeifer L, Mewes H, Mayer K (2004) MIPS Arabidopsis thaliana Database (MAtDB): an integrated biological knowledge resource for plant genomics. Nucleic Acid Res 32:D373–D376. doi:10.1093/nar/gkh068
Shultz J, Kazi S, Bashir R, Afzal J, Lightfoot D (2007) The development of BAC-end sequence-based microsatellite markers and placement in the physical and genetic maps of soybean. Theor Appl Genet 114:1081–1090
Tatusov RL, Fedorova ND, Jackson JD et al (2003) The COG database: an updated version includes eukaryotes. BMC Bioinform 4:41. doi:10.1186/1471-2105-4-41
Valarik M, Simkova H, Hribova E, Safar J, Dolezelova M, Dolezel J (2002) Isolation, characterization and chromosome localization of repetitive DNA sequences in bananas (Musa spp.). Chromosome Res 10:89–100
Venter JC, Smith HO, Hood L (1996) A new strategy for genome sequencing. Nature 381:364–366
Vij S, Gupta V, Kumar D, Vydianathan R, Raghuvanshi S, Khurana P, Khurana J, Tyagi A (2006) Decoding the rice genome. Bioessays 28:421–432. doi:10.1002/bies.20399
Yu J, Hu S, Wang J et al (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296:79–92
Acknowledgments
This project was funded through a grant of the Stichting Het Groene Woudt and was partially supported by the Dioraphte Foundation. We thank Dr. Jane Grimwood (HudsonAlpha Institute for Bioinformatics, Huntsville, AL, USA) for the BAC-end sequencing and Stefaan Vandamme and Dr. Kris Laukens (CEPROMA, Antwerp, Belgium) for technical assistance with the protein searching.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by F. Gmitter
Rights and permissions
About this article
Cite this article
Arango, R.E., Togawa, R.C., Carpentier, S.C. et al. Genome-wide BAC-end sequencing of Musa acuminata DH Pahang reveals further insights into the genome organization of banana. Tree Genetics & Genomes 7, 933–940 (2011). https://doi.org/10.1007/s11295-011-0385-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11295-011-0385-3