Skip to main content
Log in

Genome-wide BAC-end sequencing of Musa acuminata DH Pahang reveals further insights into the genome organization of banana

  • Original Paper
  • Published:
Tree Genetics & Genomes Aims and scope Submit manuscript

Abstract

Banana and plantain (Musa spp.) are grown in more than 120 countries in tropical and subtropical regions and constitute an important staple food for millions of people. A Musa acuminata ssp. malaccencis DH Pahang bacterial artificial chromosome (BAC) library (MAMB) was submitted for BAC-end sequencing. MAMB consists of 23,040 clones, with a 140-kbp average insert size, accounting for a five times coverage of the banana genome. A total of 46,080 reads were generated, and 42,750 (92.8%) high-quality sequences were obtained after trimming for vector and quality. Analysis of these data shows a GC content of 41.39%, whereas interspersed repeats comprise 32.3%. The most common repeated sequences found show homology to ribosomal RNA genes, particularly 18S rRNA, while the Ty3/gypsy type monkey retrotransposon is the most common retro element. The sequence data were used to generate a banana-specific repeat library containing 54 new repetitive elements which accounted for 11.86% of the total nucleotides. Simple sequence repeats represent 0.7% of the sequence data and allowed the identification of 2,455 potentially useful marker sites. Functional annotation identified 2,705 sequences that could code for proteins of known function. Microsynteny analysis shows a higher number of co-linear matches to Oryza sativa, in contrast to Arabidopsis thaliana. This database of BAC-end sequences is useful for the assembly of the complete banana genome sequence and is important for identification in functional genomics experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Abrusan G, Grundmann N, DeMester L, Makalowski W (2009) TEclass—a tool for automated classification of unknown eukaryotic transposable elements. Bioinformatics 25:1329–1330

    Article  PubMed  CAS  Google Scholar 

  • Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389

    Article  PubMed  CAS  Google Scholar 

  • Ashburner M, Ball C, Blake J et al (2000) Gene Ontology: tool for the unification of biology. Nat Genet 25:25–29

    Article  PubMed  CAS  Google Scholar 

  • Balint-Kurti P, Clendennen S, Dolezelova M, Valarik M, Dolezel J, Beetham P, May G (2000) Identification and chromosomal localization of the monkey retrotransposon in Musa sp. Mol Gen Genet 263:908–915

    Article  PubMed  CAS  Google Scholar 

  • Bartos J, Alkhimova O, Dolezelova M, De Langhe E, Dolezel J (2005) Nuclear genome size and genomic distribution of ribosomal DNA in Musa and Ensete (Musaceae): taxonomic implications. Cytogenet Genome Res 109:50–57. doi:10.1159/000082381

    Article  PubMed  CAS  Google Scholar 

  • Bennett M, Smith J (1991) Nuclear-DNA amounts in angiosperms. Philos Trans The R Soc Lond Ser 334:309–345

    Article  CAS  Google Scholar 

  • Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Rapp BA, Wheeler DL (2002) GenBank. Nucleic Acids Res 30:17–20

    Article  PubMed  CAS  Google Scholar 

  • Carpentier SC, Witters E, Laukens K, Onckelen HV, Swennen R, Panis B (2007) Banana (Musa spp.) as a model to study the meristem proteome: acclimation to osmotic stress. Proteomics 7:92–105. doi:10.1002/pmic.200600533

    Article  PubMed  CAS  Google Scholar 

  • Carpentier SC, Coemans B, Podevin N, Laukens K, Witters E, Matsumura H, Terauchi R, Swennen R, Panis B (2008a) Functional genomics in a non-model crop: transcriptomics or proteomics? Physiol Plant 133:117–130. doi:10.1111/j.1399-3054.2008.01069.x

    Article  PubMed  CAS  Google Scholar 

  • Carpentier SC, Panis B, Vertommen A, Swennen R, Sergeant K, Renaut J, Laukens K, Witters E, Samyn B, Devreese B (2008b) Proteome analysis of non-model plants: a challenging but powerful approach. Mass Spectrom Rev 27:354–377. doi:10.1002/mas.20170

    Article  PubMed  CAS  Google Scholar 

  • Cheung F, Town CD (2007) A BAC end view of the Musa acuminata genome. BMC Plant Biol 7:29

    Article  PubMed  Google Scholar 

  • Dsouza M, Larsen N, Overbeek, R (1997) Searching for patterns in genomic data. Trends Genet 13:497–498

    Article  PubMed  CAS  Google Scholar 

  • Gasteiger E, Jung E, Bairoch A (2001) SWISS-PROT: connecting biomolecular knowledge via a protein database. Curr Issues Mol Biol 3:47–56

    PubMed  CAS  Google Scholar 

  • Goff SA, Ricke D, Lan T-H, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H et al (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296:92–100

    Article  PubMed  CAS  Google Scholar 

  • Hong CP, Plaha P, Koo DH, Yang TJ, Choi SR, Lee YK, Uhm T, Bang JW, Edwards D, Bancroft I, Park BS, Lee J, Lim YP (2006) A survey of the Brassica rapa genome by BAC-end sequence analysis and comparison with Arabidopsis thaliana. Mol Cells 22:300–307

    PubMed  Google Scholar 

  • Hribova E, Dolezelova M, Town CD, Macas J, Dolezel J (2007) Isolation and characterization of the highly repeated fraction of the banana genome. Cytogenet Genome Res 119:268–274

    Article  PubMed  CAS  Google Scholar 

  • Hribova E, Neumann P, Matsumoto T, Roux N, Macas J, Dolezel J (2010) Repetitive part of the banana (Musa acuminata) genome investigated by low-depth 454 sequencing. BMC Plant Biol 10(1):204

    Article  PubMed  Google Scholar 

  • Huo N, Lazo G, Vogel J et al (2008) The nuclear genome of Brachypodium distachyon: analysis of BAC end sequences. Funct Integr Genomics 8:135–147. doi:10.1007/s10142-007-0062-7

    Article  PubMed  CAS  Google Scholar 

  • Lai C, Yu Q, Hou S et al (2006) Analysis of papaya BAC end sequences reveals first insights into the organization of a fruit tree genome. Mol Genet Genomics 276:1–12. doi:10.1007/s00438-006-0122-z

    Article  PubMed  CAS  Google Scholar 

  • Lescot M, Piffanelli P, Ciampi AY et al (2008) Insights into the Musa genome: syntenic relationships to rice and between Musa species. BMC Genomics 9:58

    Article  PubMed  Google Scholar 

  • Lysak M, Dolezelova M, Horry J, Swennen R, Dolezel J (1999) Flow cytometric analysis of nuclear DNA content in Musa. Theor Appl Genet 98:1344–1350

    Article  CAS  Google Scholar 

  • Marin DH, Romero RA, Guzman M, Sutton TB (2003) Black Sigatoka: an increasing threat to banana cultivation. Plant Dis 87:208–222

    Article  Google Scholar 

  • Masoudi-Nejad A, Tonomura K, Kawashima S, Moriya Y, Suzuki M, Itoh M, Kanehisa M, Endo T, Goto S (2006) EGassembler: online bioinformatics service for large-scale processing, clustering and assembling ESTs and genomic DNA fragments. Nucleic Acids Res 34:459–462

    Article  Google Scholar 

  • Osuji J, Harrison G, Crouch J, Heslop-Harrison J (1997) Identification of the genomic constitution of Musa L. lines (bananas, plantains and hybrids) using molecular cytogenetics. Ann Bot 80:787–793

    Article  CAS  Google Scholar 

  • Paux E, Roger D, Badaeva E, Gay G, Bernard M, Sourdille P, Feuillet C (2006) Characterizing the composition and evolution of homoeologous genomes in hexaploid wheat through BAC-end sequencing on chromosome 3B. Plant J 48:463–474

    Article  PubMed  CAS  Google Scholar 

  • Rozen S, Skaletsky H (2000) Primer3 on the WWW for general users and for biologist programmers. Meth Mol Biol 132:365–386

    CAS  Google Scholar 

  • SanMiguel P, Gaut B, Tikhonov A, Nakajima Y, Bennetzen J (1998) The paleontology of intergene retrotransposons of maize. Nat Genet 20:43–45

    Article  PubMed  CAS  Google Scholar 

  • Schoof H, Ernst R, Nazarov V, Pfeifer L, Mewes H, Mayer K (2004) MIPS Arabidopsis thaliana Database (MAtDB): an integrated biological knowledge resource for plant genomics. Nucleic Acid Res 32:D373–D376. doi:10.1093/nar/gkh068

    Article  PubMed  CAS  Google Scholar 

  • Shultz J, Kazi S, Bashir R, Afzal J, Lightfoot D (2007) The development of BAC-end sequence-based microsatellite markers and placement in the physical and genetic maps of soybean. Theor Appl Genet 114:1081–1090

    Article  PubMed  CAS  Google Scholar 

  • Tatusov RL, Fedorova ND, Jackson JD et al (2003) The COG database: an updated version includes eukaryotes. BMC Bioinform 4:41. doi:10.1186/1471-2105-4-41

    Article  Google Scholar 

  • Valarik M, Simkova H, Hribova E, Safar J, Dolezelova M, Dolezel J (2002) Isolation, characterization and chromosome localization of repetitive DNA sequences in bananas (Musa spp.). Chromosome Res 10:89–100

    Article  PubMed  CAS  Google Scholar 

  • Venter JC, Smith HO, Hood L (1996) A new strategy for genome sequencing. Nature 381:364–366

    Article  PubMed  CAS  Google Scholar 

  • Vij S, Gupta V, Kumar D, Vydianathan R, Raghuvanshi S, Khurana P, Khurana J, Tyagi A (2006) Decoding the rice genome. Bioessays 28:421–432. doi:10.1002/bies.20399

    Article  PubMed  CAS  Google Scholar 

  • Yu J, Hu S, Wang J et al (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296:79–92

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This project was funded through a grant of the Stichting Het Groene Woudt and was partially supported by the Dioraphte Foundation. We thank Dr. Jane Grimwood (HudsonAlpha Institute for Bioinformatics, Huntsville, AL, USA) for the BAC-end sequencing and Stefaan Vandamme and Dr. Kris Laukens (CEPROMA, Antwerp, Belgium) for technical assistance with the protein searching.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manoel T. Souza Jr..

Additional information

Communicated by F. Gmitter

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(PDF 1467 kb)

Table 1a

(XLS 37 kb)

Table 1b

(XLS 22 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arango, R.E., Togawa, R.C., Carpentier, S.C. et al. Genome-wide BAC-end sequencing of Musa acuminata DH Pahang reveals further insights into the genome organization of banana. Tree Genetics & Genomes 7, 933–940 (2011). https://doi.org/10.1007/s11295-011-0385-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11295-011-0385-3

Keywords

Navigation