Skip to main content

Complete Sequence Construction of the Highly Repetitive Ribosomal RNA Gene Repeats in Eukaryotes Using Whole Genome Sequence Data

  • Protocol
  • First Online:
The Nucleolus

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1455))

Abstract

The ribosomal RNA genes (rDNA) encode the major rRNA species of the ribosome, and thus are essential across life. These genes are highly repetitive in most eukaryotes, forming blocks of tandem repeats that form the core of nucleoli. The primary role of the rDNA in encoding rRNA has been long understood, but more recently the rDNA has been implicated in a number of other important biological phenomena, including genome stability, cell cycle, and epigenetic silencing. Noncoding elements, primarily located in the intergenic spacer region, appear to mediate many of these phenomena. Although sequence information is available for the genomes of many organisms, in almost all cases rDNA repeat sequences are lacking, primarily due to problems in assembling these intriguing regions during whole genome assemblies. Here, we present a method to obtain complete rDNA repeat unit sequences from whole genome assemblies. Limitations of next generation sequencing (NGS) data make them unsuitable for assembling complete rDNA unit sequences; therefore, the method we present relies on the use of Sanger whole genome sequence data. Our method makes use of the Arachne assembler, which can assemble highly repetitive regions such as the rDNA in a memory-efficient way. We provide a detailed step-by-step protocol for generating rDNA sequences from whole genome Sanger sequence data using Arachne, for refining complete rDNA unit sequences, and for validating the sequences obtained. In principle, our method will work for any species where the rDNA is organized into tandem repeats. This will help researchers working on species without a complete rDNA sequence, those working on evolutionary aspects of the rDNA, and those interested in conducting phylogenetic footprinting studies with the rDNA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Yue F, Cheng Y, Breschi A et al (2014) A comparative encyclopedia of DNA elements in the mouse genome. Nature 515:355–364

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Encode Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74

    Article  Google Scholar 

  3. Koch CM, Andrews RM, Flicek P et al (2007) The landscape of histone modifications across 1% of the human genome in five human cell lines. Genome Res 17:691–707

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Pei B, Sisu C, Frankish A et al (2012) The GENCODE pseudogene resource. Genome Biol 13:R51

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Orom UA, Derrien T, Beringer M et al (2010) Long noncoding RNAs with enhancer-like function in human cells. Cell 143:46–58

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Djebali S, Davis CA, Merkel A et al (2012) Landscape of transcription in human cells. Nature 489:101–108

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Nam JW, Bartel DP (2012) Long noncoding RNAs in C. elegans. Genome Res 22:2529–2540

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Brown JB, Boley N, Eisman R et al (2014) Diversity and dynamics of the Drosophila transcriptome. Nature 512:393–399

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Gerstein MB, Rozowsky J, Yan KK et al (2014) Comparative analysis of the transcriptome across distant species. Nature 512:445–448

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Scally A, Dutheil JY, Hillier LW et al (2012) Insights into hominid evolution from the gorilla genome sequence. Nature 483:169–175

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. The Chimpanzee Sequencing and Analysis Consortium (2005) Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437:69–87

    Article  Google Scholar 

  12. International Chicken Genome Sequencing Consortium (2004) Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432:695–716

    Article  Google Scholar 

  13. Naumova N, Imakaev M, Fudenberg G et al (2013) Organization of the mitotic chromosome. Science 342:948–953

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Ho JW, Jung YL, Liu T et al (2014) Comparative analysis of metazoan chromatin organization. Nature 512:449–452

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Rao SS, Huntley MH, Durand NC et al (2014) A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159:1665–1680

    Article  CAS  PubMed  Google Scholar 

  16. Eichler EE, Clark RA, She X (2004) An assessment of the sequence gaps: unfinished business in a finished human genome. Nat Rev Genet 5:345–354

    Article  CAS  PubMed  Google Scholar 

  17. Floutsakou I, Agrawal S, Nguyen TT et al (2013) The shared genomic architecture of human nucleolar organizer regions. Genome Res 23:2003–2012

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Leem SH, Kouprina N, Grimwood J et al (2004) Closing the gaps on human chromosome 19 revealed genes with a high density of repetitive tandemly arrayed elements. Genome Res 14:239–246

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Dunham A, Matthews LH, Burton J et al (2004) The DNA sequence and analysis of human chromosome 13. Nature 428:522–528

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Zody MC, Garber M, Sharpe T et al (2006) Analysis of the DNA sequence and duplication history of human chromosome 15. Nature 440:671–675

    Article  CAS  PubMed  Google Scholar 

  21. Hattori M, Fujiyama A, Taylor TD et al (2000) The DNA sequence of human chromosome 21. Nature 405:311–319

    Article  CAS  PubMed  Google Scholar 

  22. Heilig R, Eckenberg R, Petit JL et al (2003) The DNA sequence and analysis of human chromosome 14. Nature 421:601–607

    Article  CAS  PubMed  Google Scholar 

  23. McClintock B (1934) The relation of a particular chromosomal element to the development of the nucleoli in Zea mays. Z Zellforsch Mikrosk Anat 21:294–326

    Article  Google Scholar 

  24. Ritossa FM, Spiegelman S (1965) Localization of DNA complementary to ribosomal RNA in the nucleolus organizer region of Drosophila melanogaster. Proc Natl Acad Sci U S A 53:737–745

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Phillips RL, Kleese R, Wang SS (1971) The nucleolus organizer region of maize (Zea mays L.): chromosomal site of DNA complementary to ribosomal RNA. Chromosoma 36:79–88

    Article  Google Scholar 

  26. Thiry M, Lafontaine DL (2005) Birth of a nucleolus: the evolution of nucleolar compartments. Trends Cell Biol 15:194–199

    Article  CAS  PubMed  Google Scholar 

  27. Grozdanov P, Georgiev O, Karagyozov L (2003) Complete sequence of the 45-kb mouse ribosomal DNA repeat: analysis of the intergenic spacer. Genomics 82:637–643

    Article  CAS  PubMed  Google Scholar 

  28. van Keulen H, Gutell RR, Campbell SR et al (1992) The nucleotide sequence of the entire ribosomal DNA operon and the structure of the large subunit rRNA of Giardia muris. J Mol Evol 35:318–328

    Article  PubMed  Google Scholar 

  29. Spear BB (1980) Isolation and mapping of the rRNA genes in the macronucleus of Oxytricha fallax. Chromosoma 77:193–202

    Article  CAS  PubMed  Google Scholar 

  30. Birnstiel ML, Chipchase MI, Hyde BB (1963) The nucleolus, a source of ribosomes. Biochim Biophys Acta 76:454–462

    Article  CAS  PubMed  Google Scholar 

  31. Kobayashi T, Ganley AR (2005) Recombination regulation by transcription-induced cohesin dissociation in rDNA repeats. Science 309:1581–1584

    Article  CAS  PubMed  Google Scholar 

  32. Donati G, Montanaro L, Derenzini M (2012) Ribosome biogenesis and control of cell proliferation: p53 is not alone. Cancer Res 72:1602–1607

    Article  CAS  PubMed  Google Scholar 

  33. Derenzini M, Montanaro L, Chilla A et al (2005) Key role of the achievement of an appropriate ribosomal RNA complement for G1-S phase transition in H4-II-E-C3 rat hepatoma cells. J Cell Physiol 202:483–491

    Article  CAS  PubMed  Google Scholar 

  34. Deisenroth C, Zhang Y (2010) Ribosome biogenesis surveillance: probing the ribosomal protein-Mdm2-p53 pathway. Oncogene 29:4253–4260

    Article  CAS  PubMed  Google Scholar 

  35. Audas TE, Jacob MD, Lee S (2012) Immobilization of proteins in the nucleolus by ribosomal intergenic spacer noncoding RNA. Mol Cell 45:147–157

    Article  CAS  PubMed  Google Scholar 

  36. Zhang LF, Huynh KD, Lee JT (2007) Perinucleolar targeting of the inactive X during S phase: evidence for a role in the maintenance of silencing. Cell 129:693–706

    Article  CAS  PubMed  Google Scholar 

  37. Clos J, Normann A, Ohrlein A et al (1986) The core promoter of mouse rDNA consists of two functionally distinct domains. Nucleic Acids Res 14:7581–7595

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Doelling JH, Gaudino RJ, Pikaard CS (1993) Functional analysis of Arabidopsis thaliana rRNA gene and spacer promoters in vivo and by transient expression. Proc Natl Acad Sci U S A 90:7528–7532

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Haltiner MM, Smale ST, Tjian R (1986) Two distinct promoter elements in the human rRNA gene identified by linker scanning mutagenesis. Mol Cell Biol 6:227–235

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Pfleiderer C, Smid A, Bartsch I et al (1990) An undecamer DNA sequence directs termination of human ribosomal gene transcription. Nucleic Acids Res 18:4727–4736

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Grummt I, Maier U, Ohrlein A et al (1985) Transcription of mouse rDNA terminates downstream of the 3′ end of 28S RNA and involves interaction of factors with repeated sequences in the 3′ spacer. Cell 43:801–810

    Article  CAS  PubMed  Google Scholar 

  42. Nemeth A, Perez-Fernandez J, Merkl P et al (2013) RNA polymerase I termination: where is the end? Biochim Biophys Acta 1829:306–317

    Article  CAS  PubMed  Google Scholar 

  43. Pape LK, Windle JJ, Mougey E et al (1989) The Xenopus ribosomal DNA 60-and 81-base-pair repeats are position-dependent enhancers that function at the establishment of the preinitiation complex: analysis in vivo and in an enhancer-responsive in vitro system. Mol Cell Biol 9:5093–5104

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Little RD, Platt TH, Schildkraut CL (1993) Initiation and termination of DNA replication in human rRNA genes. Mol Cell Biol 13:6600–6613

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Yoon Y, Sanchez JA, Brun C et al (1995) Mapping of replication initiation sites in human ribosomal DNA by nascent-strand abundance analysis. Mol Cell Biol 15:2482–2489

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Gencheva M, Anachkova B, Russev G (1996) Mapping the sites of initiation of DNA replication in rat and human rRNA genes. J Biol Chem 271:2608–2614

    Article  CAS  PubMed  Google Scholar 

  47. Yu GL, Blackburn EH (1990) Amplification of tandemly repeated origin control sequences confers a replication advantage on rDNA replicons in Tetrahymena thermophila. Mol Cell Biol 10:2070–2080

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Daniel DC, Johnson EM (1989) Selective initiation of replication at origin sequences of the rDNA molecule of Physarum polycephalum using synchronous plasmodial extracts. Nucleic Acids Res 17:8343–8362

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Van’t Hof J, Hernandez P, Bjerknes CA et al (1987) Location of the replication origin in the 9-kb repeat size class of rDNA in pea (Pisum sativum). Plant Mol Biol 9:87–95

    Article  Google Scholar 

  50. Botchan PM, Dayton AI (1982) A specific replication origin in the chromosomal rDNA of Lytechinus variegatus. Nature 299:453–456

    Article  CAS  PubMed  Google Scholar 

  51. Gogel E, Langst G, Grummt I et al (1996) Mapping of replication initiation sites in the mouse ribosomal gene cluster. Chromosoma 104:511–518

    Article  CAS  PubMed  Google Scholar 

  52. Muller M, Lucchini R, Sogo JM (2000) Replication of yeast rDNA initiates downstream of transcriptionally active genes. Mol Cell 5:767–777

    Article  CAS  PubMed  Google Scholar 

  53. Coffman FD, He M, Diaz M-L et al (2006) Multiple initiation sites within the human ribosomal RNA gene. Cell Cycle 5:1223–1233

    Article  CAS  PubMed  Google Scholar 

  54. Coffman FD, Georgoff I, Fresa KL et al (1993) In vitro replication of plasmids containing human ribosomal gene sequences: origin localization and dependence on an aprotinin-binding cytosolic protein. Exp Cell Res 209:123–132

    Article  CAS  PubMed  Google Scholar 

  55. Akamatsu Y, Kobayashi T (2015) The human RNA polymerase I transcription terminator complex acts as a replication fork barrier that coordinates the progress of replication with rRNA transcription activity. Mol Cell Biol 35:1871–1881

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Dimitrova DS (2011) DNA replication initiation patterns and spatial dynamics of the human ribosomal RNA gene loci. J Cell Sci 124:2743–2752

    Article  CAS  PubMed  Google Scholar 

  57. Brewer BJ, Fangman WL (1988) A replication fork barrier at the 3′ end of yeast ribosomal RNA genes. Cell 56:637–643

    Article  Google Scholar 

  58. Lopez-estrano C, Schvartzman JB, Krimer DB et al (1998) Co-localization of polar replication fork barriers and rRNA transcription terminators in mouse rDNA. J Mol Biol 277:249–256

    Article  CAS  PubMed  Google Scholar 

  59. Lopez-Estrano C, Schvartzman JB, Krimer DB et al (1999) Characterization of the pea rDNA replication fork barrier: putative cis-acting and trans-acting factors. Plant Mol Biol 40:99–110

    Article  CAS  PubMed  Google Scholar 

  60. Wiesendanger B, Lucchini R, Koller T et al (1994) Replication fork barriers in the Xenopus rDNA. Nucleic Acids Res 22:5038–5046

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Grandori C, Gomez-Roman N, Felton-Edkins ZA et al (2005) c-Myc binds to human ribosomal DNA and stimulates transcription of rRNA genes by RNA polymerase I. Nat Cell Biol 7:311–318

    Article  CAS  PubMed  Google Scholar 

  62. Kern SE, Kinzler KW, Bruskin A et al (1991) Identification of p53 as a sequence-specific DNA-binding protein. Science 252:1708–1711

    Article  CAS  PubMed  Google Scholar 

  63. Jacob MD, Audas TE, Mullineux ST et al (2012) Where no RNA polymerase has gone before: novel functional transcripts derived from the ribosomal intergenic spacer. Nucleus 3:315–319

    Article  PubMed  Google Scholar 

  64. Bierhoff H, Schmitz K, Maass F et al (2010) Noncoding transcripts in sense and antisense orientation regulate the epigenetic state of ribosomal RNA genes. Cold Spring Harb Symp Quant Biol 75:357–364

    Article  CAS  PubMed  Google Scholar 

  65. Mayer C, Neubert M, Grummt I (2008) The structure of NoRC-associated RNA is crucial for targeting the chromatin remodelling complex NoRC to the nucleolus. EMBO Rep 9:774–780

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Mayer C, Schmitz KM, Li J et al (2006) Intergenic transcripts regulate the epigenetic state of rRNA genes. Mol Cell 22:351–361

    Article  CAS  PubMed  Google Scholar 

  67. Saka K, Ide S, Ganley AR et al (2013) Cellular senescence in yeast is regulated by rDNA noncoding transcription. Curr Biol 23:1794–1798

    Article  CAS  PubMed  Google Scholar 

  68. Prokopowich CD, Gregory TR, Crease TJ (2003) The correlation between rDNA copy number and genome size in eukaryotes. Genome 46:48–50

    Article  CAS  PubMed  Google Scholar 

  69. Long EO, Dawid IB (1980) Repeated genes in eukaryotes. Annu Rev Biochem 49:727–764

    Article  CAS  PubMed  Google Scholar 

  70. Stage DE, Eickbush TH (2007) Sequence variation within the rRNA gene loci of 12 Drosophila species. Genome Res 17:1888–1897

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Ganley AR, Kobayashi T (2007) Highly efficient concerted evolution in the ribosomal DNA repeats: total rDNA repeat variation revealed by whole-genome shotgun sequence data. Genome Res 17:184–191

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. James SA, O’Kelly MJ, Carter DM et al (2009) Repetitive sequence variation and dynamics in the ribosomal DNA array of Saccharomyces cerevisiae as revealed by whole-genome resequencing. Genome Res 19:626–635

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Gonzalez IL, Sylvester JE (1995) Complete sequence of the 43-kb human ribosomal DNA repeat: analysis of the intergenic spacer. Genomics 27:320–328

    Article  CAS  PubMed  Google Scholar 

  74. Li Z, Chen Y, Mu D et al (2012) Comparison of the two major classes of assembly algorithms: overlap-layout-consensus and de-bruijn-graph. Brief Funct Genomics 11:25–37

    Article  PubMed  Google Scholar 

  75. Flicek P, Birney E (2009) Sense from sequence reads: methods for alignment and assembly. Nat Methods 6:S6–S12

    Article  CAS  PubMed  Google Scholar 

  76. Margulies M, Egholm M, Altman WE et al (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437:376–380

    CAS  PubMed  PubMed Central  Google Scholar 

  77. Myers EW, Sutton GG, Delcher AL et al (2000) A whole-genome assembly of Drosophila. Science 287:2196–2204

    Article  CAS  PubMed  Google Scholar 

  78. Green P. Documentation for phrap and cross_match. http://www.phrap.org/phredphrap/phrap.html. Accessed 21 August 2015

  79. Chevreux B, Wetter T, Suhai S (1999) Genome sequence assembly using trace signals and additional sequence information. In: German Conference on Bioinformatics, pp 45–56

    Google Scholar 

  80. Jaffe DB, Butler J, Gnerre S et al (2003) Whole-genome sequence assembly for mammalian genomes: Arachne 2. Genome Res 13:91–96

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Batzoglou S, Jaffe DB, Stanley K et al (2002) ARACHNE: a whole-genome shotgun assembler. Genome Res 12:177–189

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Burton J (2008) Repeat. https://www.broadinstitute.org/crd/wiki/index.php/Repeat. Accessed 26 August 2015

  83. Gordon D, Abajian C, Green P (1998) Consed: a graphical tool for sequence finishing. Genome Res 8:195–202

    Article  CAS  PubMed  Google Scholar 

  84. Gordon D (2015) Consed 29.0 Documentation. http://www.phrap.org/consed/distributions/README.29.0.txt. Accessed 26 August 2015

  85. Searching Tips. NCBI. http://www.ncbi.nlm.nih.gov/Traces/trace.cgi?view=search_tips. Accessed 26 August 2015

  86. Heiman D, Burton J (2007) XML ancillary files. https://www.broadinstitute.org/crd/wiki/index.php/XML_ancillary_files. Accessed 26 August 2015

  87. Burton J, Gnerre S (2006) Directory tree. http://www.broadinstitute.org/crd/wiki/index.php/Directory_tree. Accessed 26 August 2015

  88. Heiman D, Burton J, Grabherr MG (2006) Input. http://www.broadinstitute.org/crd/wiki/index.php/Input. Accessed 26 August 2015

  89. Heiman D, Burton J (2007) Reads config.xml. http://www.broadinstitute.org/crd/wiki/index.php/Reads_config.xml. Accessed 26 August 2015

  90. Burton J (2008) FindXmlFeatures. http://www.broadinstitute.org/crd/wiki/index.php/FindXmlFeatures. Accessed 26 August 2015

  91. Burton J, Gnerre S, Grabherr MG (2006) Output. https://www.broadinstitute.org/crd/wiki/index.php/Output. Accessed 26 August 2015

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Saumya Agrawal or Austen R. D. Ganley .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this protocol

Cite this protocol

Agrawal, S., Ganley, A.R.D. (2016). Complete Sequence Construction of the Highly Repetitive Ribosomal RNA Gene Repeats in Eukaryotes Using Whole Genome Sequence Data. In: Németh, A. (eds) The Nucleolus. Methods in Molecular Biology, vol 1455. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-3792-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-3792-9_13

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-3790-5

  • Online ISBN: 978-1-4939-3792-9

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics