Structured RNAs in the ENCODE selected regions of the human genome

  1. Stefan Washietl1,14,
  2. Jakob S. Pedersen2,
  3. Jan O. Korbel3,4,
  4. Claudia Stocsits5,
  5. Andreas R. Gruber1,
  6. Jörg Hackermüller6,
  7. Jana Hertel5,
  8. Manja Lindemeyer5,
  9. Kristin Reiche5,
  10. Andrea Tanzer1,5,13,
  11. Catherine Ucla10,
  12. Carine Wyss10,
  13. Stylianos E. Antonarakis10,
  14. France Denoeud7,
  15. Julien Lagarde7,
  16. Jorg Drenkow8,
  17. Philipp Kapranov8,
  18. Thomas R. Gingeras8,
  19. Roderic Guigó7,
  20. Michael Snyder11,
  21. Mark B. Gerstein3,
  22. Alexandre Reymond9,10,
  23. Ivo L. Hofacker1, and
  24. Peter F. Stadler1,5,6,12
  1. 1 Institute for Theoretical Chemistry, University of Vienna, A-1090 Wien, Austria;
  2. 2 Center for Biomolecular Science and Engineering, University of California, Santa Cruz, Santa Cruz, California 95064, USA;
  3. 3 Molecular Biophysics and Biochemistry Department, Yale University, New Haven, Connecticut 06520-8114, USA;
  4. 4 European Molecular Biology Laboratory, 69117 Heidelberg, Germany;
  5. 5 Bioinformatics Group, Department of Computer Science, University of Leipzig, D-04107 Leipzig, Germany;
  6. 6 Fraunhofer Institute for Cell Therapy and Immunology, 04103 Leipzig, Germany;
  7. 7 Grup de Recerca en Informática Biomèdica, Institut Municipal d’Investigació Mèdica/Universitat Pompeu Fabra. Passeig Marítim de la Barceloneta, 37-49,08003, Barcelona, Catalonia, Spain;
  8. 8 Affymetrix, Inc., Santa Clara, California 95051, USA;
  9. 9 Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland;
  10. 10 Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland;
  11. 11 Molecular, Cellular and Developmental Biology Department, Yale University, New Haven, Connecticut 06520-8114, USA;
  12. 12 Santa Fe Institute, Santa Fe, New Mexico 87501 USA;
  13. 13 Department of Ecology and Evolutionary Biology; Yale University, New Haven, CT 06520-8106, USA

Abstract

Functional RNA structures play an important role both in the context of noncoding RNA transcripts as well as regulatory elements in mRNAs. Here we present a computational study to detect functional RNA structures within the ENCODE regions of the human genome. Since structural RNAs in general lack characteristic signals in primary sequence, comparative approaches evaluating evolutionary conservation of structures are most promising. We have used three recently introduced programs based on either phylogenetic–stochastic context-free grammar (EvoFold) or energy directed folding (RNAz and AlifoldZ), yielding several thousand candidate structures (corresponding to ∼2.7% of the ENCODE regions). EvoFold has its highest sensitivity in highly conserved and relatively AU-rich regions, while RNAz favors slightly GC-rich regions, resulting in a relatively small overlap between methods. Comparison with the GENCODE annotation points to functional RNAs in all genomic contexts, with a slightly increased density in 3′-UTRs. While we estimate a significant false discovery rate of ∼50%–70% many of the predictions can be further substantiated by additional criteria: 248 loci are predicted by both RNAz and EvoFold, and an additional 239 RNAz or EvoFold predictions are supported by the (more stringent) AlifoldZ algorithm. Five hundred seventy RNAz structure predictions fall into regions that show signs of selection pressure also on the sequence level (i.e., conserved elements). More than 700 predictions overlap with noncoding transcripts detected by oligonucleotide tiling arrays. One hundred seventy-five selected candidates were tested by RT-PCR in six tissues, and expression could be verified in 43 cases (24.6%).

Footnotes

  • 14 Corresponding author.

    14 E-mail wash{at}tbi.univie.ac.at; fax 43-1-4277-52793.

  • [The sequenced fragments of verified ncRNA predictions and TEC were deposited to GenBank under accession nos. EF212232–EF212281 and EF212282–EF212289, respectively.]

  • Article is online at http://www.genome.org/cgi/doi/10.1101/gr.5650707

  • 15 This approach is also similar in spirit to QRNA, a program that detects conserved RNA structures in pairwise alignments by comparing an SCFG-based RNA model to a background model (Rivas and Eddy 2001).

  • 16 MIRN483 does not overlap with TARs/Transfrags. It might be specific in fetal liver tissue, which is not among the 11 tissues tested.

    • Received June 16, 2006.
    • Accepted December 12, 2006.
  • Freely available online through the Genome Research Open Access option.

Related Article

| Table of Contents
OPEN ACCESS ARTICLE

Preprint Server