Skip to main content
Log in

Informatic Resources for Identifying and Annotating Structural RNA Motifs

  • Review
  • Published:
Molecular Biotechnology Aims and scope Submit manuscript

Abstract

Post-transcriptional regulation of genes and transcripts is a vital aspect of cellular processes, and unlike transcriptional regulation, remains a largely unexplored domain. One of the most obvious and most important questions to explore is the discovery of functional RNA elements. Many RNA elements have been characterized to date ranging from cis-regulatory motifs within mRNAs to large families of non-coding RNAs. Like protein coding genes, the functional motifs of these RNA elements are highly conserved, but unlike protein coding genes, it is most often the structure and not the sequence that is conserved. Proper characterization of these structural RNA motifs is both the key and the limiting step to understanding the post-transcriptional aspects of the genomic world. Here, we focus on the task of structural motif discovery and provide a survey of the informatics resources geared towards this task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abreu-Goodger, C., & Merino, E. (2005). RibEx: A web server for locating riboswitches and other conserved bacterial regulatory elements. Nucleic Acids Research, 33(Web Server issue), W690–W692.

    Article  CAS  Google Scholar 

  2. Abreu-Goodger, C., et al. (2004). Conserved regulatory motifs in bacteria: Riboswitches and beyond. Trends in Genetics: TIG, 20(10), 475–479. doi:10.1016/j.tig.2004.08.003.

    Article  CAS  Google Scholar 

  3. Anwar, M., Nguyen, T., & Turcotte, M. (2006). Identification of consensus RNA secondary structures using suffix arrays. BMC Bioinformatics, 7, 244. doi:10.1186/1471-2105-7-244.

    Article  CAS  Google Scholar 

  4. Bafna, V., & Zhang, S. (2004). FastR: Fast database search tool for non-coding RNA. Proceedings/IEEE Computational Systems Bioinformatics Conference, CSB. IEEE Computational Systems Bioinformatics Conference (pp. 52–61).

  5. Bauer, M., Klau, G. W., & Reinert, K. (2007). Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization. BMC Bioinformatics, 8, 271. doi:10.1186/1471-2105-8-271.

    Article  CAS  Google Scholar 

  6. Berman, H. M., et al. (1992). The nucleic acid database. A comprehensive relational database of three-dimensional structures of nucleic acids. Biophysical Journal, 63(3), 751–759.

    Article  CAS  Google Scholar 

  7. Bindewald, E., & Shapiro, B. A. (2006). RNA secondary structure prediction from sequence alignments using a network of k-nearest neighbor classifiers. RNA (New York, N.Y.), 12(3), 342–352. doi:10.1261/rna.2164906.

    CAS  Google Scholar 

  8. Bindewald, E., et al. (2008). RNAJunction: A database of RNA junctions and kissing loops for three-dimensional structural analysis and nanodesign. Nucleic Acids Research, 36(Database issue), D392–D397. doi:10.1093/nar/gkm842.

    CAS  Google Scholar 

  9. Busch, A., & Backofen, R. (2006). INFO-RNA—A fast approach to inverse RNA folding. Bioinformatics (Oxford, England), 22(15), 1823–1831. doi:10.1093/bioinformatics/btl194.

    Article  CAS  Google Scholar 

  10. Chang, T., et al. (2006). RNAMST: Efficient and flexible approach for identifying RNA structural homologs. Nucleic Acids Research, 34(Web Server issue), W423–W428.

    Article  CAS  Google Scholar 

  11. Coventry, A., Kleitman, D. J., & Berger, B. (2004). MSARI: Multiple sequence alignments for statistical detection of RNA secondary structure. Proceedings of the National Academy of Sciences of the United States of America, 101(33), 12102–12107. doi:10.1073/pnas.0404193101.

    Article  CAS  Google Scholar 

  12. Dalli, D., et al. (2006). STRAL: Progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time. Bioinformatics (Oxford, England), 22(13), 1593–1599. doi:10.1093/bioinformatics/btl142.

    Article  CAS  Google Scholar 

  13. di Bernardo, D., Down, T., & Hubbard, T. (2003). ddbRNA: Detection of conserved secondary structures in multiple alignments. Bioinformatics (Oxford, England), 19(13), 1606–1611. doi:10.1093/bioinformatics/btg229.

    Article  Google Scholar 

  14. Do, C. B., Foo, C., & Batzoglou, S. (2008). A max-margin model for efficient simultaneous alignment and folding of RNA sequences. Bioinformatics (Oxford, England), 24(13), i68–i76. doi:10.1093/bioinformatics/btn177.

    Article  CAS  Google Scholar 

  15. Doyle, F., et al. (2008). Bioinformatic tools for studying post-transcriptional gene regulation: The UAlbany TUTR collection and other informatic resources. Methods in Molecular Biology (Clifton, N.J.), 419, 39–52. doi:10.1007/978-1-59745-033-1_3.

    Article  CAS  Google Scholar 

  16. Dsouza, M., Larsen, N., & Overbeek, R. (1997). Searching for patterns in genomic data. Trends in Genetics: TIG, 13(12), 497–498. doi:10.1016/S0168-9525(97)01347-4.

    Article  CAS  Google Scholar 

  17. Eddy, S. R. (2006). Computational analysis of RNAs. Cold Spring Harbor Symposia on Quantitative Biology, 71, 117–128. doi:10.1101/sqb.2006.71.003.

    Article  CAS  Google Scholar 

  18. Gardner, P., & Giegerich, R. (2004). A comprehensive comparison of comparative RNA structure prediction approaches. BMC Bioinformatics, 5(1), 140. doi:10.1186/1471-2105-5-140.

    Article  CAS  Google Scholar 

  19. Gautheret, D., & Lambert, A. (2001). Direct RNA motif definition and identification from multiple sequence alignments using secondary structure profiles. Journal of Molecular Biology, 313(5), 1003–1011. doi:10.1006/jmbi.2001.5102.

    Article  CAS  Google Scholar 

  20. Griffiths-Jones, S., et al. (2005). Rfam: Annotating non-coding RNAs in complete genomes. Nucleic Acids Research, 33(suppl_1), D121–D124.

    CAS  Google Scholar 

  21. Griffiths-Jones, S., et al. (2006). miRBase: MicroRNA sequences, targets and gene nomenclature. Nucleic Acids Research, 34(suppl_1), D140–D144.

    Article  CAS  Google Scholar 

  22. Hamada, M., et al. (2006). Mining frequent stem patterns from unaligned RNA sequences. Bioinformatics (Oxford, England), 22(20), 2480–2487. doi:10.1093/bioinformatics/btl431.

    Article  CAS  Google Scholar 

  23. Hofacker, I. L. (2003). Vienna RNA secondary structure server. Nucleic Acids Research, 31(13), 3429–3431.

    Article  CAS  Google Scholar 

  24. Hofacker, I. L. (2004). RNA secondary structure analysis using the Vienna RNA package. Current Protocols in Bioinformatics/Editoral Board, Andreas D. Baxevanis… [et Al, Chapter 12, Unit 12.2].

  25. Hofacker, I. L. (2007). RNA consensus structure prediction with RNAalifold. Methods in Molecular Biology (Clifton, N.J.), 395, 527–544.

    CAS  Google Scholar 

  26. Hofacker, I. L., Bernhart, S. H. F., & Stadler, P. F. (2004). Alignment of RNA base pairing probability matrices. Bioinformatics (Oxford, England), 20(14), 2222–2227.

    Article  CAS  Google Scholar 

  27. Holmes, I. (2005). Accelerated probabilistic inference of RNA structure evolution. BMC Bioinformatics, 6, 73.

    Article  CAS  Google Scholar 

  28. Horesh, Y., et al. (2007). RNAspa: A shortest path approach for comparative prediction of the secondary structure of ncRNA molecules. BMC Bioinformatics, 8, 366.

    Article  CAS  Google Scholar 

  29. Hu, Y. (2003). GPRM: A genetic programming approach to finding common RNA secondary structure elements. Nucleic Acids Research, 31(13), 3446–3449.

    Article  CAS  Google Scholar 

  30. Huang, H., et al. (2006). RegRNA: An integrated web server for identifying regulatory RNA motifs and elements. Nucleic Acids Research, 34(Web Server issue), W429–W434.

    Article  CAS  Google Scholar 

  31. Jacobs, G. H., et al. (2006). Transterm—extended search facilities and improved integration with other databases. Nucleic Acids Research, 34(Database issue), D37–D40.

    Article  CAS  Google Scholar 

  32. Ji, Y., Xu, X., & Stormo, G. D. (2004). A graph theoretical approach for predicting common RNA secondary structure motifs including pseudoknots in unaligned sequences. Bioinformatics (Oxford, England), 20(10), 1591–1602.

    Article  CAS  Google Scholar 

  33. Katoh, K., & Toh, H. (2008). Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework. BMC Bioinformatics, 9, 212.

    Article  CAS  Google Scholar 

  34. Kin, T., Tsuda, K., & Asai, K. (2002). Marginalized kernels for RNA sequence data analysis. Genome Informatics. International Conference on Genome Informatics, 13, 112–122.

    CAS  Google Scholar 

  35. Kiryu, H., Kin, T., & Asai, K. (2007). Robust prediction of consensus secondary structures using averaged base pairing probability matrices. Bioinformatics (Oxford, England), 23(4), 434–441.

    Article  CAS  Google Scholar 

  36. Kiryu, H., et al. (2007). Murlet: A practical multiple alignment tool for structural RNA sequences. Bioinformatics (Oxford, England), 23(13), 1588–1598.

    Article  CAS  Google Scholar 

  37. Klein, R. J., & Eddy, S. R. (2003). RSEARCH: Finding homologs of single structured RNA sequences. BMC Bioinformatics, 4, 44.

    Article  Google Scholar 

  38. Knight, R., Birmingham, A., & Yarus, M. (2004). BayesFold: Rational 2 degrees folds that combine thermodynamic, covariation, and chemical data for aligned RNA sequences. RNA (New York, N.Y.), 10(9), 1323–1336.

    CAS  Google Scholar 

  39. Knudsen, B., & Hein, J. (2003). Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Research, 31(13), 3423–3428.

    Article  CAS  Google Scholar 

  40. Lambert, A., et al. (2005). Computing expectation values for RNA motifs using discrete convolutions. BMC Bioinformatics, 6, 118.

    Article  CAS  Google Scholar 

  41. Le, S., Maizel, J. V. & Zhang, K. (2004). An algorithm for detecting homologues of known structured RNAs in genomes. Proceedings/IEEE Computational Systems Bioinformatics Conference, CSB. IEEE Computational Systems Bioinformatics Conference (pp. 300–310).

  42. Le, S. Y., Zhang, K., & Maizel, J. V. (1995). A method for predicting common structures of homologous RNAs. Computers and Biomedical Research, an International Journal, 28(1), 53–66.

    Article  CAS  Google Scholar 

  43. Lestrade, L., & Weber, M. J. (2006). snoRNA-LBME-db, a comprehensive database of human H/ACA and C/D box snoRNAs. Nucleic Acids Research, 34(Database issue), D158–D162.

    Article  CAS  Google Scholar 

  44. Lindgreen, S., Gardner, P. P., & Krogh, A. (2007). MASTR: Multiple alignment and structure prediction of non-coding RNAs using simulated annealing. Bioinformatics (Oxford, England), 23(24), 3304–3311.

    Article  CAS  Google Scholar 

  45. Liu, J., et al. (2005). A method for aligning RNA secondary structures and its application to RNA motif detection. BMC Bioinformatics, 6, 89.

    Article  CAS  Google Scholar 

  46. Macke, T. J., et al. (2001). RNAMotif, an RNA secondary structure definition and search algorithm. Nucleic Acids Research, 29(22), 4724–4735.

    Article  CAS  Google Scholar 

  47. Matsui, H., Sato, K., & Sakakibara, Y. (2004). Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures. Proceedings/IEEE Computational Systems Bioinformatics Conference, CSB. IEEE Computational Systems Bioinformatics Conference (pp. 290–9).

  48. Meyer, I. M., & Miklós, I. (2007). SimulFold: Simultaneously inferring RNA structures including pseudoknots, alignments, and trees using a Bayesian MCMC framework. PLoS Computational Biology, 3(8), e149.

    Article  CAS  Google Scholar 

  49. Mignone, F., et al. (2005). UTRdb and UTRsite: A collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs. Nucleic Acids Research, 33(suppl_1), D141–D146.

    CAS  Google Scholar 

  50. Moretti, S., et al. (2007). R-Coffee: A web server for accurately aligning noncoding RNA sequences. Nucleic Acids Research, 36(Web Server issue), W10–W13.

    Google Scholar 

  51. Pavesi, G., et al. (2004). RNAProfile: An algorithm for finding conserved secondary structure motifs in unaligned RNA sequences. Nucleic Acids Research, 32(10), 3258–3269.

    Article  CAS  Google Scholar 

  52. Pedersen, J. S., et al. (2006). Identification and classification of conserved RNA secondary structures in the human genome. PLoS Computational Biology, 2(4), e33.

    Article  CAS  Google Scholar 

  53. Pesole, G., & Liuni, S. (1999). Internet resources for the functional analysis of 5′ and 3′ untranslated regions of eukaryotic mRNAs. Trends in Genetics: TIG, 15(9), 378.

    Article  CAS  Google Scholar 

  54. Reeder, J., Reeder, J., & Giegerich, R. (2007). Locomotif: From graphical motif description to RNA motif search. Bioinformatics (Oxford, England), 23(13), i392–i400.

    Article  CAS  Google Scholar 

  55. Rivas, E., & Eddy, S. R. (2001). Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics, 2, 8.

    Article  CAS  Google Scholar 

  56. Rocheleau, L., & Pelchat, M. (2006). The subviral RNA Database: A toolbox for viroids, the hepatitis delta virus and satellite RNAs research. BMC Microbiology, 6, 24.

    Article  CAS  Google Scholar 

  57. Ruan, J., Stormo, G. D., & Zhang, W. (2004). An iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots. Bioinformatics (Oxford, England), 20(1), 58–66.

    Article  CAS  Google Scholar 

  58. Sakakibara, Y. (2003). Pair hidden Markov models on tree structures. Bioinformatics (Oxford, England), 19(Suppl 1), i232–i240.

    Article  Google Scholar 

  59. Sakakibara, Y., et al. (2007). Stem kernels for RNA sequence analyses. Journal of Bioinformatics and Computational Biology, 5(5), 1103–1122.

    Article  CAS  Google Scholar 

  60. Siebert, S., & Backofen, R. (2005). MARNA: Multiple alignment and consensus structure prediction of RNAs based on sequence structure comparisons. Bioinformatics (Oxford, England), 21(16), 3352–3359.

    Article  CAS  Google Scholar 

  61. Steffen, P., et al. (2006). RNAshapes: An integrated RNA analysis package based on abstract shapes. Bioinformatics (Oxford, England), 22(4), 500–503.

    Article  CAS  Google Scholar 

  62. Tabei, Y., et al. (2007). A fast structural multiple alignment method for long RNA sequences. BMC Bioinformatics, 9, 33.

    Article  CAS  Google Scholar 

  63. Thébault, P., et al. (2006). Searching RNA motifs and their intermolecular contacts with constraint networks. Bioinformatics (Oxford, England), 22(17), 2074–2080.

    Article  CAS  Google Scholar 

  64. Touzet, H. (2007). Comparative analysis of RNA genes: The caRNAc software. Methods in Molecular Biology (Clifton, N.J.), 395, 465–474.

    CAS  Google Scholar 

  65. Veksler-Lublinsky, I., et al. (2007). A structure-based flexible search method for motifs in RNA. Journal of Computational Biology: A Journal of Computational Molecular Cell Biology, 14(7), 908–926.

    CAS  Google Scholar 

  66. Washietl, S., Hofacker, I. L., & Stadler, P. F. (2005). Fast and reliable prediction of noncoding RNAs. Proceedings of the National Academy of Sciences of the United States of America, 102(7), 2454–2459.

    Article  CAS  Google Scholar 

  67. Will, S., et al. (2007). Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering. PLoS Computational Biology, 3(4), e65.

    Article  CAS  Google Scholar 

  68. Wilm, A., Higgins, D. G., & Notredame, C. (2007). R-Coffee: A method for multiple alignment of non-coding RNA. Nucleic Acids Research, 36(9), e52.

    Article  CAS  Google Scholar 

  69. Wilm, A., Linnenbrink, K., & Steger, G. (2007). ConStruct: Improved construction of RNA consensus structures. BMC Bioinformatics, 9, 219.

    Article  CAS  Google Scholar 

  70. Xie, J., et al. (2007). Sno/scaRNAbase: A curated database for small nucleolar RNAs and cajal body-specific RNAs. Nucleic Acids Research, 35(Database issue), D183–D187.

    Article  CAS  Google Scholar 

  71. Xu, X., Ji, Y., & Stormo, G. D. (2007). RNA sampler: A new sampling based algorithm for common RNA secondary structure prediction and structural alignment. Bioinformatics (Oxford, England), 23(15), 1883–1891.

    Article  CAS  Google Scholar 

  72. Xue, C., & Liu, G. (2007). RScan: Fast searching structural similarities for structured RNAs in large databases. BMC Genomics, 8, 257.

    Article  CAS  Google Scholar 

  73. Yao, Z., Weinberg, Z., & Ruzzo, W. L. (2006). CMfinder—A covariance model based RNA motif finding algorithm. Bioinformatics (Oxford, England), 22(4), 445–452.

    Article  CAS  Google Scholar 

  74. Zhang, S., et al. (2005). Searching genomes for noncoding RNA using FastR. IEEE/ACM Transactions on Computational Biology and Bioinformatics/IEEE, ACM, 2(4), 366–379.

    Article  CAS  Google Scholar 

  75. Zhou, Y., et al. (2007). GISSD: Group I Intron Sequence and Structure Database. Nucleic Acids Research, 36(Database issue), D31–D37.

    Article  CAS  Google Scholar 

Download references

Acknowledgments

We wish to thank the members of the Tenenbaum Lab for helpful suggestions and discusion, especially Chris Zaleski and Frank Doyle. This work was supported in part by NIH grant U01HG004571 to SAT from the NHGRI.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Scott A. Tenenbaum.

Rights and permissions

Reprints and permissions

About this article

Cite this article

George, A.D., Tenenbaum, S.A. Informatic Resources for Identifying and Annotating Structural RNA Motifs. Mol Biotechnol 41, 180–193 (2009). https://doi.org/10.1007/s12033-008-9114-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12033-008-9114-z

Keywords

Navigation