Abstract
Post-transcriptional regulation of genes and transcripts is a vital aspect of cellular processes, and unlike transcriptional regulation, remains a largely unexplored domain. One of the most obvious and most important questions to explore is the discovery of functional RNA elements. Many RNA elements have been characterized to date ranging from cis-regulatory motifs within mRNAs to large families of non-coding RNAs. Like protein coding genes, the functional motifs of these RNA elements are highly conserved, but unlike protein coding genes, it is most often the structure and not the sequence that is conserved. Proper characterization of these structural RNA motifs is both the key and the limiting step to understanding the post-transcriptional aspects of the genomic world. Here, we focus on the task of structural motif discovery and provide a survey of the informatics resources geared towards this task.
Similar content being viewed by others
References
Abreu-Goodger, C., & Merino, E. (2005). RibEx: A web server for locating riboswitches and other conserved bacterial regulatory elements. Nucleic Acids Research, 33(Web Server issue), W690–W692.
Abreu-Goodger, C., et al. (2004). Conserved regulatory motifs in bacteria: Riboswitches and beyond. Trends in Genetics: TIG, 20(10), 475–479. doi:10.1016/j.tig.2004.08.003.
Anwar, M., Nguyen, T., & Turcotte, M. (2006). Identification of consensus RNA secondary structures using suffix arrays. BMC Bioinformatics, 7, 244. doi:10.1186/1471-2105-7-244.
Bafna, V., & Zhang, S. (2004). FastR: Fast database search tool for non-coding RNA. Proceedings/IEEE Computational Systems Bioinformatics Conference, CSB. IEEE Computational Systems Bioinformatics Conference (pp. 52–61).
Bauer, M., Klau, G. W., & Reinert, K. (2007). Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization. BMC Bioinformatics, 8, 271. doi:10.1186/1471-2105-8-271.
Berman, H. M., et al. (1992). The nucleic acid database. A comprehensive relational database of three-dimensional structures of nucleic acids. Biophysical Journal, 63(3), 751–759.
Bindewald, E., & Shapiro, B. A. (2006). RNA secondary structure prediction from sequence alignments using a network of k-nearest neighbor classifiers. RNA (New York, N.Y.), 12(3), 342–352. doi:10.1261/rna.2164906.
Bindewald, E., et al. (2008). RNAJunction: A database of RNA junctions and kissing loops for three-dimensional structural analysis and nanodesign. Nucleic Acids Research, 36(Database issue), D392–D397. doi:10.1093/nar/gkm842.
Busch, A., & Backofen, R. (2006). INFO-RNA—A fast approach to inverse RNA folding. Bioinformatics (Oxford, England), 22(15), 1823–1831. doi:10.1093/bioinformatics/btl194.
Chang, T., et al. (2006). RNAMST: Efficient and flexible approach for identifying RNA structural homologs. Nucleic Acids Research, 34(Web Server issue), W423–W428.
Coventry, A., Kleitman, D. J., & Berger, B. (2004). MSARI: Multiple sequence alignments for statistical detection of RNA secondary structure. Proceedings of the National Academy of Sciences of the United States of America, 101(33), 12102–12107. doi:10.1073/pnas.0404193101.
Dalli, D., et al. (2006). STRAL: Progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time. Bioinformatics (Oxford, England), 22(13), 1593–1599. doi:10.1093/bioinformatics/btl142.
di Bernardo, D., Down, T., & Hubbard, T. (2003). ddbRNA: Detection of conserved secondary structures in multiple alignments. Bioinformatics (Oxford, England), 19(13), 1606–1611. doi:10.1093/bioinformatics/btg229.
Do, C. B., Foo, C., & Batzoglou, S. (2008). A max-margin model for efficient simultaneous alignment and folding of RNA sequences. Bioinformatics (Oxford, England), 24(13), i68–i76. doi:10.1093/bioinformatics/btn177.
Doyle, F., et al. (2008). Bioinformatic tools for studying post-transcriptional gene regulation: The UAlbany TUTR collection and other informatic resources. Methods in Molecular Biology (Clifton, N.J.), 419, 39–52. doi:10.1007/978-1-59745-033-1_3.
Dsouza, M., Larsen, N., & Overbeek, R. (1997). Searching for patterns in genomic data. Trends in Genetics: TIG, 13(12), 497–498. doi:10.1016/S0168-9525(97)01347-4.
Eddy, S. R. (2006). Computational analysis of RNAs. Cold Spring Harbor Symposia on Quantitative Biology, 71, 117–128. doi:10.1101/sqb.2006.71.003.
Gardner, P., & Giegerich, R. (2004). A comprehensive comparison of comparative RNA structure prediction approaches. BMC Bioinformatics, 5(1), 140. doi:10.1186/1471-2105-5-140.
Gautheret, D., & Lambert, A. (2001). Direct RNA motif definition and identification from multiple sequence alignments using secondary structure profiles. Journal of Molecular Biology, 313(5), 1003–1011. doi:10.1006/jmbi.2001.5102.
Griffiths-Jones, S., et al. (2005). Rfam: Annotating non-coding RNAs in complete genomes. Nucleic Acids Research, 33(suppl_1), D121–D124.
Griffiths-Jones, S., et al. (2006). miRBase: MicroRNA sequences, targets and gene nomenclature. Nucleic Acids Research, 34(suppl_1), D140–D144.
Hamada, M., et al. (2006). Mining frequent stem patterns from unaligned RNA sequences. Bioinformatics (Oxford, England), 22(20), 2480–2487. doi:10.1093/bioinformatics/btl431.
Hofacker, I. L. (2003). Vienna RNA secondary structure server. Nucleic Acids Research, 31(13), 3429–3431.
Hofacker, I. L. (2004). RNA secondary structure analysis using the Vienna RNA package. Current Protocols in Bioinformatics/Editoral Board, Andreas D. Baxevanis… [et Al, Chapter 12, Unit 12.2].
Hofacker, I. L. (2007). RNA consensus structure prediction with RNAalifold. Methods in Molecular Biology (Clifton, N.J.), 395, 527–544.
Hofacker, I. L., Bernhart, S. H. F., & Stadler, P. F. (2004). Alignment of RNA base pairing probability matrices. Bioinformatics (Oxford, England), 20(14), 2222–2227.
Holmes, I. (2005). Accelerated probabilistic inference of RNA structure evolution. BMC Bioinformatics, 6, 73.
Horesh, Y., et al. (2007). RNAspa: A shortest path approach for comparative prediction of the secondary structure of ncRNA molecules. BMC Bioinformatics, 8, 366.
Hu, Y. (2003). GPRM: A genetic programming approach to finding common RNA secondary structure elements. Nucleic Acids Research, 31(13), 3446–3449.
Huang, H., et al. (2006). RegRNA: An integrated web server for identifying regulatory RNA motifs and elements. Nucleic Acids Research, 34(Web Server issue), W429–W434.
Jacobs, G. H., et al. (2006). Transterm—extended search facilities and improved integration with other databases. Nucleic Acids Research, 34(Database issue), D37–D40.
Ji, Y., Xu, X., & Stormo, G. D. (2004). A graph theoretical approach for predicting common RNA secondary structure motifs including pseudoknots in unaligned sequences. Bioinformatics (Oxford, England), 20(10), 1591–1602.
Katoh, K., & Toh, H. (2008). Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework. BMC Bioinformatics, 9, 212.
Kin, T., Tsuda, K., & Asai, K. (2002). Marginalized kernels for RNA sequence data analysis. Genome Informatics. International Conference on Genome Informatics, 13, 112–122.
Kiryu, H., Kin, T., & Asai, K. (2007). Robust prediction of consensus secondary structures using averaged base pairing probability matrices. Bioinformatics (Oxford, England), 23(4), 434–441.
Kiryu, H., et al. (2007). Murlet: A practical multiple alignment tool for structural RNA sequences. Bioinformatics (Oxford, England), 23(13), 1588–1598.
Klein, R. J., & Eddy, S. R. (2003). RSEARCH: Finding homologs of single structured RNA sequences. BMC Bioinformatics, 4, 44.
Knight, R., Birmingham, A., & Yarus, M. (2004). BayesFold: Rational 2 degrees folds that combine thermodynamic, covariation, and chemical data for aligned RNA sequences. RNA (New York, N.Y.), 10(9), 1323–1336.
Knudsen, B., & Hein, J. (2003). Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Research, 31(13), 3423–3428.
Lambert, A., et al. (2005). Computing expectation values for RNA motifs using discrete convolutions. BMC Bioinformatics, 6, 118.
Le, S., Maizel, J. V. & Zhang, K. (2004). An algorithm for detecting homologues of known structured RNAs in genomes. Proceedings/IEEE Computational Systems Bioinformatics Conference, CSB. IEEE Computational Systems Bioinformatics Conference (pp. 300–310).
Le, S. Y., Zhang, K., & Maizel, J. V. (1995). A method for predicting common structures of homologous RNAs. Computers and Biomedical Research, an International Journal, 28(1), 53–66.
Lestrade, L., & Weber, M. J. (2006). snoRNA-LBME-db, a comprehensive database of human H/ACA and C/D box snoRNAs. Nucleic Acids Research, 34(Database issue), D158–D162.
Lindgreen, S., Gardner, P. P., & Krogh, A. (2007). MASTR: Multiple alignment and structure prediction of non-coding RNAs using simulated annealing. Bioinformatics (Oxford, England), 23(24), 3304–3311.
Liu, J., et al. (2005). A method for aligning RNA secondary structures and its application to RNA motif detection. BMC Bioinformatics, 6, 89.
Macke, T. J., et al. (2001). RNAMotif, an RNA secondary structure definition and search algorithm. Nucleic Acids Research, 29(22), 4724–4735.
Matsui, H., Sato, K., & Sakakibara, Y. (2004). Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures. Proceedings/IEEE Computational Systems Bioinformatics Conference, CSB. IEEE Computational Systems Bioinformatics Conference (pp. 290–9).
Meyer, I. M., & Miklós, I. (2007). SimulFold: Simultaneously inferring RNA structures including pseudoknots, alignments, and trees using a Bayesian MCMC framework. PLoS Computational Biology, 3(8), e149.
Mignone, F., et al. (2005). UTRdb and UTRsite: A collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs. Nucleic Acids Research, 33(suppl_1), D141–D146.
Moretti, S., et al. (2007). R-Coffee: A web server for accurately aligning noncoding RNA sequences. Nucleic Acids Research, 36(Web Server issue), W10–W13.
Pavesi, G., et al. (2004). RNAProfile: An algorithm for finding conserved secondary structure motifs in unaligned RNA sequences. Nucleic Acids Research, 32(10), 3258–3269.
Pedersen, J. S., et al. (2006). Identification and classification of conserved RNA secondary structures in the human genome. PLoS Computational Biology, 2(4), e33.
Pesole, G., & Liuni, S. (1999). Internet resources for the functional analysis of 5′ and 3′ untranslated regions of eukaryotic mRNAs. Trends in Genetics: TIG, 15(9), 378.
Reeder, J., Reeder, J., & Giegerich, R. (2007). Locomotif: From graphical motif description to RNA motif search. Bioinformatics (Oxford, England), 23(13), i392–i400.
Rivas, E., & Eddy, S. R. (2001). Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics, 2, 8.
Rocheleau, L., & Pelchat, M. (2006). The subviral RNA Database: A toolbox for viroids, the hepatitis delta virus and satellite RNAs research. BMC Microbiology, 6, 24.
Ruan, J., Stormo, G. D., & Zhang, W. (2004). An iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots. Bioinformatics (Oxford, England), 20(1), 58–66.
Sakakibara, Y. (2003). Pair hidden Markov models on tree structures. Bioinformatics (Oxford, England), 19(Suppl 1), i232–i240.
Sakakibara, Y., et al. (2007). Stem kernels for RNA sequence analyses. Journal of Bioinformatics and Computational Biology, 5(5), 1103–1122.
Siebert, S., & Backofen, R. (2005). MARNA: Multiple alignment and consensus structure prediction of RNAs based on sequence structure comparisons. Bioinformatics (Oxford, England), 21(16), 3352–3359.
Steffen, P., et al. (2006). RNAshapes: An integrated RNA analysis package based on abstract shapes. Bioinformatics (Oxford, England), 22(4), 500–503.
Tabei, Y., et al. (2007). A fast structural multiple alignment method for long RNA sequences. BMC Bioinformatics, 9, 33.
Thébault, P., et al. (2006). Searching RNA motifs and their intermolecular contacts with constraint networks. Bioinformatics (Oxford, England), 22(17), 2074–2080.
Touzet, H. (2007). Comparative analysis of RNA genes: The caRNAc software. Methods in Molecular Biology (Clifton, N.J.), 395, 465–474.
Veksler-Lublinsky, I., et al. (2007). A structure-based flexible search method for motifs in RNA. Journal of Computational Biology: A Journal of Computational Molecular Cell Biology, 14(7), 908–926.
Washietl, S., Hofacker, I. L., & Stadler, P. F. (2005). Fast and reliable prediction of noncoding RNAs. Proceedings of the National Academy of Sciences of the United States of America, 102(7), 2454–2459.
Will, S., et al. (2007). Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering. PLoS Computational Biology, 3(4), e65.
Wilm, A., Higgins, D. G., & Notredame, C. (2007). R-Coffee: A method for multiple alignment of non-coding RNA. Nucleic Acids Research, 36(9), e52.
Wilm, A., Linnenbrink, K., & Steger, G. (2007). ConStruct: Improved construction of RNA consensus structures. BMC Bioinformatics, 9, 219.
Xie, J., et al. (2007). Sno/scaRNAbase: A curated database for small nucleolar RNAs and cajal body-specific RNAs. Nucleic Acids Research, 35(Database issue), D183–D187.
Xu, X., Ji, Y., & Stormo, G. D. (2007). RNA sampler: A new sampling based algorithm for common RNA secondary structure prediction and structural alignment. Bioinformatics (Oxford, England), 23(15), 1883–1891.
Xue, C., & Liu, G. (2007). RScan: Fast searching structural similarities for structured RNAs in large databases. BMC Genomics, 8, 257.
Yao, Z., Weinberg, Z., & Ruzzo, W. L. (2006). CMfinder—A covariance model based RNA motif finding algorithm. Bioinformatics (Oxford, England), 22(4), 445–452.
Zhang, S., et al. (2005). Searching genomes for noncoding RNA using FastR. IEEE/ACM Transactions on Computational Biology and Bioinformatics/IEEE, ACM, 2(4), 366–379.
Zhou, Y., et al. (2007). GISSD: Group I Intron Sequence and Structure Database. Nucleic Acids Research, 36(Database issue), D31–D37.
Acknowledgments
We wish to thank the members of the Tenenbaum Lab for helpful suggestions and discusion, especially Chris Zaleski and Frank Doyle. This work was supported in part by NIH grant U01HG004571 to SAT from the NHGRI.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
George, A.D., Tenenbaum, S.A. Informatic Resources for Identifying and Annotating Structural RNA Motifs. Mol Biotechnol 41, 180–193 (2009). https://doi.org/10.1007/s12033-008-9114-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12033-008-9114-z