Abstract
RNA alignment is an important step in the annotation and characterization of unknown RNAs, and several methods have been developed to meet the need of fast and accurate alignments. Being the performances of the aligning methods affected by the input RNA features, finding the most suitable method is not trivial. Indeed, no available method clearly outperforms the others. Here we present a simple workflow to help choosing the more suitable method for RNA pairwise alignment. We tested the performances of six algorithms, based on different approaches, on datasets created by merging publicly available datasets of known or curated RNA secondary structure annotations with datasets of curated RNA alignments. Then, we simulated the frequent case where the secondary structure is unknown by using the same alignment datasets but ignoring the known structure and instead predicting it. In conclusion, the proposed workflow for pairwise RNA alignment depends on the input RNA primary sequence identity and the availability of reliable secondary structures.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Mercer TR, Gerhardt DJ, Dinger ME et al (2012) Targeted RNA sequencing reveals the deep complexity of the human transcriptome. Nat Biotechnol 30:99–104. doi:10.1038/nbt.2024
Cabili MN, Trapnell C, Goff L et al (2011) Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev 25:1915–1927. doi:10.1101/gad.17446611
Baker M (2011) Long noncoding RNAs: the search for function. Nat Methods 8:379–383. doi:10.1038/nmeth0511-379
Burge SW, Daub J, Eberhardt R et al (2013) Rfam 11.0: 10 years of RNA families. Nucleic Acids Res 41:D226–D232. doi:10.1093/nar/gks1005
Gardner PP, Wilm A, Washietl S (2005) A benchmark of multiple sequence alignment programs upon structural RNAs. Nucleic Acids Res 33:2433–2439. doi:10.1093/nar/gki541
Wan Y, Kertesz M, Spitale RC et al (2011) Understanding the transcriptome through RNA structure. Nat Rev Genet 12:641–655. doi:10.1038/nrg3049
Sankoff D (1985) Simultaneous solution of the RNA folding, alignment and protosequence problems. SIAM J Appl Math 45:810–825
Havgaard JH, Torarinsson E, Gorodkin J (2007) Fast pairwise structural RNA alignments by pruning of the dynamical programming matrix. PLoS Comput Biol 3:1896–1908. doi:10.1371/journal.pcbi.0030193
Harmanci AO, Sharma G, Mathews DH (2007) Efficient pairwise RNA structure prediction using probabilistic alignment constraints in Dynalign. BMC Bioinformatics 8:130. doi:10.1186/1471-2105-8-130
Will S, Reiche K, Hofacker IL et al (2007) Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering. PLoS Comput Biol 3:e65. doi:10.1371/journal.pcbi.0030065
Dowell RD, Eddy SR (2006) Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints. BMC Bioinformatics 7:400. doi:10.1186/1471-2105-7-400
Taneda A (2010) Multi-objective pairwise RNA sequence alignment. Bioinformatics 26:2383–2390. doi:10.1093/bioinformatics/btq439
Notredame C, Higgins DG (1996) SAGA: sequence alignment by genetic algorithm. Nucleic Acids Res 24:1515–1524
Blin G, Denise A, Dulucq S et al (2007) Alignments of RNA structures. IEEE ACM Trans Comput Biol Bioinformatics 7:309–322. doi:10.1109/TCBB.2008.28
Guignon V, Chauve C, Hamel S (2005) An edit distance between RNA stem-loops. In: Consens MP, Navarro G (eds) SPIRE. Springer, Heidelberg, pp 335–347
Lorenz R, Bernhart SH, Höner Zu Siederdissen C et al (2011) ViennaRNA Package 2.0. Algorithms Mol Biol 6:26. doi:10.1186/1748-7188-6-26
Rice P, Longden I, Bleasby A (2000) EMBOSS: the European molecular biology open software suite. Trends Genet 16:276–277
Andronescu M, Bereg V, Hoos HH, Condon A (2008) RNA STRAND: the RNA secondary structure and statistical analysis database. BMC Bioinformatics 9:340. doi:10.1186/1471-2105-9-340
Horesh Y, Doniger T, Michaeli S, Unger R (2007) RNAspa: a shortest path approach for comparative prediction of the secondary structure of ncRNA molecules. BMC Bioinformatics 8:366. doi:10.1186/1471-2105-8-366
Widmann J, Stombaugh J, McDonald D et al (2012) RNASTAR: an RNA STructural Alignment Repository that provides insight into the evolution of natural and artificial RNAs. RNA 18:1319–1327. doi:10.1261/rna.032052.111
Berman HM, Kleywegt GJ, Nakamura H, Markley JL (2013) The future of the protein data bank. Biopolymers 99:218–222. doi:10.1002/bip.22132
Larkin MA, Blackshields G, Brown NP et al (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23:2947–2948. doi:10.1093/bioinformatics/btm404
Acknowledgements
This work was supported by the EPIGEN flagship project and PRIN 2010 (prot. 20108XYHJS_006) to MHC.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer Science+Business Media New York
About this protocol
Cite this protocol
Mattei, E., Helmer-Citterich, M., Ferrè, F. (2015). A Simple Protocol for the Inference of RNA Global Pairwise Alignments. In: Picardi, E. (eds) RNA Bioinformatics. Methods in Molecular Biology, vol 1269. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-2291-8_3
Download citation
DOI: https://doi.org/10.1007/978-1-4939-2291-8_3
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-2290-1
Online ISBN: 978-1-4939-2291-8
eBook Packages: Springer Protocols