Skip to main content

Homology Search with Fragmented Nucleic Acid Sequence Patterns

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4645))

Abstract

The comprehensive annotation of non-coding RNAs in newly sequenced genomes is still a largely unsolved problem because many functional RNAs exhibit not only poorly conserved sequences but also large variability in structure. In many cases, such as Y RNAs, vault RNAs, or telomerase RNAs, sequences differ by large insertions or deletions and have only a few small sequence patterns in common.

Here we present fragrep2, a purely sequence-based approach to detect such patterns in complete genomes. A fragrep2 pattern consists of an ordered list of position-specific weight matrices (PWMs) describing short, approximately conserved sequence elements, that are separated by intervals of non-conserved regions of bounded length. The program uses a fractional programming approach to align the PWMs to genomic DNA in order to allow for a bounded number of insertions and deletions in the patterns. These patterns are then combined to significant combinations of PWMs. At this step, a subset of PWMs may be deleted, i.e., have no match in the current region of the genome. The program furthermore estimates p- and E-values for the matches.

We apply fragrep2 to homology searches for RNase MRP, unveiling two previously unidentified matches as well as reproducing the results of two previous surveys. Furthermore, we complement the picture of vertebrate vault RNAs, a class of ncRNAs that has not received much attention so far.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lowe, T., Eddy, S.: tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucl. Acids Res. 25, 955–964 (1997)

    Article  Google Scholar 

  2. Nawrocki, E.P., Eddy, S.R.: Query-dependent banding for faster RNA similarity searches. PLoS Comp. Biol. 3, 56 (2007), doi:10.1371/journal.pcbi.0030056

    Article  MathSciNet  Google Scholar 

  3. Weinberg, Z., Ruzzo, W.L.R.: Sequence-based heuristics for faster annotation of non-coding RNA families. Bioinformatics 22, 35–39 (2006)

    Article  Google Scholar 

  4. Chen, J.L., Blasco, M.A., Greider, C.W.: Secondary structure of vertebrate telomerase RNA. Cell 100, 503–514 (2000)

    Article  Google Scholar 

  5. Mosig, A., Sameith, K., Stadler, P.F.: fragrep: Efficient search for fragmented patterns in genomic sequences. Geno. Prot. Bioinfo. 4, 56–60 (2005)

    Article  Google Scholar 

  6. Xie, M., Mosig, A., Qi, X., Li, Y., Stadler, P.F., Chen, J.L.: Structure and function of the smallest vertebrate telomerase RNA from teleost fish. in preparation

    Google Scholar 

  7. Kel, A.E., Gößling, E., Reuter, I., Cheremushkin, E., Kel-Margoulis, O.V.E.W.: MATCHTM: a tool for searching transcription factor binding sites in DNA sequences. Nucl. Acids Res. 31, 3576–3579 (2003)

    Article  Google Scholar 

  8. Dinkelbach, W.: On nonlinear fractional programming. Manage. Sci. 13, 492–498 (1967)

    Article  MathSciNet  Google Scholar 

  9. Schaible, S.: Fractional programming. Z. Operations Res. 27, 39–54 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  10. Arslan, A.N., Eğecioğlu, Ö.: Efficient algorithms for normalized edit distance. J. Discr. Algorithms 1, 3–20 (2000)

    Google Scholar 

  11. Arslan, A.N., Eğecioğlu, Ö., Pevzner, P.: A new approach to sequence comparison: Normalized sequence alignment. Bioinformatics 17, 327–337 (2001)

    Article  Google Scholar 

  12. Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the aminoacid sequences of two proteins. J. Mol. Biol. 48, 443–452 (1970)

    Article  Google Scholar 

  13. Griffiths-Jones, S., Moxon, S., Marshall, M., Khanna, A., Eddy, S.R., Bateman, A.: Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res. 33, D121–D124 (2005)

    Article  Google Scholar 

  14. Piccinelli, P., Rosenblad, M.A., Samuelsson, T.: Identification and analysis of ribonuclease P and MRP RNA in a broad range of eukaryotes. Nucleic Acids Res. 33, 4485–4495 (2005)

    Article  Google Scholar 

  15. Woodhams, M.D., Stadler, P.F., Penny, D., Collins, L.J.: RNAse MRP and the RNA processing cascade in the eukaryotic ancestor. BMC Evol. Biol. 7, 13 (2007)

    Article  Google Scholar 

  16. van Zon, A., Mossink, M.H., Scheper, R.J., Sonneveld, P., Wiemer, E.A.C.: The vault complex. Cell. Mol. Life Sci. 60, 1828–1837 (2003)

    Article  Google Scholar 

  17. van Zon, A., Mossink, M.H., Schoester, M., Scheffer, G.L., Scheper, R.J., Sonneveld, P., Wiemer, E.A.C.: Multiple human vault RNAs. J. Biol. Chem. 276, 37715–37721 (2001)

    Article  Google Scholar 

  18. Kickhoefer, V.A., Searles, R.P., Kedersha, N.L., Garber, M.E., Johnson, D.L., Rome, L.H.: Vault ribonucleoprotein particles from rat and bullfrog contain a related small RNA that is transcribed by RNA polymerase III. J. Biol. Chem. 268, 7868–7873 (1993)

    Google Scholar 

  19. Vilalta, A., Kickhoefer, V.A., Rome, L.H., Johnson, D.L.: The rat vault RNA gene contains a unique RNA polymerase III promoter composed of both external and internal elements that function synergistically. J. Biol. Chem. 269, 29752–29759 (1994)

    Google Scholar 

  20. Kickhoefer, V.A., Emre, N., Stephen, A.G., Poderycki, M.J., Rome, L.H.: Identification of conserved vault RNA expression elements and a non-expressed mouse vault RNA gene. Gene 309, 65–70 (2003)

    Article  Google Scholar 

  21. Chen, J.L., Greider, C.W.: An emerging consensus for telomerase rna structure. Proc. Natl. Acad. Sci. U S A 101(41), 14683–14684 (2004)

    Article  Google Scholar 

  22. Tzfati, Y., Knight, Z., Roy, J., Blackburn, E.H.: A novel pseudoknot element is essential for the action of a yeast telomerase. Genes & Dev. 17, 1779–1788 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Raffaele Giancarlo Sridhar Hannenhalli

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mosig, A., Chen, J.J.L., Stadler, P.F. (2007). Homology Search with Fragmented Nucleic Acid Sequence Patterns. In: Giancarlo, R., Hannenhalli, S. (eds) Algorithms in Bioinformatics. WABI 2007. Lecture Notes in Computer Science(), vol 4645. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74126-8_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74126-8_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74125-1

  • Online ISBN: 978-3-540-74126-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics