Skip to main content

Towards a More Efficient Discovery of Biologically Significant DNA Motifs

  • Conference paper
  • 2498 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9043))

Abstract

DNA motifs are short recurring patterns which are assumed to have some biological function. Most of the algorithms that solve this problem are computationally prohibitive. In this paper we extend a recent work that discovered identical string motifs. In the first phase of our three phase algorithm we report all the string motifs of all sizes. In the next phase we filter out those motifs which fail to meet our constraints, and in the last phase the motifs are ranked using a combination of stochastic techniques and p-value. Our method outperforms other motif discovery algorithms including some well-known ones such as MEME and Weeder on benchmark data suites.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Azmi, A.M., Al-Ssulami, A.: A linear algorithm to discover exact string motifs. PLoS ONE 9(5), e95148 (2014)

    Google Scholar 

  2. Bailey, T.L., Elkan, C.: Unsupervised learning of multiple motifs in biopolymers using expectation maximization. Mach. Learning 21, 51–80 (1995)

    Google Scholar 

  3. Boeva, V., Clement, J., Regnier, M., Roytberg, M.A., Makeev, V.J.: Exact p-value calculation for heterotypic clusters of regulatory motifs and its application in computational annotation of cis-regulatory modules. Algo. Mol. Biol. 2, 13 (2007)

    Article  Google Scholar 

  4. Burset, M., Gulg, R.: Evaluation of gene structure prediction programs. Genomics 34, 353–367 (1996)

    Article  Google Scholar 

  5. Buhler, J., Tompa, M.: Finding motifs using random projections. In: Proc. 5th Annual Int. Conf. on Comput. Biol. (RECOMB 2001), Montreal, Canada, pp. 69–76 (2001)

    Google Scholar 

  6. Chin, F., Leung, H.: An efficient algorithm for string motif discovery. In: Proc. 4th Asia-Pacific Bioinfor. Conf (APBC 2006), Taipei, Taiwan, pp. 79–88 (2006)

    Google Scholar 

  7. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press (2001)

    Google Scholar 

  8. Fauteux, F., Blanchette, M., Strmvik, M.V.: Seeder: discriminative seeding DNA motif discovery. Bioinfor. 24, 2303–2307 (2008)

    Article  Google Scholar 

  9. GuhaThakurta, D.: Computational identification of transcriptional regulatory elements in DNA sequence. Nucleic Acids Res. 34, 3585–3598 (2006)

    Article  Google Scholar 

  10. Hu, J., Li, B., Kihara, D.: Limitations and potentials of current motif discovery algorithms. Nucleic Acids Res. 33, 4899–4913 (2006)

    Article  Google Scholar 

  11. Karci, A.: Efficient automatic exact motif discovery algorithms for biological sequences. Expert Sys. With App. 36, 7952–7963 (2009)

    Article  Google Scholar 

  12. Kaya, M.: MOGAMOD. Multi-objective genetic algorithm for motif discovery. Expert Sys. With App. 36, 1039–1047 (2009)

    Article  Google Scholar 

  13. Marschall, T., Rahmann, S.: Efficient exact motif discovery. Bioinfor. 29, i356–i364 (2009)

    Google Scholar 

  14. Pavesi, G., Mereghetti, P., Mauri, G., Pesole, G.: Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes. Nucleic Acids Res. 32, W199–W203 (2004)

    Google Scholar 

  15. Pevzner, P.A., Sze, S.H.: Combinatorial approaches to finding subtle signals in DNA sequences. In: Proc. Int. Conf. Intel. Sys. Mol. Biol., vol. 8, pp. 269–278 (2000)

    Google Scholar 

  16. Sandve, G.K., Abul, O., Walseng, V., Drabls, F.: Improved benchmarks for computational motif discovery. BMC Bioinfor. 8, 163 (2007)

    Article  Google Scholar 

  17. Sze, S.H., Zhao, X.: Improved Pattern-driven Algorithms for Motif Finding in DNA Sequences. In: Eskin, E., Ideker, T., Raphael, B., Workman, C. (eds.) RECOMB 2005. LNCS (LNBI), vol. 4023, pp. 198–211. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  18. Tompa, M., Li, N., Bailey, T.L., Church, G.M., Moor, B.D., Eskin, E., Favorov, A.V., Frith, M.C., Fu, Y., Kent, W.J., Makeev, V.J., Mironov, A.A., Noble, W.S., Pavesi, G., Pesole, G., Regnier, M., Simonis, N., Sinha, S., Thijs, G., van Helden, J., Vandenbogaert, M., Weng, Z., Workman, C., Ye, C., Zhu, Z.: Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotech. 23, 137–144 (2005)

    Article  Google Scholar 

  19. Wingender, E., Dietze, P., Karas, H., Knuppel, R.: TRANSFAC: A database on transcription factors and their DNA binding sites. Nucleic Acids Res. 24, 238–241 (1996)

    Article  Google Scholar 

  20. Yu, Q., Huo, H., Vitter, J.S., Huan, J., Nekrich, Y.: StemFinder: An efficient algorithm for searching large motif stems over large alphabets. In: Proc. IEEE Int. Conf. Bioinfor. and Biomed. (BIBM), Shanghai, China, pp. 473–476 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Al-Ssulami, A.M., Azmi, A.M. (2015). Towards a More Efficient Discovery of Biologically Significant DNA Motifs. In: Ortuño, F., Rojas, I. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2015. Lecture Notes in Computer Science(), vol 9043. Springer, Cham. https://doi.org/10.1007/978-3-319-16483-0_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16483-0_37

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16482-3

  • Online ISBN: 978-3-319-16483-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics