Skip to main content

Finding Consensus Patterns in Very Scarce Biosequence Samples from Their Minimal Multiple Generalizations

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3918))

Included in the following conference series:

Abstract

In this paper we examine the issues involved in finding consensus patterns from biosequence data of very small sample sizes, by searching for so-called minimal multiple generalization (mmg), that is, a set of syntactically minimal patterns that accounts for all the samples. The data we use are the sigma regulons with more conserved consensus patterns for the bacteria B. subtilis. By comparing between the mmgs found over different search spaces, we found that it is possible to derive patterns close to the known consensus patterns by simply making some reasonable requirements on the kinds of patterns to obtain. We also propose some simple measures to evaluate the patterns in an mmg.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arimura, H., Fujino, R., Shinohara, T., Arikawa, S.: Protein motif discovery from positive examples by Minimal Multiple Generalization over regular patterns. In: Proceedings of the Genome Informatics Workshop, pp. 39–48 (1994)

    Google Scholar 

  2. Arimura, H., Shinohara, T., Otsuki, S.: Finding minimal generalizations for unions of pattern languages and its application to inductive inference from positive data. In: Enjalbert, P., Mayr, E.W., Wagner, K.W. (eds.) STACS 1994. LNCS, vol. 775, pp. 649–660. Springer, Heidelberg (1994)

    Chapter  Google Scholar 

  3. Brāzma, A., Jonassen, I., Eidhammer, I., Gilbert, D.: Approaches to the automatic discovery of patterns in biosequences. J. Comp. Biol. 5(2), 277–304 (1998)

    Google Scholar 

  4. Helmann, J.D., Moran, C.P.: RNA Polymerase and Sigma Factors, ch 21, pp. 289–312. American Society Microbiology, Washington (2001)

    Google Scholar 

  5. Makita, Y., Nakao, M., Ogasawara, N., Nakai, K.: DBTBS: Database of transcriptional regulation in Bacillus Subtilis and its contribution to comparative genomics. Nucl. Acids Res. 32, 75–77 (2004)

    Article  Google Scholar 

  6. Ng, Y.K., Ono, H., Shinohara, T.: Measuring over-generalization in the minimal multiple generalizations of biosequences. In: Hoffmann, A., Motoda, H., Scheffer, T. (eds.) DS 2005. LNCS (LNAI), vol. 3735, pp. 176–188. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  7. Rigoutsos, I., Floratos, A.: Combinatorial pattern discovery in biological sequences: the TEIRESIAS algorithm. Bioinformatics 14(1), 55–67 (1998)

    Article  Google Scholar 

  8. Shinohara, T.: Polynomial time inference of extended regular pattern languages. In: Goto, E., Nakajima, R., Yonezawa, A., Nakata, I., Furukawa, K. (eds.) RIMS 1982. LNCS, vol. 147, pp. 115–127. Springer, Heidelberg (1983)

    Chapter  Google Scholar 

  9. Sigrist, C.J., Cerutti, L., Hulo, N., Gattiker, A., Falquet, L., Pagni, M., Bairoch, A., Bucher, P.: PROSITE: A documented database using patterns and profiles as motif descriptors. Brief. Bioinform., 3, 265–274 (2002)

    Article  Google Scholar 

  10. Takae, T., Kasai, T., Arimura, H., Shinohara, T.: Knowledge discovery in biosequences using sort regular patterns. In: Workshop on Applied Learning Theory (1998)

    Google Scholar 

  11. Yamaguchi, M., Shimozono, S., Shinohara, T.: Finding minimal multiple generalization over regular patterns with alphabet indexing. In: Proceedings of the Seventh Workshop on Genome Informatics, vol. 7, pp. 51–60. Universal Academy Press, Tokyo (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ng, Y.K., Shinohara, T. (2006). Finding Consensus Patterns in Very Scarce Biosequence Samples from Their Minimal Multiple Generalizations. In: Ng, WK., Kitsuregawa, M., Li, J., Chang, K. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2006. Lecture Notes in Computer Science(), vol 3918. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11731139_63

Download citation

  • DOI: https://doi.org/10.1007/11731139_63

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33206-0

  • Online ISBN: 978-3-540-33207-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics