Abstract
The post-genomic era showed up a wide range of new challenging issues for the areas of knowledge discovery and intelligent information management. Among them, the discovery of complex pattern repetitions in string databases plays an important role, specifically in those contexts where even what are to be considered the interesting pattern classes is unknown. This paper provides a contribution in this precise setting, proposing a novel approach, based on disjunctive logic programming extended with several advanced features, for discovering interesting pattern classes from a given data set.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Arnone, M.I., Davidson, E.H.: The hardwiring of development: organization and function of genomic regulatory systems. Development 124, 1851–1864 (1997)
Bailey, T.L., Elkan, C.: Unsupervised learning of multiple motifs in biopolymers using expectation maximization. Machine Learning 21(1-2), 51–80 (1995)
Brazma, A., Jonassen, I., Eidhammer, I., Gilbert, D.: Approaches to the automatic discovery of patterns in biosequences. Journal of Computational Biology 5(2), 277–304 (1998)
Buhler, J., Tompa, M.: Finding motifs using random projections. In: Proceedings of Fifth Annual International Conference on Computational Molecular Biology (RECOMB 2001), pp. 69–76 (2001)
Erill, I., Escribano, M., Campoy, S., Barb, J.: In silico analysis reveals substantial variability in the gene contents of the gamma proteobacteria lexa-regulon. Bioinformatics 19(17), 2225–2236 (2003)
Eskin, E., Pevzner, P.A.: Finding composite regulatory patterns in DNA sequences. In: Proceedings of the Tenth International Conference on Intelligent Systems for Molecular Biology (ISMB 2002), pp. 354–363 (2002)
Gross, C.A., Lonetto, M., Losick, R.: Bacterial sigma factors. Transcriptional Regulation 1, 129–176 (1992)
van Helden, J., Rios, A.F., Collado-Vides, J.: Discovering regulatory elements in non-coding sequences by analysis of spaced dyads. Nucleic Acids Research 28(8), 1808–1818 (2000)
Hertz, G., Stormo, G.: Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15(7-8), 563–577 (1999)
Hughes, J., Estep, P., Tavazoie, S., Church, G.: Computational identification of cis-regulatory elements associated with groups of functionally related genes in saccharomyces cerevisiae. Journal of Computational Biology 10, 1205–1214 (2000)
Jonassen, I., Collins, J.F., Higgins, D.G.: Finding flexible patterns in unaligned protein sequences. Protein Science 4, 1587–1595 (1995)
Keich, U., Pevzner, P.A.: Finding motifs in the twilight zone. In: Proceedings of the Sixth Annual International Conference on Computational Biology (RECOMB 2002), pp. 195–204 (2002)
Leone, N., Pfeifer, G., Faber, W., Eiter, T., Gottlob, G., Perri, S., Scarcello, F.: The DLV System for Knowledge Representation and Reasoning. ACM Transactions on Computational Logic (2004) (to appear), Available via http://www.arxiv.org/ps/cs.AI/0211004
Lifschitz, V.: Foundations of Logic Programming. In: Brewka, G. (ed.) Principles of Knowledge Representation, pp. 69–127. CSLI Publications, Stanford (1996)
Marsan, L., Sagot, M.F.: Algorithms for extracting structured motifs using a suffix tree with application to promoter and regulatory site consensus identification. Journal of Computational Biology 7, 345–360 (2000)
Robin, S., Daudin, J.J., Richard, H., Sagot, M.F., Schbath, S.: Occurrence probability of structured motifs in random sequences. Journal of Computational Biology 9, 761–773 (2003)
Smith, H.O., Annau, T.M., Chandrasegaran, S.: Finding sequence motifs in groups of functionally related proteins. Proc. of National Academy of Science U.S.A., 826–830 (1990)
Terracina, G.: A fast technique for deriving frequent structured patterns from biological data sets. Information Sciences (forthcoming)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Palopoli, L., Rombo, S., Terracina, G. (2005). Flexible Pattern Discovery with (Extended) Disjunctive Logic Programming. In: Hacid, MS., Murray, N.V., Raś, Z.W., Tsumoto, S. (eds) Foundations of Intelligent Systems. ISMIS 2005. Lecture Notes in Computer Science(), vol 3488. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11425274_52
Download citation
DOI: https://doi.org/10.1007/11425274_52
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25878-0
Online ISBN: 978-3-540-31949-8
eBook Packages: Computer ScienceComputer Science (R0)