Sequences Classification by Least General Generalisations

Tantini, Frédéric; Terlutte, Alain; Torre, Fabien

doi:10.1007/978-3-642-15488-1_16

Sequences Classification by Least General Generalisations

Frédéric Tantini²⁰,
Alain Terlutte²¹ &
Fabien Torre²¹

Conference paper

791 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6339))

Abstract

In this paper, we present a general framework for supervised classification. This framework provides methods like boosting and only needs the definition of a generalisation operator called lgg. For sequence classification tasks, lgg is a learner that only uses positive examples. We show that grammatical inference has already defined such learners for automata classes like reversible automata or k-TSS automata. Then we propose a generalisation algorithm for the class of balls of words. Finally, we show through experiments that our method efficiently resolves sequence classification tasks.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Angluin, D.: Inference of reversible languages. Journal of the ACM 29(3), 741–765 (1982)
Article MATH MathSciNet Google Scholar
García, P., Vidal, E.: Inference of k-testable languages in the strict sense and application to syntactic pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12(9), 920–925 (1990)
Article Google Scholar
de la Higuera, C., Janodet, J.C., Tantini, F.: Learning languages from bounded resources: The case of the dfa and the balls of strings. In: Clark, A., Coste, F., Miclet, L. (eds.) ICGI 2008. LNCS (LNAI), vol. 5278, pp. 43–56. Springer, Heidelberg (2008)
Chapter Google Scholar
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Doklady Akademii Nauk SSSR 163(4), 845–848 (1965)
MathSciNet Google Scholar
Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. Journal of the ACM 21, 168–178 (1974)
Article MATH MathSciNet Google Scholar
de la Higuera, C., Casacuberta, F.: Topology of strings: median string is NP-complete. Theoretical Computer Science 230, 39–48 (2000)
Article MATH MathSciNet Google Scholar
Asuncion, A., Newman, D.: UCI machine learning repository (2007)
Google Scholar
Oncina, J., García, P.: Identifying regular languages in polynomial time. In: Advances in Structural and Syntactic Pattern Recognition, pp. 99–108. World Scientific Publishing, Singapore (1992)
Google Scholar
Micó, L., Oncina, J.: Comparison of fast nearest neighbour classifiers for handwritten character recognition. Pattern Recognition Letter 19(3-4), 351–356 (1998)
Article Google Scholar
Oncina, J., Sebban, M.: Learning stochastic edit distance: Application in handwritten character recognition. Pattern Recognition 39(9), 1575–1587 (2006)
Article MATH Google Scholar
Boyer, L., Esposito, Y., Habrard, A., Oncina, J., Sebban, J.: Sedil: Software for Edit Distance Learning. In: Daelemans, W., Goethals, B., Morik, K. (eds.) Proceedings of the 19th European Conference on Machine Learning, pp. 672–677. Springer, Heidelberg (2008)
Google Scholar
Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning 51(2), 181–207 (2003)
Article MATH Google Scholar
Tantini, F., de la Higuera, C., Janodet, J.C.: Identification in the limit of systematic-noisy languages. In: Sakakibara, Y., Kobayashi, S., Sato, K., Nishino, T., Tomita, E. (eds.) ICGI 2006. LNCS (LNAI), vol. 4201, pp. 19–31. Springer, Heidelberg (2006)
Chapter Google Scholar
Janodet, J.C.: The vapnik-chervonenkis dimension of balls of strings is infinite. Personal Communication (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Parole, CNRS/LORIA Nancy,
Frédéric Tantini
Mostrare (INRIA Lille Nord Europe et CNRS LIFL), Université Lille Nord de, France
Alain Terlutte & Fabien Torre

Authors

Frédéric Tantini
View author publications
You can also search for this author in PubMed Google Scholar
Alain Terlutte
View author publications
You can also search for this author in PubMed Google Scholar
Fabien Torre
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Sistemas Informáticos y Computatión, Universidad Politécnica de Valencia, Camino de Vera s/n, 46022, Valencia, Spain
José M. Sempere & Pedro García &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tantini, F., Terlutte, A., Torre, F. (2010). Sequences Classification by Least General Generalisations. In: Sempere, J.M., García, P. (eds) Grammatical Inference: Theoretical Results and Applications. ICGI 2010. Lecture Notes in Computer Science(), vol 6339. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15488-1_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-15488-1_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15487-4
Online ISBN: 978-3-642-15488-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics