Abstract
In this paper, we present a general framework for supervised classification. This framework provides methods like boosting and only needs the definition of a generalisation operator called lgg. For sequence classification tasks, lgg is a learner that only uses positive examples. We show that grammatical inference has already defined such learners for automata classes like reversible automata or k-TSS automata. Then we propose a generalisation algorithm for the class of balls of words. Finally, we show through experiments that our method efficiently resolves sequence classification tasks.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Angluin, D.: Inference of reversible languages. Journal of the ACM 29(3), 741–765 (1982)
García, P., Vidal, E.: Inference of k-testable languages in the strict sense and application to syntactic pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12(9), 920–925 (1990)
de la Higuera, C., Janodet, J.C., Tantini, F.: Learning languages from bounded resources: The case of the dfa and the balls of strings. In: Clark, A., Coste, F., Miclet, L. (eds.) ICGI 2008. LNCS (LNAI), vol. 5278, pp. 43–56. Springer, Heidelberg (2008)
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Doklady Akademii Nauk SSSR 163(4), 845–848 (1965)
Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. Journal of the ACM 21, 168–178 (1974)
de la Higuera, C., Casacuberta, F.: Topology of strings: median string is NP-complete. Theoretical Computer Science 230, 39–48 (2000)
Asuncion, A., Newman, D.: UCI machine learning repository (2007)
Oncina, J., García, P.: Identifying regular languages in polynomial time. In: Advances in Structural and Syntactic Pattern Recognition, pp. 99–108. World Scientific Publishing, Singapore (1992)
Micó, L., Oncina, J.: Comparison of fast nearest neighbour classifiers for handwritten character recognition. Pattern Recognition Letter 19(3-4), 351–356 (1998)
Oncina, J., Sebban, M.: Learning stochastic edit distance: Application in handwritten character recognition. Pattern Recognition 39(9), 1575–1587 (2006)
Boyer, L., Esposito, Y., Habrard, A., Oncina, J., Sebban, J.: Sedil: Software for Edit Distance Learning. In: Daelemans, W., Goethals, B., Morik, K. (eds.) Proceedings of the 19th European Conference on Machine Learning, pp. 672–677. Springer, Heidelberg (2008)
Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning 51(2), 181–207 (2003)
Tantini, F., de la Higuera, C., Janodet, J.C.: Identification in the limit of systematic-noisy languages. In: Sakakibara, Y., Kobayashi, S., Sato, K., Nishino, T., Tomita, E. (eds.) ICGI 2006. LNCS (LNAI), vol. 4201, pp. 19–31. Springer, Heidelberg (2006)
Janodet, J.C.: The vapnik-chervonenkis dimension of balls of strings is infinite. Personal Communication (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tantini, F., Terlutte, A., Torre, F. (2010). Sequences Classification by Least General Generalisations. In: Sempere, J.M., García, P. (eds) Grammatical Inference: Theoretical Results and Applications. ICGI 2010. Lecture Notes in Computer Science(), vol 6339. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15488-1_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-15488-1_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15487-4
Online ISBN: 978-3-642-15488-1
eBook Packages: Computer ScienceComputer Science (R0)