Abstract
Being an agglutinative language Kazakh imposes certain difficulties on both recognition of correct words and generation of candidate corrections for misspelled words. In this paper we describe a spelling correction method for Kazakh that takes advantage of both morphological analysis and noisy channel-based model. Our method outperforms both open source and commercial analogues in terms of the overall accuracy. We performed a comparative analysis of the spelling correction tools and pointed out some problems of spelling correction for agglutinative languages in general and for Kazakh in particular.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Koskenniemi, K.: A general computational model for word-form recognition and production. In: Proceedings of the 10th International Conference on Computational Linguistics, pp. 178–181. Association for Computational Linguistics (1984)
Hakkani-Tur, D.Z., Oflazer, K., Tur, G.: Statistical morphological disambiguation for agglutinative languages. Computers and the Humanities 36(4), 381–410 (2002)
Oflazer, K., Güzey, C.: Spelling correction in agglutinative languages. In: ANLP, pp. 194–195 (1994)
Makhambetov, O., Makazhanov, A., Yessenbayev, Z., Matkarimov, B., Sabyrgaliyev, I., Sharafudinov, A.: Assembling the kazakh language corpus. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, USA, pp. 1022–1031. Association for Computational Linguistics (October 2013)
Németh, L.: Hunspell open source spell checker (2011)
Church, K., Gale, W.: Probability scoring for spelling correction. Statistics and Computing 1(2), 93–103 (1991)
Mussayeva, A.: Kazakh language spelling with hunspell in openoffice.org. Technical report, The University of Nottingham (2008)
Microsoft: Microsoft Office 2010, kazakh language pack (2010)
Damerau, F.J.: A technique for computer detection and correction of spelling errors. Commun. ACM 7(3), 171–176 (1964)
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady 10(8), 707–710 (1966)
Mays, E., Damerau, F., Mercer, R.: Context based spelling correction. Information Processing & Management 27(5), 517–522 (1991)
Shannon, C.E.: A mathematical theory of communication. The Bell System Technical Journal 27, 379–423 (1948)
Brill, E., Moore, R.: An improved error model for noisy channel spelling correction. In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, Hong Kong (2000)
Hodge, V.J., Austin, J.: A comparison of a novel neural spell checker and standard spell checking algorithms. Pattern Recognition 35(11), 2571–2580 (2002)
Austin, J., Kennedy, J., Lees, K.: The advanced uncertain reasoning architecture, aura. Technical report, University of Canterbury (1995)
Alegria, I., Ceberio, K., Ezeiza, N., Soroa, A., Hernández, G.: Spelling correction: from two-level morphology to open source. In: LREC. European Language Resources Association (2008)
Pirinen, T.A., Silfverberg, M., Lindén, K.: Improving finite-state spell- checker suggestions with part of speech n-grams (2012)
Mussayeva, A.: Mozilla add-ons, kazakh spelling dictionary 1.1 (2009)
Mussayeva, A.: OpenOffice, kazakh spelling dictionary (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Makazhanov, A., Makhambetov, O., Sabyrgaliyev, I., Yessenbayev, Z. (2014). Spelling Correction for Kazakh. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2014. Lecture Notes in Computer Science, vol 8404. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54903-8_44
Download citation
DOI: https://doi.org/10.1007/978-3-642-54903-8_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54902-1
Online ISBN: 978-3-642-54903-8
eBook Packages: Computer ScienceComputer Science (R0)