Abstract
ParaMor automatically learns morphological paradigms from unlabelled text, and uses them to annotate word forms with morpheme boundaries. ParaMor competed in the English and German tracks of Morpho Challenge 2007 (Kurimo et al., 2008). In English, ParaMor’s balanced precision and recall outperform at F1 an already sophisticated baseline induction algorithm, Morfessor (Creutz, 2006). In German, ParaMor suffers from a low morpheme recall. But combining ParaMor’s analyses with analyses from Morfessor results in a set of analyses that outperform either algorithm alone, and that place first in F1 among all algorithms submitted to Morpho Challenge 2007.
Categories and Subject Descriptions: I.2 [Artificial Intelligence]: I.2.7 Natural Language Processing.
The research reported in this paper was funded in part by NSF grant number IIS-0121631.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Altun, Y., Johnson, M.: Inducing SFA with є-Transitions Using Minimum Description Length. In: Finite State Methods in Natural Language Processing Workshop at ESSLLI., Helsinki, Finland (2001)
Brent, M.R., Murthy, S.K., Lundberg, A.: Discovering Morphemic Suffixes: A Case Study in MDL Induction. In: The Fifth International Workshop on Artificial Intelligence and Statistics, Fort Lauderdale, Florida (1995)
Burnage, G.: Celex—A Guide for Users. Springer, Centre for Lexical information, Nijmegen, The Netherlands (1990)
Creutz, M.: Morpho project ( May 31, 2007), http://www.cis.hut.fi/projects/morpho/
Creutz, M.: Induction of the Morphology of Natural Language: Unsupervised Morpheme Segmentation with Application to Automatic Speech Recognition. Ph.D. Thesis. Computer and Information Science, Report D13. Helsinki: University of Technology, Espoo, Finland (2006)
Demberg, V.: A Language-Independent Unsupervised Model for Morphological Segmentation. Association for Computational Linguistics, Prague (2007)
Goldsmith, J.: Unsupervised Learning of the Morphology of a Natural Language. Computational Linguistics 27(2), 153–198 (2001)
Hafer, M.A., Weiss, S.F.: Word Segmentation by Letter Successor Varieties. Information Storage and Retrieval 10(11/12), 371–385 (1974)
Harris, Z.: From Phoneme to Morpheme. Language 31(2), 190–222 (1955); Reprinted in Harris (1970)
Harris, Z.: Papers in Structural and Transformational Linguists. D. Reidel, Dordrecht (1970)
Johnson, H., Martin, J.: Unsupervised Learning of Morphology for English and Inuktitut. In: Human Language Technology Conference / North American Chapter of the Association for Computational Linguistics, Edmonton, Canada (2003)
Kurimo, M., Creutz, M., Varjokallio, M.: Morpho Challenge Evaluation Using a Linguistic Gold Standard. In: Proceedings of the CLEF 2007 Workshop. Springer, Heidelberg (2008)
Monson, C., Carbonell, J., Lavie, A., Levin, L.: ParaMor: Minimally Supervised Induction of Paradigm Structure and Morphological Analysis. In: Computing and Historical Phonology: The Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology, Prague, Czech Republic (2007)
Snover, M.G.: An Unsupervised Knowledge Free Algorithm for the Learning of Morphology in Natural Languages. M.S. Thesis. Computer Science, Sever Institute of Technology. Washington University, Saint Louis, Missouri (2002)
Stump, G.T.: Inflectional Morphology: A Theory of Paradigm Structure. Cambridge University Press, Cambridge (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Monson, C., Carbonell, J., Lavie, A., Levin, L. (2008). ParaMor: Finding Paradigms across Morphology . In: Peters, C., et al. Advances in Multilingual and Multimodal Information Retrieval. CLEF 2007. Lecture Notes in Computer Science, vol 5152. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85760-0_115
Download citation
DOI: https://doi.org/10.1007/978-3-540-85760-0_115
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85759-4
Online ISBN: 978-3-540-85760-0
eBook Packages: Computer ScienceComputer Science (R0)