ABSTRACT
To date, there are no WSD systems for Arabic. In this paper we present and evaluate a novel unsupervised approach, SALAAM, which exploits translational correspondences between words in a parallel Arabic English corpus to annotate Arabic text using an English WordNet taxonomy. We illustrate that our approach is highly accurate in ≤ 90.1% of the evaluated data items based on Arabic native judgement ratings and annotations. Moreover, the obtained results are competitive with state-of-the-art unsupervised English WSD systems when evaluated on English data.
- Irina Chugur, Julio Gonzalo, and Felisa Verdejo. 2002. Polysemy and sense proximity in the senseval-2 test suite. In Proceedings of Word Sense Diasmbiguation: Recent Successes and Future Directions, University of Pennsylvania, Pennsylvania, July. Google ScholarDigital Library
- D. Cruse. 1986. Lexical Semantics. Cambridge University Press.Google Scholar
- Mona Diab and Philip Resnik. 2002. Word sense tagging using parallel corpora. In Proceedings of 40th ACL Conference, Pennsylvania, USA. Google ScholarDigital Library
- Mona Diab. 2000. An unsupervised method for multilingual word sense tagging using parallel corpora: A preliminary investigation. In SIGLEX2000: Word Senses and Multi-linguality, Hong Kong, October. Google ScholarDigital Library
- Mona Diab. 2003. Word sense disambiguation within a multilingual framework. In PhD Thesis, University of Maryland, College Park. Google ScholarDigital Library
- Helge Dyvik. 1998. Translations as semantic mirrors.Google Scholar
- Christiane Fellbaum, Martha Palmer, Hoa Trang Dang, Lauren Delfs, and Susanne Wolff. 2001. Manual and Automatic Semantic Annotation with WordNet. In Proceedings of the NAACL Workshop on WordNet and Other Lexical Resources: Applications, Customizations, Carnegie Mellon University, Pittsburg, PA.Google Scholar
- Christiane Fellbaum. 1998. WordNet: An Electronic Lexical Database. MIT Press. http://www.cogsci.princeton.edu/~wn {2000, September 7}.Google Scholar
- Nancy Ide. 2000. Cross-lingual sense discrimination: Can it work? Computers and the Humanities, 34:223--34.Google ScholarCross Ref
- M. R. Quillian. 1968. Semantic Memory. In M. Minsky, editor, Semantic Information Processing. The MIT Press, Cambridge, MA.Google Scholar
- Philip Resnik and David Yarowsky. 1998. Distinguishing Systems and Distinguishing Senses: New Evaluation Methods for Word Sense Disambiguation. Natural Language Engineering, 1(1):1--25.Google Scholar
- Philip Resnik. 1999. Disambiguating Noun Groupings with Respect to WordNet Senses. In S. Armstrong, K. Church, P. Isabelle, S. Manzi, E. Tzoukermann, and D. Yarowsky, editors, Natural Language Processing Using Very Large Corpora, pages 77--98. Kluwer Academic, Dordrecht.Google ScholarCross Ref
- P. Vossen, W. Peters, and J. Gonzalo. 1999. Towards a Universal Index of Meaning. pages 1--24.Google Scholar
- Louise Guthrie Wim Peters and Yorick Wilks. 2001. Cross-linguistic discovery of semantic regularity.Google Scholar
- An unsupervised approach for bootstrapping Arabic sense tagging
Recommendations
An unsupervised method for multilingual word sense tagging using parallel corpora: a preliminary investigation
WWSM '00: Proceedings of the ACL-2000 workshop on Word senses and multi-linguality - Volume 8With an increasing number of languages making their way to our desktops everyday via the Internet, researchers have come to realize the lack of linguistic knowledge resources for scarcely represented/studied languages. In an attempt to bootstrap some of ...
An unsupervised method for word sense disambiguation
AbstractWord sense disambiguation (WSD) finds the actual meaning of a word according to its context. This paper presents a novel WSD method to find the correct sense of a word present in a sentence. The proposed method uses both the WordNet ...
An unsupervised method for multilingual word sense tagging using parallel corpora: a preliminary investigation
WorkSense '00: Proceedings of the ACL-2000 Workshop on Word Senses and Multi-LingualityWith an increasing number of languages making their way to our desktops everyday via the Internet, researchers have come to realize the lack of linguistic knowledge resources for scarcely represented/studied languages. In an attempt to bootstrap some of ...
Comments