skip to main content
10.5555/1621804.1621818dlproceedingsArticle/Chapter ViewAbstractPublication PagessemiticConference Proceedingsconference-collections
research-article
Free Access

An unsupervised approach for bootstrapping Arabic sense tagging

Published:28 August 2004Publication History

ABSTRACT

To date, there are no WSD systems for Arabic. In this paper we present and evaluate a novel unsupervised approach, SALAAM, which exploits translational correspondences between words in a parallel Arabic English corpus to annotate Arabic text using an English WordNet taxonomy. We illustrate that our approach is highly accurate in ≤ 90.1% of the evaluated data items based on Arabic native judgement ratings and annotations. Moreover, the obtained results are competitive with state-of-the-art unsupervised English WSD systems when evaluated on English data.

References

  1. Irina Chugur, Julio Gonzalo, and Felisa Verdejo. 2002. Polysemy and sense proximity in the senseval-2 test suite. In Proceedings of Word Sense Diasmbiguation: Recent Successes and Future Directions, University of Pennsylvania, Pennsylvania, July. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. Cruse. 1986. Lexical Semantics. Cambridge University Press.Google ScholarGoogle Scholar
  3. Mona Diab and Philip Resnik. 2002. Word sense tagging using parallel corpora. In Proceedings of 40th ACL Conference, Pennsylvania, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Mona Diab. 2000. An unsupervised method for multilingual word sense tagging using parallel corpora: A preliminary investigation. In SIGLEX2000: Word Senses and Multi-linguality, Hong Kong, October. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Mona Diab. 2003. Word sense disambiguation within a multilingual framework. In PhD Thesis, University of Maryland, College Park. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Helge Dyvik. 1998. Translations as semantic mirrors.Google ScholarGoogle Scholar
  7. Christiane Fellbaum, Martha Palmer, Hoa Trang Dang, Lauren Delfs, and Susanne Wolff. 2001. Manual and Automatic Semantic Annotation with WordNet. In Proceedings of the NAACL Workshop on WordNet and Other Lexical Resources: Applications, Customizations, Carnegie Mellon University, Pittsburg, PA.Google ScholarGoogle Scholar
  8. Christiane Fellbaum. 1998. WordNet: An Electronic Lexical Database. MIT Press. http://www.cogsci.princeton.edu/~wn {2000, September 7}.Google ScholarGoogle Scholar
  9. Nancy Ide. 2000. Cross-lingual sense discrimination: Can it work? Computers and the Humanities, 34:223--34.Google ScholarGoogle ScholarCross RefCross Ref
  10. M. R. Quillian. 1968. Semantic Memory. In M. Minsky, editor, Semantic Information Processing. The MIT Press, Cambridge, MA.Google ScholarGoogle Scholar
  11. Philip Resnik and David Yarowsky. 1998. Distinguishing Systems and Distinguishing Senses: New Evaluation Methods for Word Sense Disambiguation. Natural Language Engineering, 1(1):1--25.Google ScholarGoogle Scholar
  12. Philip Resnik. 1999. Disambiguating Noun Groupings with Respect to WordNet Senses. In S. Armstrong, K. Church, P. Isabelle, S. Manzi, E. Tzoukermann, and D. Yarowsky, editors, Natural Language Processing Using Very Large Corpora, pages 77--98. Kluwer Academic, Dordrecht.Google ScholarGoogle ScholarCross RefCross Ref
  13. P. Vossen, W. Peters, and J. Gonzalo. 1999. Towards a Universal Index of Meaning. pages 1--24.Google ScholarGoogle Scholar
  14. Louise Guthrie Wim Peters and Yorick Wilks. 2001. Cross-linguistic discovery of semantic regularity.Google ScholarGoogle Scholar
  1. An unsupervised approach for bootstrapping Arabic sense tagging

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image DL Hosted proceedings
          Semitic '04: Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages
          August 2004
          98 pages

          Publisher

          Association for Computational Linguistics

          United States

          Publication History

          • Published: 28 August 2004

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate12of21submissions,57%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader