research-article

Free Access

An unsupervised approach for bootstrapping Arabic sense tagging

Author:
Mona T. Diab

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

Semitic '04: Proceedings of the Workshop on Computational Approaches to Arabic Script-based LanguagesAugust 2004Pages 43–50

Published:28 August 2004Publication History

Semitic '04: Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages

Pages 43–50

ABSTRACT

To date, there are no WSD systems for Arabic. In this paper we present and evaluate a novel unsupervised approach, SALAAM, which exploits translational correspondences between words in a parallel Arabic English corpus to annotate Arabic text using an English WordNet taxonomy. We illustrate that our approach is highly accurate in ≤ 90.1% of the evaluated data items based on Arabic native judgement ratings and annotations. Moreover, the obtained results are competitive with state-of-the-art unsupervised English WSD systems when evaluated on English data.

References

Irina Chugur, Julio Gonzalo, and Felisa Verdejo. 2002. Polysemy and sense proximity in the senseval-2 test suite. In Proceedings of Word Sense Diasmbiguation: Recent Successes and Future Directions, University of Pennsylvania, Pennsylvania, July. Google ScholarDigital Library
D. Cruse. 1986. Lexical Semantics. Cambridge University Press.Google Scholar
Mona Diab and Philip Resnik. 2002. Word sense tagging using parallel corpora. In Proceedings of 40th ACL Conference, Pennsylvania, USA. Google ScholarDigital Library
Mona Diab. 2000. An unsupervised method for multilingual word sense tagging using parallel corpora: A preliminary investigation. In SIGLEX2000: Word Senses and Multi-linguality, Hong Kong, October. Google ScholarDigital Library
Mona Diab. 2003. Word sense disambiguation within a multilingual framework. In PhD Thesis, University of Maryland, College Park. Google ScholarDigital Library
Helge Dyvik. 1998. Translations as semantic mirrors.Google Scholar
Christiane Fellbaum, Martha Palmer, Hoa Trang Dang, Lauren Delfs, and Susanne Wolff. 2001. Manual and Automatic Semantic Annotation with WordNet. In Proceedings of the NAACL Workshop on WordNet and Other Lexical Resources: Applications, Customizations, Carnegie Mellon University, Pittsburg, PA.Google Scholar
Christiane Fellbaum. 1998. WordNet: An Electronic Lexical Database. MIT Press. http://www.cogsci.princeton.edu/~wn {2000, September 7}.Google Scholar
Nancy Ide. 2000. Cross-lingual sense discrimination: Can it work? Computers and the Humanities, 34:223--34.Google ScholarCross Ref
M. R. Quillian. 1968. Semantic Memory. In M. Minsky, editor, Semantic Information Processing. The MIT Press, Cambridge, MA.Google Scholar
Philip Resnik and David Yarowsky. 1998. Distinguishing Systems and Distinguishing Senses: New Evaluation Methods for Word Sense Disambiguation. Natural Language Engineering, 1(1):1--25.Google Scholar
Philip Resnik. 1999. Disambiguating Noun Groupings with Respect to WordNet Senses. In S. Armstrong, K. Church, P. Isabelle, S. Manzi, E. Tzoukermann, and D. Yarowsky, editors, Natural Language Processing Using Very Large Corpora, pages 77--98. Kluwer Academic, Dordrecht.Google ScholarCross Ref
P. Vossen, W. Peters, and J. Gonzalo. 1999. Towards a Universal Index of Meaning. pages 1--24.Google Scholar
Louise Guthrie Wim Peters and Yorick Wilks. 2001. Cross-linguistic discovery of semantic regularity.Google Scholar

An unsupervised approach for bootstrapping Arabic sense tagging

Recommendations

An unsupervised method for multilingual word sense tagging using parallel corpora: a preliminary investigation
WWSM '00: Proceedings of the ACL-2000 workshop on Word senses and multi-linguality - Volume 8

With an increasing number of languages making their way to our desktops everyday via the Internet, researchers have come to realize the lack of linguistic knowledge resources for scarcely represented/studied languages. In an attempt to bootstrap some of ...
Read More
An unsupervised method for word sense disambiguation
Abstract
Word sense disambiguation (WSD) finds the actual meaning of a word according to its context. This paper presents a novel WSD method to find the correct sense of a word present in a sentence. The proposed method uses both the WordNet ...
Read More
An unsupervised method for multilingual word sense tagging using parallel corpora: a preliminary investigation
WorkSense '00: Proceedings of the ACL-2000 Workshop on Word Senses and Multi-Linguality

With an increasing number of languages making their way to our desktops everyday via the Internet, researchers have come to realize the lack of linguistic knowledge resources for scarcely represented/studied languages. In an attempt to bootstrap some of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
Semitic '04: Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages
August 2004
98 pages
Program Chairs:
Ali Farghaly
SYSTRAN Software, Inc.
,
Karine Megerdoomian
Inxight Software, Inc. and University of California, San Diego
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 28 August 2004
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate12of21submissions,57%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 286
  Total Downloads
- Downloads (Last 12 months)31
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

An unsupervised approach for bootstrapping Arabic sense tagging

Semitic '04: Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages

ABSTRACT

References

Cited By

Recommendations

An unsupervised method for multilingual word sense tagging using parallel corpora: a preliminary investigation

An unsupervised method for word sense disambiguation

An unsupervised method for multilingual word sense tagging using parallel corpora: a preliminary investigation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

An unsupervised approach for bootstrapping Arabic sense tagging

Semitic '04: Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages

ABSTRACT

References

Cited By

Recommendations

An unsupervised method for multilingual word sense tagging using parallel corpora: a preliminary investigation

An unsupervised method for word sense disambiguation

An unsupervised method for multilingual word sense tagging using parallel corpora: a preliminary investigation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media