Supervised Learning and Distributional Semantic Models for Super-Sense Tagging

Basile, Pierpaolo; Caputo, Annalina; Semeraro, Giovanni

doi:10.1007/978-3-319-03524-6_9

Pierpaolo Basile²⁰,
Annalina Caputo²⁰ &
Giovanni Semeraro²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8249))

Included in the following conference series:

Congress of the Italian Association for Artificial Intelligence

1290 Accesses

Abstract

Super-sense tagging is the task of annotating each word in a text with a super-sense, i.e. a general concept such as animal, food or person, coming from the general semantic taxonomy defined by the WordNet lexicographer classes. Due to the small set of involved concepts, the task is simpler than Word Sense Disambiguation, which identifies a specific meaning for each word. The small set of concepts allows machine learning algorithms to achieve good performance when coping with the problem of tagging. However, machine learning algorithms suffer from data-sparseness. This problem becomes more evident when lexical features are involved, because test data can contain words with low frequency (or completely absent) in training data. To overcome the sparseness problem, this paper proposes a supervised method for super-sense tagging which incorporates information coming from a distributional space of words built on a large corpus. Results obtained on two standard datasets, SemCor and SensEval-3, show the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Attardi, G., Dei Rossi, S., Di Pietro, G., Lenci, A., Montemagni, S., Simi, M.: A Resource and Tool for Super-sense Tagging of Italian Texts. In: Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010 (2010)
Google Scholar
Basile, P.: Super-Sense Tagging Using Support Vector Machines and Distributional Features. In: Magnini, B., Cutugno, F., Falcone, M., Pianta, E. (eds.) EVALITA 2012. LNCS, vol. 7689, pp. 176–185. Springer, Heidelberg (2012)
Google Scholar
Ciaramita, M., Altun, Y.: Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pp. 594–602. Association for Computational Linguistics (2006)
Google Scholar
Ciaramita, M., Johnson, M.: Supersense tagging of unknown nouns in WordNet. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 168–175. Association for Computational Linguistics (2003)
Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3), 273–297 (1995)
MATH Google Scholar
Croce, D., Basili, R.: Structured learning for semantic role labeling. In: Pirrone, R., Sorbello, F. (eds.) AI*IA 2011. LNCS, vol. 6934, pp. 238–249. Springer, Heidelberg (2011)
Chapter Google Scholar
Curran, J.: Supersense tagging of unknown nouns using semantic similarity. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 26–33. Association for Computational Linguistics (2005)
Google Scholar
Dasgupta, S., Gupta, A.: An elementary proof of a theorem of Johnson and Lindenstrauss. Random Structures & Algorithms 22(1), 60–65 (2003)
Article MathSciNet MATH Google Scholar
Fan, R., Chang, K., Hsieh, C., Wang, X., Lin, C.: LIBLINEAR: A library for large linear classification. The Journal of Machine Learning Research 9, 1871–1874 (2008)
MATH Google Scholar
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press (1998)
Google Scholar
Grishman, R., Sundheim, B.: Message Understanding Conference-6: a brief history. In: Proceedings of the 16th Conference on Computational Linguistics, COLING 1996, vol. 1, pp. 466–471. Association for Computational Linguistics, Stroudsburg (1996)
Chapter Google Scholar
Harris, Z.: Mathematical Structures of Language. Interscience, New York (1968)
MATH Google Scholar
Kim, S., Seo, H., Rim, H.: Information retrieval using word senses: root sense tagging approach. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 258–265. ACM (2004)
Google Scholar
Koo, T., Collins, M.: Hidden-variable models for discriminative reranking. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 507–514. Association for Computational Linguistics (2005)
Google Scholar
Kudo, T., Matsumoto, Y.: Fast Methods for Kernel-Based Text Analysis. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pp. 24–31. Association for Computational Linguistics, Sapporo (2003)
Google Scholar
Landauer, T.K., Dumais, S.T.: A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. Psychological Review 104(2), 211–240 (1997)
Article Google Scholar
Mihalcea, R., Csomai, A., Ciaramita, M.: Unt-yahoo: Supersenselearner: Combining senselearner with supersense and other coarse semantic features. In: Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval 2007), pp. 406–409. Association for Computational Linguistics, Prague (2007)
Chapter Google Scholar
Molina, A., Pla, F., Segarra, E.: A Hidden Markov Model Approach to Word Sense Disambiguation. In: Garijo, F.J., Riquelme, J.-C., Toro, M. (eds.) IBERAMIA 2002. LNCS (LNAI), vol. 2527, pp. 655–663. Springer, Heidelberg (2002)
Chapter Google Scholar
Molina, A., Pla, F., Segarra, E.: WSD System Based on Specialized Hidden Markov Model (upv-shmm-eaw). In: SENSEVAL-3/ACL 2004 (2004)
Google Scholar
Navigli, R.: Word Sense Disambiguation: A survey. ACM Comput. Surv. 41, 10:1–10:69 (2009)
Google Scholar
Picca, D., Gliozzo, A., Ciaramita, M.: Supersense tagger for Italian. In: Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008 (2008)
Google Scholar
Sahlgren, M.: The Word-Space Model: Using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. Ph.D. thesis, Stockholm: Stockholm University, Faculty of Humanities, Department of Linguistics (2006)
Google Scholar
Schütze, H.: Automatic word sense discrimination. Computational Linguistics 24(1), 97–123 (1998)
Google Scholar
Segond, F., Schiller, A., Grefenstette, G., Chanod, J.: An experiment in semantic tagging using hidden markov model tagging. In: ACL/EACL Workshop on Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications, pp. 78–81 (1997)
Google Scholar
Snyder, B., Palmer, M.: The English all-words task. In: Mihalcea, R., Edmonds, P. (eds.) Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, pp. 41–43. Association for Computational Linguistics, Barcelona (2004)
Google Scholar
Widdows, D., Ferraro, K.: Semantic Vectors: A Scalable Open Source Package and Online Technology Management Application. In: Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science, University of Bari Aldo Moro, Via E. Orabona, 4, 70125, Bari, Italy
Pierpaolo Basile, Annalina Caputo & Giovanni Semeraro

Authors

Pierpaolo Basile
View author publications
You can also search for this author in PubMed Google Scholar
Annalina Caputo
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Semeraro
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Informatica, Università degli Studi di Torino, via Pessinetto 12, 10149, Torino, Italy
Matteo Baldoni , Cristina Baroglio , Guido Boella & Roberto Micalizio , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Basile, P., Caputo, A., Semeraro, G. (2013). Supervised Learning and Distributional Semantic Models for Super-Sense Tagging. In: Baldoni, M., Baroglio, C., Boella, G., Micalizio, R. (eds) AI*IA 2013: Advances in Artificial Intelligence. AI*IA 2013. Lecture Notes in Computer Science(), vol 8249. Springer, Cham. https://doi.org/10.1007/978-3-319-03524-6_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-03524-6_9
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03523-9
Online ISBN: 978-3-319-03524-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics