Unsupervised Learning of Verb Argument Structures

Pardo, Thiago Alexandre Salgueiro; Marcu, Daniel; Nunes, Maria das Graças Volpe

doi:10.1007/11671299_7

Thiago Alexandre Salgueiro Pardo¹⁷,
Daniel Marcu¹⁸ &
Maria das Graças Volpe Nunes¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3878))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

1389 Accesses

Abstract

We present a statistical generative model for unsupervised learning of verb argument structures. The model was used to automatically induce the argument structures for the 1,500 most frequent verbs of English. In an evaluation carried out for a representative sample of verbs, more than 90% of the induced argument structures were judged correct by human subjects. The induced structures also overlap significantly with those in PropBank, exhibiting some correct patterns of usage that are not present in this manually developed semantic resource.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Baker, C.F., Fillmore, C.J., Lowe, J.B.: The Berkeley FrameNet project. In: The Proceed-ings of COLING/ACL, Montreal, pp. 86–90 (1998)
Google Scholar
Bikel, D.M., Schwartz, R., Weischedel, R.M.: An Algorithm that Learns What’s in a Name. Machine Learning (1999) (Special Issue on NLP)
Google Scholar
Brent, M.R.: Automatic acquisition of subcategorization frames from untagged text. In: The Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, CA, pp. 209–214 (1991)
Google Scholar
Briscoe, T., Carroll, J.: Automatic extraction of subcategorization from corpora. In: The Proceedings of the 5th ANLP Conference, Washington, D.C, pp. 356–363 (1997)
Google Scholar
Brown, P., Cocke, J., Della Pietra, S., Della Pietra, V., Jelinek, F., Lafferty, J., Mercer, R., Roossin, P.: A statistical approach to machine translation. Computational Linguistics 16(2), 79–85 (1990)
Google Scholar
Brown, P., Della Pietra, S., Della Pietra, V., Mercer, R.: The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics 2(19), 263–311 (1993)
Google Scholar
Carletta, J.: Assessing Agreement on Classification Tasks: The Kappa Statistic. Compu-tational Linguistics 22(2), 249–254 (1996)
Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Ser B 39, 1–38 (1977)
MATH MathSciNet Google Scholar
Framis, F.R.: An experiment on learning appropriate selection restrictions from a parsed corpus. In: the Proceedings of the International Conference on Computational Linguistics, Kyoto, Japan (1994)
Google Scholar
Gildea, D.: Probabilistic Models of Verb-Argument Structure. In: the Proceedings of the 17th International Conference on Computational Linguistics (2002)
Google Scholar
Gomez, F.: Building Verb Predicates: A Computational View. In: the Proceedings of the 42nd Meeting of the Association for Computational Linguistics, Barcelona, Spain, pp. 359–366 (2004)
Google Scholar
Green, R., Dorr, B.J., Resnik, P.: Inducing Frame Semantic Verb Classes from WordNet and LDOCE. In: the Proceedings of the 42nd Meeting of the Association for Computational Linguistics, Barcelona, Spain, pp. 375–382 (2004)
Google Scholar
Grishman, R., Sterling, J.: Acquisition of selectional patterns. In: the Proceedings of the International Conference on Computational Linguistics, Nantes, France, pp. 658–664 (1992)
Google Scholar
Grishman, R., Sterling, J.: Generalizing Automatically Generated Selectional Pat-terns. In: The Proceedings of the 15th International Conference on Computational Linguistics, Kyoto, Japan (1994)
Google Scholar
Kingsbury, P., Palmer, M.: From Treebank to PropBank. In: The Proceedings of the 3rd International Conference on Language Resources and Evaluation, Las Palmas (2002)
Google Scholar
Kipper, K., Dang, H.T., Palmer, M.: Class-based Construction of a Verb Lexicon. In: The Proceedings of AAAI 17th National Conference on Artificial Intelligence, Austin, Texas (2000)
Google Scholar
Knight, K., Marcu, D.: Summarization beyond sentence extraction: A Probabilistic Approach to Sentence Compression. Artificial Intelligence 139(1) (2002)
Google Scholar
Korhonen, A.: Semantically Motivated Subcategorization Acquisition. In: The Proceedings of the Workshop of the ACL Special Interest Group on the Lexicon, pp. 51–58 (2002)
Google Scholar
Lapata, M.: Acquiring lexical generalizations from corpora: A case study for diathesis alternations. In: the Proceedings of the 37th Annual Meeting of the Association for Computa-tional Linguistics, pp. 394–404 (1999)
Google Scholar
Levin, B.: Towards a lexical organization of English verbs. Chicago University Press, Chicago (1993)
Google Scholar
Manning, C.: Automatic acquisition of a large subcategorization dictionary from cor-pora. In: The Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, pp. 235–242 (1993)
Google Scholar
Marcu, D., Popescu, A.M.: Towards Developing Probabilistic Generative Models for Reasoning with Natural Language Representations. In: The Proceedings of the 6th Interna-tional Conference on Computational Linguistics and Text Processing. LNCS, vol. 2406. Springer, Mexico (2005), ISBN 3-540-24523-5
Google Scholar
Marcus, M., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics 19(2), 313–330 (1993)
Google Scholar
Marcus, M.: The Penn Treebank: A revised corpus design for extracting predicate-argument structure. In: The Proceedings of the ARPA Human Language Technology Workshop, Princeton, NJ (1994)
Google Scholar
McCarthy, D.: Using semantic preferences to identify verbal participation in role switch-ing alternations. In: The Proceedings of the 1st NAACL, Seattle, Washington, pp. 256–263 (2000)
Google Scholar
Merlo, P., Stevenson, S.: Automatic Verb Classification Based on Statistical Distri-butions of Argument Structure. Computational Linguistics 27(3) (2001)
Google Scholar
Ratnaparki, A.: A Maximum Entropy Part-Of-Speech Tagger. In: The Proceedings of the Empirical Methods in Natural Language Processing Conference, University of Pennsylvania (1996)
Google Scholar
Resnik, P.: Wordnet and distributional analysis: a class-based approach to lexical dis-covery. In: The Proceedings of AAAI Workshop on Statistical Methods in NLP (1992)
Google Scholar
Rooth, M., Stefan, R., Prescher, D., Carroll, G., Beil, F.: Inducing a semantically anno-tated lexicon via EM-based clustering. In: The Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, College Park, Maryland, pp. 104–111 (1999)
Google Scholar
Sarkar, A., Zeman, D.: Automatic extraction of subcategorization frames for Czech. In: The Proceedings of the 18th International Conference on Computational Linguistics (2000)
Google Scholar
Sarkar, A., Tripasai, W.: Learning Verb Argument Structures from Minimally Anno-tated Corpora. In: The Proceedings of the 19th International Conference on Computational Linguistics (2002)
Google Scholar
Soricut, R., Brill, E.: Automatic Question Answering: Beyond the Factoid. In: The Proceedings of the Human Language Technology and North American Association for Com-putational Linguistics Conference (2004)
Google Scholar
Voorhees, E.M., Buckland, L.P.(eds.): NIST Special Publication 500-251: The Eleventh Text REtrieval Conference (TREC 2002), Department of Commerce, National Insti-tute of Standards and Technology (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Núcleo Interinstitucional de Lingüística Computacional (NILC), CP 668 – ICMC-USP, 13.560-970, São Carlos, SP, Brasil
Thiago Alexandre Salgueiro Pardo & Maria das Graças Volpe Nunes
Information Sciences Institute (ISI), 4676 Admiralty Way, Suite 1001, Marina del Rey, CA, 90292, USA
Daniel Marcu

Authors

Thiago Alexandre Salgueiro Pardo
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Marcu
View author publications
You can also search for this author in PubMed Google Scholar
Maria das Graças Volpe Nunes
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Polytechnic Institute, Center for Computing Research, 07738, Mexico City, México
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pardo, T.A.S., Marcu, D., Nunes, M.d.G.V. (2006). Unsupervised Learning of Verb Argument Structures. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2006. Lecture Notes in Computer Science, vol 3878. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11671299_7

Download citation

DOI: https://doi.org/10.1007/11671299_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32205-4
Online ISBN: 978-3-540-32206-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics