Context Sensitive Paraphrasing with a Global Unsupervised Classifier

Connor, Michael; Roth, Dan

doi:10.1007/978-3-540-74958-5_13

Michael Connor¹ &
Dan Roth¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4701))

Included in the following conference series:

European Conference on Machine Learning

5700 Accesses
2 Citations

Abstract

Lexical paraphrasing is an inherently context sensitive problem because a word’s meaning depends on context. Most paraphrasing work finds patterns and templates that can replace other patterns or templates in some context, but we are attempting to make decisions for a specific context. In this paper we develop a global classifier that takes a word v and its context, along with a candidate word u, and determines whether u can replace v in the given context while maintaining the original meaning.

We develop an unsupervised, bootstrapped, learning approach to this problem. Key to our approach is the use of a very large amount of unlabeled data to derive a reliable supervision signal that is then used to train a supervised learning algorithm. We demonstrate that our approach performs significantly better than state-of-the-art paraphrasing approaches, and generalizes well to unseen pairs of words.

Download to read the full chapter text

Chapter PDF

Knowledge-lean Paraphrase Identification Using Character-Based Features

PKU Paraphrase Bank: A Sentence-Level Paraphrase Corpus for Chinese

PTT5-Paraphraser: Diversity and Meaning Fidelity in Automatic Portuguese Paraphrasing

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Barzilay, R., Lee, L.: Catching the drift: Probabilistic content models, with applications to generation and summarization. In: Proceedings HLT-NAACL (2004)
Google Scholar
Kauchak, D., Barzilay, R.: Paraphrasing for automatic evaluation. In: Proceedings of HLT-NAACL 2006 (2006)
Google Scholar
Dagan, I., Glickman, O., Magnini, B.: The pascal recognizing textual entailment challenge. In: Proceedings of the PASCAL Challenges Workshop on Recognizing Textual Entailment (2005)
Google Scholar
de Salvo Braz, R., Girju, R., Punyakanok, V., Roth, D., Sammons, M.: An inference model for semantic entailment in natural language. In: Proceedings of the National Conference on Artificial Intelligence (AAAI), pp. 1678–1679 (2005)
Google Scholar
Ide, N., Veronis, J.: Word sense disambiguation: The state of the art. Computational Linguistics (1998)
Google Scholar
Barzilay, R., Lee, L.: Learning to paraphrase: An unsupervised approach using multiple-sequence alignment. In: Proceedings HLT-NAACL, pp. 16–23 (2003)
Google Scholar
Barzilay, R., McKeown, K.: Extracing paraphrases from a parallel corpus. In: Proceedings ACL-01 (2004)
Google Scholar
Glickman, O., Dagan, I.: Identifying lexical paraphrases from a single corpus: A case study for verbs. In: Recent Advantages in Natural Language Processing (RANLP-2003) (2003)
Google Scholar
Szpektor, I., Tanev, H., Dagan, I., Coppola, B.: Scaling web-based acquisition of entailment relations. In: Proceedings of EMNLP 2004 (2004)
Google Scholar
Lin, D., Pantel, P.: Discovery of inference rules for question answering. Natural Language Engineering 7(4), 343–360 (2001)
Article Google Scholar
Dagan, I., Glickman, O., Gliozzo, A., Marmorshtein, E., Strapparava, C.: Direct word sense matching for lexical substitution. In: Proceedings ACL-2006, pp. 449–456 (2007)
Google Scholar
Lin, D.: Principal-based parsing without overgeneration. In: Proceedings of ACL-1993, pp. 112–120 (1993)
Google Scholar
Golding, A.R., Roth, D.: A Winnow based approach to context-sensitive spelling correction. Machine Learning 34(1-3), 107–130 (1999)
Article MATH Google Scholar
Lin, D.: Automatic retrieval and clustering of similar words. In: COLING-ACL-1998 (1998)
Google Scholar
Carlson, A., Cumby, C., Rosen, J., Roth, D.: The SNoW learning architecture. Technical Report UIUCDCS-R-99-2101, UIUC Computer Science Department (May 1999)
Google Scholar
Fellbaum, C.: Wordnet: An Electronic Lexical Database. Bradford Books (1998)
Google Scholar
Navigli, R.: Meaningful clustering of senses helps boost word sense disambiguation performance. In: Proceedings of COLING-ACL 2006 (2006)
Google Scholar
Landis, J., Koch, G.: The measurement of observer agreement for categorical data. In: Biometrics (1977)
Google Scholar
Szpektor, I., Shnarch, E., Dagan, I.: Instance-based evaluation of entailment rule acquisition. In: Proceedings of ACL 2007 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Illinois at Urbana-Champaign,
Michael Connor & Dan Roth

Authors

Michael Connor
View author publications
You can also search for this author in PubMed Google Scholar
Dan Roth
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Joost N. Kok Jacek Koronacki Raomon Lopez de Mantaras Stan Matwin Dunja Mladenič Andrzej Skowron

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Connor, M., Roth, D. (2007). Context Sensitive Paraphrasing with a Global Unsupervised Classifier. In: Kok, J.N., Koronacki, J., Mantaras, R.L.d., Matwin, S., Mladenič, D., Skowron, A. (eds) Machine Learning: ECML 2007. ECML 2007. Lecture Notes in Computer Science(), vol 4701. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74958-5_13

Download citation

DOI: https://doi.org/10.1007/978-3-540-74958-5_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74957-8
Online ISBN: 978-3-540-74958-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Context Sensitive Paraphrasing with a Global Unsupervised Classifier

Abstract

Chapter PDF

Similar content being viewed by others

Knowledge-lean Paraphrase Identification Using Character-Based Features

PKU Paraphrase Bank: A Sentence-Level Paraphrase Corpus for Chinese

PTT5-Paraphraser: Diversity and Meaning Fidelity in Automatic Portuguese Paraphrasing

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Context Sensitive Paraphrasing with a Global Unsupervised Classifier

Abstract

Chapter PDF

Similar content being viewed by others

Knowledge-lean Paraphrase Identification Using Character-Based Features

PKU Paraphrase Bank: A Sentence-Level Paraphrase Corpus for Chinese

PTT5-Paraphraser: Diversity and Meaning Fidelity in Automatic Portuguese Paraphrasing

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation