The components of paraphrase evaluations

McCarthy, Philip M.; Guess, Rebekah H.; McNamara, Danielle S.

doi:10.3758/BRM.41.3.682

The components of paraphrase evaluations

Society for Computers in Psychology
Published: 01 August 2009

Volume 41, pages 682–690, (2009)
Cite this article

Behavior Research Methods Aims and scope Submit manuscript

Philip M. McCarthy¹,
Rebekah H. Guess¹ &
Danielle S. McNamara¹

3356 Accesses
27 Citations
Explore all metrics

Abstract

Two sentences are paraphrases if their meanings are equivalent but their words and syntax are different. Paraphrasing can be used to aid comprehension, stimulate prior knowledge, and assist in writing-skills development. As such, paraphrasing is a feature of fields as diverse as discourse psychology, composition, and computer science. Although automated paraphrase assessment is both commonplace and useful, research has centered solely on artificial, edited paraphrases and has used only binary dimensions (i.e., is or is not a paraphrase). In this study, we use an extensive database (N=1,998) of natural paraphrases generated by high school students that have been assessed along 10 dimensions (e.g., semantic completeness, lexical similarity, syntactical similarity). This study investigates the components of paraphrase quality emerging from these dimensions and examines whether computational approaches can simulate those human evaluations. The results suggest that semantic and syntactic evaluations are the primary components of paraphrase quality, and that computationally light systems such as latent semantic analysis (semantics) and minimal edit distances (syntax) present promising approaches to simulating human evaluations of paraphrases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automated Paraphrase Quality Assessment Using Recurrent Neural Networks and Language Models

Evaluation of a rule-based approach to automatic factual question generation using syntactic and semantic analysis

Article 10 July 2023

‘John Ate 5 Apples’ != ‘John Ate Some Apples’: Self-supervised Paraphrase Quality Detection for Algebraic Word Problems

References

Cavazza, M., Perotto, W., & Cashman, N. (1999). The “virtual interactive presenter”: A conversational interface for interactive television. In M. Diaz, P. Owezarsji, & P. Senac (Eds.), Proceedings of the 6th International Workshop on Interactive Distributed Multimedia Systems and Telecommunications Services, IDSM ′99 (pp. 235–243). Toulouse: Springer.
Google Scholar
Charniak, E. (2000). A maximum-entropy-inspired parser. In Proceedings of the First Conference of the North American chapter of the Association for Computational Linguistics (pp. 132–139). San Francisco: Morgan Kaufmann.
Google Scholar
Dagan, I., Glickman, O., & Magnini, B. (2004–2005). Recognizing textual entailment. www.pascal-network.org/Challenges/RTE.
Dolan, B., Quirk, C., & Brockett, C. (2005). Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources. In Proceedings of the 20th International Conference on Computational Linguistics (pp. 350–356). Geneva.
Golightly, K. B., & Sanders, G. (Eds.) (1997). Writing and reading in the disciplines. Boston: Pearson.
Google Scholar
Graesser, A. C., Chipman, P., Haynes, B. C., & Olney, A. (2005). AutoTutor: An intelligent tutoring system with mixed-initiative dialogue. IEEE Transactions on Education, 48, 612–618.
Article Google Scholar
Graesser, A. C., Person, N. K., & Magliano, J. P. (1995). Collaborative dialogue patterns in naturalistic one-to-one tutoring. Applied Cognitive Psychology, 9, 495–522.
Article Google Scholar
Hawes, K. (2003). Mastering academic writing: Write a paraphrase sentence. Memphis, TN: University of Memphis.
Google Scholar
Ibrahim, A., Katz, B., & Lin, J. (2003). Extracting structural paraphrases from aligned monolingual corpora. Proceedings of the Second International Workshop on Paraphrasing (pp. 57–64). Sapporo, Japan.
Iordanskaja, L., Kittredge, R., & Polgere, A. (1991). Lexical selection and paraphrase in a meaning-text generation model. In C. L. Paris, W. R. Swartout, & W. C. Mann (Eds.), Natural language generation in artificial intelligence and computational linguistics (pp. 293–312). Norwell, MA: Kluwer.
Chapter Google Scholar
Landauer, T., McNamara, D. S., Dennis, S., & Kintsch, W. (Eds.) (2007). Handbook of latent semantic analysis. Mahwah, NJ: Erlbaum.
Google Scholar
Lockelt, M., Pfleger, N., & Reithinger, N. (2007). Multi-party conversation for mixed reality. International Journal of Virtual Reading, 6, 31–42.
Google Scholar
Mani, I. (2001). Automatic summarization (Natural Language Processing, 3). Philadelphia: John Benjamins.
Book Google Scholar
McCarthy, P. M., & McNamara, D. S. (2008). The user-language paraphrase challenge. Retrieved January 10, 2008, from https://umdrive.memphis.edu/pmmccrth/public/Paraphrase%20Corpus/Paraphrase_site.htm.
McCarthy, P. M., Rus, V., Crossley, S. A., Bigham, S. C., Graesser, A. C., & McNamara, D. S. (2007). Assessing Entailer with a corpus of natural language. In D. Wilson & G. Sutcliffe (Eds.), Proceedings of the Twentieth International Florida Artificial Intelligence Research Society Conference (pp. 247–252). Menlo Park, CA: AAAI Press.
Google Scholar
McCarthy, P. M., Rus, V., Crossley, S. A., Graesser, A. C., & Mc-Namara, D. S. (2008). Assessing forward-, reverse-, and average-entailment indices on natural language input from the intelligent tutoring system, iSTART. In D. Wilson & G. Sutcliffe (Eds.), Proceedings of the 21st International Florida Artificial Intelligence Research Society Conference (pp. 165–170). Menlo Park, CA: AAAI Press.
Google Scholar
McNamara, D. S. (2004). SERT: Self-explanation reading training. Discourse Processes, 38, 1–30.
Article Google Scholar
McNamara, D. S., Levinstein, I. B., & Boonthum, C. (2004). iSTART: Interactive strategy trainer for active reading and thinking. Behavior Research Methods, Instruments, & Computers, 36, 222–233.
Article Google Scholar
McNamara, D. S., Ozuru, Y., Best, R., & O’Reilly, T. (2007). The 4-pronged comprehension strategy framework. In D. S. McNamara (Ed.), Reading comprehension strategies: Theories, interventions, and technologies (pp. 465–496). Mahwah, NJ: Erlbaum.
Chapter Google Scholar
Millis, K., Magliano, J., Wiemer-Hastings, K., Todaro, S., & McNamara, D. S. (2007). Assessing and improving comprehension with latent semantic analysis. In T. Landauer, D. S. McNamara, S. Dennis, & W. Kintsch (Eds.), Handbook of latent semantic analysis (pp. 207–225). Mahwah, NJ: Erlbaum.
Google Scholar
Renner, A. M., McCarthy, P. M., & McNamara, D. S. (2009). Computational considerations in correcting user-language in an ITS environment. In C. H. Lane & H. W. Guesgen (Eds.), Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference (pp. 278–283). Menlo Park, CA: AAAI Press.
Google Scholar
Rus, V., Lintean, M., McCarthy, P. M., McNamara, D. S., & Graesser, A. C. (2008). Paraphrase identification with lexico-syntactic graph subsumption. In D. Wilson & G. Sutcliffe (Eds.), Proceedings of the 21st International Florida Artificial Intelligence Research Society Conference (pp. 201–206). Menlo Park, CA: AAAI Press.
Google Scholar
Rus, V., McCarthy, P. M., Lintean, M. C., Graesser, A. C., & Mc-Daniel, D. (2007). Deep natural language processing for evaluating student self-explanations in iSTART. In Proceedings of the Twentieth International Florida Artificial Intelligence Research Society Conference (pp. 422–427). Menlo Park, CA: AAAI Press.
Google Scholar
Rus, V., McCarthy, P. M., McNamara, D. S., & Graesser, A. C. (2008a). Natural language understanding and assessment. In J. R. Rabuñal, J. Dorado, & A. Pazos (Eds.), Encyclopedia of artificial intelligence (pp. 1179–1184). Hershey, NY: IGI Global.
Google Scholar
Rus, V., McCarthy, P. M., McNamara, D. S., & Graesser, A. C. (2008b). A study of textual entailment. International Journal on Artificial Intelligence Tools, 17, 659–685.
Article Google Scholar
Thompson, W. D., & Walter, S. D. (1988). A reappraisal of the kappa coefficient. Journal of Clinical Epidemiology, 10, 949–958.
Article Google Scholar
Witten, I. H., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques. San Francisco: Morgan Kaufmann.
Google Scholar

Download references

Author information

Authors and Affiliations

FedEx Institute of Technology, Institute for Intelligent Systems, University of Memphis, 4th Floor, Room 410, 38152, Memphis, TN
Philip M. McCarthy, Rebekah H. Guess & Danielle S. McNamara

Authors

Philip M. McCarthy
View author publications
You can also search for this author in PubMed Google Scholar
Rebekah H. Guess
View author publications
You can also search for this author in PubMed Google Scholar
Danielle S. McNamara
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Philip M. McCarthy.

Additional information

This research was supported in part by the Institute for Education Sciences (Grants R305GA080589, R305G020018-02, and R305G040046), and in part by the National Science Foundation (Grant IIS-0735682). The views expressed in this article do not necessarily reflect the views of the IES or the NSF.

Rights and permissions

Reprints and permissions

About this article

Cite this article

McCarthy, P.M., Guess, R.H. & McNamara, D.S. The components of paraphrase evaluations. Behavior Research Methods 41, 682–690 (2009). https://doi.org/10.3758/BRM.41.3.682

Download citation

Received: 11 November 2008
Accepted: 19 February 2009
Published: 01 August 2009
Issue Date: August 2009
DOI: https://doi.org/10.3758/BRM.41.3.682

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The components of paraphrase evaluations

Abstract

Access this article

Similar content being viewed by others

Automated Paraphrase Quality Assessment Using Recurrent Neural Networks and Language Models

Evaluation of a rule-based approach to automatic factual question generation using syntactic and semantic analysis

‘John Ate 5 Apples’ != ‘John Ate Some Apples’: Self-supervised Paraphrase Quality Detection for Algebraic Word Problems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The components of paraphrase evaluations

Abstract

Access this article

Similar content being viewed by others

Automated Paraphrase Quality Assessment Using Recurrent Neural Networks and Language Models

Evaluation of a rule-based approach to automatic factual question generation using syntactic and semantic analysis

‘John Ate 5 Apples’ != ‘John Ate Some Apples’: Self-supervised Paraphrase Quality Detection for Algebraic Word Problems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation