Abstract
Two sentences are paraphrases if their meanings are equivalent but their words and syntax are different. Paraphrasing can be used to aid comprehension, stimulate prior knowledge, and assist in writing-skills development. As such, paraphrasing is a feature of fields as diverse as discourse psychology, composition, and computer science. Although automated paraphrase assessment is both commonplace and useful, research has centered solely on artificial, edited paraphrases and has used only binary dimensions (i.e., is or is not a paraphrase). In this study, we use an extensive database (N=1,998) of natural paraphrases generated by high school students that have been assessed along 10 dimensions (e.g., semantic completeness, lexical similarity, syntactical similarity). This study investigates the components of paraphrase quality emerging from these dimensions and examines whether computational approaches can simulate those human evaluations. The results suggest that semantic and syntactic evaluations are the primary components of paraphrase quality, and that computationally light systems such as latent semantic analysis (semantics) and minimal edit distances (syntax) present promising approaches to simulating human evaluations of paraphrases.
Similar content being viewed by others
References
Cavazza, M., Perotto, W., & Cashman, N. (1999). The “virtual interactive presenter”: A conversational interface for interactive television. In M. Diaz, P. Owezarsji, & P. Senac (Eds.), Proceedings of the 6th International Workshop on Interactive Distributed Multimedia Systems and Telecommunications Services, IDSM ′99 (pp. 235–243). Toulouse: Springer.
Charniak, E. (2000). A maximum-entropy-inspired parser. In Proceedings of the First Conference of the North American chapter of the Association for Computational Linguistics (pp. 132–139). San Francisco: Morgan Kaufmann.
Dagan, I., Glickman, O., & Magnini, B. (2004–2005). Recognizing textual entailment. www.pascal-network.org/Challenges/RTE.
Dolan, B., Quirk, C., & Brockett, C. (2005). Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources. In Proceedings of the 20th International Conference on Computational Linguistics (pp. 350–356). Geneva.
Golightly, K. B., & Sanders, G. (Eds.) (1997). Writing and reading in the disciplines. Boston: Pearson.
Graesser, A. C., Chipman, P., Haynes, B. C., & Olney, A. (2005). AutoTutor: An intelligent tutoring system with mixed-initiative dialogue. IEEE Transactions on Education, 48, 612–618.
Graesser, A. C., Person, N. K., & Magliano, J. P. (1995). Collaborative dialogue patterns in naturalistic one-to-one tutoring. Applied Cognitive Psychology, 9, 495–522.
Hawes, K. (2003). Mastering academic writing: Write a paraphrase sentence. Memphis, TN: University of Memphis.
Ibrahim, A., Katz, B., & Lin, J. (2003). Extracting structural paraphrases from aligned monolingual corpora. Proceedings of the Second International Workshop on Paraphrasing (pp. 57–64). Sapporo, Japan.
Iordanskaja, L., Kittredge, R., & Polgere, A. (1991). Lexical selection and paraphrase in a meaning-text generation model. In C. L. Paris, W. R. Swartout, & W. C. Mann (Eds.), Natural language generation in artificial intelligence and computational linguistics (pp. 293–312). Norwell, MA: Kluwer.
Landauer, T., McNamara, D. S., Dennis, S., & Kintsch, W. (Eds.) (2007). Handbook of latent semantic analysis. Mahwah, NJ: Erlbaum.
Lockelt, M., Pfleger, N., & Reithinger, N. (2007). Multi-party conversation for mixed reality. International Journal of Virtual Reading, 6, 31–42.
Mani, I. (2001). Automatic summarization (Natural Language Processing, 3). Philadelphia: John Benjamins.
McCarthy, P. M., & McNamara, D. S. (2008). The user-language paraphrase challenge. Retrieved January 10, 2008, from https://umdrive.memphis.edu/pmmccrth/public/Paraphrase%20Corpus/Paraphrase_site.htm.
McCarthy, P. M., Rus, V., Crossley, S. A., Bigham, S. C., Graesser, A. C., & McNamara, D. S. (2007). Assessing Entailer with a corpus of natural language. In D. Wilson & G. Sutcliffe (Eds.), Proceedings of the Twentieth International Florida Artificial Intelligence Research Society Conference (pp. 247–252). Menlo Park, CA: AAAI Press.
McCarthy, P. M., Rus, V., Crossley, S. A., Graesser, A. C., & Mc-Namara, D. S. (2008). Assessing forward-, reverse-, and average-entailment indices on natural language input from the intelligent tutoring system, iSTART. In D. Wilson & G. Sutcliffe (Eds.), Proceedings of the 21st International Florida Artificial Intelligence Research Society Conference (pp. 165–170). Menlo Park, CA: AAAI Press.
McNamara, D. S. (2004). SERT: Self-explanation reading training. Discourse Processes, 38, 1–30.
McNamara, D. S., Levinstein, I. B., & Boonthum, C. (2004). iSTART: Interactive strategy trainer for active reading and thinking. Behavior Research Methods, Instruments, & Computers, 36, 222–233.
McNamara, D. S., Ozuru, Y., Best, R., & O’Reilly, T. (2007). The 4-pronged comprehension strategy framework. In D. S. McNamara (Ed.), Reading comprehension strategies: Theories, interventions, and technologies (pp. 465–496). Mahwah, NJ: Erlbaum.
Millis, K., Magliano, J., Wiemer-Hastings, K., Todaro, S., & McNamara, D. S. (2007). Assessing and improving comprehension with latent semantic analysis. In T. Landauer, D. S. McNamara, S. Dennis, & W. Kintsch (Eds.), Handbook of latent semantic analysis (pp. 207–225). Mahwah, NJ: Erlbaum.
Renner, A. M., McCarthy, P. M., & McNamara, D. S. (2009). Computational considerations in correcting user-language in an ITS environment. In C. H. Lane & H. W. Guesgen (Eds.), Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference (pp. 278–283). Menlo Park, CA: AAAI Press.
Rus, V., Lintean, M., McCarthy, P. M., McNamara, D. S., & Graesser, A. C. (2008). Paraphrase identification with lexico-syntactic graph subsumption. In D. Wilson & G. Sutcliffe (Eds.), Proceedings of the 21st International Florida Artificial Intelligence Research Society Conference (pp. 201–206). Menlo Park, CA: AAAI Press.
Rus, V., McCarthy, P. M., Lintean, M. C., Graesser, A. C., & Mc-Daniel, D. (2007). Deep natural language processing for evaluating student self-explanations in iSTART. In Proceedings of the Twentieth International Florida Artificial Intelligence Research Society Conference (pp. 422–427). Menlo Park, CA: AAAI Press.
Rus, V., McCarthy, P. M., McNamara, D. S., & Graesser, A. C. (2008a). Natural language understanding and assessment. In J. R. Rabuñal, J. Dorado, & A. Pazos (Eds.), Encyclopedia of artificial intelligence (pp. 1179–1184). Hershey, NY: IGI Global.
Rus, V., McCarthy, P. M., McNamara, D. S., & Graesser, A. C. (2008b). A study of textual entailment. International Journal on Artificial Intelligence Tools, 17, 659–685.
Thompson, W. D., & Walter, S. D. (1988). A reappraisal of the kappa coefficient. Journal of Clinical Epidemiology, 10, 949–958.
Witten, I. H., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques. San Francisco: Morgan Kaufmann.
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was supported in part by the Institute for Education Sciences (Grants R305GA080589, R305G020018-02, and R305G040046), and in part by the National Science Foundation (Grant IIS-0735682). The views expressed in this article do not necessarily reflect the views of the IES or the NSF.
Rights and permissions
About this article
Cite this article
McCarthy, P.M., Guess, R.H. & McNamara, D.S. The components of paraphrase evaluations. Behavior Research Methods 41, 682–690 (2009). https://doi.org/10.3758/BRM.41.3.682
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3758/BRM.41.3.682