Skip to main content
Log in

Wide-Coverage Probabilistic Sentence Processing

  • Published:
Journal of Psycholinguistic Research Aims and scope Submit manuscript

Abstract

This paper describes a fully implemented, broad-coverage model of human syntactic processing. The model uses probabilistic parsing techniques, which combine phrase structure, lexical category, and limited subcategory probabilities with an incremental, left-to-right “pruning” mechanism based on cascaded Markov models. The parameters of the system are established through a uniform training algorithm, which determines maximum-likelihood estimates from a parsed corpus. The probabilistic parsing mechanism enables the system to achieve good accuracy on typical, “garden-variety” language (i.e., when tested on corpora). Furthermore, the incremental probabilistic ranking of the preferred analyses during parsing also naturally explains observed human behavior for a range of garden-path structures. We do not make strong psychological claims about the specific probabilistic mechanism discussed here, which is limited by a number of practical considerations. Rather, we argue incremental probabilistic parsing models are, in general, extremely well suited to explaining this dual nature—generally good and occasionally pathological—of human linguistic performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

REFERENCES

  • Altmann, G. T. M., & Steedman, M. (1988). Interaction with context during human sentence processing. Cognition, 18, 129–144.

    Google Scholar 

  • Anderson, J. R. (1991). Is human cognition adaptive? Behavioural and Brain Sciences, 14,471–517.

    Google Scholar 

  • Brants, T. (1999a). Cascaded Markov Models, Proceedings of the 9th Conference of the European Chapter of the Association for Computational Linguistics (EACL-99), Bergen, Norway.

  • Brants, T. (1999b). Tagging and parsing with Cascaded Markov Models—Automation of corpus annotation. Vol. 6 of Saarbrücken Dissertations in Computational Linguistics and Language Technology, DFKI and Saarland University, Saarbrücken Germany.

    Google Scholar 

  • Brants, T. (2000). TnT—A statistical part-of-speech tagger, Proceedings of the 6th Conference on Applied Natural Language Processing, Seattle, WA.

  • Brants, T., & Crocker, M. W. (2000). Probabilistic parsing and psychological plausibility, Proceeding of the International Conference on Computational Linguistics (COLING 2000), Saarbrücken, Germany.

  • Chater, N., Crocker, M. W., & Pickering, M. (1998). The rational analysis of inquiry: The case for parsing. In Chater & Oaksford (Eds), Rational Analysis of Cognition, (pp. 441–468). Oxford: Oxford University Press.

    Google Scholar 

  • Collins, M. (1996). A new statistical parser based on bigram lexical dependencies, Proceedings of the Annual Conference of the Association for Computational Linguistics, Santa Cruz, California.

  • Corley, S., & Crocker, M. W. (2000).The modular statistical hypothesis: Exploring lexical category ambiguity. In M. W. Crocker, M. Pickering & C. Clifton (Eds.), Architectures and mechanisms for language processing (pp 135–160.) Cambridge: Cambridge University Press.

    Google Scholar 

  • Crocker, M. W., & Corley, S. Modular architectures and statistical mechanisms: The case from lexical category disambiguation. In P. Merlo & S. Stevenson (Eds.), The lexical basis of sentence processing, New York, Benjamins, in press.

  • Duffy, S. A., Morris, R. K., & Rayner, K. (1988). Lexical ambiguity and fixation times in reading. Journal of Memory and Language, 27, 429–446.

    Google Scholar 

  • Ferreira, F., & Clifton Jr., C. (1986). The Independence of Syntactic Processing. Journal of Memory and Language, 25, 348–368.

    Google Scholar 

  • Frazier, L., & Rayner, K. (1987). Resolution of syntactic category ambiguities: Eye movements in parsing lexically ambiguous sentences. Journal of Memory and Language, 26, 505–526.

    Google Scholar 

  • Garnsey, S., Pearlmutter, N., Myers, E., & Lotocky, M. (1997). The contribution of verb bias nd plausibility to the comprehension of temporarily ambiguous sentences. Journal of emory and Language, 37, 58–93.

    Google Scholar 

  • Juliano, C., & Tanenhaus, M. K. (1993). Contingent frequency effects in syntactic ambiguity resolution. In Proceedings of the Fifteenth Annual Conference of the Cognitive Science Society, (pp. 593–598). Lawrence Erlbaum Associates.

  • Jurafsky, D. A (1996). Probabilistic model of lexical and syntactic access and disambiguation, Cognitive Science, 20, 137–194.

    Google Scholar 

  • Lapata, M., Keller, F., & Schulte im Walde, S. Verb frame frequency as a predictor of verb bias, submitted.

  • MacDonald, M. C. (1993). The interaction of lexical and syntactic ambiguity. Journal of Memory and Language, 32, 692–715.

    Google Scholar 

  • MacDonald, M. C. (1994). Probabilistic constraints and syntactic ambiguity resolution. Language and Cognitive Processes, 9, 157–201.

    Google Scholar 

  • MacDonald, M. C., Pearlmutter, N. J., & Seidenberg, M. S. (1994). The lexical nature of syntactic ambiguity resolution. Psychological Review, 10, 676–703.

    Google Scholar 

  • Marcus, M., Santorini, B., and Marcinkiewicz, M. (1993). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19, 313–330.

    Google Scholar 

  • McRae, K., Spivey-Knowlton, M., & Tanenhaus, M. (1998). Modelling the influence of thematic fit (and other constaints) in on-line sentence comprehension. Journal of Memory and Language, 38, 283–312.

    Google Scholar 

  • Merlo, P., & Stevenson, S. (2000). Lexical syntax and parsing architecture. In M. W. Crocker, M. Pickering, & C. Clifton (Eds.) Architectures and mechanisms for language processing, (pp. 161–188). Cambridge: Cambridge University Press.

    Google Scholar 

  • Pickering, M., Traxler, M., & Crocker, M. W. (2000). Ambiguity resolution in sentence processing: vidence against frequency-based accounts. Journal of Memory and Language, 43, 447–475.

    Google Scholar 

  • Rabiner, R. (1989). A tutorial on Hidden Markov Models and selected applications in??? recognition. Proceedings of the IEEE, 77, 257–285.

    Google Scholar 

  • Ratnaparkhi, A. (1997). A linear observed time statistical parser based on maximum entropy. Proceedings of the Conference on Empirical Methods in Natural Language Processing, Providence, Rhode Island.

  • Samuelsson, C. (1997). Extending n-gram tagging to word graphs. Proceedings of the 2 nd International Conference on Recent Advances in Natural Language Processing, Tzigov Chark, Bulgaria.

  • Seidenberg, M. S. (1997). Language acquisition and use: Learning and applying probabilistic constraints. Science, 275, 213–215.

    Google Scholar 

  • Spivey-Knowlton, M. (1996). Integration of visual and linguistic information: Human data and model simulations. Unpublished doctoral disseration, University of Rochester, Rochester, N.Y.

    Google Scholar 

  • Tanenhaus, M. K., Spivey-Knowlton, M. J., & Hanna, J. E. (2000). Modelling discourse context effects: A multiple constraints approach. In M. W. Crocker, M. Pickering, & C. Clifton (Eds.) Architectures and mechanisms for language processing (pp. 90–118). Cambridge: Cambridge University Press.

    Google Scholar 

  • Trueswell, J. (1996). The role of lexical frequency in syntactic ambiguity resolution. Journal of Memory and Language, 35, 566–585.

    Google Scholar 

  • Trueswell, J., Tanenhaus, M., & Kello, C. (1993). Verb specific constraints in sentence processing: Separating effects of lexical preferences from garden-paths. Journal of Experimental Psychology: Learning, Memory and Cognition, 19, 528–553.

    Google Scholar 

  • Viterbi, A. (1967). Error bounds for convolution codes and an asymptotically optimal decoding algorithm. IEEE Transactions on Information Theory, 13, 260–269.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Crocker, M.W., Brants, T. Wide-Coverage Probabilistic Sentence Processing. J Psycholinguist Res 29, 647–669 (2000). https://doi.org/10.1023/A:1026560822390

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1026560822390

Navigation