Abstract
Understanding low-intelligibility speech is effortful. In three experiments, we examined the effects of intelligibility on working memory (WM) demands imposed by perception of synthetic speech. In all three experiments, a primary speeded word recognition task was paired with a secondary WM-load task designed to vary the availability of WM capacity during speech perception. Speech intelligibility was varied either by training listeners to use available acoustic cues in a more diagnostic manner (as in Experiment 1) or by providing listeners with more informative acoustic cues (i.e., better speech quality, as in Experiments 2 and 3). In the first experiment, training significantly improved intelligibility and recognition speed; increasing WM load significantly slowed recognition. A significant interaction between training and load indicated that the benefit of training on recognition speed was observed only under low memory load. In subsequent experiments, listeners received no training; intelligibility was manipulated by changing synthesizers. Improving intelligibility without training improved recognition accuracy, and increasing memory load still decreased it, but more intelligible speech did not produce more efficient use of available WM capacity. This suggests that perceptual learning modifies the way available capacity is used, perhaps by increasing the use of more phonetically informative features and/or by decreasing use of less informative ones.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Abbs, J. H., & Sussman, H. M. (1971). Neurophysiological feature detectors and speech perception: A discussion of theoretical implications. Journal of Speech & Hearing Research, 14, 23–36.
Ahissar, M., & Hochstein, S. (2002). The role of attention in learning simple visual tasks. In M. Fahle & T. Poggio (Eds.), Perceptual learning (pp. 253–272). Cambridge, MA: MIT Press.
Appelbaum, I. (1999). The dogma of isomorphism: A case study from speech perception. Philosophy of Science, 66(Suppl. 3), S250-S259.
Baddeley, A. D. (2002). The psychology of memory. In A. D. Baddeley, M. D. Kopelman, & B. A. Wilson (Eds.), The handbook of memory disorders (2nd ed., pp. 3–15). New York: Wiley.
Baddeley, A. D., & Hitch, G. J. (1974). Working memory. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 8, pp. 47–90). New York: Academic Press.
Boothroyd, A. (1985). Evaluation of speech production of the hearing impaired: Some benefits of forced-choice testing. Journal of Speech & Hearing Research, 28, 185–196.
Bundesen, C. (1990). A theory of visual attention. Psychological Review, 97, 523–547.
Cepstral, Inc. (2006). David for i386-Linux (Version 4.1.2) [Software package]. Retrieved December 10, 2007. Available from www.cepstral.com/downloads/.
Chaiklin, J. B. (1955). Native American listeners’ adaptation in understanding speakers with foreign dialect. Journal of Speech & Hearing Disorders, 20, 165–170.
Chomsky, N., & Miller, G. A. (1963). Introduction to the formal analysis of natural languages. In R. D. Luce, R. R. Bush, & E. Galanter (Eds.), Handbook of mathematical psychology (Vol. 2, pp. 269–321). New York: Wiley.
Cleary, M., & Pisoni, D. B. (2001). Speech perception and spoken word recognition: Research and theory. In E. B. Goldstein (Ed.), Blackwell handbook of perception (pp. 499–534). Oxford: Blackwell.
Conway, A. R. A., Cowan, N., & Bunting, M. F. (2001). The cocktail party phenomenon revisited: The importance of working memory capacity. Psychonomic Bulletin & Review, 8, 331–335.
Cooper, W. E. (1979). Speech perception and production: Studies in selective adaptation. Norwood, NJ: Ablex.
Cowan, N. (1988). Evolving conceptions of memory storage, selective attention, and their mutual constraints within the human information-processing system. Psychological Bulletin, 104, 163–191.
Cowan, N. (1997). Attention and memory: An integrated framework. New York: Oxford University Press.
Diehl, R. L., Lotto, A. J., & Holt, L. L. (2004). Speech perception. Annual Review of Psychology, 55, 149–179.
Dinse, H. R., & Merzenich, M. M. (2002). Adaptation of inputs to the somatosensory system. In M. Fahle & T. Poggio (Eds.), Perceptual learning (pp. 19–42). Cambridge, MA: MIT Press.
Egan, J. P. (1948). Articulation testing methods. Laryngoscope, 58, 955–991.
Engle, R. W. (2002). Working memory capacity as executive attention. Current Directions in Psychological Science, 11, 19–23.
Fenn, K. M., Nusbaum, H. C., & Margoliash, D. (2003). Consolidation during sleep of perceptual learning of spoken language. Nature, 425, 614–616.
Fodor, J. A. (1983). Modularity of mind: An essay on faculty psychology. Cambridge, MA: MIT Press.
Fowler, C. A., & Galantucci, B. (2005). The relation of speech perception and speech production. In D. B. Pisoni & R. E. Remez (Eds.), Handbook of speech perception (pp. 633–652). Malden, MA: Blackwell.
Francis, A. L., Baldwin, K., & Nusbaum, H. C. (2000). Effects of training on attention to acoustic cues. Perception & Psychophysics, 62, 1668–1680.
Francis, A. L., & Nusbaum, H. C. (2002). Selective attention and the acquisition of new phonetic categories. Journal of Experimental Psychology: Human Perception & Performance, 28, 349–366.
Francis, A. L., Nusbaum, H. C., & Fenn, K. [M.] (2007). Effects of training on the acoustic-phonetic representation of synthetic speech. Journal of Speech, Language, & Hearing Research, 50, 1445–1465.
Gass, S., & Varonis, E. M. (1984). The effect of familiarity on the comprehensibility of nonnative speech. Language Learning, 34, 65–89.
Goldinger, S. D. (1998). Echoes of echoes? An episodic theory of lexical access. Psychological Review, 105, 251–279.
Goldstone, R. L. (1994). Influences of categorization on perceptual discrimination. Journal of Experimental Psychology: General, 123, 178–200.
Goldstone, R. L. (2000). Unitization during category learning. Journal of Experimental Psychology: Human Perception & Performance, 26, 86–112.
Goldstone, R. L., & Schyns, P. (1994). Learning new features of representation. Proceedings of the 16th Annual Conference of the Cognitive Science Society (pp. 974–978). Hillsdale, NJ: Erlbaum.
Greenspan, S. L., Nusbaum, H. C., & Pisoni, D. B. (1988). Perceptual learning of synthetic speech produced by rule. Journal of Experimental Psychology: Learning, Memory, & Cognition, 14, 421–433.
Hustad, K. C., & Cahill, M. A. (2003). Effects of presentation mode and repeated familiarization on intelligibility of dysarthric speech. American Journal of Speech-Language Pathology, 12, 198–208.
Hustad, K. C., Kent, R. D., & Beukelman, D. R. (1998). DECtalk and MacinTalk speech synthesizers: Intelligibility differences for three listener groups. Journal of Speech, Language, & Hearing Research, 41, 744–752.
Ing-Simmons, N. (1994). RSYNTH: Complete speech synthesis system for UNIX [Computer software]. www.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/speech/systems/rsynth/0.html
Iverson, P., Hazan, V., & Bannister, K. (2005). Phonetic training with acoustic cue manipulations: A comparison of methods for teaching English /r/-/l/ to Japanese adults. Journal of the Acoustical Society of America, 118, 3267–3278.
Iverson, P., Kuhl, P. K., Akahane-Yamada, R., Diesch, E., Tohkura, Y., Kettermann, A., & Siebert, C. (2003). A perceptual interference account of acquisition difficulties for non-native phonemes. Cognition, 87, B47-B57.
Kane, M. J., & Engle, R. W. (2000). Working memory capacity, proactive interference, and divided attention: Limits on long-term memory retrieval. Journal of Experimental Psychology: Learning, Memory, & Cognition, 26, 336–358.
Kirk, R. E. (1995). Experimental design: Procedures for the behavioral sciences (3rd ed.). Pacific Grove, CA: Brooks/Cole.
Klatt, D. H. (1979). Speech perception: A model of acoustic-phonetic analysis and lexical access. Journal of Phonetics, 7, 279–312.
Klatt, D. H., & Klatt, L. C. (1990). Analysis, synthesis, and perception of voice quality variations among female and male talkers. Journal of the Acoustical Society of America, 87, 820–857.
Ladefoged, P., & Broadbent, D. E. (1957). Information conveyed by vowels. Journal of the Acoustical Society of America, 29, 98–104.
Lavie, N. (1995). Perceptual load as a necessary condition for selective attention. Journal of Experimental Psychology: Human Perception & Performance, 21, 451–468.
Lavie, N. (2000). Selective attention and cognitive control: Dissociating attentional functions through different types of load. In S. Monsell & J. Driver (Eds.), Attention and performance XVIII: Control of cognitive processes (pp. 175–194). Cambridge, MA: MIT Press.
Lavie, N. (2001). Capacity limits in selective attention: Behavioral evidence and implications for neural activity. In J. Braun, C. Koch, & J. L. Davis (Eds.), Visual attention and cortical circuits (pp. 49–68). Cambridge, MA: MIT Press.
Lavie, N. (2005). Distracted and confused? Selective attention under load. Trends in Cognitive Sciences, 9, 75–82.
Lavie, N., & Cox, S. (1997). On the efficiency of visual selective attention: Efficient visual search leads to inefficient distractor rejection. Psychological Science, 8, 395–398.
Lavie, N., Hirst, A., de Fockert, J. W., & Viding, E. (2004). Load theory of selective attention and cognitive control. Journal of Experimental Psychology: General, 133, 339–354.
Lavie, N., & Tsal, Y. (1994). Perceptual load as a major determinant of the locus of selection in visual attention. Perception & Psychophysics, 56, 183–197.
Liberman, A. M., Cooper, F. S., Harris, K. S., & MacNeilage, P. F. (1963). A motor theory of speech perception. Proceedings of the speech communication seminar (Vol. 2). Stockholm: Royal Institute of Technology.
Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Studdert-Kennedy, M. (1967). Perception of the speech code. Psychological Review, 74, 431–461.
Liberman, A. M., & Mattingly, I. G. (1985). The motor theory of speech perception revised. Cognition, 21, 1–36.
Liss, J. M., Spitzer, S. M., Caviness, J. N., & Adler, C. (2002). The effects of familiarization on intelligibility and lexical segmentation in hypokinetic and ataxic dysarthria. Journal of the Acoustical Society of America, 112, 3022–3030.
Livingston, K. R., Andrews, J. K., & Harnad, S. (1998). Categorical perception effects induced by category learning. Journal of Experimental Psychology: Learning, Memory, & Cognition, 24, 732–753.
Logan, J. S., Greene, B. G., & Pisoni, D. B. (1989). Segmental intelligibility of synthetic speech produced by rule. Journal of the Acoustical Society of America, 86, 566–581.
Luce, P. A., Feustel, T. C., & Pisoni, D. B. (1983). Capacity demands in short-term memory for synthetic and natural speech. Human Factors, 25, 17–32.
Marslen-Wilson, W. D., & Welsh, A. (1978). Processing interactions and lexical access during word recognition in continuous speech. Cognitive Psychology, 10, 29–63.
McClelland, J. L., & Elman, J. L. (1986). The TRACE model of speech perception. Cognitive Psychology, 18, 1–86.
McClelland, J. L., Mirman, D., & Holt, L. L. (2006). Are there interactive processes in speech perception? Trends in Cognitive Sciences, 10, 363–369.
McCoy, S. L., Tun, P. A., Cox, L. C., Colangelo, M., Stewart, R. A., & Wingfield, A. (2005). Hearing loss and perceptual effort: Downstream effects on older adults’ memory for speech. Quarterly Journal of Experimental Psychology, 58A, 22–33.
McGarr, N. S. (1983). The intelligibility of deaf speech to experienced and inexperienced listeners. Journal of Speech & Hearing Research, 26, 451–458.
Melara, R. D., Rao, A., & Tong, Y. (2002). The duality of selection: Excitatory and inhibitory processes in auditory selective attention. Journal of Experimental Psychology: Human Perception & Performance, 28, 279–306.
Miller, J. L. (1987). Rate-dependent processing in speech perception. In A. W. Ellis (Ed.), Progress in the psychology of language (Vol. 3, pp. 119–157). Hillsdale, NJ: Erlbaum.
Mirenda, P., & Beukelman, D. R. (1990). A comparison of intelligibility among natural speech and seven speech synthesizers with listeners from three age groups. Augmentative & Alternative Communication, 6, 61–68.
Nittrouer, S. (1992). Age-related differences in perceptual effects of formant transitions within syllables and across syllable boundaries. Journal of Phonetics, 20, 351–382.
Nittrouer, S., & Crowther, C. S. (1998). Examining the role of auditory sensitivity in the developmental weighting shift. Journal of Speech, Language, & Hearing Research, 41, 809–818.
Nittrouer, S., & Miller, M. E. (1997a). Developmental weighting shifts for noise components of fricative-vowel syllables. Journal of the Acoustical Society of America, 102, 572–580.
Nittrouer, S., & Miller, M. E. (1997b). Predicting developmental shifts in perceptual weighting schemes. Journal of the Acoustical Society of America, 101, 2253–2266.
Norman, D. A., & Bobrow, D. G. (1975). On data-limited and resource-limited processes. Cognitive Psychology, 7, 44–64.
Nosofsky, R. M. (1986). Attention, similarity, and the identification-categorization relationship. Journal of Experimental Psychology: General, 115, 39–57.
Nusbaum, H. C., & Goodman, J. C. (1994). Learning to hear speech as spoken language. In J. C. Goodman & H. C. Nusbaum (Eds.), The development of speech perception: The transition from speech sounds to spoken words (pp. 299–338). Cambridge, MA: MIT Press.
Nusbaum, H. [C.], & Magnuson, J. (1997). Talker normalization: Phonetic constancy as a cognitive process. In K. Johnson & J. W. Mullennix (Eds.), Talker variability in speech processing (pp. 109–132). San Diego: Academic Press.
Nusbaum, H. C., & Morin, T. M. (1992). Paying attention to differences among talkers. In Y. Tohkura, E. Vatikiotis-Bateson, & Y. Sagisaka (Eds.), Speech perception, production and linguistic structure (pp. 113–134). Amsterdam: IOS Press.
Nusbaum, H. C., & Pisoni, D. B. (1985). Constraints on the perception of synthetic speech generated by rule. Behavior Research Methods, Instruments, & Computers, 17, 235–242.
Nusbaum, H. C., & Schwab, E. C. (1986). The role of attention and active processing in speech perception. In E. C. Schwab & H. C. Nusbaum (Eds.), Pattern recognition by humans and machines: Vol. 1. Speech perception (pp. 113–157). San Diego: Academic Press.
Oliva, A., & Schyns, P. G. (1997). Coarse blobs or fine edges? Evidence that information diagnosticity changes the perception of complex visual stimuli. Cognitive Psychology, 34, 72–107.
Pichora-Fuller, M. K., Schneider, B. A., & Daneman, M. (1995). How young and old adults listen to and remember speech in noise. Journal of the Acoustical Society of America, 97, 593–608.
Pisoni, D. B., & Sawusch, J. R. (1975). Some stages of processing in speech perception. In A. Cohen & S. G. Nooteboom (Eds.), Structure and process in speech perception (pp. 16–34). Berlin: Springer.
Rabbitt, P. (1991). Mild hearing loss can cause apparent memory failures which increase with age and reduce with IQ. Acta Oto-Laryngologica, 111(Suppl. 476), 167–176.
Reynolds, M. E., Isaacs-Duvall, C., & Haddox, M. L. (2002). A comparison of learning curves in natural and synthesized speech comprehension. Journal of Speech, Language, & Hearing Research, 45, 802–810.
Reynolds, M. E., Isaacs-Duvall, C., Sheward, B., & Rotter, M. (2000). Examination of the effects of listening practice on synthesized speech comprehension. Augmentative & Alternative Communication, 16, 250–259.
Rochon, E., Caplan, D., & Waters, G. S. (1990). Short-term memory processes in patients with apraxia of speech: Implications for the nature and structure of the auditory verbal short-term memory system. Journal of Neurolinguistics, 5, 237–264.
Rounsefell, S., Zucker, S. H., & Roberts, T. G. (1993). Effects of listener training on intelligibility of augmentative and alternative speech in the secondary classroom. Education & Training in Mental Retardation, 28, 296–308.
Schneider, W., Eschman, A., & Zuccolotto, A. (2002). E-Prime reference guide. Pittsburgh: Psychology Software Tools.
Schwab, E. C., Nusbaum, H. C., & Pisoni, D. B. (1985). Some effects of training on the perception of synthetic speech. Human Factors, 27, 395–408.
Schyns, P. G., Goldstone, R. L., & Thibaut, J.-P. (1998). The development of features in object concepts. Behavioral & Brain Sciences, 21, 1–54.
Skipper, J. I., Nusbaum, H. C., & Small, S. L. (2006). Lending a helping hand to hearing: Another motor theory of speech perception. In M. A. Arbib (Ed.), Action to language via the mirror neuron system (pp. 250–285). Cambridge: Cambridge University Press.
Spitzer, S. M., Liss, J. M., Caviness, J. N., & Adler, C. (2000). An exploration of familiarization effects in the perception of hypokinetic and ataxic dysarthric speech. Journal of Medical Speech-Language Pathology, 8, 285–293.
Stevens, K. N., & Blumstein, S. E. (1981). The search for invariant acoustic correlates of phonetic features. In P. D. Eimas & J. L. Miller (Eds.), Perspectives on the study of speech (pp. 1–38). Hillsdale, NJ: Erlbaum.
Stevens, K. N., & Halle, M. (1967). Remarks on analysis of synthesis and distinctive features. In W. Wathen-Dunn (Ed.), Models for the perception of speech and visual form (pp. 88–102). Cambridge, MA: MIT Press.
Tjaden, K. K., & Liss, J. M. (1995). The role of listener familiarity in the perception of dysarthric speech. Clinical Linguistics & Phonetics, 9, 139–154.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Francis, A.L., Nusbaum, H.C. Effects of intelligibility on working memory demand for speech perception. Attention, Perception, & Psychophysics 71, 1360–1374 (2009). https://doi.org/10.3758/APP.71.6.1360
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.3758/APP.71.6.1360