Abstract
Studies of the McGurk effect have shown that when discrepant phonetic information is delivered to the auditory and visual modalities, the information is combined into a new percept not originally presented to either modality. In typical experiments, the auditory and visual speech signals are generated by the same talker. The present experiment examined whether a discrepancy in the gender of the talker between the auditory and visual signals would influence the magnitude of the McGurk effect. A male talker’s voice was dubbed onto a videotape containing a female talker’s face, and vice versa. The gender-incongruent videotapes were compared with gender-congruent videotapes, in which a male talker’s voice was dubbed onto a male face and a female talker’s voice was dubbed onto a female face. Even though there was a clear incompatibility in talker characteristics between the auditory and visual signals on the incongruent videotapes, the resulting magnitude of the McGurk effectwas not significantly different for the incongruent as opposed to the congruent videotapes. The results indicate that the mechanism for integrating speech information from the auditory and the visual modalities is not disrupted by a gender incompatibility even when it is perceptually apparent. The findings are compatible with the theoretical notion that information about voice characteristics of the talker is extracted and used to normalize the speech signal at an early stage of phonetic processing, prior to the integration of the auditory and the visual information.
Article PDF
Similar content being viewed by others
References
Benguerel, A., &Pichora-Fuller, M. K. (1982). Coarticulation effects in lipreading.Journal of Speech & Hearing Research,25 600–607.
Bentin, S., &Mann, V. (1990). Masking and stimulus intensity effects on duplex perception: A confirmation of the dissociation between speech and nonspeech modes.Journal of the Acoustical Society of America,88, 64–74.
Binnie, C. A., Montgomery, A. A., &Jackson, P. L. (1974). Auditory and visual contributions to the perception of selected English consonants for normally hearing and hearing-impaired listeners. In H. Bir Nielsen & E. Kampp (Eds.),Visual and audio-visual perception of speech (Scandinavian Audiology Supplementum 4, pp. 181–209). Stockholm: Almquist & Wiksell.
Cohen, M. M. (1984).Processing of visual and auditory information in speech perception. Unpublished doctoral dissertation, University of California, Santa Cruz.
Cutting, J. E. (1976). Auditory and linguistic processes in speech perception: Inferences from six fusions in dichotic listening.Psychological Review,83, 114–140.
Dixon, N. F., &Spitz, L. (1980). The detection of auditory visual desynchrony.Perception,9, 719–721.
Dodd, B. (1977). The role of vision in the perception of speech,Perception,6, 31–40.
Dodd, B. (1979). Lip reading in infants: Attention to speech presented in- and out-of-synchrony.Cognitive Psychology,11, 478–484.
Dorman, M. F., Studdert-Kennedy, M., &Raphael, L. J. (1977). Stop-consonant recognition: Release bursts and formant transitions as functionally equivalent, context-dependent cues.Perception & Psychophysics,22, 109–122.
Erber, N. P. (1971). Effects of distance on the visual reception of speech.Journal of Speech & Hearing Research,14, 848–857.
Fischer-Jorgensen, E. (1954). Acoustic analysis of stop consonants.Miscellanea Phonetica,2, 42–59.
Fowler, C. A. (1986). An event approach to the study of speech perception from a direct-realist perspective.Journal of Phonetics,14, 3–28.
Fowler, C. A., &Rosenslum, L. D. (1991). The perception of phonetic gestures. In I. G. Mattingly & M. Studdert-Kennedy (Eds.),Modularity and the motor theory of speech perception (pp. 33–59). Hillsdale, NJ: Erlbaum.
Green, K. P., &Kuhl, P. K. (1989). The role of visual information in the processing of place and manner features in speech perception.Perception & Psychophysics,45, 34–42.
Green, K. P., &Kuhl, P. K. (1991). Integral processing of visual place and auditory voicing information during phonetic perception.Journal of Experimental Psychology: Human Perception & Performance,17, 278–288.
Green, K. P., Kuhl, P. K., &Meltzoff, A. N. (1988). Factors affecting the integration of auditory and visual information in speech: The effect of vowel environment.Journal of the Acoustical Society of America,84, S155.
Green, K. P.,&Miller, J. L. (1985). On the role of visual rate information in phonetic perception.Perception & Psychophysics,38, 269–276.
Jack, C. E., &Thurlow, W. R. (1973). Effects of degree of visual association and angle of displacement on the “ventriloquism” effect.Perceptual & Motor Skills,37, 967–979.
Jackson, C. V. (1953). Visual factors in auditory localization.Quarterly Journal of Experimental Psychology,5, 52–65.
Jusczyk, P. W., Bertoncini, J., Bijeljac-Babic, R., Kennedy, L. J., &Mehler, J. (1990). The role of attention in speech perception by young infants.Cognitive Development,5, 265–286.
Kirk, R. E. (1968).Experimental design procedures for the behavioral sciences. Belmont, CA: Wadsworth.
Kuhl, P. K. (1979). Speech perception in early infancy: Perceptual constancy for spectrally dissimilar vowel categories.Journal of the Acoustical Society of America,66, 1668–1679.
Kuhl, P. K. (1980). Perceptual constancy for speech-sound categories in early infancy. In G. H. Yeni-Komshian, J. F. Kavanaugh, & C. A. Ferguson (Eds.),Child phonology: Vol. 2. Perception (pp. 41–66). New York: Academic Press.
Kuhl, P. K. (1983). Perception of auditory equivalence classes for speech in early infancy.Infant Behavior & Development,6, 263–285.
Kuhl, P. K. (1985). Categorization of speech by infants. In J. Mehler & R. Fox (Eds.),Neonate cognition: Beyond the blooming, buzzing confusion (pp. 231–262). Hillsdale, NJ: Erlbaum.
Kuhl, P. K. (1991). Human adults and human infants show a “perceptual magnet effect” for the prototypes of speech categories, monkeys do not.Perception & Psychophysics,50, 93–107.
Kuhl, P. K., &Meltzoff, A. N. (1982). The bimodal perception of speech in infancy.Science,218, 1138–1141.
Kuhl, P. K., &Meltzoff, A. N. (1984). The intermodal representation of speech in infants, infantBehavior & Development,7, 361–381.
Kuhl, P. K., &Meltzoff, A. N. (1988). Speech as an intermodal object of perception. In A. Yonad (Ed.),Perceptual development in infancy: The Minnesota symposia on child psychology (pp. 235–266). Hillsdale, NJ: Erlbaum.
Liberman, A. M., Isenberg, D., &Rakerd, B. (1981). Duplex perception of cues for stop consonants: Evidence for a phonetic mode.Perception & Psychophysics,30, 133–143.
Liberman, A. M., &Mattingly, I. G. (1985). The motor theory of speech perception revised.Cognition,21, 1–36.
Liberman, A. M., &Matringly, I. G. (1989). A specialization for speech perception.Science,243, 489–494.
MacDonald, J., &McGurk, H. (1978). Visual influences on speech perception processes.Perception & Psychophysics,24, 253–257.
MacKain, K., Studdert-Kennedy, M., Spieker, S., &Stern, D. (1983). Infant intermodal speech perception is a left-hemisphere function.Science,219, 1347–1349.
Mann, V. A., &Liberman, A. M. (1983). Some differences between phonetic and auditory modes of perception.Cognition,14, 211–235.
Manuel, S. Y., Repp, B. H., Studdert-Kennedy, M., &Liberman, A. M. (1983). Exploring the “McGurk effect.”Journal of the Acoustical Society of America,74, S66. (Abstract)
Marean, G. C., Werner, L. W., & Kuhl, P. K. (in press). Vowel categorization in very young infants,Developmental Psychology.
Massaro, D. (1987). Speech perception by ear and eye. In B. Dodd & R. Campbell (Eds.),Hearing by eye: The psychology of lip-reading (pp. 53–83). London: Erlbaum.
Massaro, D. W., &Cohen, M. M. (1983). Evaluation and integration of visual and auditory information in speech perception.Journal of Experimental Psychology: Human Perception & Performance,9, 753–771.
McGrath, M., &Summerfield, Q. (1985). Intermodal timing relations and audio-visual speech recognition by normal-hearing adults.Journal of the Acoustical Society of America,77, 678–685.
McGurk, H., &MacDonald, J. (1976). Hearing lips and seeing voices.Nature,264, 746–748.
Miller, J. D. (1989). Auditory-perceptual interpretation of the vowel.Journal of the Acoustical Society of America,85, 2114–2134.
Miller, J. L., Connine, C. M., Schermer, T. M., &Kluender, K. R. (1983). A possible auditory basis for internal structure of phonetic categories.Journal of the Acoustical Society of America,73, 2124–2133.
Mills, A. E., &Thiem, R. (1980). Auditory-visual fusions and illusions in speech perception.Linguistische Berichte,68, 85–109.
Mullennix, J. W., &Pisoni, D. B. (1990). Stimulus variability and processing dependencies in speech perception.Perception & Psychophysics,47, 379–390.
Mullennix, J. W., Pisoni, D. B., &Martin, C. S. (1989). Some effects of talker variability on spoken word recognition.Journal of the Acoustical Society of America,85, 365–378.
Nearey, T. M. (1989). Static, dynamic, and relational properties in vowel perception.Journal of the Acoustical Society of America,85, 2088–2113.
Nusbaum, H. C. (1990). The role of learning and attention in speech perception. In H. Fujisaki (Ed.),Proceedings of the international Conference on Spoken Language Processing (pp. 409–412). Tokyo: Acoustical Society of Japan.
Nygaard, L. C., &Eimas, P. D. (1990). A new version of duplex perception: Evidence for phonetic and nonphonetic fusion.Journal of the Acoustical Society of America,88, 75–86.
Pisoni, D. B. (1990). Effects of talker variability on speech perception: Implications for current research and theory. In H. Fujisaki (Ed.),Proceedings of the International Conference on Spoken Language Processing (pp. 1399–1407). Tokyo: Acoustical Society of Japan.
Rand, T. C. (1974). Dichotic release from masking for speech.Journal of the Acoustical Society of America,55, 678–680.
Reisberg, D., McLean, J., &Goldfield, A. (1987). Easy to hear but hard to understand: A lip-reading advantage with intact auditory stimuli. In B. Dodd & R. Campbell (Eds.),Hearing by eye: The psychology of lip-reading (pp. 97–113). London: Erlbaum.
Repp, B. H., &Linn, H. (1989). Acoustic properties and perception of stop consonant release transients.Journal of the Acoustical Society of America,85, 379–396.
Roberts, M., &Summerfield, Q. (1981). Audiovisual presentation demonstrates that selective adaptation in speech perception is purely auditory.Perception & Psychophysics,30, 309–314.
Samuel, A. G. (1982). Phonetic prototypes.Perception & Psychophysics,31, 307–314.
Strange, W. (1989). Evolving theories of vowel perception.Journal of the Acoustical Society of America,85, 2081–2087.
Studdert-Kennedy, M. (1986). Development of the speech perceptuomotor system. In B. Lindblom and R. Zetterstrom (Eds.),Precursors of early speech (pp. 205–217). New York: Stockton Press.
Sumby, W. H., &Pollack, I. (1954). Visual contribution to speech intelligibility in noise.Journal of the Acoustical Society of America,26, 212–215.
Summerfield, Q. (1987). Some preliminaries to a comprehensive account of audio-visual speech perception. In B. Dodd & R. Campbell (Eds.),Hearing by eye: The psychology of lip-reading (pp. 3–51). London: Erlbaum.
Summerfield, Q., &McGrath, M. (1984). Detection and resolution of audio-visual incompatibility in the perception of vowels.Quarterly Journal of Experimental Psychology,36A, 51–74.
Syrdal, A. K., &Gopal, H. S. (1986). A perceptual model of vowel recognition based on the auditory representation of American English vowels.Journal of the Acoustical Society of America,79, 1086–1100.
Warren, D. H. (1979). Spatial localization under conflict conditions: Is there a single explanation?Perception,8, 323–337.
Warren, D. H., Welch, R. B., &McCarthy, T. J. (1981). The role of visual-auditory “compellingness” in the ventriloquism effect: Implications for transitivity among the spatial senses.Perception & Psychophysics,30, 557–564.
Welch, R. B. (1989). A comparison of speech perception and spatial localization.Behavioral & Brain Sciences,12, 776–777.
Welch, R. B., &Warren, D. H. (1980). Immediate perceptual response to intersensory discrepancy.Psychological Bulletin,88, 638–667.
Whalen, D. H., &Liberman, A. M. (1987). Speech perception takes precedence over nonspeech perception.Science,237, 169–171.
Woodward, M. F., &Barber, C. G. (1960). Phoneme perception of lipreading.Journal of Speech & Hearing Research,3, 212–222.
Author information
Authors and Affiliations
Additional information
This research was supported by National Institutes of Health Grant NS-26475 to Kerry P. Green and National Institutes of Health Grant HD-18286 to Patricia K. Kuhl.
Rights and permissions
About this article
Cite this article
Green, K.P., Kuhl, P.K., Meltzoff, A.N. et al. Integrating speech information across talkers, gender, and sensory modality: Female faces and male voices in the McGurk effect. Perception & Psychophysics 50, 524–536 (1991). https://doi.org/10.3758/BF03207536
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.3758/BF03207536