Audio-visual integration in schizophrenia
Introduction
In a natural environment the organism receives information though multiple sensory channels. Yet, the world is not perceived as consisting of a juxtaposition of independent sensory experiences in which the integrity of each modality is preserved intact. Rather, inputs from different sensory channels are often combined, providing an information-rich environment, which in turn presents the organism with a complex target for adaptive behaviour. Adaptive responses are thus the product of perceptual integration between different sensory modalities. This situation is found, for example, in spatial orientation, which is often based on a combination of auditory and visual cues, but also in audio-visual speech perception and in perceiving emotional cues provided by the face, the voice as well as by gestures. These cases and others are undoubtedly quite different among themselves, yet it is likely that for all of them continuous integration of multi-sensory inputs is at the basis of successful communicative behaviour. Since schizophrenic patients are known to experience many problems in interacting with the social environment, our goal was to investigate whether some of these difficulties might go back to impairments in processing cues with communicative value and in establishing inter-sensory integration between them.
There are many domains of perception in which inputs from different sensory modalities are combined. One type of inter-sensory integration is that between audition and vision. Audio-visual integration has predominantly been studied in two domains of perception: space and speech. A very well known example of audio-visual spatial integration is ventriloquism (see Bertelson, 1998 for an overview). The ventriloquist produces speech without moving his lips, while his puppet seems to be talking. The ventriloquist illusion is very robust and consists in the fact that perceivers attribute the location of a sound to the apparent visual source (the puppet) rather than to its actual source (the ventriloquist). The ventriloquist illusion is a prime example of the general perceptual principle that when two or more sensory events are in close temporal proximity, but in slightly distinct spatial locations, they are generally perceived as emanating from a common source. A well-known experimental paradigm for investigating the ventriloquist effect is immediate cross-modal bias. Participants are asked to point to the location of an auditory stimulus and their pointing is attracted towards the location of a visual stimulus located at different position as the sound. The effect is automatic and mandatory and does not depend on attention to the visual input and even occurs when the viewer does not explicitly notice the presence of a visual stimulus as is for example the case in patients with unilateral neglect Bertelson et al., 2000a, Bertelson et al., 2000b, Vroomen et al., 2001, Vroomen et al., in press.
Audio-visual integration in the domain of speech perception is illustrated most dramatically by an experiment of McGurk and MacDonald (1976). They paired a video of a face articulating /ga/ in synchrony with a soundtrack of /ba/. This resulted in a ‘heard’ /da/. In the combined audio-visual condition, there were two types of responses: a “fused” response, where information from the two modalities was transformed into something new (e.g., visual/ga/+auditory/ba/perceived as/da/), and a “combination” response representing a composite solution (e.g., visual/ba/+auditory/ga/perceived as/bga/). Not only behavioural, but also electro-physiological studies found evidence for cross-modal effects in the perception of speech. Sams et al. (1991) demonstrated, by means of magneto-encephalography (MEG), that the characteristic response of the auditory cortex to heard speech could be modified by the inclusion of visible speech information. This shows that visual information from articulatory movements can have a cross-modal effect on auditory cortex (see de Gelder, 2000 for a model of cross-modal effects). It is also possible to investigate the brain regions involved in multi-modal processing by means of functional magnetic resonance imaging (fMRI). Calvert et al. (1997) used this technique to investigate the brain regions involved in silent lipreading in normal hearing subjects by comparing them with those activated during auditory speech perception in the same individuals. Silent lipreading not only activated visual cortex, but also primary auditory cortex, and activation in the latter region overlapped considerably with the region activated during heard speech.
Only very few studies have looked at audio-visual integration in schizophrenia. A number of neurological studies found impairments in audio-visual integration using cross-modal matching. Heinrichs and Buchanan (1988) reviewed several neurological studies of abnormal signs on clinical neurological examinations. They found that basic mechanisms of sensory input appear to be intact in schizophrenic patients. Impairments were found in three higher-order functional areas: the co-ordination of motor activity, the sequencing of motor patterns and the integration of more complex sensory units. Later results were based on studies of cross-modal matching. Ismail et al. (1998) investigated the prevalence and type of neurological abnormalities in schizophrenic patients and their nonpsychotic siblings. Both patients and siblings scored worse than comparison subjects did on integration of higher sensory functions. Ross et al. (1998) found an association between poor sensory integration and eye tracking disorder. They proposed that a circuit, which includes the posterior parietal cortex, subserves both smooth-pursuit eye movements and audio-visual integration. de Gelder et al. (1997) reported a study of audio-visual integration in schizophrenia. They measured the effect of a voice expression (happy or sad) on a face expression categorisation task using a morphed happy-to-sad face continuum. In normal controls, the recognition of a facial expression was biased towards the direction of the vocal expression, but the schizophrenic group did not show this bias.
The main objectives of our study were to explore whether schizophrenic patients show deficits in the cross-modal integration of audio-visual stimuli, and whether possible deficits are found in more than one domain where audio-visual integration is critical. Two tasks were administered, each focusing on one domain of audio-visual integration. The first task was intended to assess audio-visual interactions in the spatial localisation of sounds. The second task was a variant of the McGurk phenomenon (McGurk and MacDonald, 1976). If schizophrenic patients show a general deficit in audio-visual integration, performance on both tasks will be impaired and there should be a correlation between the two sets of results. But if the integration deficit is more specific, then one task might be impaired, but not the other.
Section snippets
Experiment 1
In experiment 1, the question is examined whether schizophrenic patients, like normal perceivers, are biased by a concurrently presented visual stimulus when pointing towards the origin of a sound.
Experiment 2
In Experiment 2 the question was examined whether schizophrenic patients show an effect of lipreading on the recognition of auditory presented syllables. A comparison of these results with Experiment 1 might also allow to address the question whether there is a general as opposed to a domain-specific deficit in audio-visual integration.
General discussion
The goal of this study was to investigate whether schizophrenic patients are impaired in their integration of auditory and visual stimuli and whether a possible integration deficit existed for audio-visual spatial localisation, for audio-visual speech perception or both. In Experiment 1, schizophrenic patients showed a normal pattern of performance, indicating that they have no impairments integrating auditory and visual information when performing a spatial localisation task. In Experiment 2,
Acknowledgements
We thank an anonymous reviewer for insightful comments. This research was partly supported by a grant from Lundbeck.
References (23)
- et al.
Ventriloquism in patients with unilateral visual neglect
Neuropsychologia
(2000) - et al.
Evidence from functional magnetic resonance imaging of crossmodal binding in the human heteromodal cortex
Current Biology
(2000) - et al.
The combined perception of emotion from voice and face: early interaction revealed by electric brain responses
Neuroscience Letters
(1999) - et al.
Seeing speech: visual information from lip movements modifies activity in human auditory cortex
Neuroscience Letters
(1991) Starting from the ventriloquist: the perception of multimodal events
- et al.
Exploring the relation between McGurk interference and ventriloquism
- et al.
The ventriloquist effect does not depend on the direction of deliberate visual attention
Perception and Psychophysics
(2000) - et al.
Activation of auditory cortex during silent lipreading
Science
(1997) - et al.
Dependence of impaired eye tracking on deficient velocity discrimination in schizophrenia
Archives of General Psychiatry
(1999) - et al.
Crossmodal binding of fear in voice and face
More to seeing than meets the eye
Science
Cited by (109)
Cross-modal associative memory impairment in schizophrenia
2023, NeuropsychologiaA systematic review of the neural correlates of multisensory integration in schizophrenia
2022, Schizophrenia Research: CognitionAudiovisual integration and the P2 component in adult Asperger's syndrome: An ERP-study
2021, Research in Autism Spectrum Disorders