Synchronizing with auditory and visual rhythms: An fMRI assessment of modality differences and modality appropriateness
Highlights
► fMRI study of finger tapping with visual flashes, moving bar, auditory beeps, siren. ► Visuo-motor synchrony improved with moving bar; audio-motor degraded with siren. ► Putamen activation reflected stability of sensorimotor synchrony, not modality. ► No modality difference in tapping or timing circuit with modality-appropriate stimuli.
Introduction
Precise temporal coordination between action and different perceptual systems is crucial for interacting with a dynamic environment. Precise visuo-motor integration is needed to catch a ball (or catch dinner) and audio-motor integration is needed to synchronize movements with music. Empirically, the temporal integration of action and perception is commonly examined in tasks requiring finger tapping to an isochronous pacing sequence. Previous neuroimaging and behavioral studies of tapping have established strong modality differences between audio-motor and visuo-motor synchronization. However, the vast majority of previous studies used only flashing visual stimuli, and flashes are known to yield poorer synchronization performance than auditory stimuli (Repp, 2005). Visuo-motor synchronization improves significantly with moving stimuli (Hove and Keller, 2010, Hove et al., 2010, Iversen et al., submitted for publication). In the present study, we tested whether previously observed activation differences reflect modality per se (i.e., audio-motor vs. visuo-motor integration), or differences in synchronization performance.
Neuroimaging studies have uncovered divergent neural activation patterns for visuo-motor versus audio-motor synchronization. Differences extend well beyond primary sensory areas into regions implicated in the brain's timing networks, including the basal ganglia, supplementary motor areas (SMA), and cerebellum (e.g., Buhusi and Meck, 2005, Coull et al., 2011, Macar et al., 2002, Schwartze et al., 2012). Direct comparisons of audio-motor and visuo-motor synchronization reported activation in different areas of the cerebellum (Jäncke et al., 2000, Penhune et al., 1998). Additionally, audio-motor, but not visuo-motor synchronization, yielded significant activation in the SMA (Jäncke et al., 2000, Penhune et al., 1998). In a meta-analysis on 38 neuroimaging studies of finger-tapping, striking differences between audio- and visuo-motor synchronization were uncovered in the putamen of the basal ganglia; synchronization with auditory, but not visual stimuli, consistently activated the putamen (Witt et al., 2008). The putamen is a key area for beat and rhythm processing (Coull et al., 2011, Grahn and Rowe, 2009, Kotz et al., 2009, Teki et al., 2011, Wiener et al., 2009). A recent study comparing audio and visual beat perception showed more putamen activation and more sensitive beat perception for auditory than for visual stimuli; nevertheless, within the visual condition the degree of putamen activation predicted beat sensitivity (Grahn et al., 2011).
Taken together, neural activation differences observed between auditory and visual modalities in synchronization and beat perception have important implications. For example, it has been argued that auditory rhythms induce an internal rhythm that guides movement, whereas visual rhythms do not generate an internal rhythm (Jäncke et al., 2000). Additionally, modality differences have provided evidence that time is represented in a distributed network rooted in sensorimotor processes, rather than subserved by a centralized clock mechanism (Jantzen et al., 2005). Furthermore, differences in neural activations could support an auditory specialization for encoding temporal information (e.g., Welch and Warren, 1980).
Given these activation differences in timing circuits, it is perhaps unsurprising that a strong behavioral advantage has also been observed for audio-motor over visuo-motor synchronization. Rhythmic finger tapping is much more accurate with auditory stimuli than with flashing visual stimuli (e.g., Chen et al., 2002, Dunlap, 1910, Kolers and Brewster, 1985). Stable synchronization is possible at much faster rates with auditory than with visual sequences (Repp, 2003). In a target-distracter paradigm, when auditory beeps and visual flashes are presented in competition with each other, participants' movement timing is dictated by the auditory stimuli, regardless of volition (Repp and Penel, 2004). Finally, the serial dependence between inter-tap intervals, which intimates underlying timing processes (e.g., Vorberg and Wing, 1996), differs between audio- and visuo-motor synchronization: Inter-tap intervals in audio-motor synchronization typically alternate between short and long intervals (a negative lag1 autocorrelation), which suggests active error correction (Semjen et al., 2000); whereas synchronization with flashing visual stimuli typically has a positive or non-negative lag1 autocorrelation, which suggests weak (or absent) tap-to-tap error correction (Chen et al., 2002, Hove and Keller, 2010, Hove et al., 2010). Together these results suggest different underlying processes for synchronizing with audio- versus flashing visual sequences.
However, nearly all imaging and behavioral evidence for differences between visual and auditory synchronization used flashing visual stimuli. While flashes may offer the most similar control for auditory beeps in terms of temporal onset/offset (and no additional confounding factors), they lack ecological validity in that the visual system rarely processes or acts upon purely temporal information devoid of spatial translation. The visual system has considerably lower temporal resolution than audition (e.g., Holcombe, 2009), and thus is severely handicapped in synchronizing with discrete temporal stimuli. Vision excels at processing spatial, rather than temporal information (e.g., Bertelson and Radeau, 1981, Welch and Warren, 1980). When a visual stimulus contains spatiotemporal information (rather than purely temporal information), action timing to intercept that moving stimulus can be very precise (Bootsma and van Wieringen, 1990).
In a series of recent finger-tapping studies, we have shown that synchronization timing improves dramatically with spatiotemporal visual stimuli, compared with purely temporal flashing stimuli. In one study, participants tapped along with flashing visual stimuli and with visual images that alternated between a high and low position creating apparent motion. Synchronization was considerably more stable with the apparent motion stimuli than the flashes (Hove and Keller, 2010). In another study, participants tapped along with visual flashes, fading stimuli, and stimuli that moved frame-by-frame at a linear velocity, as well as an auditory metronome. Synchronization with the moving stimuli was much better than with flashing or fading stimuli; however, an auditory advantage was still observed, especially at very fast tempi (300 and 240 ms IOI), (Hove et al., 2010). In both these studies, the moving visual stimuli also yielded negative lag1 autocorrelations, suggesting that error correction was occurring. Additionally, in a target-distracter study that presented moving visual stimuli in competition with auditory beeps, the moving visual stimuli attracted movement timing as much as auditory stimuli, thus erasing the auditory dominance previously observed over flashes (Hove et al., in press). Together, these studies demonstrate that motion increases the temporal reliability of visual encoding (cf. Ernst and Bülthoff, 2004) and thus facilitates precise visuo-motor integration.
The foregoing suggests that the ‘modality differences’ observed between auditory and flashing visual stimuli should be interpreted cautiously: It is unclear if the previously reported differences in brain activation truly reflect differences between modality (i.e., audio-motor vs. visuo-motor integration) or simply the poor performance with flashing visual stimuli due to their less precise temporal encoding. The significant improvement in visual synchronization with moving stimuli encourages the re-examination of established modality differences. Are differences in neural timing circuits substantially reduced with improved visuo-motor synchronization (or degraded audio-motor synchronization)?
In the present fMRI study, participants synchronized their finger taps with four types of visual and auditory pacing sequences: visual stimuli were flashes and a moving bar, and auditory stimuli were beeps and frequency modulated ‘sirens’. Within the visual modality, synchronization should improve for the moving bar stimuli compared to the flashes due to the spatial processing advantage. Within the auditory modality, synchronization should degrade for the siren compared to the discrete beeps, since the siren's continuous presentation should reduce or blur the neural encoding of its target compared to the beep's discrete target (cf. Barsz et al., 2002). Thus, these stimuli can disentangle synchronization performance from modality. Critically, neural activation in key timing areas such as the putamen should vary with the stability of synchronization performance, rather than being dictated by modality. We expect to replicate previously observed modality differences only for the discrete stimuli (beeps versus flashes), whereas the modality differences should be substantially less pronounced with the continuous stimuli (siren versus moving bar). Accordingly, we anticipate an interaction between modality and discrete/continuous stimulus structure for both behavioral synchronization performance and neural activation in time-sensitive areas such as the putamen.
Section snippets
Participants
Fourteen right-handed volunteers (7 women) aged 24 to 34 years (M = 27.7 ± 3.0 years) participated in the experiment.1 Participants were paid for their participation and gave informed consent. Participants had a range of musical training (M = 7.9 years; SD = 9.5); this did not
Behavioral results
Tap-to-target synchronization stability (‾R) was analyzed in a 2 (modality: auditory, visual) × 2 (temporal structure: discrete, continuous) × 2 (tempo: slow, fast) repeated measures Analysis of Variance (ANOVA); see Table 1. Overall, participants synchronized better with the auditory than the visual sequences, as indicated by a main effect of modality, F(1,13) = 53.72, p < .001, ηp2 = .805. There was no main effect of discrete vs. continuous temporal structure, F(1,13) = 2.04, p = .18, ηp2 = .135.
In the
Behavioral effects
Visuo-motor synchronization was more stable with continuous moving targets than with discrete flashes; whereas audio-motor synchronization was more stable with discrete beeps than continuous pitch-modulated sirens. A modality difference in synchronization was observed between discrete beeps over flashes, but not between the continuous siren and moving bar. These results indicate that synchronization is not dictated by modality, but instead depends on the nature of the stimulus. Stable
Conclusion
In conclusion, this study presents evidence that sensorimotor synchronization is largely contingent upon the stimuli's suitability to the processing style of each modality. In the auditory modality, a discrete beep enables a clear encoding of the target timing; and in the visual modality, a continuously moving target can be more clearly encoded due to its spatiotemporal dynamics. After this early encoding, it can serve to coordinate action timing via thalamo–cortical–striatal loops, and
Acknowledgments
This research was supported by The Max Planck Society. We thank Jan Bergmann for technical assistance and Maria Bader for help in data collection.
References (73)
- et al.
The ventriloquist effect results from near-optimal bimodal integration
Curr. Biol.
(2004) - et al.
Behavioral and neural measures of auditory temporal acuity in aging humans and mice
Neurobiol. Aging
(2002) - et al.
The synchronization of human arm movements to external events
Neurosci. Lett.
(2000) - et al.
Spectral decomposition of variability in synchronization and continuation tapping: comparisons between auditory and visual pacing and feedback conditions
Hum. Mov. Sci.
(2002) - et al.
Merging the senses into a robust percept
Trends Cogn. Sci.
(2004) - et al.
Separate visual pathways for perception and action
Trends Neurosci.
(1992) - et al.
FMRI investigation of cross-modal interactions in beat perception: audition primes vision, but not vice versa
NeuroImage
(2011) - et al.
Cortical activations during paced finger-tapping applying visual and auditory pacing stimuli
Cogn. Brain Res.
(2000) - et al.
Functional MRI reveals the existence of modality and coordination-dependent timing networks
NeuroImage
(2005) - et al.
A global optimisation method for robust affine registration of brain images
Med. Image Anal.
(2001)
Non-motor basal ganglia functions: a review and proposal for a model of sensory predictability in auditory language
Cortex
Brain activity correlates differentially with increasing temporal complexity of rhythms during initialisation, synchronisation, and continuation phases of paced finger tapping
Neuropsychologia
Time perception: manipulation of task difficulty dissociates clock functions from other cognitive demands
Neuropsychologia
Cortico-striatal circuits and interval timing: coincidence detection of oscillatory processes
Cogn. Brain Res.
Timing of finger tapping to frequency modulated acoustic stimuli
Acta Psychol.
Cortico-striatal representation of time in animals and humans
Curr. Opin. Neurobiol.
The effect of external rhythmic cues (auditory and visual) on walking during a functional task in homes of people with Parkinson's disease
Arch. Phys. Med. Rehabil.
Time perception and motor timing: a common cortical and subcortical basis revealed by fMRI
NeuroImage
Temporal aspects of prediction in audition: cortical and subcortical neural mechanisms
Int. J. Psychophysiol.
Synchronization in repetitive smooth movement requires perceptible events
Acta Psychol.
Modeling variability and dependence in timing
Functional neuroimaging correlates of finger-tapping task variations: an ALE meta-analysis
NeuroImage
Temporal autocorrelation in univariate linear modeling of FMRI data
NeuroImage
Effects of rhythmic sensory stimulation (auditory, visual) on gait in Parkinson's disease patients
Exp. Brain Res.
Perceptual synchrony of audiovisual streams for natural and artificial motion sequences
J. Vis.
A computational model of how the basal ganglia produce sequences
J. Cogn. Neurosci.
Cross-modal bias and perceptual fusion with auditory-visual spatial discordance
Percept. Psychophys.
Timing an attacking forehand drive in table tennis
J. Exp. Psychol. Hum. Percept. Perform.
What makes us tick? Functional and neural mechanisms of interval timing
Nat. Rev. Neurosci.
Moving on time: brain network for auditory–motor synchronization is modulated by rhythm complexity and musical training
J. Cogn. Neurosci.
Neuroanatomical and neurochemical substrates of timing
Neuropsychopharmacol. Rev.
Functional anatomy of the attentional modulation of time estimation
Science
Role of the cerebellum in externally paced rhythmic finger movements
J. Neurophysiol.
Reactions to rhythmic stimuli, with attempt to synchronize
Psychol. Rev.
Statistical Analysis of Circular Data
Cited by (127)
Vision rivals audition in alerting humans for fast action
2023, Acta PsychologicaIdentifying a brain network for musical rhythm: A functional neuroimaging meta-analysis and systematic review
2022, Neuroscience and Biobehavioral Reviews