Elsevier

Brain and Language

Volume 120, Issue 3, March 2012, Pages 401-405
Brain and Language

Short Communication
Hemispheric differences in the effects of context on vowel perception

https://doi.org/10.1016/j.bandl.2011.12.012Get rights and content

Abstract

Listeners perceive speech sounds relative to context. Contextual influences might differ over hemispheres if different types of auditory processing are lateralized. Hemispheric differences in contextual influences on vowel perception were investigated by presenting speech targets and both speech and non-speech contexts to listeners’ right or left ears (contexts and targets either to the same or to opposite ears). Listeners performed a discrimination task. Vowel perception was influenced by acoustic properties of the context signals. The strength of this influence depended on laterality of target presentation, and on the speech/non-speech status of the context signal. We conclude that contrastive contextual influences on vowel perception are stronger when targets are processed predominately by the right hemisphere. In the left hemisphere, contrastive effects are smaller and largely restricted to speech contexts.

Highlights

► Laterality differences in processes underlying spoken vowel perception were examined. ► Vowels discriminated in speech and nonspeech contexts presented to right or left ear. ► Speech stimuli had strongest contrastive context effect on perception. ► Left vs. right hemisphere processes induce different amount of contrastive effects.

Introduction

One of the most well-established findings in cognitive neuroscience is that strokes to the left perisylvian region lead to stronger language impairments than strokes to the right perisylvian region (Ingram, 2007). The imaging literature on speech perception, however, consistently implicates both hemispheres (Poeppel, 2003). This apparent contradiction is addressed by the Asymmetric Sampling in Time (AST) hypothesis (Hickok and Poeppel, 2007, Poeppel, 2003), which suggests that there is bilateral processing of spoken language, but with a functional asymmetry: It is proposed that the left hemisphere integrates information over shorter time windows (i.e., ∼20–50 ms) than the right hemisphere (i.e., ∼150–300 ms). This would make the left hemisphere well equipped to deal with the fast temporal changes that are necessary for identifying speech sounds, while the right hemisphere would be better equipped for the fine-grained spectral analysis needed for the perception of music and intonation contours. Scott and Wise (2004), however, have argued that there is no convincing evidence that left auditory cortex has a preference for fast transitions. They also conclude that “It is simply not meaningful to consider ‘temporal’ and ‘spectral’ in the auditory system as delineating the ends of a dimension which affords rapid temporal resolution at one end and pitch processing at the other” (p. 38).

It is thus not clear whether the hemispheric difference in integration window – if it even exists – is there to support a division of labor between ‘temporal’ and ‘spectral’ processes. But there may be a different reason why the proposed short window of integration in the left hemisphere may be useful for speech processing. Short windows are necessary to account for contrastive context effects, such as those first reported by Ladefoged and Broadbent (1957). For instance, when participants categorized targets on a continuum ranging from “itch” to “etch”, they categorized more stimuli as “itch” when a context sentence was processed by a filter that suppressed the frequencies that are more dominant in /ɪ/ than in /ɛ/ (Watkins, 1991). Similar effects have been observed with non-speech contexts and over relatively long silent intervals between contexts and targets (Holt, 2005, Sjerps et al., 2011a, Sjerps et al., 2011b). The common denominator in all these studies is “contrast”: A given stimulus is perceived relative to context, so that a “high” context makes “low” percepts more likely, and vice versa (Kluender, Coady, & Kiefte, 2003). In the case of vowel perception, for example, more vowels on a 1st formant (F1) continuum are identified as the low-F1 endpoint vowel in a context with a high F1 than in a context with a low F1.

Contrast effects can obviously only arise if target and context are perceived as separate entities. If information that is processed in the left hemisphere is integrated over shorter time windows, such that context and target are processed in separate windows, contrastive effects should arise (“high” contexts should make “low” percepts more likely). If the right hemisphere, however, uses larger windows of integration, context and target information are more likely to be integrated because they are more likely to fall in the same analysis window (“high” contexts should make “high” percepts more likely). The need to be able to perceive separate acoustic events as separate, a feature that might be especially useful in speech perception, thus constitutes a new raison d’être for the AST hypothesis. This explanation is independent of the motivation based on the distinction between spectral and temporal properties in auditory processing. If this reasoning is correct, we should find contrastive effects for stimuli that are processed primarily by the left hemisphere, but integrative effects for stimuli that are processed primarily by the right hemisphere.

The outcome of different contrastive and integrative effects over the hemispheres could also shed light on some puzzling contradictory findings. As it turns out, the size and direction of context effects have differed across materials. For instance, Watkins (1991) found no effect of contralaterally presented noise contexts on the perception of speech targets, but speech analogs of these stimuli did elicit contrastive effects. Moreover, integrative effects have been reported in the spectral domain (Aravamudhan et al., 2008, Mitterer, 2006) and with respect to durational distinctions (Fowler, 1992, van Dommelen, 1999). These inconsistencies between contrastive and integrative effects could reflect differences in the relative involvement of the two hemispheres with speech and non-speech stimuli. The present study was thus set up to test whether hemispheric differences influence extrinsic normalization of vowels. To test this, we made use of two manipulations. First, we used both speech and non-speech stimuli. Second, we presented these stimuli either to participants’ right ears or to their left ears.

Monaural input is more strongly transferred to the hemisphere contralaterally to the ear of presentation, for primary and non-primary auditory cortex (Jäncke et al., 2002, Loveless et al., 1994, Stefanatos et al., 2008, Suzuki et al., 2002). Activation levels are two to three times as large in the contralateral as in the ipsilateral hemisphere (Jäncke et al., 2002, Suzuki et al., 2002), although with speech stimuli the contralateral dominance effect has been reported to be larger for the right than for the left ear (Stefanatos et al., 2008). We manipulated dominance of hemispheric processing by presenting stimuli monaurally to the left or the right ear.

There is, however, a caveat to consider. Signals that are close together in time influence each other at peripheral stages in auditory pathways when presented to the same ear. These influences are contrastive (Summerfield, Haggard, Foster, & Gray, 1984). Such influences would obscure our investigation because we are interested in central (cortical) levels of processing. Preceding context was therefore separated from targets by a 500 ms silent interval. Moreover, across conditions, contexts and targets were presented either to the same ears or to opposite ears. These precautions allow us to reduce and control the influence of peripheral adaptation (Summerfield et al., 1984).

We investigated context effects in a 4I-oddity discrimination design. In this task, listeners are asked to detect whether a deviant (D), presented among a set of standards (S), occurred in either second or third position (e.g. SDSS or SSDS). The use of this task reduces influences from response strategies (such as balancing the number of responses between each of the two labels). This is mainly so because the 4I-oddity task does not require the use of category labels, and as such encourages listeners to focus on auditory aspects of target stimuli (Gerrits & Schouten, 2004).

A continuum of target stimuli was created between the Dutch vowels /ɛ/ and /ɪ/ (which is mainly an F1 distinction). Vowels were presented in a non-word context (/papu/) that was manipulated to have a high- or a low-F1 contour. A context effect should result in a difference in discriminability between an ambiguous sound with the [ɛ] and [ɪ] endpoints. To exemplify, consider a categorization experiment: in a low-F1 context, listeners categorize ambiguous vowels more as /ɛ/ (Watkins, 1991). The perceptual distance between the ambiguous sound [Iε] and [ɛ] is thus smaller in this condition than the distance between [Iε] and [ɪ]. This pattern reverses for vowels that are presented in a high-F1 context. In our 4I-oddity discrimination task, context effects should then lead to reduced discriminability between [Iε] and [ɛ] in a low-F1 context (and between [Iε] and [ɪ] in a high-F1 context).

Listeners heard sets of three ambiguous standards ([Iε]) and one unambiguous deviant (either [ɪ] or [ɛ]). The bisyllable [papu] was manipulated to have a high or a low average F1 and thereby provided listeners with information about the speaker’s vocal tract properties. The context was spliced onto the target vowels such that listeners heard nonsense words like [Iεpapu] (standards) and [ɪpapu] or [ɛpapu] (deviants). In one group of listeners the target vowels and contexts were always presented contralaterally. For another group of listeners the target vowels and contexts were always presented to the same ear. The stimuli were presented in sets of four, with the [papu] part identical in all four non-words in a set. Fig. 1 displays an example trial for participants in the group that were presented with targets and contexts contralaterally, for a trial in which the targets were presented to the left ear, and with the deviant vowel ([ɛ]) in second position.

In a further condition, the contexts were non-speech stimuli. The [papu] parts now consisted of noise that had the same amplitude envelope as the original [papu] parts. Two of the non-speech versions of the noise precursor were made, one with the same Long-Term Average Spectrum (LTAS) as the low-F1 [papu] part and one with the same LTAS as the high-F1 [papu] part. This is important as the LTAS of context signals has been argued to be the main cause of contrast effects (Watkins, 1991).

To summarize, we tested whether contextual influences on vowel perception differ in the two hemispheres. Target vowels were presented in two types of context: a speaker with a high F1 or a speaker with a low F1. Effects were tested in a discrimination task. Context effects were expected as a difference in the discriminability of the two deviant vowels across F1 contexts. Targets were presented to the right or to the left ear (and contexts were, across two groups of listeners, presented to the same and to the opposite ears). Furthermore, context stimuli consisted either of speech (the bisyllable: [papu]) or a non-speech version of this sequence that had the same amplitude envelope and LTAS as the speech version. According to the predictions of the AST hypothesis we should find that contrastive context effects are stronger when stimuli are presented primarily to the left hemisphere (i.e., the right ear) than when they were presented primarily to the right hemisphere (i.e., the left ear).

Section snippets

Analysis and results

The results were analyzed using linear mixed effects models in R (version 2.10.0; The R foundation for statistical computing) as provided in the lme4 package (Bates & Sarkar, 2007). For the dichotomous dependent variable of correct responses (i.e., correct = 1 vs. incorrect = 0), a logit linking function was used. Responses were analyzed by fitting models with participants as random factor. All fixed factors were centered around zero. These were Context (with the levels low-F1 = −1 vs. high-F1 = 1),

Discussion

This study was set up to investigate a prediction derived from AST (Poeppel, 2003) with respect to contextual influences on vowel perception. AST proposes that the right hemisphere integrates information over longer time windows than the left hemisphere. This led to the prediction that processes in the right hemisphere would lead to more integrative effects than those in the left hemisphere. Combined with the fact that biological systems are naturally more sensitive to contrast (Kluender &

Participants

32 native Dutch participants were tested. Participants were invited if they indicated that their right hand was dominant (in response to the question: “Indicate whether you are right or left-handed”). Seven were employees of the Max Planck Institute for Psycholinguistics (MPI) and 25 were participants selected from the MPI participant database (all were uninformed about the purpose of the study). None of the participants reported hearing impairment. All participants can be considered bilingual

Acknowledgment

This work was funded by the Max Planck Gesellschaft and forms part of the first author’s PhD dissertation.

References (26)

  • M. Suzuki et al.

    Cortical and subcortical activation with monaural monosyllabic stimulation by functional MRI

    Hearing Research

    (2002)
  • W.A. van Dommelen

    Auditory accounts of temporal factors in the perception of Norwegian disyllables and speech analogs

    Journal of Phonetics

    (1999)
  • R. Aravamudhan et al.

    Perceptual context effects of speech and nonspeech sounds: The role of auditory categories

    Journal of the Acoustical Society of America

    (2008)
  • Cited by (11)

    • Compensation for complete assimilation in speech perception: The case of korean labial-to-velar assimilation

      2013, Journal of Memory and Language
      Citation Excerpt :

      The integrative effect observed with Dutch and English listeners is similar to integrative effects found in compensation for coarticulation studies (Fowler, 2006): Listeners are likely to perceive an ambiguous sound as similar to an adjacent sound, if this adjacent sound is a good example of its category. Integrative and contrastive effects seem to compete in speech perception, and it is not yet clear under what circumstances which context effect prevails (Sjerps, Mitterer, & McQueen, 2012). Nevertheless, the results of Experiments 3 and 4 seem to suggest that the integrative effect would be the default effect when listeners with no language-specific knowledge hear a labial-to-velar continuum before a velar, and only with language experience, one shows a contrastive effect.

    • Compensation for vocal tract characteristics across native and non-native languages

      2013, Journal of Phonetics
      Citation Excerpt :

      This approach focuses much more on a listener's phoneme repertoire and therefore relies on a listener's language background. This framework finds some support in reports such as those by Watkins and Makin (1996), Watkins (1991), Mitterer (2006), and Sjerps et al. (2012), who have shown that a signal-correlated noise precursor induced significantly smaller compensation effects than a speech precursor that had the same LTAS. From a mental frame of reference standpoint, these findings suggest that only a small part of the compensation found with speech sounds is actually due to general auditory processes, while an important contribution is made by a language-specific process such as, possibly, a phonetic frame of reference.

    • Regressive spectral assimilation bias in speech perception

      2019, Attention, Perception, and Psychophysics
    • Lexical tone is perceived relative to locally surrounding context, vowel quality to preceding context

      2018, Journal of Experimental Psychology: Human Perception and Performance
    • Toward an integrative model of talker normalization

      2016, Journal of Experimental Psychology: Human Perception and Performance
    View all citing articles on Scopus
    View full text