Elsevier

Brain and Language

Volume 162, November 2016, Pages 46-59
Brain and Language

Prosodic grouping at birth

https://doi.org/10.1016/j.bandl.2016.08.002Get rights and content

Highlights

  • Newborns exhibit perceptual biases to group sounds varying in duration or pitch.

  • These biases are only found for acoustic cues in their native language prosody.

  • Prenatal monolingual and bilingual language exposure modulates these biases.

Abstract

Experience with spoken language starts prenatally, as hearing becomes operational during the second half of gestation. While maternal tissues filter out many aspects of speech, they readily transmit speech prosody and rhythm. These properties of the speech signal then play a central role in early language acquisition. In this study, we ask how the newborn brain uses variation in duration, pitch and intensity (the three acoustic cues that carry prosodic information in speech) to group sounds. In four near-infrared spectroscopy studies (NIRS), we demonstrate that perceptual biases governing how sound sequences are perceived and organized are present in newborns from monolingual and bilingual language backgrounds. Importantly, however, these prosodic biases are present only for acoustic patterns found in the prosody of their native languages. These findings advance our understanding of how prenatal language experience lays the foundations for language development.

Introduction

Learning about language depends critically on a complex interplay between neurobiologically constrained processing mechanisms, perceptual biases and linguistic input. At birth, infants possess many language-general abilities. They can discriminate between most speech sounds (Cheour-Luhtanen et al., 1995, Werker and Gervain, 2013); and between rhythmically different languages they never heard before (Nazzi, Bertoncini, et al., 1998, Nazzi and Ramus, 2003). Moreover, they prefer speech over a variety of non-linguistic sounds (Decasper and Spence, 1986, Vouloumanos and Werker, 2007) and infant- over adult-directed speech (Fernald & Kuhl, 1987). However, as hearing is operational from the 24th to the 28th week of gestation (Hepper & Shahidullah, 1994), experience with spoken language starts in the womb, and some evidence of prenatal learning is found at birth. Indeed, newborns prefer their mother’s voice over other female voices (Decasper & Fifer, 1980), their native language over a rhythmically different unfamiliar language (Mehler et al., 1988, Moon et al., 1993), and their communicative cries reflect the prosody of the language they heard in utero (Mampe, Friederici, Christophe, & Wermke, 2009). Moreover, it has been shown that newborns who received bilingual prenatal exposure recognize both languages as familiar and can discriminate them from a rhythmically different unfamiliar language (Byers-Heinlein, Burns, & Werker, 2010). Additionally, newborns are able to recognize stories heard during pregnancy (Decasper & Spence, 1986) or melodies to which they were exposed prenatally (DeCasper, 1994, Granier-Deferre et al., 2011). Taken together, these findings constitute evidence that infants start learning about language while still in the womb, and that speech heard in utero has a more important impact on the development of speech perception and language learning than hitherto believed.

Speech experienced in utero, however, is different from broadcast speech transmitted through the air. Maternal tissues act as a low-pass filter, mainly transmitting sounds below 300–400 Hz (Gerhardt et al., 1992, Querleu et al., 1988). As a consequence, prosody, the global melody and rhythm of speech, is relatively well preserved and transmitted to the fetal inner ear, whereas more detailed, phonetic aspects are disrupted (Querleu et al., 1988). Importantly, prosody is a powerful cue that infants have been shown to make use of during language acquisition. For instance, newborns rely on prosody to discriminate languages (Nazzi, Bertoncini, et al., 1998, Nazzi, Floccia, et al., 1998, Nazzi and Ramus, 2003), to detect boundaries in speech (Christophe, Dupoux, Bertoncini, & Mehler, 1994), differences in the pitch contour or lexical stress pattern of words (Nazzi, Floccia, et al., 1998, Sansavini et al., 1997) or even between function words and content words (Shi, Werker, & Morgan, 1999), on the basis of their different acoustic characteristics. They also use prosody to segment words out of the continuous speech stream (Johnson and Jusczyk, 2001, Jusczyk et al., 1999, Kooijman et al., 2009, Mattys et al., 1999, Nazzi et al., 2006, Nishibayashi et al., 2015) or to learn about the syntactic features of their native language (Hirsh-Pasek et al., 1987), such as its basic word order (Gervain and Werker, 2013, Nespor et al., 2008) or argument structure (Christophe, Gout, Peperkamp, & Morgan, 2003).

Thus, the variations in pitch, intensity or duration that carry prosody in the speech signal serve as robust and particularly important cues to language learning. Yet, how infants perceive these three acoustic dimensions at birth has remained largely unexplored, and whether language experience shapes the perception of these acoustic cues is currently heatedly debated. One issue at stake is the origin and developmental trajectory of the prosodic grouping bias known as the Iambic-Trochaic Law (ITL). Some authors have argued that the ITL is language-independent. Specifically, it has been claimed that the auditory system automatically groups sequences of sounds that differ in duration with the longest element in final position (i.e., prominence-final or iambic grouping), and sequences of sounds that differ in intensity or pitch with the loudest or highest element in initial position (i.e., prominence-initial or trochaic grouping). The ITL was initially proposed to explain the grouping of musical or non-linguistic sequences (Bolton, 1894, Cooper and Meyer, 1960, Woodrow, 1951). As a well-known example, people tend to perceive the fire truck siren as a sequence of two paired sounds, the first one being higher than the second one. This grouping principle was later extended to account for regularities in speech production and biases in speech perception in adults (Bion et al., 2011, Hay and Diehl, 2007, Hayes, 1995, Nespor et al., 2008). The proposal that the ITL is language-general is supported by studies showing that adult speakers of prosodically and rhythmically different languages such as English and French show similar grouping preferences (Hay & Diehl, 2007). Moreover, trochaic grouping on the basis of a pitch contrast was found in Italian adults, in Italian and French infants, whose native language makes little use of pitch cues in its prosody (Abboub et al., 2016, Bion et al., 2011), as well as in rats (de la Mora, Nespor, & Toro, 2013), suggesting not only that prosodic grouping preferences might exist in the absence of language experience, but also that they might be shared by humans and other mammals.

However, a recent alternative hypothesis has emerged, according to which prosodic grouping biases might, at least in part, be influenced by language experience. Supporting this view, recent cross-linguistic research has shown that although English and Japanese adults group sequences varying in intensity trochaically, only English, but not Japanese, adults group sequences varying in duration iambically (Iversen, Patel, & Ohgushi, 2008). The two languages differ at the phrasal level, since Japanese has a trochaic rhythm (^Tokyo ni, Tokyo to, ‘to Tokyo’, with prosodic prominence marked by higher pitch on the content word ‘Tokyo’ in initial position; Gervain & Werker, 2013), whereas English has an iambic rhythm (to Ro:me, with prosodic prominence marked by lengthened duration on the content word ‘Rome’ in final position). Relatedly, while both German and French adults follow the ITL when presented with complex linguistic stimuli varying in intensity or duration, they nevertheless exhibit language-specific differences, German adults showing stronger ITL effects; moreover, effects based on pitch were found for German but not French adults (Bhatara, Boll-Avetisyan, Unger, Nazzi, & Höhle, 2013). Similar findings were found using complex non-linguistic stimuli (Bhatara, Boll-Avetisyan, Agus, Höhle, & Nazzi, 2015). The authors argue that these cross-linguistic differences reflect the fact that German has a predominantly trochaic word-level stress pattern, while French does not. Additionally, French is iambic at the phrasal level, whereas German can have both rhythmic patterns. In infants, Japanese- and English-learning 7–8-month-olds (Yoshida et al., 2010) revealed a pattern of results similar to the one found in adults (Iversen et al., 2008) and bilingual Spanish and Basque 9–10-month-olds (Molnar, Lallier, & Carreiras, 2014) also showed consistent grouping for intensity, but not for duration. These early cross-linguistic differences were found essentially for duration, suggesting first, that the language environment might influence grouping preferences early on, and second, that the three acoustics cues are not affected in the same way by this cross linguistic modulation. This raises the question of how and when during development language experience starts modulating perceptual grouping biases.

Contributing to this debate, the current study will explore whether newborns already possess general perceptual mechanisms to group sounds according to prosodic cues, and whether these abilities are already modulated by the native language(s) heard in utero. If such perceptual biases are present early in development, they have the potential to help infants break into language.

A related point concerns the cerebral basis of prosodic grouping, which remains, to a large extent, unexplored. In adults, language comprehension, including morphosyntactic and semantic processing, is predominantly lateralized to the left hemisphere, while prosodic processing typically recruits a more dynamic network in the right hemisphere (Friederici, 2012, Hickok and Poeppel, 2007), although the lateralization of prosodic processing also depends on the functional relevance of prosody in the language studied and on the context. Left dominance may be observed if the prosodic cue used is lexically or morphosyntactically relevant such as lexical tone in adults who speak a tonal language (Gandour et al., 2004, Kreitewolf et al., 2014, Sato et al., 2007, Sato et al., 2010). In infants, few studies have investigated the neural basis of prosodic processing in general, and none have specifically looked at how prosodic grouping is processed across these three acoustic dimensions. The few existing optical imaging studies investigating prosodic processing in general reported that sleeping neonates and 3-month-olds showed a right hemispheric specialization (Homae et al., 2007, Sato et al., 2010, Telkemeyer et al., 2009), as do 4-year-olds (Wartenburger et al., 2007). Nevertheless in these studies, sentential prosody was tested in its full acoustic complexity. Thus it is still unclear how and where in the brain prosodic cues in isolation (i.e. variations in duration only, intensity only or pitch only) are processed and grouped. More specifically, we do not know whether, and if yes, how grouping on the basis of a single acoustic cue is perceived and processed in the developing brain.

The current study therefore sought to answer two questions. First, we explored the earliest foundations of the crucial ability to detect and process prosodic patterns. In particular, no study has as yet tested newborns’ prosodic grouping biases and their neural correlates, a gap that the present study intends to fill. Accordingly, we tested prosodic patterns that vary along one of the three acoustic dimensions characterizing speech prosody: duration, intensity, and pitch. To do so, we used near-infrared spectroscopy (NIRS), an optical imaging technique ideally suited to test the youngest developmental populations (high motion tolerance, easy application, no carrier substance or magnetic field, no noise, etc.). NIRS has the advantage of providing good spatial localization, allowing us to identify the brain areas responsible for prosodic grouping. This technique has been widely used to explore the neural correlates of speech perception and language acquisition in newborns and young infants (Gervain et al., 2012, Gervain, Macagno, et al., 2008, Gomez et al., 2014, May et al., 2011, Peña et al., 2003, Telkemeyer et al., 2009). French, the language that our monolingual participants were exposed to prenatally, uses mainly durational contrasts in its prosody, in particular final lengthening (Dell et al., 1984, Nespor et al., 2008). Second, we investigated how these abilities are modulated by language exposure, comparing prosodic grouping biases for the pitch condition in monolingual French-exposed newborns and newborns exposed to French and another language making more systematic use of pitch in its prosody.

To address these two questions, we conducted four NIRS experiments. First, we explored the origins of prosodic grouping in French-exposed monolingual newborns in each of the three relevant acoustic dimensions (duration: Exp 1; intensity: Exp 2; pitch: Exp 3). Second, we tested French-other language bilingual newborns in the pitch contrast condition (Exp 4), the dimension for which even non-linguistic animals show a grouping preference. In all four experiments, newborns listened to sequences of pure tone pairs pertaining to three conditions: a consistent grouping condition in which the pairs were consistent with the grouping predicted by the universal bias and/or by native language prosody (short-long, strong-weak and high-low), an inconsistent grouping condition which presented the opposite grouping pattern (long-short, weak-strong, low-high) and a no-contrast condition, whereby tones in the pair were identical (equal duration, intensity or pitch).

We predicted that if infants use variation in a given prosodic dimension to group sounds, differences in response amplitudes or localization should be found for the consistent and inconsistent conditions. Since no previous study has tested the cerebral basis of the prosodic grouping bias in any developmental population, we had no clear predictions regarding the localization of the response. Sentential prosody has been found to be processed in the right hemisphere early on in development (Homae et al., 2007, Sato et al., 2010, Telkemeyer et al., 2009). However, these studies all compared fully complex sentential prosody to a flattened, non-prosodic condition, whereas in our study, the two critical conditions (consistent and inconsistent grouping) both contain prosodic information. The crucial difference is in the well-formedness and sequential ordering of this information, i.e. in structural and sequential ordering aspects, which are typically processed in the left hemisphere (Dehaene-Lambertz et al., 2002, Gervain et al., 2012, Gervain, Macagno, et al., 2008). Accordingly, we might either find an involvement of both hemispheres, or a left hemispheric advantage.

Given the importance of duration variation in French, Experiment 1 presented newborns from monolingual French families with duration variations.

Section snippets

Participants

Eighteen healthy, full-term newborns (13 females; mean age: 2.05 days; range: 1–4 days; Apgar score  8) born to French-speaking families contributed data to the final analyses. An additional 7 newborns were tested but were excluded from data analysis due to the infant becoming awake or fussy during the experiment (3), having thick hair (2) or failing to complete the procedure (2). All parents gave informed consent before the experiment. The present experiment was approved by The Conseil

Participants

A new group of eighteen healthy, full term newborns (11 females; mean age: 2.28 days; range: 1–4 days; Apgar score  8) born to French-speaking families contributed data to the final analyses. An additional 8 newborns were tested, but were excluded from data analysis due to the infant becoming awake or fussy. All parents gave informed consent before the experiment. The present experiment was approved by The Conseil d’évaluation éthique pour les recherches en santé (CERES) ethics board (Université

Participants

Eighteen healthy, full term newborns (12 females; mean age: 2.86 days; range: 2–5 days; Apgar score ⩾8) born to French-speaking families contributed data to the final analyses. An additional 18 infants were tested, but were excluded from data analysis due to the infant becoming awake or fussy (9), having thick hair (4), failing to complete the procedure (2) or due to equipment failure (3). All parents gave informed consent before the experiment. The present experiment was approved by The Conseil

Participants

Eighteen healthy, full term newborns (12 females; mean age: 2.17 days; range: 1–4 days; Apgar score ⩾8) born to bilingual families speaking French and one other language contributed data to the final analyses. An additional 24 infants were tested, but were excluded from data analysis due to the infant becoming awake or fussy (13), having thick hair (8), or due to equipment failure (3). All parents gave informed consent before the experiment. The present experiment was approved by The Conseil

General discussion

We conducted four NIRS experiments testing whether newborns from different language backgrounds have prosodic grouping preferences at birth. To our knowledge, the present study is the first to show prosodic grouping at such a young age and in infants from monolingual and bilingual backgrounds. It thus extends previous studies that had tested infants’ sensitivity to the Iambic-Trochaic Law (ITL) from about 6 months onwards and typically from monolingual language environments (Abboub et al., 2016,

Acknowledgements

We thank the parents who graciously agreed to have their infants participate in the study and all of the personnel of Hopital Robert Debre, Paris, France. We also thank Maria Clemencia Ortiz Barajas for her help with the statistical analysis. This work was supported by LABEX EFL (ANR-10-LABX-0083) to NA, TN and JG, a Fyssen Foundation Startup Grant, an Emergence(s) Programme Grant from the City of Paris and a Human Frontiers Science Program Young Investigator Grant (RGY-0073-2014) to JG.

References (90)

  • J. Gervain et al.

    Bootstrapping word order in prelexical infants: A Japanese-Italian cross-linguistic study

    Cognitive Psychology

    (2008)
  • K. Hirsh-Pasek et al.

    Clauses are perceptual units for young infants

    Cognition

    (1987)
  • F. Homae et al.

    The right hemisphere of sleeping infant perceives sentential prosody

    Neuroscience Research

    (2006)
  • F. Homae et al.

    Prosodic processing in the developing brain

    Neuroscience Research

    (2007)
  • E.K. Johnson et al.

    Word segmentation by 8-month-olds: When speech cues count more than statistics

    Journal of Memory and Language

    (2001)
  • P.W. Jusczyk et al.

    The beginnings of word segmentation in English-learning infants

    Cognitive Psychology

    (1999)
  • J. Kreitewolf et al.

    Hemispheric lateralization of linguistic prosody recognition in comparison to speech and speaker recognition

    NeuroImage

    (2014)
  • B. Mampe et al.

    Newborns’ cry melody is shaped by their native language

    Current Biology: CB

    (2009)
  • E. Maris et al.

    Nonparametric statistical testing of EEG- and MEG-data

    Journal of Neuroscience Methods

    (2007)
  • G. Matsuda et al.

    Sustained decrease in oxygenated hemoglobin during video games in the dorsal prefrontal cortex: A NIRS study of children

    NeuroImage

    (2006)
  • S.L. Mattys et al.

    Phonotactic and prosodic effects on word segmentation in infants

    Cognitive Psychology

    (1999)
  • J. Mehler et al.

    A precursor of language acquisition in young infants

    Cognition

    (1988)
  • C. Moon et al.

    Two-day-olds prefer their native language

    Infant Behavior & Development

    (1993)
  • T. Nazzi et al.

    Discrimination of pitch contours by neonates

    Infant Behavior & Development

    (1998)
  • T. Nazzi et al.

    Early segmentation of fluent speech by infants acquiring French: Emerging evidence for crosslinguistic differences

    Journal of Memory and Language

    (2006)
  • T. Nazzi et al.

    Perception and acquisition of linguistic rhythm by infants

    Speech Communication

    (2003)
  • Y. Otsuka et al.

    Neural activation to upright and inverted faces in infants measured by near infrared spectroscopy

    NeuroImage

    (2007)
  • D. Querleu et al.

    Fetal hearing

    European Journal of Obstetrics and Gynecology and Reproductive Biology

    (1988)
  • R. Shi et al.

    Newborn infants’ sensitivity to perceptual cues to lexical and grammatical words

    Cognition

    (1999)
  • M. Sundara et al.

    Development of coronal stop perception: Bilingual infants keep pace with their monolingual peers

    Cognition

    (2008)
  • P. Vannasing et al.

    Distinct hemispheric specializations for native and non-native languages in one-day-old newborns identified by fNIRS

    Neuropsychologia

    (2016)
  • I. Wartenburger et al.

    The processing of prosody: Evidence of interhemispheric specialization at the age of four

    NeuroImage

    (2007)
  • K.A. Yoshida et al.

    The development of perceptual grouping biases in infancy: A Japanese-English cross-linguistic study

    Cognition

    (2010)
  • N. Abboub et al.

    An exploration of rhytmic grouping of speech sequences by French- and German-learning infants

    Frontiers in Human Neuroscience

    (2016)
  • J.P. Angoujard

    Les hiérarchies prosodiques en arabe

    Revue Québécoise de Linguistique

    (1986)
  • Y. Benjamini et al.

    Controlling the false discovery rate: A practical and powerful approach to multiple testing

    Journal of the Royal Statistical Society. Series B (Methodological)

    (1995)
  • A. Bhatara et al.

    Language experience affects grouping of musical instrument sounds

    Psychological Science

    (2015)
  • A. Bhatara et al.

    Native language affects rhythmic grouping of speech

    The Journal of the Acoustical Society of America

    (2013)
  • R. Bijeljac-Babic et al.

    Effect of bilingualism on lexical stress pattern discrimination in French-learning infants

    PLoS One

    (2012)
  • R.A.H. Bion et al.

    Acoustic markers of prominence influence infants’ and adults’ segmentation of speech sequences

    Language and Speech

    (2011)
  • T.L. Bolton

    Rhythm

    The American Journal of Psychology

    (1894)
  • K. Byers-Heinlein et al.

    The roots of bilingualism in newborns

    Psychological Science

    (2010)
  • S. Chaker

    Données exploratoires en prosodie berbère: I. L’accent en kabyle

    Comptes Rendus Du GLECS

    (1995)
  • A. Christophe et al.

    Do infants perceive word boundaries? An empirical study of the bootstrapping of lexical acquisition

    Journal of the Acoustical Society of America

    (1994)
  • M.X. Cohen

    Analyzing neural time series data: Theory and practice

    (2014)
  • Cited by (85)

    View all citing articles on Scopus
    View full text