Prosodic grouping at birth
Introduction
Learning about language depends critically on a complex interplay between neurobiologically constrained processing mechanisms, perceptual biases and linguistic input. At birth, infants possess many language-general abilities. They can discriminate between most speech sounds (Cheour-Luhtanen et al., 1995, Werker and Gervain, 2013); and between rhythmically different languages they never heard before (Nazzi, Bertoncini, et al., 1998, Nazzi and Ramus, 2003). Moreover, they prefer speech over a variety of non-linguistic sounds (Decasper and Spence, 1986, Vouloumanos and Werker, 2007) and infant- over adult-directed speech (Fernald & Kuhl, 1987). However, as hearing is operational from the 24th to the 28th week of gestation (Hepper & Shahidullah, 1994), experience with spoken language starts in the womb, and some evidence of prenatal learning is found at birth. Indeed, newborns prefer their mother’s voice over other female voices (Decasper & Fifer, 1980), their native language over a rhythmically different unfamiliar language (Mehler et al., 1988, Moon et al., 1993), and their communicative cries reflect the prosody of the language they heard in utero (Mampe, Friederici, Christophe, & Wermke, 2009). Moreover, it has been shown that newborns who received bilingual prenatal exposure recognize both languages as familiar and can discriminate them from a rhythmically different unfamiliar language (Byers-Heinlein, Burns, & Werker, 2010). Additionally, newborns are able to recognize stories heard during pregnancy (Decasper & Spence, 1986) or melodies to which they were exposed prenatally (DeCasper, 1994, Granier-Deferre et al., 2011). Taken together, these findings constitute evidence that infants start learning about language while still in the womb, and that speech heard in utero has a more important impact on the development of speech perception and language learning than hitherto believed.
Speech experienced in utero, however, is different from broadcast speech transmitted through the air. Maternal tissues act as a low-pass filter, mainly transmitting sounds below 300–400 Hz (Gerhardt et al., 1992, Querleu et al., 1988). As a consequence, prosody, the global melody and rhythm of speech, is relatively well preserved and transmitted to the fetal inner ear, whereas more detailed, phonetic aspects are disrupted (Querleu et al., 1988). Importantly, prosody is a powerful cue that infants have been shown to make use of during language acquisition. For instance, newborns rely on prosody to discriminate languages (Nazzi, Bertoncini, et al., 1998, Nazzi, Floccia, et al., 1998, Nazzi and Ramus, 2003), to detect boundaries in speech (Christophe, Dupoux, Bertoncini, & Mehler, 1994), differences in the pitch contour or lexical stress pattern of words (Nazzi, Floccia, et al., 1998, Sansavini et al., 1997) or even between function words and content words (Shi, Werker, & Morgan, 1999), on the basis of their different acoustic characteristics. They also use prosody to segment words out of the continuous speech stream (Johnson and Jusczyk, 2001, Jusczyk et al., 1999, Kooijman et al., 2009, Mattys et al., 1999, Nazzi et al., 2006, Nishibayashi et al., 2015) or to learn about the syntactic features of their native language (Hirsh-Pasek et al., 1987), such as its basic word order (Gervain and Werker, 2013, Nespor et al., 2008) or argument structure (Christophe, Gout, Peperkamp, & Morgan, 2003).
Thus, the variations in pitch, intensity or duration that carry prosody in the speech signal serve as robust and particularly important cues to language learning. Yet, how infants perceive these three acoustic dimensions at birth has remained largely unexplored, and whether language experience shapes the perception of these acoustic cues is currently heatedly debated. One issue at stake is the origin and developmental trajectory of the prosodic grouping bias known as the Iambic-Trochaic Law (ITL). Some authors have argued that the ITL is language-independent. Specifically, it has been claimed that the auditory system automatically groups sequences of sounds that differ in duration with the longest element in final position (i.e., prominence-final or iambic grouping), and sequences of sounds that differ in intensity or pitch with the loudest or highest element in initial position (i.e., prominence-initial or trochaic grouping). The ITL was initially proposed to explain the grouping of musical or non-linguistic sequences (Bolton, 1894, Cooper and Meyer, 1960, Woodrow, 1951). As a well-known example, people tend to perceive the fire truck siren as a sequence of two paired sounds, the first one being higher than the second one. This grouping principle was later extended to account for regularities in speech production and biases in speech perception in adults (Bion et al., 2011, Hay and Diehl, 2007, Hayes, 1995, Nespor et al., 2008). The proposal that the ITL is language-general is supported by studies showing that adult speakers of prosodically and rhythmically different languages such as English and French show similar grouping preferences (Hay & Diehl, 2007). Moreover, trochaic grouping on the basis of a pitch contrast was found in Italian adults, in Italian and French infants, whose native language makes little use of pitch cues in its prosody (Abboub et al., 2016, Bion et al., 2011), as well as in rats (de la Mora, Nespor, & Toro, 2013), suggesting not only that prosodic grouping preferences might exist in the absence of language experience, but also that they might be shared by humans and other mammals.
However, a recent alternative hypothesis has emerged, according to which prosodic grouping biases might, at least in part, be influenced by language experience. Supporting this view, recent cross-linguistic research has shown that although English and Japanese adults group sequences varying in intensity trochaically, only English, but not Japanese, adults group sequences varying in duration iambically (Iversen, Patel, & Ohgushi, 2008). The two languages differ at the phrasal level, since Japanese has a trochaic rhythm (^Tokyo ni, Tokyo to, ‘to Tokyo’, with prosodic prominence marked by higher pitch on the content word ‘Tokyo’ in initial position; Gervain & Werker, 2013), whereas English has an iambic rhythm (to Ro:me, with prosodic prominence marked by lengthened duration on the content word ‘Rome’ in final position). Relatedly, while both German and French adults follow the ITL when presented with complex linguistic stimuli varying in intensity or duration, they nevertheless exhibit language-specific differences, German adults showing stronger ITL effects; moreover, effects based on pitch were found for German but not French adults (Bhatara, Boll-Avetisyan, Unger, Nazzi, & Höhle, 2013). Similar findings were found using complex non-linguistic stimuli (Bhatara, Boll-Avetisyan, Agus, Höhle, & Nazzi, 2015). The authors argue that these cross-linguistic differences reflect the fact that German has a predominantly trochaic word-level stress pattern, while French does not. Additionally, French is iambic at the phrasal level, whereas German can have both rhythmic patterns. In infants, Japanese- and English-learning 7–8-month-olds (Yoshida et al., 2010) revealed a pattern of results similar to the one found in adults (Iversen et al., 2008) and bilingual Spanish and Basque 9–10-month-olds (Molnar, Lallier, & Carreiras, 2014) also showed consistent grouping for intensity, but not for duration. These early cross-linguistic differences were found essentially for duration, suggesting first, that the language environment might influence grouping preferences early on, and second, that the three acoustics cues are not affected in the same way by this cross linguistic modulation. This raises the question of how and when during development language experience starts modulating perceptual grouping biases.
Contributing to this debate, the current study will explore whether newborns already possess general perceptual mechanisms to group sounds according to prosodic cues, and whether these abilities are already modulated by the native language(s) heard in utero. If such perceptual biases are present early in development, they have the potential to help infants break into language.
A related point concerns the cerebral basis of prosodic grouping, which remains, to a large extent, unexplored. In adults, language comprehension, including morphosyntactic and semantic processing, is predominantly lateralized to the left hemisphere, while prosodic processing typically recruits a more dynamic network in the right hemisphere (Friederici, 2012, Hickok and Poeppel, 2007), although the lateralization of prosodic processing also depends on the functional relevance of prosody in the language studied and on the context. Left dominance may be observed if the prosodic cue used is lexically or morphosyntactically relevant such as lexical tone in adults who speak a tonal language (Gandour et al., 2004, Kreitewolf et al., 2014, Sato et al., 2007, Sato et al., 2010). In infants, few studies have investigated the neural basis of prosodic processing in general, and none have specifically looked at how prosodic grouping is processed across these three acoustic dimensions. The few existing optical imaging studies investigating prosodic processing in general reported that sleeping neonates and 3-month-olds showed a right hemispheric specialization (Homae et al., 2007, Sato et al., 2010, Telkemeyer et al., 2009), as do 4-year-olds (Wartenburger et al., 2007). Nevertheless in these studies, sentential prosody was tested in its full acoustic complexity. Thus it is still unclear how and where in the brain prosodic cues in isolation (i.e. variations in duration only, intensity only or pitch only) are processed and grouped. More specifically, we do not know whether, and if yes, how grouping on the basis of a single acoustic cue is perceived and processed in the developing brain.
The current study therefore sought to answer two questions. First, we explored the earliest foundations of the crucial ability to detect and process prosodic patterns. In particular, no study has as yet tested newborns’ prosodic grouping biases and their neural correlates, a gap that the present study intends to fill. Accordingly, we tested prosodic patterns that vary along one of the three acoustic dimensions characterizing speech prosody: duration, intensity, and pitch. To do so, we used near-infrared spectroscopy (NIRS), an optical imaging technique ideally suited to test the youngest developmental populations (high motion tolerance, easy application, no carrier substance or magnetic field, no noise, etc.). NIRS has the advantage of providing good spatial localization, allowing us to identify the brain areas responsible for prosodic grouping. This technique has been widely used to explore the neural correlates of speech perception and language acquisition in newborns and young infants (Gervain et al., 2012, Gervain, Macagno, et al., 2008, Gomez et al., 2014, May et al., 2011, Peña et al., 2003, Telkemeyer et al., 2009). French, the language that our monolingual participants were exposed to prenatally, uses mainly durational contrasts in its prosody, in particular final lengthening (Dell et al., 1984, Nespor et al., 2008). Second, we investigated how these abilities are modulated by language exposure, comparing prosodic grouping biases for the pitch condition in monolingual French-exposed newborns and newborns exposed to French and another language making more systematic use of pitch in its prosody.
To address these two questions, we conducted four NIRS experiments. First, we explored the origins of prosodic grouping in French-exposed monolingual newborns in each of the three relevant acoustic dimensions (duration: Exp 1; intensity: Exp 2; pitch: Exp 3). Second, we tested French-other language bilingual newborns in the pitch contrast condition (Exp 4), the dimension for which even non-linguistic animals show a grouping preference. In all four experiments, newborns listened to sequences of pure tone pairs pertaining to three conditions: a consistent grouping condition in which the pairs were consistent with the grouping predicted by the universal bias and/or by native language prosody (short-long, strong-weak and high-low), an inconsistent grouping condition which presented the opposite grouping pattern (long-short, weak-strong, low-high) and a no-contrast condition, whereby tones in the pair were identical (equal duration, intensity or pitch).
We predicted that if infants use variation in a given prosodic dimension to group sounds, differences in response amplitudes or localization should be found for the consistent and inconsistent conditions. Since no previous study has tested the cerebral basis of the prosodic grouping bias in any developmental population, we had no clear predictions regarding the localization of the response. Sentential prosody has been found to be processed in the right hemisphere early on in development (Homae et al., 2007, Sato et al., 2010, Telkemeyer et al., 2009). However, these studies all compared fully complex sentential prosody to a flattened, non-prosodic condition, whereas in our study, the two critical conditions (consistent and inconsistent grouping) both contain prosodic information. The crucial difference is in the well-formedness and sequential ordering of this information, i.e. in structural and sequential ordering aspects, which are typically processed in the left hemisphere (Dehaene-Lambertz et al., 2002, Gervain et al., 2012, Gervain, Macagno, et al., 2008). Accordingly, we might either find an involvement of both hemispheres, or a left hemispheric advantage.
Given the importance of duration variation in French, Experiment 1 presented newborns from monolingual French families with duration variations.
Section snippets
Participants
Eighteen healthy, full-term newborns (13 females; mean age: 2.05 days; range: 1–4 days; Apgar score ⩾ 8) born to French-speaking families contributed data to the final analyses. An additional 7 newborns were tested but were excluded from data analysis due to the infant becoming awake or fussy during the experiment (3), having thick hair (2) or failing to complete the procedure (2). All parents gave informed consent before the experiment. The present experiment was approved by The Conseil
Participants
A new group of eighteen healthy, full term newborns (11 females; mean age: 2.28 days; range: 1–4 days; Apgar score ⩾ 8) born to French-speaking families contributed data to the final analyses. An additional 8 newborns were tested, but were excluded from data analysis due to the infant becoming awake or fussy. All parents gave informed consent before the experiment. The present experiment was approved by The Conseil d’évaluation éthique pour les recherches en santé (CERES) ethics board (Université
Participants
Eighteen healthy, full term newborns (12 females; mean age: 2.86 days; range: 2–5 days; Apgar score ⩾8) born to French-speaking families contributed data to the final analyses. An additional 18 infants were tested, but were excluded from data analysis due to the infant becoming awake or fussy (9), having thick hair (4), failing to complete the procedure (2) or due to equipment failure (3). All parents gave informed consent before the experiment. The present experiment was approved by The Conseil
Participants
Eighteen healthy, full term newborns (12 females; mean age: 2.17 days; range: 1–4 days; Apgar score ⩾8) born to bilingual families speaking French and one other language contributed data to the final analyses. An additional 24 infants were tested, but were excluded from data analysis due to the infant becoming awake or fussy (13), having thick hair (8), or due to equipment failure (3). All parents gave informed consent before the experiment. The present experiment was approved by The Conseil
General discussion
We conducted four NIRS experiments testing whether newborns from different language backgrounds have prosodic grouping preferences at birth. To our knowledge, the present study is the first to show prosodic grouping at such a young age and in infants from monolingual and bilingual backgrounds. It thus extends previous studies that had tested infants’ sensitivity to the Iambic-Trochaic Law (ITL) from about 6 months onwards and typically from monolingual language environments (Abboub et al., 2016,
Acknowledgements
We thank the parents who graciously agreed to have their infants participate in the study and all of the personnel of Hopital Robert Debre, Paris, France. We also thank Maria Clemencia Ortiz Barajas for her help with the statistical analysis. This work was supported by LABEX EFL (ANR-10-LABX-0083) to NA, TN and JG, a Fyssen Foundation Startup Grant, an Emergence(s) Programme Grant from the City of Paris and a Human Frontiers Science Program Young Investigator Grant (RGY-0073-2014) to JG.
References (90)
- et al.
On the importance of being bilingual: Word stress processing in a context of segmental variability
Journal of Experimental Child Psychology
(2015) - et al.
Mismatch negativity indicates vowel discrimination in newborns
Hearing Research
(1995) - et al.
Discovering words in the continuous speech stream: The role of prosody
Journal of Phonetics
(2003) Fetal reactions to recurrent maternal speech
Infant Behavior and Development
(1994)- et al.
Prenatal maternal speech influences newborns’ perception of speech sounds
Infant and Child Development
(1986) - et al.
Neural correlates of own- and other-race face recognition in children: A functional near-infrared spectroscopy study
NeuroImage
(2014) - et al.
Acoustic determinants of infant preference for motherese speech
Infant Behavior and Development
(1987) The cortical language circuit: From auditory perception to sentence comprehension
Trends in Cognitive Sciences
(2012)- et al.
Hemispheric roles in the perception of speech prosody
NeuroImage
(2004) - et al.
Cochlear microphonics recorded from fetal and newborn sheep
American Journal of Otolaryngology
(1992)
Bootstrapping word order in prelexical infants: A Japanese-Italian cross-linguistic study
Cognitive Psychology
Clauses are perceptual units for young infants
Cognition
The right hemisphere of sleeping infant perceives sentential prosody
Neuroscience Research
Prosodic processing in the developing brain
Neuroscience Research
Word segmentation by 8-month-olds: When speech cues count more than statistics
Journal of Memory and Language
The beginnings of word segmentation in English-learning infants
Cognitive Psychology
Hemispheric lateralization of linguistic prosody recognition in comparison to speech and speaker recognition
NeuroImage
Newborns’ cry melody is shaped by their native language
Current Biology: CB
Nonparametric statistical testing of EEG- and MEG-data
Journal of Neuroscience Methods
Sustained decrease in oxygenated hemoglobin during video games in the dorsal prefrontal cortex: A NIRS study of children
NeuroImage
Phonotactic and prosodic effects on word segmentation in infants
Cognitive Psychology
A precursor of language acquisition in young infants
Cognition
Two-day-olds prefer their native language
Infant Behavior & Development
Discrimination of pitch contours by neonates
Infant Behavior & Development
Early segmentation of fluent speech by infants acquiring French: Emerging evidence for crosslinguistic differences
Journal of Memory and Language
Perception and acquisition of linguistic rhythm by infants
Speech Communication
Neural activation to upright and inverted faces in infants measured by near infrared spectroscopy
NeuroImage
Fetal hearing
European Journal of Obstetrics and Gynecology and Reproductive Biology
Newborn infants’ sensitivity to perceptual cues to lexical and grammatical words
Cognition
Development of coronal stop perception: Bilingual infants keep pace with their monolingual peers
Cognition
Distinct hemispheric specializations for native and non-native languages in one-day-old newborns identified by fNIRS
Neuropsychologia
The processing of prosody: Evidence of interhemispheric specialization at the age of four
NeuroImage
The development of perceptual grouping biases in infancy: A Japanese-English cross-linguistic study
Cognition
An exploration of rhytmic grouping of speech sequences by French- and German-learning infants
Frontiers in Human Neuroscience
Les hiérarchies prosodiques en arabe
Revue Québécoise de Linguistique
Controlling the false discovery rate: A practical and powerful approach to multiple testing
Journal of the Royal Statistical Society. Series B (Methodological)
Language experience affects grouping of musical instrument sounds
Psychological Science
Native language affects rhythmic grouping of speech
The Journal of the Acoustical Society of America
Effect of bilingualism on lexical stress pattern discrimination in French-learning infants
PLoS One
Acoustic markers of prominence influence infants’ and adults’ segmentation of speech sequences
Language and Speech
Rhythm
The American Journal of Psychology
The roots of bilingualism in newborns
Psychological Science
Données exploratoires en prosodie berbère: I. L’accent en kabyle
Comptes Rendus Du GLECS
Do infants perceive word boundaries? An empirical study of the bootstrapping of lexical acquisition
Journal of the Acoustical Society of America
Analyzing neural time series data: Theory and practice
Cited by (85)
The roles of prosody in Chinese-English reading comprehension
2024, Learning and InstructionPrerequisites of language acquisition in the newborn brain
2023, Trends in Neurosciences