Research Article
Language and cluster-specific effects in the timing of onset consonant sequences in seven languages

https://doi.org/10.1016/j.wocn.2022.101153Get rights and content

Highlights

  • Language-specific variation in consonant timing occurs only for some, but not all clusters.

  • All languages make use of a higher consonant-consonant overlap pattern, only some languages also show a lower overlap pattern.

  • Relative effects of manner, place, and voicing work in the same direction across languages.

Abstract

In this paper, we draw on available data from previous experiments to explore cross-linguistic variation in articulatory overlap in CC onset clusters, taking into account the role of cluster composition. Our sample includes articulography recordings of eleven clusters for seven languages. We find that cross-linguistic variability is conditional on cluster composition. Previous suggestions that languages may have individual global articulatory timing profiles for consonant clusters in terms of an overall relatively lower or higher degree of overlap are not confirmed for our sample. All included languages converge on a relatively higher degree of overlap for some of the clusters, whereas only some of the languages additionally extend into the lower overlap range, particularly for stop-sonorant sequences. Manner and voicing are further identified as factors conditioning variation in consonantal overlap. Overall languages differ in their degree of overlap in multi-faceted ways, but the relative effects of cluster composition work in the same direction across languages.

Introduction

Consonant clusters are considered a source of complexity in speech production and perception, and they are known to be subject to synchronic and diachronic change in the form of assimilations, blending, or deletion (for an overview see e.g., Recasens (2018)). Their phonetic properties have therefore received a considerable amount of attention in recent years. A particular focus has been on articulatory timing, i.e., how the articulatory gestures of the cluster are coordinated with respect to each other in time. Some of these studies have pursued the question as to what extent syllable structure is expressed in articulatory timing relations (among others, Bombien, 2011, Hermes et al., 2013, Marin, 2013, Pastätter and Pouplier, 2017, Shaw et al., 2009), while others have focused on questions relating to phonetic motivations for diachronic change, or phonetic motivations for (un)common cluster phonotactics (Chitoran, 2016, Chitoran et al., 2002, Colantoni and Steele, 2011, Hoole et al., 2013, Pouplier et al., 2017, Wright, 1996). The relationship between segment- and language-specific effects thereby remains a particular challenge to our understanding of consonant cluster timing, and it is on this issue that the current paper seeks to make a contribution. Teasing apart language-specific from cluster composition effects is far from trivial, and there are comparatively few studies that directly compare two or more languages (exceptions are for instance Bombien and Hoole, 2013, Hoole and Bombien, 2017). Meta-comparisons between studies are often limited by differences in the selection of clusters studied, data types, and measurement heuristics, making it difficult to compare results other than relative differences between conditions which may, however, not be shared between studies.

The goal of the current paper is to advance our understanding of language- and cluster-specific effects in the articulation of consonant clusters by comparing the articulatory timing for CC onset clusters across seven languages. A number of studies over the past 15 years conducted at the Institute of Phonetics and Speech Processing in Munich have investigated articulatory timing in consonant clusters as a function of cluster composition. For our present study, we have assembled the data from as many of these past studies as possible and present a direct comparison of consonant-consonant timing for the same clusters. All datasets included in our paper were originally collected to study consonant timing in onset clusters, but the individual experiments were conducted independently of each other. Consequently they do not share the exact same details of experimental design (a detailed account is given in the Methods section). While this limits our comparative study to a certain extent, we ensured methodological consistency for data segmentation and analyses across all datasets. This allows us to explore language and segmental composition effects in cluster timing in a broader fashion than has to our knowledge been done before, providing a basis for future hypothesis testing in confirmatory studies.

Articulatory timing in tautosyllabic consonant sequences has received some attention in the context of the question of why some clusters are cross-linguistically more frequent than others. Especially for languages which have a large cluster inventory that also includes typologically unusual clusters such as obstruent-obstruent or sonorant-obstruent clusters, Wright (1996) conjectured that there is a principled connection between degrees of articulatory overlap between successive consonants and the existence of these typologically unusual clusters. Expanding on Wright's work, Henke, Kaisse, and Wright (2012) proposed that gestural timing was a key factor in understanding how languages may achieve some level of cue robustness for clusters with perceptually sub-optimal signal modulation properties (such as sonorant-sonorant or stop-stop sequences, which are also the ones that are typologically rare). Relatedly, Pouplier and Beňuš (2011) argued in the context of Slovak syllabic consonants that the range of consonant clusters permitted in a language may be related to the language-specific degree of overlap between consonants, with a greater range of cluster types being attested in languages in which consonants in a cluster are timed further apart, i.e., with greater consonant-consonant lags. Chitoran (2016), finally, proposed the sonority hierarchy to be fundamentally related to articulatory timing of consonant sequences, and the timing between gestures to be linked to typological cluster frequency. Similarly to Wright (1996), she predicts that sonority-violating sequences would predominantly occur in languages which show relatively low overlap in consonant sequences, such as Georgian or Tsou. This arguably implies that a low degree of overlap may be a kind of 'coarticulatory setting' for consonant sequences in a given language, permeating the entire phonotactic inventory, not just the sonority-violating sequences. Direct empirical tests of these ideas are rare. Somewhat relatedly, previous work on Russian by Pouplier et al. (2017) tested the hypothesis that clusters which in Henke et al.'s (2012) proposal are perceptually suboptimal should be comparatively immune to changes in overlap under speaking rate changes since their existence may be conditional on a certain, low-overlap timing pattern, as was originally proposed by Wright (1996). Pouplier et al. (2017) found no support for such a scenario: lexical frequency, but not auditory cue recoverability predicted differences in the flexibility of cluster pronunciation under speech rate variation.

The current dataset affords some further insights into a possible implicational relationship between the existence of typologically unusual, sonority-violating onset clusters in a language's phonotactic inventory and that language's general consonant timing as argued for by Wright (1996) or Chitoran (2016). Importantly, our corpus consists of the same clusters for all languages, a group of stop-sonorant clusters plus a group of sibilant-stop clusters. Our sample comprises languages with a broad phonotactic cluster inventory that includes sonority-violating clusters (Georgian, Polish, Russian), along with languages which have a relatively more restricted cluster inventory (English, German, French, Romanian). Our study will thus speak to the question of whether languages differ in the degree of articulatory overlap for the same clusters. The languages and clusters included are a convenience sample of data available to us, and their selection was not hypothesis-driven. Nonetheless, our results may serve as a useful exploratory basis for the possibility of a language specific, global lower or higher overlap setting. To foreshadow one of our results, we will see that all of our languages use a relatively higher degree of overlap at least for certain clusters, while only some but not all languages additionally make use of relatively low degrees of overlap. Cross-linguistic variation in onset cluster timing thus seems to occur predominantly for certain cluster types.

Previous studies on cluster timing have observed that cluster timing varies as a function of various factors related to cluster composition, such as place and manner of articulation, as well as voicing, which we will now consider in turn. Throughout the paper, we use C1 to refer to the initial and C2 to refer to the vowel-adjacent consonant of a C1C2V onset cluster.

In terms of place of articulation effects, Bombien et al. (2013) reported for German that C1 labials conditioned a higher degree of overlap than C1 velars; a parallel result was obtained by Gibson et al. (2019) for Spanish. Bombien et al. discussed their result in the context of perceptual recoverability aspects, in particular the so-called place order effect, first reported by Chitoran et al. (2002) for Georgian. In that study, Chitoran and colleagues found that in Georgian stop-stop clusters differ in their degree of articulatory overlap depending on whether C1 had a more anterior or a more posterior constriction in the vocal tract compared to C2. For front-to-back (FB) clusters such as /bg/, more overlap was found compared to back-to-front (BF) clusters such as /gb/. Chitoran and colleagues interpreted this as evidence for perceptual recoverability conditioning cluster timing in stop-stop clusters: In a back-to-front cluster a high degree of overlap would result in hiding the acoustic-auditory cues to C1 due to the more anterior constriction of C2. Note though that a syllable-internal place order effect can only be investigated for languages like Georgian which allow for reversal of segment order within the same syllable position. For languages which do not permit this, overlap differences as a function of place order can only be tested for cross-boundary clusters or for clusters that also differ in manner or place of C1. For instance, Kühnert, Hoole, and Mooshammer (2006) report French /fl/ (front-to-back) to be more overlapped than /kl/ (back-to front), but of course there is a confounding manner difference in C1 which may cause the observed overlap differences.1 It is thus unclear to what extent the place order effect generalizes for one, across languages and secondly, beyond stop-stop clusters. In our study we can address place of articulation effects by comparing, for example, /pl, bl/ with /kl, gl/ clusters across all languages. Following the work of Bombien and colleagues one might thereby conjecture that labial-initial clusters should show within each language the highest degree of overlap, since there are no conflicting demands on the articulators, and labials are known to be least resistant to coarticulation (Recasens, Pallarès, & Fontdevila, 1997). From such considerations, one might also predict that labial-initial clusters will vary the least across languages, since the relative coarticulatory independence of labial and lingual articulations is rooted in the biomechanics of the speech apparatus.

Previous studies have also reported a manner effect for C2. It has been observed that nasals in C2 position are much less overlapped with the preceding C1 compared to laterals. Bombien and colleagues (Bombien and Hoole, 2013, Bombien et al., 2013) found that /kn/ features a much lesser degree of overlap compared to /kl/ in German and French. Hoole et al. (2013) attributed this again as evidence for perception acting as a shaping force on cluster timing: Nasals and stops are in an aerodynamic conflict and velum lowering too close to the stop would obscure the velar burst, which is, however, crucial for its identification given its initial position. This aerodynamic conflict is therefore hypothesized to push the nasal articulation away from the stop. To our knowledge, there have been no further studies investigating to what extent these findings generalize beyond German and French. In our present paper we take the opportunity to look for corresponding effects in /kn, gn/ across our language sample. For one, if the relatively low overlap observed for stop-nasal clusters by Hoole and colleagues is due to aerodynamic conflict, a similar effect should hold across languages. Yet if it should be the case, as discussed above, that some languages generally have a relatively lower degree of overlap than others, this aerodynamic conflict might not apply to languages in which the consonants are timed relatively far apart to begin with. That is, an asymmetry between /kl, gl/ and /kn, gn/ may not be found for Georgian if consonant sequences in Georgian generally have a low degree of overlap. We can further ask whether this effect of aerodynamic conflict generalizes to sibilant+nasal clusters which have a similar aerodynamic incompatibility in terms of intraoral pressure by comparing clusters like /sm/ to clusters like /sp/ in which the consonants are aerodynamically compatible. Sibilancy and velic opening are well known to be in conflict (Shosted, 2006, Solé, 2007, Solé and Ohala, 2010). For instance, Solé (2007) points out that diachronic loss of fricatives before nasals may be linked to anticipatory velum lowering during fricatives leading to defricativization. On the other hand, the details of articulatory timing that are under scrutiny here may still differ between stops and sibilants: Sibilants are highly perceptible throughout their duration in contrast to stops, which during occlusion signal no information about place of articulation. Sibilant identification, in contrast to stops, is not dependent on the perception of a short transient. Nasal vs. non-nasal manner of articulation may thus not have the same effect on timing when following a sibilant as opposed to a stop.

As to voicing, several studies on consonant clusters reported an interaction between the laryngeal specifications of the cluster members and the timing of the supralaryngeal gestures. In a direct comparison of French and German, Bombien and colleagues (Bombien and Hoole, 2013, Hoole et al., 2009), investigated the possibility that only aspirating languages might show sensitivity to voicing in consonant cluster timing. The stop voicing contrast in French is characteristic of a true voicing language with prevoiced vs. voiceless unaspirated, whereas in German, the contrast is one of voiceless unaspirated vs. aspirated stops. Bombien and Hoole (2013) found in their articulography (EMA) study that in German, but not in French, the overlap of the consonantal target plateaus varied as a function of C1 voicing in stop-lateral clusters. German clusters in which the stop was voiceless showed less overlap compared to clusters in which the stop was phonologically voiced. The authors hypothesized that this was due to the long aspiration phase of stops in German 'pushing' C2 away from C1. This gave rise to the expectation that French stop-lateral timing should overall be similar to voiced clusters in German, since there is no long aspiration phase in French. Yet contrary to the authors' initial expectations, the French clusters, no matter whether C1 was voiceless unaspirated or prevoiced, patterned with the German C1 voiceless aspirated case in terms of a relatively lower degree of overlap. The authors take this as an indication that the glottal gesture in German may be a property of the entire onset, meaning that the glottal gesture might be coordinated with all supralaryngeal consonantal gestures of the onset (assuming the syllable model of Articulatory Phonology in which all consonantal gestures in an onset are coordinated with respect to each other and the vowel in pairwise, local coupling relations, Browman and Goldstein (2000)). In contrast, the glottal opening gesture in French may be timed relative to C1 only (see also Hoole (2006), and Hoole and Bombien (2017) on oral-laryngeal coordination in onsets).

Lateral devoicing may serve as another source of information on the factors that influence cluster timing, in particular whether certain cues (VOT, lateral voicing) prevail over others, and whether laryngeal specifications in clusters are a property belonging to syllable constituents rather than individual segments. In the Bombien and Hoole (2013) study it was apparent that German and French differ in their degree of lateral devoicing, feeding their argument that the laryngeal gesture has different affiliations in the two languages. In their data, in German, the long VOT led to a (nearly) complete devoicing of the lateral. Bombien and Hoole (2013, p. 549) observed the French lateral to be significantly less devoiced compared to German and identified across their dataset 78 ± 29% more voicing during the lateral interval compared to their German data. Judging from their average durations (Fig. 6, p. 548), in the voiceless-stop clusters /kl, pl/ about 50% of the articulatory plateau of the lateral is devoiced in French.2 Note that this differs from the results obtained by Colantoni and Steele (2011) for the same clusters in Québec French which did not show lateral devoicing, which supports Bombien and Hoole's argument that glottal timing in onset clusters is language- and/or dialect-specific.

Indeed, whether a language uses long- or short-lag VOT for [-voice] does not by itself necessarily predict variation in consonant timing as a function of stop phonation, as further underscored by a study of Gibson and colleagues (Gibson, Sotiropoulou, Tobin, & Gafos, 2019; see also Sotiropoulou, Gibson, & Gafos, 2020). Following up on Bombien and Hoole (2013), they conducted an articulography study of stop-lateral and stop-rhotic clusters in Peninsular Spanish, which is a true voicing language like French. Yet Spanish – maybe surprisingly – turned out to be similar to what Bombien and Hoole had reported for German rather than to what they had reported for French: Clusters with a voiceless stop featured less overlap than clusters with a voiced stop. The Gibson et al. results highlight that historically very close languages like Spanish and French with a superficially similar voicing contrast may still show language-specific interactions of consonant timing and voicing (see also Torreira and Ernestus (2011) on differences in stop lenition and vowel devoicing in spontaneous speech in French and Spanish). The languages of our current study allow us to extend our knowledge in this respect: We have data from four true voicing languages – Russian, French, Polish, Romanian – and three aspirating languages: English, German and Georgian (Table 1). Our grouping of Georgian with German and English is based on reports of prevoicing being weak to absent in Georgian (Butskhrikidze, 2002, Chitoran et al., 2002, Shosted and Chikovani, 2006). Our own previous work on the Georgian and German data included here also pointed to a basic phonetic similarity of Georgian and German with respect to phonologically voiced stops (Pouplier, Lentz, Chitoran, & Hoole, 2020).

Thus overall, the effect of VOT on cluster timing remains an open issue. The cited work underscores that two true voicing languages – Spanish and French – may behave differently, but we have as of yet no data that would allow us to judge whether the result of Bombien and Hoole (2013) for German – a modulation of consonant timing by phonation – generalizes to other aspirating languages, and whether systematic effects on consonant timing can be observed as a function of language voicing category at all.

We now briefly give information on syllable structure and stop phonation for each language, as well as information on the type of lateral, but we emphasize again that the languages and clusters which form part of our current study were sampled based on data availability grounds and were not chosen for hypothesis testing.

German (Wiese, 2000), English (Giegerich, 2005), French (Pustka, 2016), and Romanian (Chitoran, 2001) allow for up to three consonants to precede a vowel. Except for Romanian, sonority plateaus and reversals are phonotactically illegal with the exception of sibilant-obstruent clusters (e.g., German /ʃpɔʁt/, French /ski/). In word-initial onsets, Romanian allows for a few cases of sonority plateaus (/ktitor, mre̯anʌ, mlʌdijos/), but no sonority reversals other than sibilant-obstruent clusters. In German resyllabification to onset is often blocked by glottal stop insertion or glottalization in the onset (Moosmüller, 2015). Stop voicing in German and English is typical of aspirating languages with voiced stops being produced with either contextually conditioned passive voicing or as voiceless unaspirated (Beckman, Jessen, & Ringen, 2013). Voiceless stops have a long VOT. French and Romanian on the other hand are considered true voicing languages with prevoicing for phonologically voiced stops and short-lag VOT for phonologically voiceless stops. German, French, and Romanian have a clear /l/ in all positions, while the American English lateral can be categorized as dark (Recasens, 2012a).

Georgian (Butskhrikidze, 2002), Russian (Cubberley, 2002), and Polish (Rochon, 2000) are languages with almost free onset phonotactics. Particularly Georgian has basically free onset phonotactics for two-consonant clusters. All three languages allow for obstruent-sonorant clusters also found in the other languages, but also for sonorant-obstruent, sonorant-sonorant, and obstruent-obstruent clusters, but these latter types of clusters are not included in our present study. Stop voicing in Georgian is characterized by a 3-way contrast: voiced, voiceless aspirated, and laryngealized. Voiceless stops are aspirated, while voiced stops, as noted above, have been described as weakly prevoiced or voiceless unaspirated (Butskhrikidze, 2002, Chitoran et al., 2002, Shosted and Chikovani, 2006). Laryngealized stops are not included in our dataset. Russian (Ringen & Kulikov, 2012) and Polish (Konopska & Sawicki, 2013) voiced stops are prevoiced; voiceless stops are unaspirated (short-lag VOT). For both Georgian and Russian, so-called open transitions in the sense of Catford (1985) have been reported (Chitoran et al., 2002, Davidson and Roon, 2008, Zsiga, 2000). An open transition arises between two consonantal constrictions if there is sufficient lag between the release of the first constriction and the constriction formation of the following consonant. Depending on aerodynamic conditions and context, the transition may be voiced or voiceless. The nature of these open transitions, their (lack of) control, and their function has been vigorously debated. We therefore also use the opportunity to present an analysis of the phonation of the interval between the two consonantal constrictions in the current paper for all of our languages. In terms of the lateral, Russian has two lateral phonemes, a dark, velarized /l/ and a clear, palatalized /lj/ (Yanushevskaya & Bunčić, 2015). The stimuli used in this study contained only the dark /l/. Polish /l/ is clear. For Georgian, Robins and Waterson (1952) describe /l/ impressionistically based on a single informant as being clear before /e, i/, “dark” before /a/ and “very dark” (p. 63) before /u, o/. This intermediate quality of /l/ before /a/ also corresponds to our auditory impression: Our stimuli are in an /a/ context and the /l/ is darker than e.g., German /l/ in our data but not as dark as American English /l/.

In sum, the goal of our study is to probe our corpus for language- and cluster-specific effects in consonant timing in #C1C2 onset clusters. Previous independent studies on a variety of languages suggest that languages may differ in where along a continuum of relatively higher or lower degrees of overlap they fall, possibly as a function of onset phonotactics. Languages with particularly large onset cluster inventories have been hypothesized to be overall characterized by a lower degree of consonant overlap compared to languages with smaller cluster inventories. We further ask whether place and manner of articulation as well as phonation have comparable effects on cluster timing across languages. Based on our literature review, we might expect labial initial clusters to allow for the greatest temporal extent of anticipatory coarticulation across languages, labial-initial clusters may thus be least variable across languages. Clusters with a nasal consonant in C2 position should generally show a relatively low degree of overlap compared to clusters with a non-nasal C2. Finally, we contribute to our understanding of whether consonant phonation systematically affects timing of the supralaryngeal constrictions in a fashion consistent with the classification of languages as 'aspirating' versus 'true voicing'.

Section snippets

Methods

We assembled a corpus of articulography data from existing datasets which were available at the Institute of Phonetics in Munich from past projects. Our goal was to construct a coherent dataset in which there would be a rough balance in terms of the number of consonant sequences available per language, and vice versa to have a certain range of languages available for each consonant sequence included. Our final corpus contained 11 different C1C2 onset clusters from 7 languages. Any language

Results

Fig. 4 gives an initial overview of the two articulatory lag measures by language and cluster in form of 95% confidence ellipse plots. Each ellipse shows the position of all the datapoints of the corresponding language (a) or cluster (b), respectively. Languages differ in the size of the ellipses, and also somewhat in their orientation. Noticeably Polish covers a small region compared to e.g., French and German, which are similarly sized datasets (see Methods). Georgian extends further out than

Discussion

The goal of our study was to compare consonant overlap in a range of onset clusters in a relatively large sample of languages, using existing data. Our overall research question thereby concerned the presence of systematic differences in consonant timing between languages, possibly in interaction with cluster composition effects. Place, manner, and voicing effects were targeted by our analyses. Based on previous work we expected some languages to display an overall higher, others a lower degree

Concluding remarks

We presented an exploratory cross-linguistic comparison of consonant cluster timing in seven languages, comparing the same clusters across the dataset. Our data revealed clear cluster composition effects in that sibilant-initial clusters were, across all languages, more overlapped than stop-initial clusters. Mixed voicing and nasality in clusters are overall associated with a 'repeller' effect on timing for stop-initial clusters. Aerodynamic conflict does not affect sibilant-initial clusters in

Acknowledgements

We thank Lasse Bombien for generously sharing the French data.

Ethics statement

The current paper draws on existing datasets. The original studies from which the data for this paper were assembled all obtained approval of the relevant ethics review boards.

Conflict of interest statement

The authors declare no conflict of interest.

References (71)

  • E. Zsiga

    Phonetic alignment constraints: Consonant overlap and palatalization in English and Russian

    Journal of Phonetics

    (2000)
  • D.M. Bates et al.

    Fitting linear mixed-effects models using lme4

    Journal of Statistical Software

    (2015)
  • J. Beckman et al.

    Empirical evidence for laryngeal features: Aspirating vs. true voice languages

    Journal of Linguistics

    (2013)
  • P.S. Beddor

    A coarticulatory path to sound change

    Language

    (2009)
  • Bombien, L. (2011). Segmental and prosodic aspects in the production of consonant clusters. PhD thesis,...
  • L. Bombien et al.

    Articulatory overlap as a function of voicing in French and German consonant clusters

    Journal of the Acoustical Society of America

    (2013)
  • C. Browman et al.

    Competing constraints on intergestural coordination and self-organization of phonological structures

    Bulletin de la Communication Parlée

    (2000)
  • Butskhrikidze, M. (2002). The Consonant Phonotactics of Georgian. Utrecht: LOT Dissertation...
  • C. Carignan et al.

    Planting the seed for sound change: Evidence from real-time MRI of velum kinematics in German

    Language

    (2021)
  • J.C. Catford

    'Rest' and 'open transition' in a systemic phonology of English

  • M. Charrad et al.

    NbClust: An R package for determining the relevant number of clusters in a data set

    Journal of Statistical Software

    (2014)
  • I. Chitoran

    The phonology of Romanian: A constraint-based approach

    (2001)
  • I. Chitoran

    Relating the sonority hierarchy to articulatory timing patterns. A cross-linguistic perspective

  • I. Chitoran et al.

    Gestural overlap and recoverability: Articulatory evidence from Georgian

  • L. Colantoni et al.

    Synchronic evidence of a diachronic change: Voicing and duration in French and Spanish stop-liquid clusters

    Canadian Journal of Linguistics

    (2011)
  • P. Cubberley

    Russian. A linguistic introduction

    (2002)
  • L. Davidson et al.

    Durational correlates for differentiating consonant sequences in Russian

    Journal of the International Phonetic Association

    (2008)
  • C. Fougeron et al.

    French

    Journal of the International Phonetic Association

    (1993)
  • A. Gafos

    A grammar of gestural coordination

    Natural Language & Linguistic Theory

    (2002)
  • M. Gibson et al.

    Temporal aspects of word initial single consonants and consonants in clusters in Spanish

    Phonetica

    (2019)
  • H.J. Giegerich

    English phonology

    (2005)
  • E. Henke et al.

    Is the sonority sequencing principle an epiphenomenon?

  • A. Hermes et al.

    Gestural coordination of Italian word initial clusters - the case of 'impure s'

    Phonology

    (2013)
  • Hoole, P. (2006). Experimental studies of laryngeal articulation - Part II: Laryngeal-oral coordination in consonant...
  • View full text