Abstract
Since the seminal works of Bernstein (The coordination and regulation of movements. Pergamon Press, Oxford, 1967) several authors have supported the idea that, to produce a goal-oriented movement in general, and a movement of the organs responsible for the production of speech sounds in particular, individuals activate a set of coupling relations that coordinate the behavior of the elements of the motor system involved in the production of the target movement or sound. In order to characterize the configurations of the coupling relations underlying speech production articulator movements, we introduce an original method based on recurrence analysis. The method is validated through the analysis of simulated dynamical systems adapted to reproduce the features of speech gesture kinematics and it is applied to the analysis of speech articulator movements recorded in five German speakers during the production of labial and coronal plosive and fricative consonants at variable speech rates. We were able to show that the underlying coupling relations change systematically between labial and coronal consonants, but are not affected by speech rate, despite the presence of qualitative changes observed in the trajectory of the jaw at fast speech rate.
Similar content being viewed by others
Notes
This step was not present in the original version of the algorithm as described in Lancia et al. (2016).
References
Abbs JH, Gracco VL (1984) Control of complex motor gestures: orofacial muscle responses to load perturbations of lip during speech. J Neurophysiol 51(4):705–723
Balasubramaniam R (2013) On the control of unstable objects: the dynamics of human stick balancing. In: Kevin S, Richardson MJ, Riley MA (eds) Progress in motor control. Springer, New York, pp 149–168
Bernstein N (1967) The coordination and regulation of movements. Pergamon Press, Oxford
Browman CP, Goldstein L (1989) Articulatory gestures as phonological units. Phonology 6(02):201–251
Fitch H, Tuller B, Turvey MT (1982) The Bernstein perspective III. Tuning of coordinative structures with special reference to perception. In: Kelso JAS (ed) Understanding human motor control. Human Kinetics, Champaign, pp 271–287
Folkins JW, Abbs JH (1975) Lip and jaw motor control during speech: responses to resistive loading of the jaw. J Speech Langc Hear Res 18(1):207–220
Frankel J, King S (2001) ASR–Articulatory speech recognition. Proc Eurospeech 1:599–602
Fuchs S, Perrier P, Hartinger M (2011) A critical evaluation of gestural stiffness estimations in speech production based on a linear second-order model. J Speech Lang Hear Res 54(4):1067–1076
Geumann A, Kroos C, Tillmann HG (1999) Are there compensatory effects in natural speech? In: Proceedings of the 14th international congress of phonetic sciences, San Francisco, pp 399–402
Graco VL, Abbs JH (1985) Dynamic control of the perioral system during speech: kinematic analyses of autogenic and nonautogenic sensorimotor processes. J Neurophysiol 54(2):418–432
Grimme B, Fuchs S, Perrier P, Schöner G (2011) Limb versus speech motor control: a conceptual review. Mot Control 15(1):5–33
Guenther FH (1994) A neural network model of speech acquisition and motor equivalent speech production. Biol Cybern 72(1):43–53
Hoole P (1996) Issues in the acquisition, processing, reduction and parameterization of articulographic data. Forschungsberichte des Instituts für Phonetik und Sprachliche Kommunikation der Universität München 34:158–173
Ishwaran H, Rao JS (2005) Spike and slab variable selection: frequentist and Bayesian strategies. Ann Stat 33:730–773
Iskarous K, Mooshammer C, Hoole P, Recasens D, Shadle CH, Saltzman E, Whalen DH (2013) The coarticulation/invariance scale: mutual information as a measure of coarticulation resistance, motor synergy, and articulatory invariance. J Acoust Soc Am 134(2):1271–1282
Ito T, Gomi H, Honda M (2004) Dynamical simulation of speech cooperative articulation by muscle linkages. Biol Cybern 91(5):275–282
Iwanski JS, Bradley E (1998) Recurrence plots of experimental data: to embed or not to embed? Chaos: an Interdisciplinary. J Nonlinear Sci 8(4):861–871
Jackson PJ, Singampalli VD (2009) Statistical identification of articulation constraints in the production of speech. Speech Commun 51(8):695–710
Keating PA, Lindblom B, Lubker J, Kreiman J (1994) Variability in jaw height for segments in English and Swedish VCVs. J Phon 22(4):407–422
Kelso JS, Tuller B, Vatikiotis-Bateson E, Fowler CA (1984) Functionally specific articulatory cooperation following jaw perturbations during speech: evidence for coordinative structures. J Exp Psychol Hum Percept Perform 10(6):812
Kennel MB, Brown R, Abarbanel HD (1992) Determining embedding dimension for phase-space reconstruction using a geometrical construction. Phys Rev A 45(6):3403
Kinsella-Shaw JM, Harrison SJ, Turvey MT (2011) Interleg coordination in quiet standing: influence of age and visual environment on noise and stability. J Mot Behav 43(4):285–294
Koenig L, Lucero J, Löfqvist A, Palethorpe S, Tabain M (2003) Studying articulatory variability using functional data analysis. In: Proceedings of the 15th international congress of phonetic sciences, pp 269–272
Krivobokova T, Kneib T, Claeskens G (2012) Simultaneous confidence bands for penalized spline estimators. J Am Stat Assoc 105(490):852–863
Lancia L, Fuchs S (2011) The labial coronal effect revisited. In: Laprie Y (ed) Proceedings of the 8th international seminar on speech production, Montreal, Canada, pp 187–194
Lancia L, Fuchs S, Tiede M (2014) Cross-recurrence analysis in speech production: an overview and a comparison to other nonlinear methods. J Speech Lang Hear Res 57(3):718–33
Lancia L, Voigt D, Krasovitskiy G (2016) Characterization of laryngealization as irregular vocal fold vibration and interaction with prosodic prominence. J Phon 54:80–97
Latash ML, Scholz JP, Schöner G (2007) Toward a new theory of motor synergies. Mot Control 11(3):276–308
Lucero JC (2005) Comparison of measures of variability of speech movement trajectories using synthetic records. J Speech Lang Hear Res 48(2):336–344
Marwan N, Romano MC, Thiel M, Kurths J (2007) Recurrence plots for the analysis of complex systems. Phys Rep 438(5):237–329
Marwan N, Kurths J (2002) Nonlinear analysis of bivariate data with cross recurrence plots. Phys Lett A 302(5):299–307
McFarland DH, Baum SR (1995) Incomplete compensation to articulatory perturbation. J Acoust Soc Am 97(3):1865–1873
McFarland DH, Baum SR, Chabot C (1996) Speech compensation to structural modifications of the oral cavity. J Acoust Soc Am 100(2):1093–1104
Mooshammer C, Hoole P, Geumann A (2007) Jaw and order. Lang Speech 50(2):145–176
Morris JS, Carroll RJ (2006) Wavelet-based functional mixed models. J R Stat Soc Ser B 68(2):179–199
Olsen MA, Hartung D, Busch C, Larsen R (2011) Convolution approach for feature detection in topological skeletons obtained from vascular patterns. In: 2011 IEEE workshop on computational intelligence in biometrics and identity management (CIBIM), pp 163–167
Papcun G, Hochberg J, Thomas T, Laroche F, Zacks J, Levy S (1992) Inferring articulation and recognizing gestures from acoustics with a neural network trained on x-ray microbeam data. J Acoust Soc Am 92(2):688–700
Perkell JS (2012) Movement goals and feedback and feedforward control mechanisms in speech production. J Neurolinguistics 25(5):382–407
Ramsay JO (2006) Functional data analysis. Wiley, New York
Rochet-Capellan A, Schwartz JL (2007) An articulatory basis for the labial-to-coronal effect:/pata/seems a more stable articulatory pattern than /tapa/. J Acoust Soc Am 121(6):3740–3754
Romano MC, Thiel M, Kurths J, Grebogi C (2007) Estimation of the direction of the coupling by conditional probabilities of recurrence. Phys Rev E 76(3):036211
Romano MC, Thiel M, Kurths J, Mergenthaler K, Engbert R (2009) Hypothesis test for synchronization: twin surrogates revisited. Chaos: an interdisciplinary. J Nonlinear Sci 19(1):015108
Rulkov NF, Sushchik MM, Tsimring LS, Abarbanel HD (1995) Generalized synchronization of chaos in directionally coupled chaotic systems. Phys Rev E 51(2):980–994
Saltzman E, Kelso JA (1987) Skilled actions: a task-dynamic approach. Psychol Rev 94(1):84
Saltzman EL, Munhall KG (1989) A dynamical approach to gestural patterning in speech production. Ecol Psychol 1(4):333–382
Schöner G, Martin V, Reimann H, Scholz JP (2008) Motor equivalence and the uncontrolled manifold. In: Proceedings of the international seminar on speech production (ISSP 2008), Strasbourg, France, pp 23–28
Sugihara G, May R, Ye H, Hsieh CH, Deyle E, Fogarty M, Munch S (2012) Detecting causality in complex ecosystems. Science 338(6106):496–500
Thiel M, Romano MC, Read PL, Kurths J (2004) Estimation of dynamical invariants without embedding by recurrence plots. Chaos: an Interdisciplinary. J Nonlinear Sci 14(2):234–243
Tourville JA, Guenther FH (2011) The DIVA model: a neural theory of speech acquisition and production. Lang Cogn Process 26(7):952–981
Turvey MT (1977) Preliminaries to a theory of action with reference to vision. In Shaw RE, Bransford J (eds) Perceiving, acting and knowing. Lawrence Erlbaum Associates, pp 211–265
Weirich M, Lancia L, Brunner J (2013) Inter-speaker articulatory variability during vowel-consonant-vowel sequences in twins and unrelated speakers. J Acoust Soc Am 134(5):3766–3780
Zou Y, Romano MC, Thiel M, Marwan N, Kurths J (2011) Inferring indirect coupling by means of recurrences. Int J Bifurcat Chaos 21(04):1099–1111
Acknowledgements
Leonardo Lancia’s work, carried out within the Labex BLRI (ANR-11-LABX-0036) and EFL (ANR-10-LABX-0083), has benefited from support from the French government, managed by the French National Agency for Research (ANR), under the program “Investissements d’Avenir”.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix: Simulations details
Appendix: Simulations details
We validate methods for detecting coupling in simulation experiments where we can control the direction and the level of coupling, as well as amplitude or temporal variability. Throughout this study, we are only considering unidirectional coupling. We are interested in the generalized synchronization of two dynamical systems X and Y, i.e., we do not use coupling scenarios like \(\dot{x}=F(x)+\mu (y-x)\), which would easily lead to complete synchronization in our setups.
1.1 Uninterrupted simulations of structurally different systems
In order to obtain results comparable to those obtained in previous works based on joint recurrence analysis (e.g. Romano et al., 2007), we investigate a Van der Pol oscillator X and a Rössler system. In a first scenario, we let X drive Y. The systems are described by the equations:
and
Here, \(\xi \) is Gaussian white noise with a standard deviation of 1, which is added to the driving system. The driven system features a coupling term. The amount of noise is given by \(\upnu \) and the coupling strength is given by \(\mu \). \(\sigma \) is the standard deviation of the respective components, evaluated for a typical realization of the trajectories without noise or coupling (\(v=0\), \(\mu =0)\), used for normalization. \(\omega \) is a frequency parameter, and for \(\omega =1\) both systems have approximately the same frequency, depending on the levels of noise and coupling. Since values around 1.0 (e.g., \(\omega \in \left[ {0.7,1.3} \right] )\) quickly produce phase synchronization even for low levels of coupling, we choose X having half of Y ’s frequency by \(\omega =0.5\) in the setup of long trajectories. X then completes half a cycle on average in the temporal interval \(\left[ {0,2\pi } \right] \) while Y completes one. In this way, the influence of coupling is rather local and leads to generalized synchronization. The systems are numerically integrated using the forward Euler method and a constant step size of \(dt=0.01\) in the interval \(\left[ {0,200} \right] \). In this interval, X completes around 16 cycles, Y around twice as many. The length of the interval is chosen so that the number of oscillations of the slower X trajectory roughly matches the number of gestures found in the real data investigated in this study (Fig. 11).
A ”warmup” interval of the same length is used and discarded afterwards, which guarantees that the systems are not in a transient state. The remaining 20,000 time steps are downsampled to \(N=2000\) points for further analysis.
For investigating the methods’ independence of the special structure of driver and driven systems, we also consider the case of X driven by Y. The equations read
and
As above, a stochastic component \(\xi \) is added to the driver and a coupling term is added to the driven system.
1.2 Multiple-runs simulations
In order to simulate the conditions found in speech data, we also evaluate the equations several times during short time-intervals. This means that for a certain number of repetitions (here: 30), we generate short trajectories of an approximate length of \(2\pi \) under different initial conditions, cf. Fig. 12. For this setup, we can choose \(\omega =1\) (same frequency), because the systems are not run long enough to get phase locked and we still obtain generalized synchronization. The initial conditions are drawn randomly from a set of typical realizations of the long trajectories subject to the constraint that the difference in phase across repetitions and between the two systems is smaller than \(\pi /2\), so a certain temporal alignment is guaranteed. The systems are numerically integrated in the interval \(\left[ {0,6\pi } \right] \) (using \(dt=0.01\) as above), where they complete approximately three cycles. The portion of time corresponding to the second cycle is then extracted (where the beginning of the cycle is set at the time point where \(x_{1}\) is closer to 0) and each of the 30 trajectories obtained is downsampled to 80 points (Fig. 12).
1.3 Sources of variability in the simulations
We add synthetic dynamical noise and temporal variability to the systems (Lancia et al. 2014). We distinguish between internal and observational variability. Internal noise is induced to the driver before coupling, such that it also affects the dynamics of the driven. Observational variability is applied to both systems afterwards and, therefore, does not interfere with the coupling relation, but will make it even harder to identify.
1.4 Internal dynamical noise
Internal dynamical noise is introduced to the driver by the additive white noise term \(\xi \). The effect of the noise term is modulated by the parameter v. Due to the normalization, e.g. for \(v=0.3\), the additive noise has a standard deviation 30% of the respective original signal’s standard deviation \(\sigma \). Through the coupling term, intrinsic noise is propagated to the driven system.
1.5 Internal temporal variability
First we describe the general case of a continuous time-deformed signal, following Lucero (2005). The methods for the discretized signals are introduced subsequently. The signal \(\tilde{x} \left( t \right) \), \(t\in \left[ {0,T} \right] \), is obtained by the original signal \(x\left( t \right) \), via a mapping function \(H\left( t \right) \) introducing the required amount of temporal variability:
First, consider a mapping function which is the identity \(H\left( t \right) =t\) or, equivalently, a signal \(x\left( {H\left( t \right) } \right) \) moving at constant speed \(dH\left( t \right) /dt=1\). Now, let the signal speed be perturbed by temporal noise
where \(\phi \left( t \right) \) is a Gaussian process with zero mean, standard deviation of 1 and an autocorrelation function \(exp\left( {-\left( {\theta t} \right) ^{2}} \right) \). The correlation length parameter \(\theta \) is adapted such that it features approximately five extrema per cycle. \(\gamma \) controls the amount of temporal variation. \(\gamma \phi \left( t \right) >-1\) is required for a monotonic mapping \(H\left( t \right) \). Otherwise, locally negative slopes occur, which means the new trajectory contains time-reversed portions of the original trajectory.
The mapping function \(H\left( t \right) \) is obtained by integration:
For a discrete time-series \(x\left( n \right) \) with dimensionless time-scaling \(\left( {n=0,1,\ldots ,N} \right) \), temporal variability is introduced as follows. A discrete Gaussian process \(\phi \left( n \right) \), satisfying \(\phi \left( 0 \right) =0\), is generated and the mapping function
is computed, satisfying \(H\left( 0 \right) =0\). In a second step, the mapping is normalized by
which leads to \(\tilde{H} \left( n \right) =N\). Since a mapping is generally not integer-valued (\(\tilde{H} \left( n \right) \notin N)\), an interpolation of \(x\left( 0 \right) ,x\left( 1 \right) ,\ldots ,x\left( N \right) \) has to be used to compute the values of the time-deformed signal
between the original trajectory’s discretization points \(0,1,\ldots ,N\). For the multiple-runs simulations, the temporal deformation is applied to each run separately. The effect of temporal deformation is illustrated in Fig. 13.
To introduce internal temporal variability, we do not let \(Y\left( \cdot \right) \) be driven by \(X\left( \cdot \right) \), but by a temporally deformed signal \(X\left( {H\left( \cdot \right) } \right) \), where the amount of variability is given by \(\gamma \). The analysis of coupling relations is then performed on \(X\left( {H\left( \cdot \right) } \right) \) and \(Y\left( \cdot \right) \).
1.6 Observational temporal variability
We let X drive Y without any temporal deformation. Afterwards, both signals are deformed by different mappings, \(H\left( \cdot \right) \) and \(G\left( \cdot \right) \). The amount of temporal variability is given by \(\gamma _x \) and \(\gamma _y \), respectively. In this way we can also allow different amounts of noise in \(X\left( {H\left( \cdot \right) } \right) \) and \(Y\left( {G\left( \cdot \right) } \right) \). Then we compare the time-deformed signals \(X\left( {H\left( \cdot \right) } \right) \) to \(Y\left( {G\left( \cdot \right) } \right) \) for detecting the coupling relation.
1.7 Simulations parameters
Purely deterministic simulations: (\(v=0\), \(\gamma =0\), \(\gamma _X =0\), \(\gamma _Y =0)\); simulations with intrinsic noise and temporal variability: (\(v=0.3\), \(\gamma =0.3\),, \(\gamma _X =0\), \(\gamma _Y =0)\); uninterrupted simulations with intrinsic noise and variability and observational temporal variability: (\(v=0.3\), \(\gamma =0.3\), \(\gamma _X =0.3\), \(\gamma _Y =0.3)\).
Rights and permissions
About this article
Cite this article
Lancia, L., Rosenbaum, B. Coupling relations underlying the production of speech articulator movements and their invariance to speech rate. Biol Cybern 112, 253–276 (2018). https://doi.org/10.1007/s00422-018-0749-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00422-018-0749-y