Phasegram Analysis of Vocal Fold Vibration Documented With Laryngeal High-speed Video Endoscopy
Introduction
The behavior of a vibratory system is periodic if the observed oscillatory pattern continuously repeats itself after a constant time interval. Periodicity abiding this strict definition is hardly observed in empirical data of biomechanical systems such as the voice. Rather, voice production is at best a nearly periodic1 phenomenon under nonpathological conditions. In the presence of a voice disorder, vocal fold vibration and thus the generated acoustical output is likely to be more or less perturbed,2 often caused by highly irregular vibratory regimes of the vocal folds.3, 4, 5, 6
Deviations of periodicity can be quantified in a variety of ways. Apart from time domain-based and frequency domain-based approaches such as calculation of jitter7 or the harmonics-to-noise ratio,8 methods from nonlinear systems analysis have received growing interest in the past decades.9, 10, 11, 12 In non-linear dynamics methods, the voice is considered to be a dynamical system13 that is able to exhibit a wide variety of oscillatory behavior “on the way to chaos.”14, 15
Several quantitative methods for assessing the complexity of the temporal behavior of nonlinear systems have been introduced in the domain of mathematics and physics, for instance the correlation dimension,16 Lyapunov exponents,17 or Tokuda et al's low-dimensional nonlinearity measure.18 These methods have been successfully applied during the analysis of biosignals from both healthy and pathological voices, such as the acoustical waveform,12, 19, 20, 21, 22 electroglottography,23, 24, 25 or data derived from high-speed video (HSV) recordings of vocal fold vibration.26, 27, 28, 29
The detailed interpretation of available quantitative methods for analyzing the dynamics of irregular voice often requires expert background knowledge in mathematics and physics. In contrast, visualization methods are often easier to understand for nonexperts. Such visualization methods, applied to nonperiodic voice production, include for example spectrograms12, 30, 31 or local maxima displays.32
Recently, a novel visualization method of system dynamics has been introduced: the phasegram.33 In a phasegram, time is mapped onto the x-axis, and various vibratory regimes, such as periodic oscillation, subharmonics, or chaos, are identified within the generated graph by the number and the stability of horizontal lines. Phasegrams can be interpreted as bifurcation diagrams in time. They are particularly suited for nonstationary signals. The benefits of sliding window analysis are combined with the visualization potential of phase space embedding.34, 35 In contrast to other nonlinear analysis techniques (eg, bifurcation maps), phasegrams can be automatically constructed from a time domain signal alone, no additional system parameter needs to be known. In contrast to conventional voice perturbation measures (eg, jitter), no information about glottal cycle duration or fundamental frequency needs to be known.
Phasegrams have thus far been utilized for the visualization33 and the manual classification36 of electroglottographic voice signals. Here, their application to the analysis of time series data derived from HSV recordings is introduced by example of simulated vocal fold vibrations using a lumped element biomechanical model. The concept is further extended to healthy and pathological phonations, considering both stationary and nonstationary signals. The analysis is complemented by spatiotemporal visualization37, 38 and of Fourier analysis of vocal fold vibration and of simultaneously acquired acoustical signals. It will be shown that sequences of aberrant vocal fold vibratory behavior can be easily located in phasegrams, thus earmarking the method as a promising candidate for detection of clinically relevant passages within HSV recordings. To facilitate automated objective analysis of vocal fold vibratory behavior (as seen in HSV recordings), two novel quantitative analysis parameters derived from the phasegram visualization are introduced in this paper. The performance of these quantitative parameters is assessed through analysis of a database containing HSV recordings of healthy and pathological phonations.
Section snippets
Participants and phonatory tasks
A total of 73 female participants were included in the study. Before data acquisition, all participants underwent a standard clinical evaluation. Forty-two of these were considered to be normophonic (ie, healthy) speakers. Another 15 participants were diagnosed with functional dysphonia, and the remaining 16 were diagnosed with unilateral vocal fold paralysis. The average (±standard deviation) age of these clinical groups was 40.2 ± 15.8 years (healthy), 46.2 ± 16.1 years (functional
Qualitative analysis—typical examples
Three stereotypical vocal fold vibratory regimes (periodic, subharmonic, and irregular), generated through attempts at stable phonation by a normophonic female and two females with vocal fold paralysis, respectively, are illustrated in Figure 3.
Discussion
In a recent publication, the phasegram has been introduced as an intuitive visualization tool for various oscillatory phenomena in physics and in biology, demonstrated with analysis of the human voice.33 Here, a more specialized investigation is performed, showing that phasegrams are useful in analyzing signals derived from HSV recordings documenting healthy and pathological phonations. The feasibility of the approach was demonstrated by creating a phasegram of the GAW from a synthesized vocal
Conclusion
In this work, the phasegram visualization method has been extended to the analysis of GAW data derived from HSV recordings of both normophonic and pathological voice production. Qualitative analysis showed that the phasegram is a valuable complement to existing analysis methods, as it provides direct insights into the time-dependent complexity of vocal fold vibration. Because of the phasegram's potential to condense information about the vocal fold dynamics of an entire phonation into a single
Acknowledgments
This research was supported by the institutional fund of Palacký University Olomouc, Czech Republic (to C.T.H.), by the Technology Agency of the Czech Republic project no. TA04010877 (to CTH and JGS), by the state budget of the Czech Republic OPVK CZ.1.07/2.3.00/20.0057 (to J.G.Š.), and by grant no. LO1413/2-2 from Deutsche Forschungsgemeinschaft (to J.U. and J.L.).
References (56)
- et al.
A method for analyzing vocal jitter in sustained phonation
J Phon
(1973) - et al.
Chaos in voice, from modeling to measurement
J Voice
(2006) - et al.
Calls out of chaos: the adaptive significance of nonlinear phenomena in mammalian vocal production
Anim Behav
(2002) - et al.
Measuring the strangeness of strange attractors
Physica D
(1983) - et al.
Acoustic analyses of sustained and running voices from patients with laryngeal pathologies
J Voice
(2008) - et al.
Perturbation and nonlinear dynamic analyses of voices from patients with unilateral laryngeal paralysis
J Voice
(2005) Irregularity of vocal period and amplitude: a first approach to the fractal analysis of voice
J Voice
(1990)- et al.
Observation of a strange attractor
Physica D
(1983) - et al.
Clinically evaluated procedure for the reconstruction of vocal fold vibrations from endoscopic digital high-speed videos
Med Image Anal
(2007) - et al.
Automatic glottal segmentation using local-based active contours and application to glottovibrography
Speech Commun
(2012)
Workshop on acoustic voice analysis
Clinical Measurement of Speech and Voice
Spatiotemporal analysis of high-speed videolaryngoscopic imaging of organic pathologies in males
J Speech Lang Hear Res
Voice production mechanisms following phonosurgical treatment of early glottic cancer
Ann Otol Rhinol Laryngol
Videokymography in voice disorders: what to look for?
Ann Otol Rhinol Laryngol
Interpretation of biomechanical simulations of normal and chaotic vocal fold oscillations with empirical eigenfunctions
J Acoust Soc Am
Harmonics-to-noise ratio as an index of the degree of hoarseness
J Acoust Soc Am
Methods of chaos physics and their application to acoustics
J Acoust Soc Am
Bifurcations and chaos in voice signals
Appl Mech Rev
Evidence of chaos in vocal fold vibration
Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and Engineering
Subharmonic routes to chaos observed in acoustics
Phys Rev Lett
Liapunov exponents from time series
Phys Rev A
Nonlinear analysis of irregular animal vocalizations
J Acoust Soc Am
Surrogate analysis for detecting nonlinear dynamics in normal vowels
J Acoust Soc Am
Detecting bifurcations in voice signals
Correlation dimension of electroglottographic data from healthy and pathologic subjects
J Acoust Soc Am
Global and local dimensions of vocal dynamics
J Acoust Soc Am
Cited by (12)
Automatic glottis segmentation for laryngeal endoscopic images based on U-Net
2022, Biomedical Signal Processing and ControlCitation Excerpt :Precise glottal area segmentation shows the outline of vocal fold, the location and shape of glottis, which is significant for further classification of laryngeal diseases or computer-assisted-surgery [8]. The glottis segmentation has been widely used to classify laryngeal diseases [9,10], also to understand vibratory patterns in different phonation or singing style [11,12] and others applications [6]. However, the automatic segmentation of glottis remains a challenging task due to the various shapes of glottis, the low brightness of laryngeal images, the slight differences between glottis and other laryngeal tissues, the existence of laryngeal diseases and so on.
Effect of Ventricular Folds on Vocalization Fundamental Frequency in Domestic Pigs (Sus scrofa domesticus)
2021, Journal of VoiceCitation Excerpt :The number and stability of lines perpendicular to the (vertical) y-axis indicate the system state at a particular point in time: one line → no oscillation (stasis); two locally stable lines → periodic oscillation; more than two locally stable lines → subharmonic patterns; no continuous lines, rugged appearance → irregular system behavior, potential indicator for chaos (Figure 5B for an example). The complexity of the respective phase space embeddings within a generated phasegram can be quantified with a parameter termed the phasegram complexity estimate (PCE), by calculating the one-dimensional correlation dimension along each Poincaré section (36, Appendix I). PCE = 0 for a perfectly periodic stationary signal devoid of noise.
Analysis Method for Laryngeal High-Speed Videoendoscopy: Development of the Criteria for the Measurement Input
2021, Journal of VoiceCitation Excerpt :However, because the available evidential supports for this assertion, at best, is inferential, further confirmation is required to verify the use of 2000 frames as the length of the SOI. It is a common practice in the LHSV-based analysis methods, where only a certain part of the image sample is selected for analysis.14,23–26 A similar practice is also applied in this analysis method.
Classifying Vocal Folds Fixation from Endoscopic Videos with Machine Learning
2023, Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBSLaryngeal High-Speed Videoendoscopy with Laser Illumination: A Preliminary Report
2021, Otolaryngologia Polska