Abstract
In this paper, a framework for speech synthesis is proposed to realize the process of speech production of human, which is based on a physiological articulatory model. Within this framework, it begins with given articulatory targets, then muscle activation patterns are estimated according to the targets by accounting for both the equilibrium characteristics and muscle dynamics, consequently, the articulatory model is driven to generate a time-varying vocal tract shape corresponding to the targets by contracting the corresponding muscles. Thereafter, a transmission line model is implemented for the time-varying vocal tract to produce speech sound. At last, a primary experiment is carried out to synthesize the single vowels and diphthongs of Chinese with the physiological articulatory model based synthesizer. The result shows that the spectra of the synthetic sound for single vowels are consistent with those of the real speech, and proper acoustic characteristics are obtained in most cases for diphthongs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Du, G., Zhu, Z., Gong, X.: Foundation for Acoustics, 2nd edn. Nanjing university Publishing House (2001)
Dang, J., Honda, K.: Construction and control of a physiological articulatory model. J. Acoust. Soc. Am. 115(2), 853–870 (2004)
Dang, J., Honda, K.: Estimation of vocal tract shape from sounds via a physiological articulatory model. J. Phonetics 30, 511–532 (2002)
Dang, J., Honda, K.: A physiological model of a dynamic vocal tract for speech production. J. Acoust. Soc. Jpn (E) 22, 415–425 (2001)
Dang, J., Honda, K.: Speech production of vowel sequences using a physiological articulatory model. In: ISCLP 1998 (1998)
Dang, J., Honda, K.: Acoustic characteristics of the piriform fossa in models and humans. J. Acoust. Soc. Am. 101, 456–465 (1997)
Dang, J., Honda, K.: Acoustic characteristics of the human paranasal sinuses derived from transmission characteristic measurement and morphological observation. J. Acoust. Soc. Am. 100, 3374–3383 (1996)
Dang, J., Honda, K., Suzuki, H.: Morphological and acoustical analysis of the nasal and the paranasal cavities. J. Acoust. Soc. Am. 96, 2088–2100 (1994)
Maeda, S.: A digital simulation method of the vocal tract system. Speech Communication, 199–229 (1982)
Wu, Z., Lin, M.: Outline of experimental phonetics. Higher education Press (1988)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fang, Q., Dang, J. (2006). Speech Synthesis Based on a Physiological Articulatory Model. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_25
Download citation
DOI: https://doi.org/10.1007/11939993_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49665-6
Online ISBN: 978-3-540-49666-3
eBook Packages: Computer ScienceComputer Science (R0)