Speech Synthesis Based on a Physiological Articulatory Model

Fang, Qiang; Dang, Jianwu

doi:10.1007/11939993_25

Qiang Fang^22,23 &
Jianwu Dang²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4274))

Included in the following conference series:

International Symposium on Chinese Spoken Language Processing

1585 Accesses

Abstract

In this paper, a framework for speech synthesis is proposed to realize the process of speech production of human, which is based on a physiological articulatory model. Within this framework, it begins with given articulatory targets, then muscle activation patterns are estimated according to the targets by accounting for both the equilibrium characteristics and muscle dynamics, consequently, the articulatory model is driven to generate a time-varying vocal tract shape corresponding to the targets by contracting the corresponding muscles. Thereafter, a transmission line model is implemented for the time-varying vocal tract to produce speech sound. At last, a primary experiment is carried out to synthesize the single vowels and diphthongs of Chinese with the physiological articulatory model based synthesizer. The result shows that the spectra of the synthetic sound for single vowels are consistent with those of the real speech, and proper acoustic characteristics are obtained in most cases for diphthongs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Du, G., Zhu, Z., Gong, X.: Foundation for Acoustics, 2nd edn. Nanjing university Publishing House (2001)
Google Scholar
Dang, J., Honda, K.: Construction and control of a physiological articulatory model. J. Acoust. Soc. Am. 115(2), 853–870 (2004)
Article Google Scholar
Dang, J., Honda, K.: Estimation of vocal tract shape from sounds via a physiological articulatory model. J. Phonetics 30, 511–532 (2002)
Article Google Scholar
Dang, J., Honda, K.: A physiological model of a dynamic vocal tract for speech production. J. Acoust. Soc. Jpn (E) 22, 415–425 (2001)
Google Scholar
Dang, J., Honda, K.: Speech production of vowel sequences using a physiological articulatory model. In: ISCLP 1998 (1998)
Google Scholar
Dang, J., Honda, K.: Acoustic characteristics of the piriform fossa in models and humans. J. Acoust. Soc. Am. 101, 456–465 (1997)
Article Google Scholar
Dang, J., Honda, K.: Acoustic characteristics of the human paranasal sinuses derived from transmission characteristic measurement and morphological observation. J. Acoust. Soc. Am. 100, 3374–3383 (1996)
Article Google Scholar
Dang, J., Honda, K., Suzuki, H.: Morphological and acoustical analysis of the nasal and the paranasal cavities. J. Acoust. Soc. Am. 96, 2088–2100 (1994)
Article Google Scholar
Maeda, S.: A digital simulation method of the vocal tract system. Speech Communication, 199–229 (1982)
Google Scholar
Wu, Z., Lin, M.: Outline of experimental phonetics. Higher education Press (1988)
Google Scholar

Download references

Author information

Authors and Affiliations

IIPL, school of information science, Japan Advance Institute of Science and Technology,
Qiang Fang & Jianwu Dang
Phonetics Lab., Institute of Linguistics, Chinese Academy of Social Sciences,
Qiang Fang

Authors

Qiang Fang
View author publications
You can also search for this author in PubMed Google Scholar
Jianwu Dang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, The University of Hong Kong, Hong Kong
Qiang Huo
Human Language Technology Department, Institute for Infocomm Research (I2R), 119613, Singapore
Bin Ma
School of Computer Engineering, Nanyang Technological University (NTU), 639798, Singapore
Eng-Siong Chng
Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore
Haizhou Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fang, Q., Dang, J. (2006). Speech Synthesis Based on a Physiological Articulatory Model. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_25

Download citation

DOI: https://doi.org/10.1007/11939993_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49665-6
Online ISBN: 978-3-540-49666-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics