Abstract
Whether it is for human-robot interaction or for human-computer interaction, there is a growing need for an emotional speech synthesis system that can provide the required information in a more natural and effective manner. In order to identify and understand the characteristics of basic emotions and their effects, we propose a series of user evaluation experiments on an emotional prosody modification system that can express either perceivable or slightly exaggerated emotions classified into anger, joy, and sadness as an independent module for a general purpose speech synthesis system. In this paper, we propose two experiments to evaluate the emotional prosody modification module according to different types of the initial input speech. And we also provide a supplementary experiment to understand the apparently prosody-independent emotion, or joy, by replacing the resynthesized joy speech information with original human voice recorded in the emotional state of joy.
Chapter PDF
Similar content being viewed by others
Keywords
References
Schröder, M.: Emotional Speech Synthesis: A Review. In: Eurospeech 2001, vol. 1, pp. 561–564 (2001)
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., Taylor, J.G.: Emotion recognition in human-computer interaction. IEEE Signal Processing Magazine 18(1), 32–80 (2001)
Lee, H.-J., Park, J.C.: Customized Message Generation and Speech Synthesis in Response to Characteristic Behavioral Patterns of Children. In: Jacko, J.A. (ed.) HCI 2007. LNCS, vol. 4552, pp. 114–123. Springer, Heidelberg (2007)
SiTEC Emotional Speech Corpus, http://www.sitec.or.kr/English/index.asp
Jun, S.-A.: K-ToBI (Korean ToBI) Labeling Convention. Korean Journal of Speech Science 7 (2000)
Lee, H.-J., Park, J.C.: Lexical Disambiguation for Intonation Synthesis: A CCG Approach. In: Korean Society for Language and Information, pp. 103–118 (2005)
Lee, H.-J., Park, J.C.: Vowel Sound Disambiguation for Proper Intonation Synthesis. In: 19th Pacific Asia Conference on Language, Information and Computation, pp. 131–142 (2005)
Lee, H.-J., Park, J.C.: Characteristics of Spoken Discourse Markers and their Application to Speech Synthesis Systems. In: 19th Annual Conference on Human and Cognitive Language Technology, pp. 254–260 (2007)
PRAAT, http://www.praat.org
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, HJ., Park, J.C. (2009). Interpretation of User Evaluation for Emotional Speech Synthesis System. In: Jacko, J.A. (eds) Human-Computer Interaction. New Trends. HCI 2009. Lecture Notes in Computer Science, vol 5610. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02574-7_33
Download citation
DOI: https://doi.org/10.1007/978-3-642-02574-7_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02573-0
Online ISBN: 978-3-642-02574-7
eBook Packages: Computer ScienceComputer Science (R0)