Expressive Face Animation Synthesis Based on Dynamic Mapping Method

Yin, Panrong; Zhao, Liyue; Huang, Lixing; Tao, Jianhua

doi:10.1007/978-3-540-74889-2_1

Expressive Face Animation Synthesis Based on Dynamic Mapping Method

Panrong Yin¹,
Liyue Zhao¹,
Lixing Huang¹ &
…
Jianhua Tao¹

Conference paper

5781 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4738))

Abstract

In the paper, we present a framework of speech driven face animation system with expressions. It systematically addresses audio-visual data acquisition, expressive trajectory analysis and audio-visual mapping. Based on this framework, we learn the correlation between neutral facial deformation and expressive facial deformation with Gaussian Mixture Model (GMM). A hierarchical structure is proposed to map the acoustic parameters to lip FAPs. Then the synthesized neutral FAP streams will be extended with expressive variations according to the prosody of the input speech. The quantitative evaluation of the experimental result is encouraging and the synthesized face shows a realistic quality.

The work is supported by the National Natural Science Foundation of China (No. 60575032) and the 863 Program (No. 2006AA01Z138).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Massaro, D.W., Beskow, J., Cohen, M.M., Fry, C.L., Rodriguez, T.: Picture My Voice: Audio to Visual Speech Synthesis using Artificial Neural Networks. In: Proceedings of AVSP’99, Santa Cruz, CA, pp. 133–138 (August 1999)
Google Scholar
Ezzat, T., Poggio, T.: MikeTalk: A Talking Facial Display Based on Morphing Visemes. In: Proc. Computer Animation Conference, Philadelphia, USA (1998)
Google Scholar
Yamamoto, E., Nakamura, S., Shikano, K.: Lip movement synthesis from speech based on Hidden Markov Models. Speech Communication 26, 105–115 (1998)
Article Google Scholar
Hong, P., Wen, Z., Huang, T.S.: Real-time speech-driven face animation with expressions using neural networks. IEEE Trans on Neural Networks 13(4) (2002)
Google Scholar
Brand, M.: Voice Puppetry. In: Pro of SIGGRAPH 1999. pp. 21–28
Google Scholar
Bregler, C., Covell, M., Slaney, M.: Video Rewrite: Driving Visual Speech with Audio, ACM SIGGRAPH (1997)
Google Scholar
Cosatto, E., Potamianos, G., Graf, H.P.: Audio-visual unit selection for the synthesis of photo-realistic talking-heads. In: IEEE International Conference on Multimedia and Expo, ICME, vol. 2, pp. 619–622 (2000)
Google Scholar
Yin, P., Tao, J.: Dynamic mapping method based speech driven face animation system. In: The First International Conference on Affective Computing and Intelligent Interaction (2005)
Google Scholar
Tekalp, A.M., Ostermann, J.: Face and 2-D mesh animation in MPEG-4, Signal Processing. Image Communication 15, 387–421 (2000)
Google Scholar
Wang, J.-Q., Wong, K.-H., Pheng, P.-A., Meng, H.M., Wong, T.-T.: A real-time Cantonese text-to-audiovisual speech synthesizer, Acoustics, Speech, and Signal Processing. In: Proceedings (ICASSP 2004), vol. 1, pp. 653–656 (2004)
Google Scholar
Arun, K.S., Huang, T.S., Blostein, S.D.: Least-square fitting of two 3-D point sets. IEEE Trans. Pattern Analysis and Machine Intelligence 9(5), 698–700 (1987)
Article Google Scholar
Verma, A., Subramaniam, L.V., Rajput, N., Neti, C., Faruquie, T.A.: Animating Expressive Faces Across Languages. IEEE Trans on Multimedia 6(6) (2004)
Google Scholar
Tao, J., Tan, T.: Emotional Chinese Talking Head Syste. In: Proc. of ACM 6th International Conference on Multimodal Interfaces (ICMI 2004), State College, PA (October 2004)
Google Scholar
Gutierrez-Osuna, R., Kakumanu, P.K., Esposito, A., Garcia, O.N., Bojorquez, A., Castillo, J.L., Rudomin, I.: Speech-Driven Facial Animation with Realistic Dynamics. IEEE Trans. on Multimedia 7(1) (2005)
Google Scholar
Huang, Y., Lin, S., Ding, X., Guo, B., Shum, H.-Y.: Real-time Lip Synchronization Based on Hidden Markov Models. ACCV (2002)
Google Scholar
Li, Y., Yu, F., Xu, Y.-Q., Chang, E., Shum, H.-Y.: Speech-Driven Cartoon Animation with Emotions. In: Proceedings of the ninth ACM international conference on Multimedia (2001)
Google Scholar
Rao, R., Chen, T.: Audio-to-Visual Conversion for Multimedia Communication. IEEE Transactions on Industrial Electronics 45(1), 15–22 (1998)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences, Beijing, China
Panrong Yin, Liyue Zhao, Lixing Huang & Jianhua Tao

Authors

Panrong Yin
View author publications
You can also search for this author in PubMed Google Scholar
Liyue Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Lixing Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jianhua Tao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Ana C. R. Paiva Rui Prada Rosalind W. Picard

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yin, P., Zhao, L., Huang, L., Tao, J. (2007). Expressive Face Animation Synthesis Based on Dynamic Mapping Method. In: Paiva, A.C.R., Prada, R., Picard, R.W. (eds) Affective Computing and Intelligent Interaction. ACII 2007. Lecture Notes in Computer Science, vol 4738. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74889-2_1

Download citation

DOI: https://doi.org/10.1007/978-3-540-74889-2_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74888-5
Online ISBN: 978-3-540-74889-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics