Skip to main content

Temporal Organization in Listeners’ Perception of the Speakers’ Emotions and Characteristics: A Way to Improve the Automatic Recognition of Emotion-Related States in Human Voice

  • Conference paper
Affective Computing and Intelligent Interaction (ACII 2007)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4738))

Abstract

We propose to improve the automatic detection and characterization of emotion-related expressions in human voice by an approach based on human auditory perception. In order to determine the temporal hierarchical organization in human perception of the speakers’ emotions and characteristics, a listening test has been set up with seventy-two listeners. The corpus was constituted of eighteen voice messages extracted from a real-life application. Message segments of different temporal length have been listened to by listeners who were asked to verbalize their perception. Fourteen meta-categories have been obtained and related to age, gender, regional accent, timbre, personality, emotion, sound quality, expression style and so on. The temporal windows of listening necessary for listeners to perceive and verbalize these categories are defined and could underlie the building of sub-models relevant to the automatic recognition of emotion-related expressions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Scherer, K.R.: Vocal communication of emotion: A review of research paradigms. Speech Communication 40, 227–256 (2003)

    Article  MATH  Google Scholar 

  2. Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N.: Emotion recognition in human-computer interaction. IEEE Signal Processing Magazine 18(1), 32–80 (2001)

    Article  Google Scholar 

  3. Ververidis, D., Kotropoulos, C.: Emotional speech recognition: Resources, features, and methods. Speech Communication 48, 1162–1181 (2006)

    Article  Google Scholar 

  4. Scherer, K.R., Johnstone, T., Klasmeyer, G.: Vocal expression of emotion. In: Davidson, R., Scherer, K., Goldsmith, H. (eds.) Handbook of Affective Sciences, pp. 433–456. Oxford Press (2003)

    Google Scholar 

  5. Petty, R.E., Fabrigar, L.R., Wegener, D.T.: Emotional factors in attitudes and persuasion. In: Davidson, R., Scherer, K., Goldsmith, H. (eds.) Handbook of Affective Sciences, pp. 752–772. Oxford Press (2003)

    Google Scholar 

  6. Batliner, A., Fischer, K., Hunber, R.: How to find trouble in communication. Speech Communication 40, 117–143 (2003)

    Article  MATH  Google Scholar 

  7. Schuller, B., Rigoll, G., Lang, M.: Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine - belief network architecture. In: Proceedings ICASSP 2004., Montreal, Canada, vol. 1, pp. 577–580 (2004)

    Google Scholar 

  8. Batliner, A., Steidl, S., Schuller, B., Seppi, D., Laskowski, K., Vogt, T., Devillers, L., Vidrascu, L., Amir, N., Kessous, L., Aharonson, V.: Combining efforts for improving automatic classification of emotional user states. In: Erjavec, T., Gros, J. (eds.) Language Technologies, IS-LTC, Ljubljana, Slovenia, pp. 240–245. Infornacijska Druzba (Information Society) (2006)

    Google Scholar 

  9. Lee, C.M., Yildirim, S., Bulut, M.: Emotion Recognition based on Phoneme Classes. In: Proceedings ICSLP, Jeju Island, Corea (2004)

    Google Scholar 

  10. Laver, J.: The analysis of vocal quality: from the classical period to the twentieth century. In: Asher, R.E, Henderson, E.J.A (eds.) Towards a history of phonetic, Edinburgh University Press (1981)

    Google Scholar 

  11. Chateau, N., Maffiolo, V., Blouin, C.: Analysis of emotional speech in voice mail message: The influence of speaker’s gender. In: Proceedings Interspeech-ICSLP, Corea, pp. 885–888 (2004)

    Google Scholar 

  12. Grimm, M., Kroschel, K., Narayanan, S.: Modelling Emotion Expression and Perception Behaviour in Auditive Emotion Evaluation. In: Proceedings Prosody (2006)

    Google Scholar 

  13. Tato, R., Kemp, T., Marasek, K.: Method for detecting emotions involving subspace specialists, Brevet n° US 2003/0069728 A1 (2003)

    Google Scholar 

  14. Maffiolo, V., Chateau, N., Le Chenadec, G.: Procédé d’estimation de l’état mental d’une personne, Brevet n° 06581-FR (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ana C. R. Paiva Rui Prada Rosalind W. Picard

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Maffiolo, V., Chateau, N., Le Chenadec, G. (2007). Temporal Organization in Listeners’ Perception of the Speakers’ Emotions and Characteristics: A Way to Improve the Automatic Recognition of Emotion-Related States in Human Voice. In: Paiva, A.C.R., Prada, R., Picard, R.W. (eds) Affective Computing and Intelligent Interaction. ACII 2007. Lecture Notes in Computer Science, vol 4738. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74889-2_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74889-2_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74888-5

  • Online ISBN: 978-3-540-74889-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics