Abstract
The present study aimed to clarify how listeners decode emotions from human nonverbal vocalizations, exploring unbiased recognition accuracy of vocal emotions selected from the Montreal Affective Voices (MAV) (Belin et al. in Trends Cognit Sci 8:129–135, 2008. doi:10.1016/j.tics.2004.01.008). The MAV battery includes 90 nonverbal vocalizations expressing anger, disgust, fear, pain, sadness, surprise, happiness, sensual pleasure, as well as neutral expressions, uttered by female and male actors. Using a forced-choice recognition task, 156 native speakers of Portuguese were asked to identify the emotion category underlying each MAV sound, and additionally to rate the valence, arousal and dominance of these sounds. The analysis focused on unbiased hit rates (Hu Score; Wagner in J Nonverbal Behav 17(1):3–28, 1993. doi:10.1007/BF00987006), as well as on the dimensional ratings for each discrete emotion. Further, we examined the relationship between categorical and dimensional ratings, as well as the effects of speaker’s and listener’s sex on these two types of assessment. Surprise vocalizations were associated with the poorest accuracy, whereas happy vocalizations were the most accurately recognized, contrary to previous studies. Happiness was associated with the highest valence and dominance ratings, whereas fear elicited the highest arousal ratings. Recognition accuracy and dimensional ratings of vocal expressions were dependent both on speaker’s sex and listener’s sex. Further, discrete vocal emotions were not consistently predicted by dimensional ratings. Using a large sample size, the present study provides, for the first time, unbiased recognition accuracy rates for a widely used battery of nonverbal vocalizations. The results demonstrated a dynamic interplay between listener’s and speaker’s variables (e.g., sex) in the recognition of emotion from nonverbal vocalizations. Further, they support the use of both categorical and dimensional accounts of emotion when probing how emotional meaning is decoded from nonverbal vocal cues.

Similar content being viewed by others
Notes
Differences in the duration of the MAV vocalizations could represent a confounding factor. However, in naturalistic contexts, vocal emotions rely on both acoustic and temporal differences to ensure they are accurately communicated. Therefore, the duration of the MAV stimuli was not manipulated to keep the sounds closer to vocal expressions typically found in real-life social communication contexts.
References
Bachorowski, J. A. (1999). Vocal expression and perception of emotion. Current Directions in Psychological Science, 8, 53–57. doi:10.1111/1467-8721.00013.
Banse, R., & Scherer, K. R. (1996). Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology, 70, 614–636. doi:10.1037/0022-3514.70.3.614.
Belin, P., Fecteau, S., & Bedard, C. (2004). Thinking the voice: Neural correlates of voice perception. Trends in Cognitive Sciences, 8, 129–135. doi:10.1016/j.tics.2004.01.008.
Belin, P., Fillion-Bilodeau, S., & Gosselin, F. (2008). The Montreal Affective Voices: A validated set of nonverbal affect bursts for research on auditory affective processing. Behavior Research Methods, 40, 531–539. doi:10.3758/BRM.40.2.531.
Besson, M., Magne, C., & Schön, D. (2002). Emotional prosody: Sex differences in sensitivity to speech melody. Trends in Cognitive Sciences, 6, 405–407. doi:10.1016/S1364-6613(02)01975-7.
Bradley, M. M., Codispoti, M., Cuthbert, B. N., & Lang, P. J. (2001a). Emotion and motivation I: Defensive and appetitive reactions in picture processing. Emotion, 1(3), 276. doi:10.1037/1528-3542.1.3.276.
Bradley, M. M., Codispoti, M., Sabatinelli, D., & Lang, P. J. (2001b). Emotion and motivation II: Sex differences in picture processing. Emotion, 1, 300–319. doi:10.1037/1528-3542.1.3.300.
Bradley, M. M., & Lang, P. J. (1994). Measuring emotion: The self-assessment manikin and the semantic differential. Journal of Behavior Therapy and Experimental Psychiatry, 25, 49–59. doi:10.1016/0005-7916(94)90063-9.
Bradley, M. M., & Lang, P. J. (1999). The International Affective Digitized Sounds (IADS): Stimuli. Instruction manual and affective ratings. Florida. FL: The Center for Research in Psychophysiology. University of Florida.
Bradley, M. M., & Lang, P. J. (2007). The International Affective Digitized Sounds (IADS-2): Affective ratings of sounds and instruction manual (2nd ed.). Gainesville: NIMH Center for the Study of Emotion and Attention, University of Florida.
Darwin, C. (1998). The expression of the emotions in man and animals (3rd ed.). London: Harper-Collins. (Original work published in 1872).
Ekman, P. (1992). An argument for basic emotions. Cognition and Emotion, 6, 169–200. doi:10.1080/02699939208411068.
Ekman, P. (1994). Strong evidence for universals in facial expressions: A reply to Russell’s mistaken critique. Psychological Bulletin, 115, 268–287.
Ekman, P., Friesen, W. V., & Ellsworth, P. (1972). Emotion in the human face: Guide-lines for research and an integration of findings. New York, NY: Pergamon.
Fecteau, S., Belin, P., Joanette, Y., & Armony, J. L. (2007). Amygdala responses to nonlinguistic emotional vocalizations. Neuroimage, 36(2), 480–487. doi:10.1016/j.neuroimage.2007.02.043.
Gohier, B., Senior, C., Brittain, P. J., Lounes, N., El-Hage, W., Law, V., et al. (2013). Gender differences in the sensitivity to negative stimuli: Cross-modal affective priming study. European Psychiatry, 28, 74–80. doi:10.1016/j.eurpsy.2011.06.007.
Hall, J. A. (1978). Gender effects in decoding nonverbal cues. Psychological Bulletin, 85, 845–857. doi:10.1037/0033-2909.85.4.845.
Hall, J. A. (1984). Nonverbal sex differences: Communication accuracy and expressive style. Baltimore, MD: Johns Hopkins University Press.
Hall, J. A., Andrzejewski, S. A., Murphy, N. A., Mast, M. S., & Feinstein, B. A. (2008). Accuracy of judging others’ traits and states: Comparing mean levels across tests. Journal of Research in Personality, 42(6), 1476–1489. doi:10.1016/j.jrp.2008.06.013.
Hall, J. A., Gunnery, S. D., & Horgan, T. G. (2016). Gender differences in interpersonal accuracy. In J. A. Hall, M. Schmid Mast, & T. V. West (Eds.), The social psychology of perceiving others accurately (pp. 309–327). Cambridge: Cambridge University Press.
Hall, J. A., & Matsumoto, D. (2004). Gender differences in judgments of multiple emotions from facial expressions. Emotion, 4(2), 201. doi:10.1037/1528-3542.4.2.201.
Hampson, E., van Anders, S. M., & Mullin, L. I. (2006). A female advantage in the recognition of emotional facial expressions: Test of an evolutionary hypothesis. Evolution and Human Behavior, 27, 401–416. doi:10.1016/j.evolhumbehav.2006.05.002.
Hawk, S. T., van Kleef, G. A., Fischer, A. H., & van Der Schalk, J. (2009). “Worth a thousand words”: Absolute and relative decoding of nonlinguistic affect vocalizations. Emotion, 9(3), 293. doi:10.1037/a0015178.
Jiang, X., Paulmann, S., Robin, J., & Pell, M. D. (2015). More than accuracy: Nonverbal dialects modulate the time course of vocal emotion recognition across cultures. Journal of Experimental Psychology: Human Perception and Performance, 41(3), 597. doi:10.1037/xhp0000043.
Johnstone, T., & Scherer, K. R. (2000). Vocal communication of emotion. In M. Lewis & J. M. Haviland-Jones (Eds.), Handbook of emotions (2nd ed., pp. 220–235). New York: Guilford Press.
Juslin, P. N., & Laukka, P. (2003). Communication of emotions in vocal expression and music performance: Different channels, same code? Psychological Bulletin, 129, 770–814. doi:10.1037/0033-2909.129.5.770.
Kensinger, E. A., & Corkin, S. (2004). Two routes to emotional memory: Distinct neural processes for valence and arousal. Proceedings of the National Academy of Sciences of the United States of America, 101(9), 3310–3315. doi:10.1073/pnas.0306408101.
Koeda, M., Belin, P., Hama, T., Masuda, T., Matsuura, M., & Okubo, Y. (2013). Cross-cultural differences in the processing of non-verbal affective vocalizations by Japanese and Canadian listeners. Frontiers in Psychology, 4, 105. doi:10.3389/fpsyg.2013.00105.
Kotchoubey, B., Kaiser, J., Bostanov, V., Lutzenberger, W., & Birbaumer, N. (2009). Recognition of affective prosody in brain-damaged patients and healthy controls: A neurophysiological study using EEG and whole-head MEG. Cognitive Affective and Behavioral Neuroscience, 9, 153–167. doi:10.3758/CABN.9.2.153.
Kotz, S. A., Kalberlah, C., Bahlmann, J., Friederici, A. D., & Haynes, J. D. (2013). Predicting vocal emotion expressions from the human brain. Human Brain Mapping, 34(8), 1971–1981. doi:10.1002/hbm.22041.
Lang, P. J., Bradley, M. M., & Cuthbert, B. N. (2008). International Affective Picture System (IAPS): Technical manual and affective ratings. Gainesville. FL: Center for Research in Psychophysiology. University of Florida.
Laukka, P. (2005). Categorical perception of vocal emotion expressions. Emotion, 5, 277–295. doi:10.1037/1528-3542.5.3.277.
Laukka, P., Elfenbein, H. A., Söder, N., Nordström, H., Althoff, J., Chui, W., et al. (2013). Cross-cultural decoding of positive and negative non-linguistic emotion vocalizations. Frontiers in Psychology, 4, 353. doi:10.3389/fpsyg.2013.00353.
Laukka, P., Juslin, P., & Bresin, R. (2005). A dimensional approach to vocal expression of emotion. Cognition and Emotion, 19, 633–653. doi:10.1080/02699930441000445.
Lewis, P. A., Critchley, H. D., Rotshtein, P., & Dolan, R. J. (2007). Neural correlates of processing valence and arousal in affective words. Cerebral Cortex, 17(3), 742–748. doi:10.1093/cercor/bhk024.
Lima, C. F., Alves, T., Scott, S. K., & Castro, S. L. (2014). In the ear of the beholder: How age shapes emotion processing in nonverbal vocalizations. Emotion, 14(1), 145. doi:10.1037/a0034287.
Lima, C. F., Castro, S. L., & Scott, S. K. (2013). When voices get emotional: A corpus of nonverbal vocalizations for research on emotion processing. Behavior Research Methods, 45, 1234–1245. doi:10.3758/s13428-013-0324-3.
Liu, P., & Pell, M. D. (2012). Recognizing vocal emotions in Mandarin Chinese: A validated database of Chinese vocal emotional stimuli. Behavior Research Methods, 44(4), 1042–1051. doi:10.3758/s13428-012-0203-3.
Liu, T., Pinheiro, A. P., Deng, G., Nestor, P. G., McCarley, R. W., & Niznikiewicz, M. A. (2012). Electrophysiological insights into processing nonverbal emotional vocalizations. NeuroReport, 23, 108–112. doi:10.1097/WNR.0b013e32834ea757.
McClure, E. B. (2000). A meta-analytic review of sex differences in facial expression processing and their development in infants, children, and adolescents. Psychological Bulletin, 126, 424–453. doi:10.1037/0033-2909.126.3.424.
Morris, J. S., Scott, S. K., & Dolan, R. J. (1999). Saying it with feeling: Neural responses to emotional vocalizations. Neuropsychologia, 37(10), 1155–1163. doi:10.1016/S0028-3932(99)00015-9.
Owren, M. J., & Bachorowski, J. A. (2003). Reconsidering the evolution of nonlinguistic communication: The case of laughter. Journal of Nonverbal Behavior, 27(3), 183–200. doi:10.1023/A:1025394015198.
Pakosz, M. (1983). Attitudinal judgments in intonation: Some evidence for a theory. Journal of Psycholinguistic Research, 12, 311–326. doi:10.1007/BF01067673.
Paulmann, S., Jessen, S., & Kotz, S. A. (2012). It’s special the way you say it: An ERP investigation on the temporal dynamics of two types of prosody. Neuropsychologia, 50, 1609–1620. doi:10.1016/j.neuropsychologia.2012.03.014.
Paulmann, S., & Kotz, S. A. (2008). Early emotional prosody perception based on different speaker voices. NeuroReport, 19, 209–213. doi:10.1097/WNR.0b013e3282f454db.
Paulmann, S., & Pell, M. D. (2011). Is there an advantage for recognizing multi-modal emotional stimuli? Motivation and Emotion, 35(2), 192–201. doi:10.1007/s11031-011-9206-0.
Pell, M. D., Monetta, L., Paulmann, S., & Kotz, S. A. (2009a). Recognizing emotions in a foreign language. Journal of Nonverbal Behavior, 33(2), 107–120. doi:10.1007/s10919-008-0065-7.
Pell, M. D., Paulmann, S., Dara, C., Alasseri, A., & Kotz, S. A. (2009b). Factors in the recognition of vocally expressed emotions: A comparison of four languages. Journal of Phonetics, 37, 417–435. doi:10.1016/j.wocn.2009.07.005.
Pell, M. D., Rothermich, K., Liu, P., Paulmann, S., Sethi, S., & Rigoulot, S. (2015). Preferential decoding of emotion from human non-linguistic vocalizations versus speech prosody. Biological Psychology, 111, 14–25. doi:10.1016/j.biopsycho.2015.08.008.
Pinheiro, A. P., Barros, C., & Pedrosa, J. (2016). Salience in a social landscape: Electrophysiological effects of task-irrelevant and infrequent vocal change. Social Cognitive and Affective Neuroscience, 11(1), 127–139. doi:10.1093/scan/nsv103.
Pinheiro, A. P., Del Re, E., Mezin, J., Nestor, P. G., Rauber, A., McCarley, R. W., et al. (2013). Sensory-based and higher-order operations contribute to abnormal emotional prosody processing in schizophrenia: An electrophysiological investigation. Psychological Medicine, 43, 603–618. doi:10.1017/S003329171200133X.
Pinheiro, A. P., Dias, M., Pedrosa, J., & Soares, A. P. (in press). Minho Affective Sentences (MAS): Probing the role of sex, mood and empathy in affective ratings of verbal stimuli. Behavior Research Methods. doi:10.3758/s13428-016-0726-0.
Pinheiro, A. P., Rezaii, N., Rauber, A., Liu, T., Nestor, P. G., McCarley, R. W., et al. (2014). Abnormalities in the processing of emotional prosody from single words in schizophrenia. Schizophrenia Research, 152, 235–241. doi:10.1016/j.schres.2013.10.042.
Rosenthal, R., & Rubin, D. B. (1989). Effect size estimation for one-sample multiple-choice-type data: Design, analysis, and meta-analysis. Psychological Bulletin, 106(2), 332. doi:10.1037/0033-2909.106.2.332.
Sauter, D. A., & Eimer, M. (2010). Rapid detection of emotion from human vocalizations. Journal of Cognitive Neuroscience, 22, 474–481. doi:10.1162/jocn.2009.21215.
Sauter, D. A., Eisner, F., Calder, A. J., & Scott, S. K. (2010a). Perceptual cues in nonverbal vocal expressions of emotion. The Quarterly Journal of Experimental Psychology, 63, 2251–2272. doi:10.1080/17470211003721642.
Sauter, D. A., Eisner, F., Ekman, P., & Scott, S. K. (2010b). Cross-cultural recognition of basic emotions through nonverbal emotional vocalizations. Proceedings of the National Academy of Sciences, 107(6), 2408–2412. doi:10.1073/pnas.0908239106.
Sauter, D. A., Panattoni, C., & Happé, F. (2013). Children’s recognition of emotions from vocal cues. British Journal of Developmental Psychology, 31(1), 97–113. doi:10.1111/j.2044-835X.2012.02081.x.
Sauter, D. A., & Scott, S. K. (2007). More than one kind of happiness: Can we recognize vocal expressions of different positive states? Motivation and Emotion, 31(3), 192–199. doi:10.1007/s11031-007-9065-x.
Scherer, K. R., & Ellgring, H. (2007). Multimodal expression of emotion: Affect programs or componential appraisal patterns? Emotion, 7, 158–171. doi:10.1037/1528-3542.7.1.113.
Scherer, K. R., Ladd, D. R., & Silverman, K. E. (1984). Vocal cues to speaker affect: Testing two models. The Journal of the Acoustical Society of America, 76, 1346–1356. doi:10.1121/1.391450.
Schirmer, A., & Kotz, S. A. (2003). ERP evidence for a sex-specific Stroop effect in emotional speech. Journal of Cognitive Neuroscience, 15, 1135–1148. doi:10.1162/089892903322598102.
Schirmer, A., Kotz, S. A., & Friederici, A. D. (2002). Sex differentiates the role of emotional prosody during word processing. Cognitive Brain Research, 14, 228–233. doi:10.1016/S0926-6410(02)00108-8.
Schirmer, A., Kotz, S. A., & Friederici, A. D. (2005a). On the role of attention for the processing of emotions in speech: Sex differences revisited. Cognitive Brain Research, 24(3), 442–452. doi:10.1016/j.cogbrainres.2005.02.022.
Schirmer, A., Striano, T., & Friederici, A. D. (2005b). Sex differences in the preattentive processing of vocal emotional expressions. NeuroReport, 16, 635–639. doi:10.1097/00001756-200504250-00024.
Schirmer, A., Zysset, S., Kotz, S. A., & von Cramon, D. Y. (2004). Gender differences in the activation of inferior frontal cortex during emotional speech perception. NeuroImage, 21, 1114–1123. doi:10.1016/j.neuroimage.2003.10.048.
Schröder, M. (2003). Experimental study of affect bursts. Speech Communication, 40, 99–116. doi:10.1016/S0167-6393(02)00078-X.
Scott, S. K., Lavan, N., Chen, S., & McGettigan, C. (2014). The social life of laughter. Trends in Cognitive Sciences, 18(12), 618–620. doi:10.1016/j.tics.2014.09.002.
Soares, A. P., Comesaña, M., Pinheiro, A. P., Simões, A., & Frade, C. S. (2012). The adaptation of the Affective Norms for English words (ANEW) for European Portuguese. Behavior Research Methods, 44, 256–269. doi:10.3758/s13428-011-0131-7.
Soares, A. P., Pinheiro, A. P., Costa, A., Frade, C. S., Comesaña, M., & Pureza, R. (2013). Affective auditory stimuli: Adaptation of the international affective digitized sounds (IADS-2) for European Portuguese. Behavior Research Methods, 45, 1168–1181. doi:10.3758/s13428-012-0310-1.
Soares, A. P., Pinheiro, A. P., Costa, A., Frade, C. S., Comesaña, M., & Pureza, R. (2015). Adaptation of the International Affective Picture System (IAPS) for European Portuguese. Behavior Research Methods, 47(4), 1159–1177. doi:10.3758/s13428-014-0535-2.
Stevenson, R. A., & James, T. W. (2008). Affective auditory stimuli: Characterization of the International Affective Digitized Sounds (IADS) by discrete emotional categories. Behavior Research Methods, 40, 315–321. doi:10.3758/BRM.40.1.315.
Stevenson, R. A., Mikels, J. A., & James, T. W. (2007). Characterization of the affective norms for English words by discrete emotional categories. Behavior Research Methods, 39, 1020–1024. doi:10.3758/BF03192999.
Thompson, A. E., & Voyer, D. (2014). Sex differences in the ability to recognize non-verbal displays of emotion: A meta-analysis. Cognition and Emotion, 28, 1164–1195. doi:10.1080/02699931.2013.875889.
Wagner, H. L. (1993). On measuring performance in category judgment studies of nonverbal behavior. Journal of Nonverbal Behavior, 17(1), 3–28. doi:10.1007/BF00987006.
Wallbott, H. G. (1988). Big girls don’t frown, big boys don’t cry—Gender differences of professional actors in communicating emotion via facial expression. Journal of Nonverbal Behavior, 12, 98–106.
Acknowledgments
The authors gratefully acknowledge all the participants who collaborated in the study.
Funding
This work was supported by Grants IF/00334/2012, and PTDC/MHN-PCN/3606/2012 funded by Fundação para a Ciência e a Tecnologia (FCT, Portugal) and Fundo Europeu de Desenvolvimento Regional (FEDER) through the European programs Quadro de Referência Estratégico Nacional (QREN), and Programa Operacional Factores de Competitividade (COMPETE), awarded to A.P.P., and by a Doctoral Grant (SFRH/BD/52400/2013) funded by Fundação para a Ciência e a Tecnologia (Portugal), awarded to M.V.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical Approval
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Vasconcelos, M., Dias, M., Soares, A.P. et al. What is the Melody of That Voice? Probing Unbiased Recognition Accuracy with the Montreal Affective Voices. J Nonverbal Behav 41, 239–267 (2017). https://doi.org/10.1007/s10919-017-0253-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10919-017-0253-4