Understanding and Predicting Bonding in Conversations Using Thin Slices of Facial Expressions and Body Language

Jaques, Natasha; McDuff, Daniel; Kim, Yoo Lim; Picard, Rosalind

doi:10.1007/978-3-319-47665-0_6

Understanding and Predicting Bonding in Conversations Using Thin Slices of Facial Expressions and Body Language

Natasha Jaques¹⁹,
Daniel McDuff²⁰,
Yoo Lim Kim²¹ &
…
Rosalind Picard¹⁹

Conference paper
First Online: 19 October 2016

3071 Accesses
17 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10011))

Abstract

This paper investigates how an intelligent agent could be designed to both predict whether it is bonding with its user, and convey appropriate facial expression and body language responses to foster bonding. Video and Kinect recordings are collected from a series of naturalistic conversations, and a reliable measure of bonding is adapted and verified. A qualitative and quantitative analysis is conducted to determine the non-verbal cues that characterize both high and low bonding conversations. We then train a deep neural network classifier using one minute segments of facial expression and body language data, and show that it is able to accurately predict bonding in novel conversations.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Even if some participants did speak with exaggerated lip movements, this would not affect our later analysis.
2.
The participant is the one who completes the B-WAI about their partner.
3.
Note again that bonding is not symmetric and neither is the matrix in Fig. 4; it is computed based on the participant’s perception of bonding, not her partner’s.
4.
There are several strong differences in inner eyebrow raising, however this AU can be associated with either sadness or happiness, making it difficult to interpret [18].
5.
After CFS, two body part features that are highly correlated (for example, the left and right hips) will be represented by only one of the pair (e.g. the right hip).
6.
A similar heatmap was generated, but there is insufficient space to show it here.
7.
The other parameter settings were: learning rate \({=}\,.01\), batch size \({=}\,20\), L2 regularization \(\beta =.01\), no dropout.

References

Ambady, N., Rosenthal, R.: Thin slices of expressive behavior as predictors of interpersonal consequences: a meta-analysis. Psychol. Bull. 111, 256 (1992)
Article Google Scholar
Horvath, A., Greenberg, L.: Development and validation of the working alliance inventory. J. Couns. Psychol. 36(2), 223 (1989)
Article Google Scholar
Pentland, A.: Social dynamics: signals and behavior. In: International Conference on Developmental Learning, vol. 5 (2004)
Google Scholar
Valstar, M., et al.: Meta-analysis of the first facial expression recognition challenge. Syst. Man Cybern. 42(4), 966–979 (2012)
Article Google Scholar
Avola, D., Cinque, L., Levialdi, S., Placidi, G.: Human body language analysis: a preliminary study based on kinect skeleton tracking. In: Petrosino, A., Maddalena, L., Pala, P. (eds.) ICIAP 2013. LNCS, vol. 8158, pp. 465–473. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41190-8_50
Chapter Google Scholar
Yang, Z., Metallinou, A., Narayanan, S.: Analysis and predictive modeling of body language behavior in dyadic interactions from multimodal interlocutor cues. Multimedia 16(6), 1766–1778 (2014)
Google Scholar
Gratch, J., Wang, N., Gerten, J., Fast, E., Duffy, R.: Creating rapport with virtual agents. In: Pelachaud, C., Martin, J.-C., André, E., Chollet, G., Karpouzis, K., Pelé, D. (eds.) IVA 2007. LNCS (LNAI), vol. 4722, pp. 125–138. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74997-4_12
Chapter Google Scholar
Kahl, S., Kopp, S.: Modeling a social brain for interactive agents: integrating mirroring and mentalizing. In: Brinkman, W.-P., Broekens, J., Heylen, D. (eds.) IVA 2015. LNCS (LNAI), vol. 9238, pp. 77–86. Springer, Heidelberg (2015). doi:10.1007/978-3-319-21996-7_8
Chapter Google Scholar
Zhao, R., Papangelis, A., Cassell, J.: Towards a dyadic computational model of rapport management for human-virtual agent interaction. In: Bickmore, T., Marsella, S., Sidner, C. (eds.) IVA 2014. LNCS (LNAI), vol. 8637, pp. 514–527. Springer, Heidelberg (2014). doi:10.1007/978-3-319-09767-1_62
Google Scholar
Wong, J.W.-E., McGee, K.: Frown more, talk more: effects of facial expressions in establishing conversational rapport with virtual agents. In: Nakano, Y., Neff, M., Paiva, A., Walker, M. (eds.) IVA 2012. LNCS (LNAI), vol. 7502, pp. 419–425. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33197-8_43
Chapter Google Scholar
Cuperman, R., Ickes, W.: Big five predictors of behavior and perceptions in initial dyadic interactions. J. Pers. Soc. Psych. 97(4), 667 (2009)
Article Google Scholar
McDuff, D., et al.: AFFDEX SDK: a cross-platform real-time multi-face expression recognition toolkit. In: CHI, pp. 3723–3726. ACM (2016)
Google Scholar
Ekman, P., Friesen, W.: Facial action coding system (1977)
Google Scholar
Hall, M.A.: Correlation-based feature subset selection for machine learning, Ph.D. thesis, University of Waikato, Hamilton, New Zealand (1998)
Google Scholar
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems. Software (2015). tensorflow.org
Provine, R.R.: Laughter: A Scientific Investigation. Penguin, New York (2001)
Google Scholar
Meeren, H., van Heijnsbergen, C., de Gelder, B.: Rapid perceptual integration of facial expression and emotional body language. PNAS 102, 16518–16523 (2005)
Article Google Scholar
Kohler, C., et al.: Differences in facial expressions of four universal emotions. Psychiatr. Res. 128(3), 235–244 (2004)
Article MathSciNet Google Scholar
Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press, Cambridge (2012)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

MIT Media Lab, Cambridge, MA, 02139, USA
Natasha Jaques & Rosalind Picard
Affectiva, Waltham, MA, 02453, USA
Daniel McDuff
Wellesley College, Wellesley, MA, 02481, USA
Yoo Lim Kim

Authors

Natasha Jaques
View author publications
You can also search for this author in PubMed Google Scholar
Daniel McDuff
View author publications
You can also search for this author in PubMed Google Scholar
Yoo Lim Kim
View author publications
You can also search for this author in PubMed Google Scholar
Rosalind Picard
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Natasha Jaques .

Editor information

Editors and Affiliations

University of Southern California, Los Angeles, California, USA
David Traum
University of Southern California, Los Angeles, California, USA
William Swartout
US Army Research Laboratory, Los Angeles, California, USA
Peter Khooshabeh
Universität Bielefeld, Bielefeld, Nordrhein-Westfalen, Germany
Stefan Kopp
University of Southern California, Los Angeles, California, USA
Stefan Scherer
University of Southern California, Los Angeles, California, USA
Anton Leuski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jaques, N., McDuff, D., Kim, Y.L., Picard, R. (2016). Understanding and Predicting Bonding in Conversations Using Thin Slices of Facial Expressions and Body Language. In: Traum, D., Swartout, W., Khooshabeh, P., Kopp, S., Scherer, S., Leuski, A. (eds) Intelligent Virtual Agents. IVA 2016. Lecture Notes in Computer Science(), vol 10011. Springer, Cham. https://doi.org/10.1007/978-3-319-47665-0_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-47665-0_6
Published: 19 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47664-3
Online ISBN: 978-3-319-47665-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics