short-paper

Public Access

Multimodal Gated Information Fusion for Emotion Recognition from EEG Signals and Facial Behaviors

Authors:
Soheil Rayatdoost

University of Geneva, Geneva, Switzerland

University of Geneva, Geneva, Switzerland
View Profile

,
David Rudrauf

University of Geneva, Geneva, Switzerland

University of Geneva, Geneva, Switzerland
View Profile

,
Mohammad Soleymani

University of Southern California, Los Angeles, CA, USA

University of Southern California, Los Angeles, CA, USA
View Profile

ICMI '20: Proceedings of the 2020 International Conference on Multimodal InteractionOctober 2020Pages 655–659https://doi.org/10.1145/3382507.3418867

Published:22 October 2020Publication History

ICMI '20: Proceedings of the 2020 International Conference on Multimodal Interaction

Pages 655–659

ABSTRACT

Emotions associated with neural and behavioral responses are detectable through scalp electroencephalogram (EEG) signals and measures of facial expressions. We propose a multimodal deep representation learning approach for emotion recognition from EEG and facial expression signals. The proposed method involves the joint learning of a unimodal representation aligned with the other modality through cosine similarity and a gated fusion for modality fusion. We evaluated our method on two databases: DAI-EF and MAHNOB-HCI. The results show that our deep representation is able to learn mutual and complementary information between EEG signals and face video, captured by action units, head and eye movements from face videos, in a manner that generalizes across databases. It is able to outperform similar fusion methods for the task at hand.

Supplemental Material

3382507.3418867.mp4

mp4

30.4 MB

Download

References

J. Arevalo, Th. Solorio, M. Montes-y-Gómez, and F. A. González. 2017. Gated Multimodal Units for Information Fusion. arXiv e-prints, Article arXiv:1702.01992 (Feb. 2017), arXiv:1702.01992 pages. arXiv:stat.ML/1702.01992Google Scholar
T. Baltrušaitis, P.Robinson, and LP Morency. 2016. OpenFace: An open source facial behavior analysis toolkit. In 2016 IEEE Winter Conference on Applications of Computer Vision (WACV). 1--10.Google ScholarCross Ref
T. Baltrušaitis, C. Ahuja, and LP. Morency. 2019. Multimodal Machine Learning: A Survey and Taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 2 (Feb 2019), 423--443. https://doi.org/10.1109/TPAMI.2018.2798607Google ScholarDigital Library
P. Bashivan, I. Rish, M. Yeasin, and N. Codella. 2016. Learning Representations from EEG with Deep Recurrent-Convolutional Neural Networks. In 4th International Conference on Learning Representations (ICLR).Google Scholar
M. Chen, S. Wang, P. Pu Liang, T. Baltrušaitis, A. Zadeh, and LP. Morency. 2017. Multimodal sentiment analysis with word-level fusion and reinforcement learning. In Proceedings of the 19th ACM International Conference on Multimodal Interaction - ICMI 2017. ACM Press. https://doi.org/10.1145/3136755.3136801Google Scholar
S. K. D'mello and J. Kory. 2015. A Review and Meta-Analysis of Multimodal Affect Detection Systems. ACM Comput. Surv. 47, 3, Article 43 (Feb. 2015), 36 pages.Google Scholar
Paul Ekman and Wallace V. Friesen. 1978. Facial action coding system: a technique for the measurement of facial movement. Consulting Psychologists Press.Google Scholar
I.I. Goncharova, D.J. McFarland, T.M. Vaughan, and J.R. Wolpaw. 2003. EMG contamination of EEG: spectral and topographical characteristics. Clinical Neurophysiology 114, 9 (sep 2003), 1580--1593. https://doi.org/10.1016/S1388--2457(03)00093-Google ScholarCross Ref
S. Hochreiter and J. Schmidhuber. 1997. Long Short-Term Memory. Neural Computation 9, 8 (1997), 1735--1780. https://doi.org/10.1162/neco.1997.9.8.1735 arXiv:https://doi.org/10.1162/neco.1997.9.8.1735Google ScholarDigital Library
Jaekyum K., Junho K., Yecheol K., Jaehyung Ch., Youngbae H., and Jun Won Ch. 2019. Robust Deep Multi-modal Learning Based on Gated Information Fusion Network. In Computer Vision -- ACCV 2018, C.V. Jawahar, Hongdong Li, Greg Mori, and Konrad Schindler (Eds.). Springer International Publishing, Cham, 90--106.Google Scholar
L. Kessous, G. Castellano, and G. Caridakis. 2010. Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis. Journal on Multimodal User Interfaces 3, 1 (2010), 33--48.Google ScholarCross Ref
Sander Koelstra and Ioannis Patras. 2013. Fusion of facial expressions and EEG for implicit affective tagging. Image and Vision Computing 31, 2 (2013), 164--174. https://doi.org/10.1016/j.imavis.2012.10.002Google ScholarDigital Library
J.D. Morris. 1995. Observations: SAM: The self-assessment manikin: An efficient cross-cultural measurement of emotional response. Journal of Advertising Research 35, 6 (1995), 63--68.Google Scholar
Ch. Mühl, B. Allison, and G. Nijholt, A.and Chanel. 2014. A survey of affective brain computer interfaces: principles, state-of-the-art, and challenges. BrainComputer Interfaces 1, 2 (2014), 66--84. https://doi.org/10.1080/2326263X.2014.Google ScholarCross Ref
S. Rayatdoost, D. Rudrauf, and M. Soleymani. 2020. Expression-Guided EEG Representation Learning for Emotion Recognition. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 3222--3226.Google Scholar
S. Rayatdoost and M. Soleymani. 2018. CROSS-CORPUS EEG-BASED EMOTION RECOGNITION. In 2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP). 1--6. https://doi.org/10.1109/MLSP.2018.8517037Google Scholar
S. Siddharth, T. Jung, and T. J. Sejnowski. 2019. Impact of Affective Multimedia Content on the Electroencephalogram and Facial Expressions. Scientific Reports 9, 1 (Nov. 2019). https://doi.org/10.1038/s41598-019--52891--2Google ScholarCross Ref
S. Siddharth, T. Jung, and T. J. Sejnowski. 2019. Utilizing Deep Learning Towards Multi-modal Bio-sensing and Vision-based Affective Computing. IEEE Transactions on Affective Computing (2019), 1--1. https://doi.org/10.1109/TAFFC.2019.Google Scholar
K. Simonyan and A. Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:cs.CV/1409.1556Google Scholar
M. Soleymani, S. Asghari-Esfeden, Y. Fu, and M. Pantic. 2016. Analysis of EEG Signals and Facial Expressions for Continuous Emotion Detection. IEEE Transactions on Affective Computing 7, 1 (Jan 2016), 17--28. https://doi.org/10.1109/ TAFFC.2015.2436926Google ScholarDigital Library
M. Soleymani, J. Lichtenauer, T. Pun, and M. Pantic. 2012. A Multimodal Database for Affect Recognition and Implicit Tagging. IEEE Transactions on Affective Computing 3, 1 (Jan 2012), 42--55. https://doi.org/10.1109/T-AFFC.2011.25Google ScholarDigital Library
M. Soleymani, M. Pantic, and T. Pun. 2012. Multimodal Emotion Recognition in Response to Videos. IEEE Transactions on Affective Computing 3, 2 (April 2012), 211--223. https://doi.org/10.1109/T-AFFC.2011.37Google ScholarDigital Library
T. Song, W. Zheng, P. Song, and Z. Cui. 2018. EEG Emotion Recognition Using Dynamical Graph Convolutional Neural Networks. IEEE Transactions on Affective Computing (2018), 1--1. https://doi.org/10.1109/TAFFC.2018.2817622Google Scholar
X. Yang, P. Ramesh, R. Chitta, S. Madhvanath, E. A. Bernal, and J. Luo. 2017. Deep Multimodal Representation Learning from Temporal Data. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 5066--5074. https: //doi.org/10.1109/CVPR.2017.538Google ScholarCross Ref
A. Zadeh, M. Chen, S. Poria, E. Cambria, and LP. Morency. 2017. Tensor Fusion Network for Multimodal Sentiment Analysis. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 1103--1114. https: //doi.org/10.18653/v1/D17--1115Google Scholar
L. Zhao, R. Li, W. Zheng, and B. Lu. 2019. Classification of Five Emotions from EEG and Eye Movement Signals: Complementary Representation Properties. In 2019 9th International IEEE/EMBS Conference on Neural Engineering (NER). 611--614. https://doi.org/10.1109/NER.2019.8717055Google ScholarCross Ref

Index Terms

Multimodal Gated Information Fusion for Emotion Recognition from EEG Signals and Facial Behaviors
1. Computing methodologies
  1. Machine learning
2. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

Valence-Arousal Model based Emotion Recognition using EEG, peripheral physiological signals and Facial Expression
ICMLSC '20: Proceedings of the 4th International Conference on Machine Learning and Soft Computing

Emotion recognition plays a particularly important role in the field of artificial intelligence. However, the emotional recognition of electroencephalogram (EEG) in the past was only a unimodal or a bimodal based on EEG. This paper aims to use deep ...
Read More
Maximum weight multi-modal information fusion algorithm of electroencephalographs and face images for emotion recognition
Highlights
- The novelty of the work presented in this paper is that a information fusion algorithm of the maximum multi-modal weight decision is proposed. The emotion recognition scheme is developed by using the multi-modal information fusion ...
Abstract
In view of the low accuracy of the traditional emotion recognition methods based on facial expressions, an emotion recognition method based on maximum weight multi-modal information fusion of electroencephalographs (EEGs) and facial expression ...
Graphical Abstract
The novelty of the work presented in this paper is that a fusion method of maximum weight decision information is proposed. Emotion recognition is realized by the weighted fusion of EEG information and facial expression information. ...
Read More
Human-Computer Interaction Using Emotion Recognition from Facial Expression
EMS '11: Proceedings of the 2011 UKSim 5th European Symposium on Computer Modeling and Simulation

This paper describes emotion recognition system based on facial expression. A fully automatic facial expression recognition system is based on three steps: face detection, facial characteristic extraction and facial expression classification. We have ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMI '20: Proceedings of the 2020 International Conference on Multimodal Interaction
October 2020
920 pages
ISBN:9781450375818
DOI:10.1145/3382507
General Chairs:
Khiet Truong
University of Twente, the Netherlands
,
Dirk Heylen
University of Twente, the Netherlands
,
Mary Czerwinski
Microsoft Research, USA
,
Program Chairs:
Nadia Berthouze
University College London, United Kingdom
,
Mohamed Chetouani
Sorbonne University, France
,
Mikio Nakano
C4A Research Institute, Japan
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 October 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deep learning
eeg signals
emotion recognition
facial expression
information fusion
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate453of1,080submissions,42%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 773
  Total Downloads
- Downloads (Last 12 months)210
- Downloads (Last 6 weeks)17
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Multimodal Gated Information Fusion for Emotion Recognition from EEG Signals and Facial Behaviors

ICMI '20: Proceedings of the 2020 International Conference on Multimodal Interaction

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Valence-Arousal Model based Emotion Recognition using EEG, peripheral physiological signals and Facial Expression

Maximum weight multi-modal information fusion algorithm of electroencephalographs and face images for emotion recognition

Human-Computer Interaction Using Emotion Recognition from Facial Expression