ABSTRACT
Emotions associated with neural and behavioral responses are detectable through scalp electroencephalogram (EEG) signals and measures of facial expressions. We propose a multimodal deep representation learning approach for emotion recognition from EEG and facial expression signals. The proposed method involves the joint learning of a unimodal representation aligned with the other modality through cosine similarity and a gated fusion for modality fusion. We evaluated our method on two databases: DAI-EF and MAHNOB-HCI. The results show that our deep representation is able to learn mutual and complementary information between EEG signals and face video, captured by action units, head and eye movements from face videos, in a manner that generalizes across databases. It is able to outperform similar fusion methods for the task at hand.
Supplemental Material
- J. Arevalo, Th. Solorio, M. Montes-y-Gómez, and F. A. González. 2017. Gated Multimodal Units for Information Fusion. arXiv e-prints, Article arXiv:1702.01992 (Feb. 2017), arXiv:1702.01992 pages. arXiv:stat.ML/1702.01992Google Scholar
- T. Baltrušaitis, P.Robinson, and LP Morency. 2016. OpenFace: An open source facial behavior analysis toolkit. In 2016 IEEE Winter Conference on Applications of Computer Vision (WACV). 1--10.Google ScholarCross Ref
- T. Baltrušaitis, C. Ahuja, and LP. Morency. 2019. Multimodal Machine Learning: A Survey and Taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 2 (Feb 2019), 423--443. https://doi.org/10.1109/TPAMI.2018.2798607Google ScholarDigital Library
- P. Bashivan, I. Rish, M. Yeasin, and N. Codella. 2016. Learning Representations from EEG with Deep Recurrent-Convolutional Neural Networks. In 4th International Conference on Learning Representations (ICLR).Google Scholar
- M. Chen, S. Wang, P. Pu Liang, T. Baltrušaitis, A. Zadeh, and LP. Morency. 2017. Multimodal sentiment analysis with word-level fusion and reinforcement learning. In Proceedings of the 19th ACM International Conference on Multimodal Interaction - ICMI 2017. ACM Press. https://doi.org/10.1145/3136755.3136801Google Scholar
- S. K. D'mello and J. Kory. 2015. A Review and Meta-Analysis of Multimodal Affect Detection Systems. ACM Comput. Surv. 47, 3, Article 43 (Feb. 2015), 36 pages.Google Scholar
- Paul Ekman and Wallace V. Friesen. 1978. Facial action coding system: a technique for the measurement of facial movement. Consulting Psychologists Press.Google Scholar
- I.I. Goncharova, D.J. McFarland, T.M. Vaughan, and J.R. Wolpaw. 2003. EMG contamination of EEG: spectral and topographical characteristics. Clinical Neurophysiology 114, 9 (sep 2003), 1580--1593. https://doi.org/10.1016/S1388--2457(03)00093-Google ScholarCross Ref
- S. Hochreiter and J. Schmidhuber. 1997. Long Short-Term Memory. Neural Computation 9, 8 (1997), 1735--1780. https://doi.org/10.1162/neco.1997.9.8.1735 arXiv:https://doi.org/10.1162/neco.1997.9.8.1735Google ScholarDigital Library
- Jaekyum K., Junho K., Yecheol K., Jaehyung Ch., Youngbae H., and Jun Won Ch. 2019. Robust Deep Multi-modal Learning Based on Gated Information Fusion Network. In Computer Vision -- ACCV 2018, C.V. Jawahar, Hongdong Li, Greg Mori, and Konrad Schindler (Eds.). Springer International Publishing, Cham, 90--106.Google Scholar
- L. Kessous, G. Castellano, and G. Caridakis. 2010. Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis. Journal on Multimodal User Interfaces 3, 1 (2010), 33--48.Google ScholarCross Ref
- Sander Koelstra and Ioannis Patras. 2013. Fusion of facial expressions and EEG for implicit affective tagging. Image and Vision Computing 31, 2 (2013), 164--174. https://doi.org/10.1016/j.imavis.2012.10.002Google ScholarDigital Library
- J.D. Morris. 1995. Observations: SAM: The self-assessment manikin: An efficient cross-cultural measurement of emotional response. Journal of Advertising Research 35, 6 (1995), 63--68.Google Scholar
- Ch. Mühl, B. Allison, and G. Nijholt, A.and Chanel. 2014. A survey of affective brain computer interfaces: principles, state-of-the-art, and challenges. BrainComputer Interfaces 1, 2 (2014), 66--84. https://doi.org/10.1080/2326263X.2014.Google ScholarCross Ref
- S. Rayatdoost, D. Rudrauf, and M. Soleymani. 2020. Expression-Guided EEG Representation Learning for Emotion Recognition. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 3222--3226.Google Scholar
- S. Rayatdoost and M. Soleymani. 2018. CROSS-CORPUS EEG-BASED EMOTION RECOGNITION. In 2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP). 1--6. https://doi.org/10.1109/MLSP.2018.8517037Google Scholar
- S. Siddharth, T. Jung, and T. J. Sejnowski. 2019. Impact of Affective Multimedia Content on the Electroencephalogram and Facial Expressions. Scientific Reports 9, 1 (Nov. 2019). https://doi.org/10.1038/s41598-019--52891--2Google ScholarCross Ref
- S. Siddharth, T. Jung, and T. J. Sejnowski. 2019. Utilizing Deep Learning Towards Multi-modal Bio-sensing and Vision-based Affective Computing. IEEE Transactions on Affective Computing (2019), 1--1. https://doi.org/10.1109/TAFFC.2019.Google Scholar
- K. Simonyan and A. Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:cs.CV/1409.1556Google Scholar
- M. Soleymani, S. Asghari-Esfeden, Y. Fu, and M. Pantic. 2016. Analysis of EEG Signals and Facial Expressions for Continuous Emotion Detection. IEEE Transactions on Affective Computing 7, 1 (Jan 2016), 17--28. https://doi.org/10.1109/ TAFFC.2015.2436926Google ScholarDigital Library
- M. Soleymani, J. Lichtenauer, T. Pun, and M. Pantic. 2012. A Multimodal Database for Affect Recognition and Implicit Tagging. IEEE Transactions on Affective Computing 3, 1 (Jan 2012), 42--55. https://doi.org/10.1109/T-AFFC.2011.25Google ScholarDigital Library
- M. Soleymani, M. Pantic, and T. Pun. 2012. Multimodal Emotion Recognition in Response to Videos. IEEE Transactions on Affective Computing 3, 2 (April 2012), 211--223. https://doi.org/10.1109/T-AFFC.2011.37Google ScholarDigital Library
- T. Song, W. Zheng, P. Song, and Z. Cui. 2018. EEG Emotion Recognition Using Dynamical Graph Convolutional Neural Networks. IEEE Transactions on Affective Computing (2018), 1--1. https://doi.org/10.1109/TAFFC.2018.2817622Google Scholar
- X. Yang, P. Ramesh, R. Chitta, S. Madhvanath, E. A. Bernal, and J. Luo. 2017. Deep Multimodal Representation Learning from Temporal Data. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 5066--5074. https: //doi.org/10.1109/CVPR.2017.538Google ScholarCross Ref
- A. Zadeh, M. Chen, S. Poria, E. Cambria, and LP. Morency. 2017. Tensor Fusion Network for Multimodal Sentiment Analysis. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 1103--1114. https: //doi.org/10.18653/v1/D17--1115Google Scholar
- L. Zhao, R. Li, W. Zheng, and B. Lu. 2019. Classification of Five Emotions from EEG and Eye Movement Signals: Complementary Representation Properties. In 2019 9th International IEEE/EMBS Conference on Neural Engineering (NER). 611--614. https://doi.org/10.1109/NER.2019.8717055Google ScholarCross Ref
Index Terms
- Multimodal Gated Information Fusion for Emotion Recognition from EEG Signals and Facial Behaviors
Recommendations
Valence-Arousal Model based Emotion Recognition using EEG, peripheral physiological signals and Facial Expression
ICMLSC '20: Proceedings of the 4th International Conference on Machine Learning and Soft ComputingEmotion recognition plays a particularly important role in the field of artificial intelligence. However, the emotional recognition of electroencephalogram (EEG) in the past was only a unimodal or a bimodal based on EEG. This paper aims to use deep ...
Maximum weight multi-modal information fusion algorithm of electroencephalographs and face images for emotion recognition
Highlights- The novelty of the work presented in this paper is that a information fusion algorithm of the maximum multi-modal weight decision is proposed. The emotion recognition scheme is developed by using the multi-modal information fusion ...
AbstractIn view of the low accuracy of the traditional emotion recognition methods based on facial expressions, an emotion recognition method based on maximum weight multi-modal information fusion of electroencephalographs (EEGs) and facial expression ...
Graphical AbstractThe novelty of the work presented in this paper is that a fusion method of maximum weight decision information is proposed. Emotion recognition is realized by the weighted fusion of EEG information and facial expression information. ...
Human-Computer Interaction Using Emotion Recognition from Facial Expression
EMS '11: Proceedings of the 2011 UKSim 5th European Symposium on Computer Modeling and SimulationThis paper describes emotion recognition system based on facial expression. A fully automatic facial expression recognition system is based on three steps: face detection, facial characteristic extraction and facial expression classification. We have ...
Comments