skip to main content
10.1145/3474085.3479209acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper

Facial Action Unit-based Deep Learning Framework for Spotting Macro- and Micro-expressions in Long Video Sequences

Published:17 October 2021Publication History

ABSTRACT

In this paper, we utilize facial action units (AUs) detection to construct an end-to-end deep learning framework for the macro- and micro-expressions spotting task in long video sequences. The proposed framework focuses on individual components of facial muscle movement rather than processing the whole image, which eliminates the influence of image change caused by noises, such as body or head movement. Compared with existing models deploying deep learning methods with classical Convolutional Neural Network (CNN) models, the proposed framework utilizes Gated Recurrent Unit (GRU) or Long Short-term Memory (LSTM) or our proposed Concat-CNN models to learn the characteristic correlation between AUs of distinctive frames. The Concat-CNN uses three convolutional kernels with different sizes to observe features of different duration and emphasizes both local and global mutation features by changing dimensionality (max-pooling size) of the output space. Our proposal achieves state-of-the-art performance from the aspect of overall F1-scores: 0.2019 on CAS(ME)2-cropped, 0.2736 on SAMM Long Video, and 0.2118 on CAS(ME)2, which not only outperforms the baseline but is also ranked the 3rd of FME challenge 2021 for combined datasets of CAS(ME)2-cropped and SAMM-LV.

References

  1. Tadas Baltrusaitis, Amir Zadeh, Yao Chong Lim, and Louis-Philippe Morency. 2018. OpenFace 2.0: Facial Behavior Analysis Toolkit. In 2018 13th IEEE International Conference on Automatic Face Gesture Recognition (FG 2018). 59--66. https://doi.org/10.1109/FG.2018.00019Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Fethi Bougares, Holger Schwenk, and Y. Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1724--1734. https: //doi.org/10.3115/v1/D14--1179Google ScholarGoogle Scholar
  3. Adrian K Davison, Cliff Lansley, Nicholas Costen, Kevin Tan, and Moi Hoon Yap. 2016. Samm: A spontaneous micro-facial movement dataset. IEEE transactions on affective computing 9, 1 (2016), 116--129. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Adrian K Davison, Walied Merghani, and Moi Hoon Yap. 2018. Objective classes for micro-facial expression recognition. Journal of Imaging 4, 10 (2018), 119.Google ScholarGoogle ScholarCross RefCross Ref
  5. Carlos Duque, Olivier Alata, Remi Emonet, Anneclaire Legrand, and Hubert Konik. 2018. Micro-Expression Spotting Using the Riesz Pyramid. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). 66--74. https://doi.org/10.1109/WACV.2018.00014Google ScholarGoogle Scholar
  6. Jennifer Endres and Anita Laidlaw. 2009. Micro-expression recognition training in medical students: A pilot study. BMC medical education 9 (02 2009), 47. https: //doi.org/10.1186/1472-6920-9-47Google ScholarGoogle Scholar
  7. Ying He, Su-Jing Wang, Jingting Li, and Moi Hoon Yap. 2020. Spotting Macro- and Micro-expression Intervals in Long Video Sequences. In 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020) (FG). IEEE Computer Society, Los Alamitos, CA, USA, 742--748. https://doi.org/ 10.1109/FG47880.2020.00036Google ScholarGoogle Scholar
  8. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-term Memory. Neural computation 9 (12 1997), 1735--80. https://doi.org/10.1162/neco.1997.9.8.1735 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1746--1751. https://doi.org/10.3115/v1/D14-1181Google ScholarGoogle ScholarCross RefCross Ref
  10. Jingting LI, Su-Jing Wang, Moi Hoon Yap, John See, Xiaopeng Hong, and Xiaobai Li. 2020. MEGC2020 - The Third Facial Micro-Expression Grand Challenge. In 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020). 777--780. https://doi.org/10.1109/FG47880.2020.00035Google ScholarGoogle Scholar
  11. Xiaobai Li, Xiaopeng Hong, Antti Moilanen, Xiaohua Huang, Tomas Pfister, Guoying Zhao, and Matti Pietikainen. 2017. Towards Reading Hidden Emotions: A Comparative Study of Spontaneous Micro-Expression Spotting and Recognition Methods. IEEE Transactions on Affective Computing PP (02 2017), 1--1. https: //doi.org/10.1109/TAFFC.2017.2667642Google ScholarGoogle Scholar
  12. Hua Lu, Kidiyo Kpalma, and Joseph Ronsin. 2017. Micro-expression detection using integral projections. Journal of WSCG 25 (01 2017), 87--96.Google ScholarGoogle Scholar
  13. Albert Mehrabian. 1968. Communication Without Words. Psychological Today. 53--55 pages.Google ScholarGoogle Scholar
  14. Maureen O'Sullivan, Mark Frank, Carolyn Hurley, and Jaspreet Tiwana. 2009. Police Lie Detection Accuracy: The Effect of Lie Scenario. Law and human behavior 33 (11 2009). https://doi.org/10.1007/s10979-009-9191-yGoogle ScholarGoogle Scholar
  15. Hang Pan, Lun Xie, and Zhiliang Wang. 2020. Local Bilinear Convolutional Neural Network for Spotting Macro- and Micro-expression Intervals in Long Video Sequences. In 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020). 749--753. https://doi.org/10.1109/FG47880. 2020.00052Google ScholarGoogle Scholar
  16. Fangbing Qu, Su-Jing Wang, Wen-Jing Yan, He Li, Shuhang Wu, and Xiaolan Fu. 2018. CAS(ME)2: A Database for Spontaneous Macro-Expression and Micro- Expression Spotting and Recognition. IEEE Transactions on Affective Computing 9, 4 (2018), 424--436. https://doi.org/10.1109/TAFFC.2017.2654440Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Patrick Stewart, Bridget Waller, and James Schubert. 2009. Presidential speech- making style: Emotional response to micro-expressions of facial affect. Motivation and Emotion 33 (06 2009), 125--135. https://doi.org/10.1007/s11031-009-9129-1Google ScholarGoogle Scholar
  18. Su-Jing Wang, Ying He, Jingting Li, and Xiaolan Fu. 2021. MESNet: A Convo- lutional Neural Network for Spotting Multi-Scale Micro-Expression Intervals in Long Videos. IEEE Transactions on Image Processing 30 (2021), 3956--3969. https://doi.org/10.1109/TIP.2021.3064258Google ScholarGoogle ScholarCross RefCross Ref
  19. Chuin Hong Yap, Connah Kendrick, and Moi Hoon Yap. 2020. SAMM Long Videos: A Spontaneous Facial Micro- and Macro-Expressions Dataset. In 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020) (FG). IEEE Computer Society, Los Alamitos, CA, USA, 771--776. https: //doi.org/10.1109/FG47880.2020.00029Google ScholarGoogle Scholar
  20. Chuin Hong Yap, Moi Hoon Yap, Adrian K. Davison, and Ryan Cunningham. 2021. Efficient Lightweight 3D-CNN using Frame Skipping and Contrast Enhancement for Facial Macro- and Micro-expression Spotting. CoRR abs/2105.06340 (2021). arXiv:2105.06340 https://arxiv.org/abs/2105.06340Google ScholarGoogle Scholar
  21. Li-Wei Zhang, Jingting Li, Su-Jing Wang, Xian-Hua Duan, Wen-Jing Yan, Hai- Yong Xie, and Shu-Cheng Huang. 2020. Spatio-temporal fusion for Macro-andMicro-expression Spotting in Long Video Sequences. In 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020). 734--741. https://doi.org/10.1109/FG47880.2020.00037Google ScholarGoogle ScholarCross RefCross Ref
  22. Zhihao Zhang, T. Chen, Hongying Meng, Guangyuan Liu, and Xiaolan Fu. 2018. SMEConvNet: A Convolutional Neural Network for Spotting Spontaneous Facial Micro-Expression From Long Videos. IEEE Access 6 (11 2018), 71143--71151. https://doi.org/10.1109/ACCESS.2018.2879485.Google ScholarGoogle Scholar

Index Terms

  1. Facial Action Unit-based Deep Learning Framework for Spotting Macro- and Micro-expressions in Long Video Sequences

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        MM '21: Proceedings of the 29th ACM International Conference on Multimedia
        October 2021
        5796 pages
        ISBN:9781450386517
        DOI:10.1145/3474085

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 17 October 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • short-paper

        Acceptance Rates

        Overall Acceptance Rate995of4,171submissions,24%

        Upcoming Conference

        MM '24
        MM '24: The 32nd ACM International Conference on Multimedia
        October 28 - November 1, 2024
        Melbourne , VIC , Australia

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader