Abstract
In this paper, we propose a method for pose-invariant facial expression recognition from monocular video sequences. The advantage of our method is that, unlike existing methods, our method uses a simple model, called the variable-intensity template, for describing different facial expressions. This makes it possible to prepare a model for each person with very little time and effort. Variable-intensity templates describe how the intensities of multiple points, defined in the vicinity of facial parts, vary with different facial expressions. By using this model in the framework of a particle filter, our method is capable of estimating facial poses and expressions simultaneously. Experiments demonstrate the effectiveness of our method. A recognition rate of over 90% is achieved for all facial orientations, horizontal, vertical, and in-plane, in the range of ±40 degrees, ±20 degrees, and ±40 degrees from the frontal view, respectively.
Similar content being viewed by others
References
Bartlett, M. S., Littlewort, G., Frank, M. G., Lainscsek, C., Fasel, I. R., & Movellan, J. R. (2006). Automatic recognition of facial actions in spontaneous expressions. Journal of Multimedia, 1(6), 22–35.
Beaton, A. E., & Tukey, J. W. (1974). The fitting of power series, meaning polynomials, illustrated on band-spectroscopic data. Technometrics, 16(2), 147–185.
Black, M. J., & Yacoob, Y. (1997). Recognizing facial expressions in image sequences using local parameterized models of image motion. International Journal of Computer Vision, 25(1), 23–48.
Cascia, M. L., Sclaroff, S., & Athitsos, V. (2000). Fast, reliable head tracking under varying illumination: An approach based on registration of texture-mapped 3D models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(4), 322–336.
Castrillon, M., Deniz, O., Guerra, C., & Hernandez, M. (2007). Encara2: Real-time detection of multiple faces at different resolutions in video streams. Journal of Visual Communication and Image Representation, 18(2), 130–140.
Chang, Y., Hu, C., Feris, R., & Turk, M. (2006). Manifold based analysis of facial expression. Image and Vision Computing, 24(6), 605–614.
Cohen, I., Sebe, N., Garg, A., Chen, L. S., & Huang, T. S. (2003). Facial expression recognition from video sequences: temporal and static modeling. Computer Vision and Image Understanding, 91(1–2), 160–187.
Dornaika, F., & Davoine, F. (2008). Simultaneous facial action tracking and expression recognition in the presence of head motion. International Journal of Computer Vision, 76(3), 257–281.
Ekman, P., & Friesen, W. V. (1975). Unmasking the face: a guide to recognizing emotions from facial expressions. Englewood Cliffs: Prentice Hall.
Ekman, P., & Friesen, W. V. (1978). The facial action coding system: a technique for the measurement of facial movement. Palo Alto: Consulting Psychologists Press.
Ekman, P., Friesen, W. V., & Hager, J. C. (2002). FACS investigator’s guide. A human face.
Fasel, B., & Luettin, J. (2003). Automatic facial expression analysis: Survey. Pattern Recognition, 36, 259–275.
Fasel, B., Monay, F., & Gatica-Perez, D. (2004). Latent semantic analysis of facial action codes for automatic facial expression recognition. In Proceedings of the ACM SIGMM international workshop on multimedia information retrieval (pp. 181–188).
Geman, S., & McClure, D. E. (1987). Statistical methods for tomographic image reconstruction. Bulletin of the International Statistical Institute, LII, 5–21.
Gokturk, S. B., Tomasi, C., Girod, B., & Bouguet, J. (2002). Model-based face tracking for view-independent facial expression recognition. In Proceedings of the IEEE international conference on automatic face and gesture recognition (pp. 287–293).
Gross, R., Matthews, I., & Baker, S. (2005). Generic vs. person specific active appearance models. Image and Vision Computing, 23(11), 1080–1093.
Hu, Y., Zeng, Z., Yin, L., Wei, X., Zhou, X., & Huang, T. S. (2008). Multi-view facial expression recognition. In Proceedings of the IEEE international conference on automatic face and gesture recognition.
Huang, C. L., & Huang, Y. M. (1997). Facial expression recognition using model-based feature extraction and action parameters classification. Journal of Visual Communication and Image Representation, 8(3), 278–290.
Isard, M., & Blake, A. (1998). Condensation—conditional density propagation for visual tracking. International Journal of Computer Vision, 29(1), 5–28.
Kanade, T., Cohn, J., & Tian, Y. L. (2000). Comprehensive database for facial expression analysis. In Proceedings of the IEEE international conference on automatic face and gesture recognition (pp. 46–53).
Kass, M., Witkin, A., & Terzopoulos, D. (1988). Snakes: Active contour models. International Journal of Computer Vision, 1(4), 321–331.
Koelstra, S., & Pantic, M. (2008). Non-rigid registration using free-form deformations for recognition of facial actions and their temporal dynamics. In Proceedings of the IEEE international conference on automatic face and gesture recognition.
Kotsia, I., & Pitas, I. (2007). Facial expression recognition in image sequences using geometric deformation features and support vector machines. IEEE Transactions on Image Processing, 16(1), 172–187.
Kumano, S., Otsuka, K., Yamato, J., Maeda, E., & Sato, Y. (2007). Pose-invariant facial expression recognition using variable-intensity templates. In Proceedings of Asian conference on computer vision (Vol. 1, pp. 324–334).
Lanitis, A., Taylor, C. J., & Cootes, T. F. (1997). Automatic interpretation and coding of face images using flexible models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7), 743–756.
Liao, W. K., & Cohen, I. (2006). Belief propagation driven method for facial gestures recognition in presence of occlusions. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition workshop (pp. 158–163).
Littlewort, G., Bartlett, M. S., Fasel, I. R., Susskind, J., & Movellan, J. R. (2006). Dynamics of facial expression extracted automatically from video. Image and Vision Computing, 24(6), 615–625.
Loy, G., & Zelinsky, A. (2003). Fast radial symmetry for detecting points of interest. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(8), 959–973.
Lucey, S., Matthews, I., Hu, C., Ambadar, Z., Torre, F., & Cohn, J. (2006). AAM derived face representations for robust facial action recognition. In Proceedings of the IEEE international conference on automatic face and gesture recognition (pp. 155–160).
Matsubara, Y., & Shakunaga, T. (2005). Sparse template matching and its application to real-time object tracking. IPSJ Transactions on Computer Vision and Image Media, 46(9), 17–40 (in Japanese).
Murphy-Chutorian, E., & Trivedi, M. M. (2008). Head pose estimation in computer vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence (in press).
Oka, K., & Sato, Y. (2005). Real-time modeling of face deformation for 3D head pose estimation. In Proceedings of the IEEE international conference on automatic face and gesture recognition (pp. 308–320).
Otsuka, K., Sawada, H., & Yamato, J. (2007). Automatic inference of cross-modal nonverbal interactions in multiparty conversations: “who responds to whom, when, and how?” from gaze, head gestures, and utterances. In Proceedings of the international conference on multimodal interfaces (pp. 255–262).
Pantic, M., & Bartlett, M. (2007). Machine analysis of facial expressions. In I-Tech education and publishing (pp. 377–416).
Pantic, M., & Rothkrantz, L. (2000a). Expert system for automatic analysis of facial expression. Image and Vision Computing, 18, 881–905.
Pantic, M., & Rothkrantz, L. J. M. (2000b). Automatic analysis of facial expressions: the state of the art. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 1424–1445.
Russell, S., & Norvig, P. (2003). Artificial intelligence—a modern approach. Paris: Pearson Education.
Sebe, N., Lew, M. S., Sun, Y., Cohen, I., Gevers, T., & Huang, T. S. (2007). Authentic facial expression analysis. Image and Vision Computing, 25(12), 1856–1863.
Tang, H., & Huang, T. S. (2008). 3D facial expression recognition based on properties of line segments connecting facial feature points. In Proceedings of the IEEE international conference on automatic face and gesture recognition.
Tian, Y. L., Kanade, T., & Cohn, J. F. (2001). Recognizing action units for facial expression analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(2), 97–115.
Tian, Y. L., Kanade, T., & Cohn, J. (2005). Facial expression analysis. Berlin: Springer.
Tong, Y., Liao, W., & Ji, Q. (2007). Facial action unit recognition by exploiting their dynamic and semantic relationships. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(10), 1683–1699.
Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 511–518).
Wang, J., Yin, L., Wei, X., & Sun, Y. (2006). 3D facial expression recognition based on primitive surface feature distribution. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 1399–1406).
Xiao, J., Moriyama, T., Kanade, T., & Cohn, J. (2003). Robust full-motion recovery of head by dynamic templates and re-registration techniques. International Journal of Imaging Systems and Technology, 13, 85–94.
Xiao, J., Baker, S., Matthews, I., & Kanade, T. (2004). Real-time combined 2D+3D active appearance models. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (Vol. 2, pp. 535–542).
Yang, P., Liu, Q., Cui, X., & Metaxas, D. N. (2008). Facial expression recognition based on dynamic binary patterns. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition.
Zhang, W., Chen, H., Yao, P., Li, B., & Zhuang, Z. (2006). Precise eye localization with AdaBoost and fast radial symmetry. In Proceedings of the international conference on computational intelligence and security (Vol. 1, pp. 725–730).
Zhao, G., & Pietikainen, M. (2007). Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6), 915–928.
Zhu, Z., & Ji, Q. (2006). Robust real-time face pose and facial expression recovery. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 681–688).
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
VideoObject
VideoObject
VideoObject
VideoObject
VideoObject
VideoObject
Rights and permissions
About this article
Cite this article
Kumano, S., Otsuka, K., Yamato, J. et al. Pose-Invariant Facial Expression Recognition Using Variable-Intensity Templates. Int J Comput Vis 83, 178–194 (2009). https://doi.org/10.1007/s11263-008-0185-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-008-0185-x