Abstract
Automatic analysis of head gestures and facial expressions is a challenging research area and it has significant applications in human-computer interfaces. In this study, facial landmark points are detected and tracked over successive video frames using a robust method based on subspace regularization, Kalman prediction and refinement. The trajectories (time series) of facial landmark positions during the course of the head gesture or facial expression are organized in a spatiotemporal matrix and discriminative features are extracted from the trajectory matrix. Alternatively, appearance based features are extracted from DCT coefficients of several face patches. Finally Adaboost algorithm is performed to learn a set of discriminating spatiotemporal DCT features for face and head gesture (FHG) classification. We report the classification results obtained by using the Support Vector Machines (SVM) on the outputs of the features learned by Adaboost. We achieve 94.04% subject independent classification performance over seven FHG.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Knapp, M.L., Hall, J.A.: Nonverbal communication in human interaction, 6th edn. Wadsworth/Thomson Learning, Belmont (2006)
Mehrabian, A., Ferris, S.R.: Inference of attitude from nonverbal communication in two channels. Journal of Counseling Psychology 31(3), 248–252 (1967)
Vinciarelli, A., Pantic, M., Bourlard, H.: Social signal processing: Survey of an emerging domain. Image and Vision Computing 27(12), 1743–1759 (2009)
Gatica-Perez, D.: Automatic nonverbal analysis of social interaction in small groups: A review. Image and Vision Computing 27(12), 1775–1787 (2009)
Sebe, N., Lew, M., Sun, Y., Cohen, I., Gevers, T., Huang, T.: Authentic facial expression analysis. Image and Vision Computing 25(12), 1856–1863 (2007)
Zeng, Z., Pantic, M., Roisman, G., Huang, T.: A survey of affect recognition methods: Audio, visual, and spontaneous expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(1), 39–58 (2009)
Bailenson, J.N., Pontikakis, E.D., Mauss, I.B., Gross, J.J., Jabon, M.E., Hutcherson, C.A.C., Nass, C., John, O.: Real-time classification of evoked emotions using facial feature tracking and physiological responses. Int. J. Hum.-Comput. Stud. 66(5), 303–317 (2008)
Zhang, Y., Ji, Q.: Active and dynamic information fusion for facial expression understanding from image sequences. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(5), 699–714 (2005)
Hupont, I., Cerezo, E., Baldassarri, S.: Facial emotional classifier for natural interaction 7(4), 1–12 (2008)
Dornaika, F., Davoine, F.: Simultaneous facial action tracking and expression recognition in the presence of head motion. International Journal of Computer Vision 76(3), 257–281 (2008)
Shan, C., Gong, S., McOwan, P.: Facial expression recognition based on local binary patterns a comprehensive study. Image and Vision Computing 27(6), 803–816 (2009)
Tsalakanidou, F., Malassiotis, S.: Real-time 2d+3d facial action and expression recognition. Pattern Recognition 43(5), 1763–1775 (2010)
Wang, T., James Lien, J.J.: Facial expression recognition system based on rigid and non-rigid motion separation and 3d pose estimation. Pattern Recognition 42, 962–977 (2009)
Kaliouby, R.A.: Mind-reading machines: automated inference of complex mental states. Technical report, UCAM-CL-TR 636 (2005)
Tong, Y., Wang, Y., Zhu, Z., Ji, Q.: Robust facial feature tracking under varying face pose. Pattern Recognition 40, 3195–3208 (2007)
Pantic, M., Rothkrantz, L.: Facial action recognition for facial expression analysis from static face images. IEEE Transactions on Systems, Man, and Cybernetics–Part B: Cybernetics 34(3), 1449–1461 (2004)
Ekman, P., Friesen, W.: Facial Action Coding System: A Technique for the Measurement of Facial Movement. Consulting Psychologists Press, Palo Alto (1978)
Kanaujia, A., Huang, Y., Metaxas, D.: Emblem detections by tracking facial features. In: Conference on Computer Vision and Pattern Recognition Workshop, p. 108 (2006)
Kapoor, A., Picard, R.W.: A real-time head nod and shake detector. In: Proceedings from the Workshop on Perspective User Interfaces (2001)
Kang, Y.G., Joo, H.J., Rhee, P.K.: Real time head nod and shake detection using hmms. In: Gabrys, B., Howlett, R.J., Jain, L.C. (eds.) KES 2006. LNCS (LNAI), vol. 4253, pp. 707–714. Springer, Heidelberg (2006)
Morency, L.P., Sidner, C., Lee, C., Lee, C., Darrell, T.: Contextual recognition of head gestures. In: Proc. of the 7th Int. Conf. on Multimodal Interfaces, pp. 18–24 (2005)
Kapoor, A., Burleson, W., Picard, R.W.: Automatic prediction of frustration. International Journal of Human-Computer Studies 65(8), 724–736 (2007)
Ji, Q., Lan, P., Looney, C.: A probabilistic framework for modeling and real-time monitoring human fatigue. IEEE Transactions on Systems, Man, and Cybernetics, Part A 36(5), 862–875 (2006)
Yang, J.H., Mao, Z.H., Tijerina, L., Pilutti, T., Coughlin, J.F., Feron, E.: Detection of driver fatigue caused by sleep deprivation. Trans. Sys. Man Cyber. Part A 39(4), 694–705 (2009)
Bartlett, M.S., Littlewort, G., Frank, M., Lainscsek, C., Fasel, I., Movellan, J.: Recognizing facial expression: Machine learning and application to spontaneous behavior. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 568–573. IEEE Computer Society Press, Los Alamitos (2005)
Aran, O., Akarun, L.: A multi-class classification strategy for fisher scores: Application to signer independent sign language recognition. Pattern Recognition 43(5), 1776–1788 (2010)
Aran, O., Ari, I., Guvensan, M.A., Haberdar, H., Kurt, Z., Turkmen, H.I., Uyar, A., Akarun, L.: A database of non-manual signs in turkish sign language. In: IEEE 15th Signal Processing and Communications Applications Conference (SIU 2007) (June 2007)
Akakin, H.C., Sankur, B.: Analysis of head and facial gestures using facial landmark trajectories. In: COST 2101/2102 Conference, pp. 105–113 (2009)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)
Littlewort, G., Bartlett, M.S., Fasel, I., Susskind, J., Movellan, J.: Dynamics of facial expression extracted automatically from video. Journal of Image and Vision Computing, 615–625 (2004)
Yang, P., Liu, Q., Metaxas, D.N.: Boosting encoded dynamic features for facial expression recognition. Pattern Recognition Letters 30(2), 132–139 (2009); Video-based Object and Event Analysis
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Çınar Akakın, H., Sankur, B. (2010). Spatiotemporal-Boosted DCT Features for Head and Face Gesture Analysis. In: Salah, A.A., Gevers, T., Sebe, N., Vinciarelli, A. (eds) Human Behavior Understanding. HBU 2010. Lecture Notes in Computer Science, vol 6219. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14715-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-14715-9_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14714-2
Online ISBN: 978-3-642-14715-9
eBook Packages: Computer ScienceComputer Science (R0)