Abstract
The recent introduction of Hankelets to describe time series relies on the assumption that the time series has been generated by a vector autoregressive model (VAR) of order p. The success of Hankelet-based time series representations prevalently in nearest neighbor classifiers poses questions about if and how this representation can be used in kernel machines without the usual adoption of mid-level representations (such as codebook-based representations). It is also of interest to investigate how this representation relates to probabilistic approaches for time series modeling, and which characteristics of the VAR model a Hankelet can capture. This paper aims at filling these gaps by: deriving a time series kernel function for Hankelets (TSK4H), demonstrating the relations between the derived TSK4H and former dissimilarity/similarity scores, highlighting an alternative probabilistic interpretation of Hankelets.
Experiments with an off-the-shelf SVM implementation and extensive validation in action classification and emotion recognition on several feature representations, show that the proposed TSK4H allows achieving state-of-the-art or even superior accuracy values in classification with respect to past work. In contrast to state-of-the-art time series kernel functions that suffer of numerical issues and tend to provide diagonally dominant kernel matrices, empirical results suggest that the TSK4H has limited numerical issues in high-dimensional spaces. On three widely used public benchmarks, TSK4H consistently outperforms other time series kernel functions despite its simplicity and limited time complexity.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
References
Abdi, H.: RV coefficient and congruence coefficient. In: Encyclopedia of Measurement and Statistics, pp. 849–853. Sage, Thousand Oaks (2007)
Bradski, G.: The OpenCV library. Dr. Dobb’s J. Softw. Tools 25(11), 120–126 (2000)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27–37 (2011). ACM
Chaudhry, R., Ofli, F., Kurillo, G., Bajcsy, R., Vidal, R.: Bio-inspired dynamic 3D discriminative skeletal features for human action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2013), pp. 471–478. IEEE (2013)
Chen, L., Wei, H., Ferryman, J.: A survey of human motion analysis using depth imagery. Pattern Recogn. Lett. 34(15), 1995–2006 (2013). Elsevier
Chew, S., Lucey, P., Lucey, S., Saragih, J., Cohn, J., Sridharan, S.: Person-independent facial expression detection using constrained local models. In: Proceedings of Conference and Workshop on Automatic Face and Gesture Recognition (FG), pp. 915–920. IEEE (2011)
Cohn, J., Schmidt, K.: The timing of facial motion in posed and spontaneous smiles. Int. J. Wavelets Multiresolut. Inf. Process. 2(2), 121–132 (2004). World Scientific
Cootes, T., Edwards, G., Taylor, C.: Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 23(6), 681–685 (2001). IEEE
Cuturi, M.: Fast global alignment kernels. In: Proceedings of International Conference on Machine Learning (ICML), pp. 929–936 (2011)
Cuturi, M., Doucet, A.: Autoregressive kernels for time series. arXiv preprint arXiv:1101.0673 (2011)
Cuturi, M., Vert, J., Birkenes, O., Matsui, T.: A kernel for time series based on global alignments. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 2, pp. 413–420. IEEE (2007)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 886–893. IEEE (2005)
Ellis, C., Masood, S.Z., Tappen, M.F., Laviola Jr., J.J., Sukthankar, R.: Exploring the trade-off between accuracy and observational latency in action recognition. Int. J. Comput. Vis. 101(3), 420–436 (2013). Springer
Frank, J., Mannor, S., Precup, D.: Activity and gait recognition with time-delay embeddings. In: Conference on Artificial Intelligence (AAAI) (2010)
Gehler, P.V.: Kernel learning approaches for image classification. Ph.D. thesis, Universitat des Saarlandes (2009)
Harandi, M.T., Salzmann, M., Jayasumana, S., Hartley, R., Li, H.: Expanding the family of Grassmannian Kernels: an embedding perspective. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 408–423. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10584-0_27
Hare, S., Saffari, A., Torr, P.H.S.: Struck: structured output tracking with kernels. In: Proceedings of International Conference on Computer Vision (ICCV 2011), pp. 263–270. IEEE (2011)
Haufe, S., Nolte, G., Mueller, K., Krämer, N.: Sparse causal discovery in multivariate time series. arXiv preprint arXiv:0901.2234 (2009)
Hofmann, T., Schölkopf, B., Smola, A.: Kernel methods in machine learning. Ann. stat. 36(3), 1171–1220 (2008). JSTOR
Huang, X., Zhao, G., Pietikainen, M., Zheng, W.: Robust facial expression recognition using revised canonical correlation. In: Proceedings of International Conference on Pattern Recognition (ICPR), pp. 1734–1739. IEEE (2014)
Jebara, T., Kondor, R., Howard, A.: Probability product kernels. J. Mach. Learn. Res. 5, 819–844 (2004). JMLR.org
Jiang, Z., Lin, Z., Davis, L.S.: Recognizing human actions by learning and matching shape-motion prototype trees. Trans. Pattern Anal. Mach. Intell. 34(3), 533–547 (2012). IEEE
Lehrmann, A., Gehler, P., Nowozin, S.: Efficient nonlinear Markov models for human motion. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR 2014), pp. 1314–1321. IEEE (2014)
Li, B., Camps, O., Sznaier, M.: Cross-view activity recognition using Hankelets. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR 2012), pp. 1362–1369. IEEE (2012)
Lin, R.-S., Liu, C.-B., Yang, M.-H., Ahuja, N., Levinson, S.: Learning nonlinear manifolds from time series. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 245–256. Springer, Heidelberg (2006). doi:10.1007/11744047_19
Lo Presti, L., La Cascia, M.: An on-line learning method for face association in personal photo collection. Image Vis. Comput. 30(4), 306–316 (2012). Elsevier
Lo Presti, L., La Cascia, M.: Ensemble of Hankel matrices for face emotion recognition. In: Murino, V., Puppo, E. (eds.) ICIAP 2015. LNCS, vol. 9280, pp. 586–597. Springer, Heidelberg (2015). doi:10.1007/978-3-319-23234-8_54
Lo Presti, L., La Cascia, M.: Using Hankel matrices for dynamics-based facial emotion recognition and pain detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2015), pp. 26–33. IEEE (2015)
Lo Presti, L., La Cascia, M., Sclaroff, S., Camps, O.: Gesture modeling by Hanklet-based hidden Markov model. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9005, pp. 529–546. Springer, Heidelberg (2015). doi:10.1007/978-3-319-16811-1_35
Lo Presti, L., La Cascia, M., Sclaroff, S., Camps, O.: Hankelet-based dynamical systems modeling for 3D action recognition. Image Vis. Comput. 40, 1–53 (2015). Elsevier
Lorincz, A., Jeni, L., Szabó, Z., Cohn, J., Kanade, T.: Emotional expression classification using time-series kernels. In: Proceedings of Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 889–895. IEEE (2013)
Lucey, P., Cohn, J., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended Cohn-Kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: Proceedings of Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 94–101. IEEE (2010)
Moeslund, T., Granum, E.: A survey of computer vision-based human motion capture. Comput. Vis. Image Underst. 81(3), 231–268 (2001). Elsevier
Nicolaou, M.A., Pavlovic, V., Pantic, M.: Dynamic probabilistic CCA for analysis of affective behaviour. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7578, pp. 98–111. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33786-4_8
Nie, S., Wang, Z., Ji, Q.: A generative restricted Boltzmann machine based method for high-dimensional motion data modeling. Comput. Vis. Image Underst. 136, 14–22 (2015). Elsevier
Noma, H., Shimodaira, K.: Dynamic time-alignment kernel in support vector machine. Adv. Neural Inf. Process. Syst. 14, 921–930 (2002)
Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., Bajcsy, R.: Sequence of the most informative joints (SMIJ): a new representation for human skeletal action recognition. J. Vis. Commun. Image Represent. 25(1), 24–38 (2014). Elsevier
Paoletti, S., Juloski, A., Ferrari-Trecate, G., Vidal, R.: Identification of hybrid systems a tutorial. Eur. J. Control 13(2), 242–260 (2007). Elsevier
Poppe, R.: A survey on vision-based human action recognition. Image and Vis. Comput. 28(6), 976–990 (2010). Elsevier
Poullot, S., Tsukatani, S., Phuong Nguyen, A., Jégou, H., Satoh, S.: Temporal matching kernel with explicit feature maps. In: Proceedings of Conference on Multimedia Conference, pp. 381–390. ACM (2015)
Prabhakar, K., Oh, S., Wang, P., Abowd, G., Rehg, J.M.: Temporal causality for the analysis of visual events. In: Proceedings on Computer Vision and Pattern Recognition (CVPR 2010), pp. 1967–1974. IEEE (2010)
Rahimi, A., Recht, B., Darrell, T.: Learning to transform time series with a few examples. Trans. Pattern Anal. Mach. Intell. 29(10), 1759–1775 (2007). IEEE
Raptis, M., Kokkinos, I., Soatto, S.: Discovering discriminative action parts from mid-level video representations. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR 2012), pp. 1242–1249. IEEE (2012)
Revaud, J., Douze, M., Schmid, C., Jégou, H.: Event retrieval in large video collections with circulant temporal encoding. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR 2013), pp. 2459–2466. IEEE (2013)
Ramirez Rivera, A., Castillo, R., Chae, O.: Local directional number pattern for face analysis: face and expression recognition. Trans. Image Process. (TIP) 22(5), 1740–1752. IEEE (2013)
Sahami, M., Heilman, T.D.: A web-based kernel function for measuring the similarity of short text snippets. In: Proceedings of International Conference on World Wide Web, pp. 377–386. ACM (2006)
Sankaranarayanan, A.C., Turaga, P.K., Baraniuk, R.G., Chellappa, R.: Compressive acquisition of dynamic scenes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6311, pp. 129–142. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15549-9_10
Sariyanidi, E., Gunes, H., Cavallaro, A.: Automatic analysis of facial affect: a survey of registration, representation and recognition. Trans. Pattern Anal. Mach. Intell. (PAMI) 37(6), 1113–1133 (2014). IEEE
Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings of International Conference on Pattern Recognition (ICPR 2004), vol. 3, pp. 32–36. IEEE (2004)
Seo, H.J., Milanfar, P.: Training-free, generic object detection using locally adaptive regression kernels. Trans. Pattern Anal. Mach. Intell. 32(9), 1688–1704 (2010). IEEE
Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56(1), 116–124 (2013). ACM
Slama, R., Wannous, H., Daoudi, M., Srivastava, A.: Accurate 3D action recognition using learning on the Grassmann manifold. Pattern Recognit. (PR) 48(2), 556–567 (2015). Elsevier
Smilde, A.K., Kiers, H.A.L., Bijlsma, S., Rubingh, C.M., Van Erk, M.J.: Matrix correlations for high-dimensional data: the modified RV-coefficient. Bioinformatics 25(3), 401–405 (2009). Oxford University Press
Songsiri, J., Dahl, J., Vandenberghe, L.: Graphical models of autoregressive processes. In: Convex Optimization in Signal Processing and Communications, pp. 89–116. Cambridge University Press. Cambridge (2010)
Turaga, P., Chellappa, R., Subrahmanian, V.S., Udrea, O.: Machine recognition of human activities: a survey. Trans. Circ. Syst. Video Technol. 18(11), 1473–1488 (2008). IEEE
Urtasun, R., Fleet, D.J., Fua, P.: 3D people tracking with Gaussian process dynamical models. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 1, pp. 238–245. IEEE (2006)
Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004). Springer
Wang, Z., Wang, S., Ji, Q.: Capturing complex spatio-temporal relations among facial muscles for facial expression recognition. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3422–3429. IEEE (2013)
Wu, B., Yuan, C., Hu, W.: Human action recognition based on context-dependent graph kernels. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2014), pp. 2609–2616. IEEE (2014)
Xu, D., Yan, S., Tao, D., Zhang, L., Li, X., Zhang, H.: Human gait recognition with matrix representation. Trans. Circ. Syst. Video Technol. 16(7), 896–903 (2006). IEEE
Yang, M.H., Ahuja, N., Tabb, M.: Extraction of 2D motion trajectories and its application to hand gesture recognition. Trans. Pattern Anal. Mach. Intell. 24(8), 1061–1074 (2002). IEEE
Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S.: A survey of affect recognition methods: audio, visual, and spontaneous expressions. Trans. Pattern Anal. Mach. Intell. 31(1), 39–58 (2009). IEEE
Zhang, X., Yang, Y., Jiao, L.C., Dong, F.: Manifold-constrained coding and sparse representation for human action recognition. Pattern Recogn. 46(7), 1819–1831 (2013). Elsevier
Zhou, F., De la Torre, F.: Generalized canonical time warping. Trans. Pattern Anal. Mach. Intell. (PAMI) 38(2), 279–294 (2016). IEEE
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Lo Presti, L., La Cascia, M. (2017). A Novel Time Series Kernel for Sequences Generated by LTI Systems. In: Lai, SH., Lepetit, V., Nishino, K., Sato, Y. (eds) Computer Vision – ACCV 2016. ACCV 2016. Lecture Notes in Computer Science(), vol 10113. Springer, Cham. https://doi.org/10.1007/978-3-319-54187-7_29
Download citation
DOI: https://doi.org/10.1007/978-3-319-54187-7_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54186-0
Online ISBN: 978-3-319-54187-7
eBook Packages: Computer ScienceComputer Science (R0)