Abstract
In temporal data analysis, noisy data is inevitable in both testing and training. This noise can seriously influence the performance of the temporal data analysis. To address this problem, we propose a novel method, termed Selective Temporal Filtering that builds a noise-free model for classification during training and identifies key-feature vectors that are noise-filtered data from the input sequence during testing. The use of these key-feature vectors makes the classifier robust to noise within the input space. The proposed method is validated on a synthetic-dataset and a database of American Sign Language. Using key-feature vectors results in robust performance with respect to the noise content. Futhermore, we are able to show that the proposed method not only outperforms Conditional Random Fields and Hidden Markov Models in noisy environments, but also in a well-controlled environment where we assume no significant noise vectors exist.
Similar content being viewed by others
Notes
1 American Sign Language Database, http://www.bu.edu/asllrp/ncslgr.html
References
Rabiner L R (1989) A tutorial on hidden markov models and selected applications in speech recognition. Proc IEEE 77:257–286
Starner T, Weaver J, Pentland A (1998) Real-time american sign language recognition using desk and wearable computer based video. IEEE Trans Pattern Anal Mach Intell 20(12):1371– 1375
Ahmad M, Lee S-W (2008) Human action recognition using shape and CLG-motion flow from multi-view image sequences. Pattern Recognit 41(7):2237–2252
Murphy K (2002) Dynamic Bayesian Networks: Representation, Inference and Learning. PhD thesis, University of California, Berkeley
Suk H-I, Shin B-K, Lee S-W (2010) Hand gesture recognition based on dynamic bayesian network framework. Pattern Recognit 43(9):3059–3072
Yang H-D, Sclaroff S, Lee S-W (2009) Sign language spotting with a threshold model based on conditional random fields. IEEE Trans Pattern Anal Mach Intell 31(7):1264–1277
Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of Int. Conf. on Machine Learning, pp 282–289
Pearson K (1901) On lines and planes of closest fit to systems of points in space. Philosophical Magazine 2 (11):559–572
Fischler M A, Bolles R C (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24:381–395
Irani M, Anandan P, Hsu S (1995) Mosaic based representations of video sequences and their applications. In: Proceedings of Fifth International Conference on Computer Vision, pp 605– 611
Lacey A J, Pinitkarn N, Thacker N A (2000) An evaluation of the performance of RANSAC algorithms for stereo camera calibration. In: Proceedings of the British Machine Vision Conference
Roh M-C, Oguri T, Kanade T (2011) Face alignment robust to occlusion. In: Proceedings of IEEE International Conference on Automatic Face Gesture Recognition and Workshops, pp 239– 244
Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. Proceedings of the Second European Conference on Computational Learning Theory:23–37
Hu K, Yin L (2015) Multiple feature representations from multi-layer geometric shape for hand gesture analysis. In: Automatic Face and Gesture Recognition (FG), 2015 11th IEEE International Conference and Workshops on, vol 1, pp 1–7
Nguyen V-T, Le T-L, Tran T-H, Mullot R, Courboulay V (2015) A new hand representation based on kernels for hand posture recognition. In: Automatic Face and Gesture Recognition (FG), 2015 11th IEEE International Conference and Workshops on, vol 1, pp 1–6
Yang H-D, Lee S-W (2013) Robust sign language recognition by combining manual and non-manual features based on conditional random field and support vector machine. Pattern Recogn Lett 34:2051–2056
Yang H-D (2014) Sign language recognition with the kinect sensor based on conditional random fields. Sensors 15(1):135– 147
Ghaleb F, Youness E, Elmezain M, Dewdar F (2015) Vision-based hand gesture spotting and recognition using CRF and SVM. J Softw Eng Appl 8:313–323
Alon J, Athitsos V, Sclaroff S (2005) Accurate and efficient gesture spotting via pruning and subgesture reasoning. In: Proceedings of International Conference on Computer Vision Workshop on Human Computer Interaction, pp 189–198
Lee H-K, Kim J-H (1999) An HMM-based threshold model approach for gesture recognition. IEEE Trans. on Pattern Analysis and Machine Recognition 21(10):961–973
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. Proc IEEE Conf Comput Vis Pattern Recognit 1:511–519
Yang H-D, Lee S-W, Lee S-W (2006) Multiple human detection and tracking based on weighted temporal texture features. Int J Pattern Recognit Artif Intell 20(3):377–391
Acknowledgments
This work was partly supported by the ICT R&D program of MSIP/IITP [B0101-15-0552 , Development of Predictive Visual Intelligence Technology] and also supported by the Implementation of Technologies for Identification, Behavior, and Location of Human based on Sensor Network Fusion Program through the Ministry of Trade, Industry and Energy (Grant No. 10041629).
Author information
Authors and Affiliations
Corresponding author
Additional information
Myung-cheol Roh is currently with S-1 Corporation.
Rights and permissions
About this article
Cite this article
Roh, MC., Fazli, S. & Lee, SW. Selective temporal filtering and its application to hand gesture recognition. Appl Intell 45, 255–264 (2016). https://doi.org/10.1007/s10489-015-0757-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-015-0757-8