A Tree-Based Approach to Integrated Action Localization, Recognition and Segmentation

Jiang, Zhuolin; Lin, Zhe; Davis, Larry S.

doi:10.1007/978-3-642-35749-7_9

Zhuolin Jiang¹⁷,
Zhe Lin¹⁸ &
Larry S. Davis¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6553))

Included in the following conference series:

European Conference on Computer Vision

1856 Accesses
3 Citations

Abstract

A tree-based approach to integrated action segmentation, localization and recognition is proposed. An action is represented as a sequence of joint hog-flow descriptors extracted independently from each frame. During training, a set of action prototypes is first learned based on a k-means clustering, and then a binary tree model is constructed from the set of action prototypes based on hierarchical k-means clustering. Each tree node is characterized by a shape-motion descriptor and a rejection threshold, and an action segmentation mask is defined for leaf nodes (corresponding to a prototype). During testing, an action is localized by mapping each test frame to a nearest neighbor prototype using a fast matching method to search the learned tree, followed by global filtering refinement. An action is recognized by maximizing the sum of the joint probabilities of the action category and action prototype over test frames. Our approach does not explicitly rely on human tracking and background subtraction, and enables action localization and recognition in realistic and challenging conditions (such as crowded backgrounds). Experimental results show that our approach can achieve recognition rates of 100% on the CMU action dataset and 100% on the Weizmann dataset.

Download to read the full chapter text

Chapter PDF

Spatio-Temporal Action Instance Segmentation and Localisation

Action Segmentation and Recognition Based on Depth HOG and Probability Distribution Difference

Joint Recognition and Segmentation of Actions via Probabilistic Integration of Spatio-Temporal Fisher Vectors

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: ICCV (2005)
Google Scholar
Liu, J., Ali, S., Shah, M.: Recognizing human actions using multiple features. In: CVPR (2008)
Google Scholar
Li, F., Nevatia, R.: Single view human action recognition using key pose matching and viterbi path searching. In: CVPR (2007)
Google Scholar
Bobick, A., Davis, J.: The recognition of human movement using tempral templates. IEEE Trans. PAMI 23, 257–267 (2001)
Article Google Scholar
Ke, Y., Sukthankar, R., Hebert, M.: Event detection in crowded videos. In: ICCV (2007)
Google Scholar
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)
Google Scholar
Mikolajczyk, K., Uemura, H.: Action recognition with motion-appearance vocabulary forest. In: CVPR (2008)
Google Scholar
Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: CVPR (2009)
Google Scholar
Lin, Z., Jiang, Z., Davis, L.S.: Recognizing actions by shape-motion prototype trees. In: ICCV (2009)
Google Scholar
Lampert, C., Blaschko, M., Hofmann, T.: Beyond sliding windows: Object localization by efficient subwindow search. In: CVPR (2008)
Google Scholar
Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: ICCV (2005)
Google Scholar
Laptev, I., Perez, P.: Retrieving actions in movies. In: ICCV (2007)
Google Scholar
Yuan, J., Liu, Z., Wu, Y.: Discriminative subvolume search for efficient action detection. In: CVPR (2009)
Google Scholar
Ali, S., Basharat, A., Shah, M.: Chaotic invariants for human action recognition. In: ICCV (2007)
Google Scholar
Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: ICCV (2003)
Google Scholar
Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: CVPR (2008)
Google Scholar
Sheikh, Y., Sheikh, M., Shah, M.: Exploring the space of a human action. In: ICCV (2005)
Google Scholar
Li, W., Zhang, Z., Liu, Z.: Expandable data-driven graphical modeling of human actions based on salient postures. IEEE Trans. Circuits and Systems for Video Technology 18, 1499–1510 (2008)
Article Google Scholar
Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. Int’l J. Computer Vision 79, 299–318 (2008)
Article Google Scholar
Nowozin, S., Bakir, G., Tsuda, K.: Discriminative subsequence mining for action classification. In: ICCV (2007)
Google Scholar
Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: VS-PETS (2005)
Google Scholar
Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: ICCV (2007)
Google Scholar
Schindler, K., Van Gool, L.: Action snippets: How many frames does human action recognition require? In: CVPR (2008)
Google Scholar
Yao, B., Zhu, S.: Learning deformable action templates from cluttered videos. In: ICCV (2009)
Google Scholar
Thurau, C., Hlavac, V.: Pose primitive based human action recognition in videos or still images. In: CVPR (2008)
Google Scholar
Weinland, D., Boyer, E.: Action recognition using exemplar-based embedding. In: CVPR (2008)
Google Scholar
Liu, J., Shah, M.: Learning human actions via information maximization. In: CVPR (2008)
Google Scholar
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local svm approach. In: ICPR (2004)
Google Scholar
Elgammal, A., Shet, V., Yacoob, Y., Davis, L.S.: Learning dynamics for exemplar-based gesture recognition. In: CVPR (2003)
Google Scholar
Veeraraghavan, A., Chellappa, R., Roy-Chowdhury, A.K.: The function space of an activity. In: CVPR (2006)
Google Scholar
Yao, A., Gall, J., Van Gool, L.: A hough transform-based voting framework for action recognition. In: CVPR (2010)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Google Scholar
Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing visual features for multiclass and multiview object detection. IEEE Trans. PAMI 29, 854–869 (2007)
Article Google Scholar
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: CVPR (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Maryland, College Park, MD, 20742, USA
Zhuolin Jiang & Larry S. Davis
Adobe Systems Incorporated, San Jose, CA, 95110, USA
Zhe Lin

Authors

Zhuolin Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Zhe Lin
View author publications
You can also search for this author in PubMed Google Scholar
Larry S. Davis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Toronto, 10 King’s College Road, ON M5S 3G4, Toronto, Canada
Kiriakos N. Kutulakos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, Z., Lin, Z., Davis, L.S. (2012). A Tree-Based Approach to Integrated Action Localization, Recognition and Segmentation. In: Kutulakos, K.N. (eds) Trends and Topics in Computer Vision. ECCV 2010. Lecture Notes in Computer Science, vol 6553. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35749-7_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-35749-7_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35748-0
Online ISBN: 978-3-642-35749-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Tree-Based Approach to Integrated Action Localization, Recognition and Segmentation

Abstract

Chapter PDF

Similar content being viewed by others

Spatio-Temporal Action Instance Segmentation and Localisation

Action Segmentation and Recognition Based on Depth HOG and Probability Distribution Difference

Joint Recognition and Segmentation of Actions via Probabilistic Integration of Spatio-Temporal Fisher Vectors

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Tree-Based Approach to Integrated Action Localization, Recognition and Segmentation

Abstract

Chapter PDF

Similar content being viewed by others

Spatio-Temporal Action Instance Segmentation and Localisation

Action Segmentation and Recognition Based on Depth HOG and Probability Distribution Difference

Joint Recognition and Segmentation of Actions via Probabilistic Integration of Spatio-Temporal Fisher Vectors

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation