Discriminative Orderlet Mining for Real-Time Recognition of Human-Object Interaction

Yu, Gang; Liu, Zicheng; Yuan, Junsong

doi:10.1007/978-3-319-16814-2_4

Discriminative Orderlet Mining for Real-Time Recognition of Human-Object Interaction

Gang Yu¹⁷,
Zicheng Liu¹⁸ &
Junsong Yuan¹⁷

Conference paper
First Online: 01 January 2015

2266 Accesses
31 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9007))

Abstract

This paper presents a novel visual representation, called orderlets, for real-time human action recognition with depth sensors. An orderlet is a middle level feature that captures the ordinal pattern among a group of low level features. For skeletons, an orderlet captures specific spatial relationship among a group of joints. For a depth map, an orderlet characterizes a comparative relationship of the shape information among a group of subregions. The orderlet representation has two nice properties. First, it is insensitive to small noise since an orderlet only depends on the comparative relationship among individual features. Second, it is a frame-level representation thus suitable for real-time online action recognition. Experimental results demonstrate its superior performance on online action recognition and cross-environment action recognition.

G. Yu—The work was done when Gang Yu was an intern at Microsoft Research. This work is supported in part by Singapore MoE Tier-1 grant.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
The dataset can be downloaded from http://research.microsoft.com/en-us/um/people/zliu/ActionRecoRsrc/default.htm.

References

Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: CVPR (2011)
Google Scholar
Chum, O., Philbin, J., Zisserman, A.: Near duplicate image detection: min-Hash and tf-idf weighting. In: BMVC (2008)
Google Scholar
Yagnik, J., Strelow, D., Ross, D., Lin, R.S.: The power of comparative reasoning. In: ICCV (2011)
Google Scholar
Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: CVPR (2012)
Google Scholar
Wang, J., Liu, Z., Chorowski, J., Chen, Z., Wu, Y.: Robust 3d action recognition with random occupancy patterns. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 872–885. Springer, Heidelberg (2012)
Chapter Google Scholar
Schapire, R.: A brief introduction to boosting. In: IJCAI (1999)
Google Scholar
Tang, S., Wang, X., Lv, X., Han, T.X., Keller, J., He, Z., Skubic, M., Lao, S.: Histogram of oriented normal vectors for object recognition with a depth sensor. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part II. LNCS, vol. 7725, pp. 525–538. Springer, Heidelberg (2013)
Chapter Google Scholar
Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: CVPR (2011)
Google Scholar
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. In: ICPR (2004)
Google Scholar
Yu, G., Yuan, J., Liu, Z.: Unsupervised random forest indexing for fast action search. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)
Google Scholar
Yu, G., Yuan, J., Liu, Z.: Propagative hough voting for human activity recognition. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 693–706. Springer, Heidelberg (2012)
Chapter Google Scholar
Oreifej, O., Liu, Z.: HON4D: histogram of oriented 4D Normals for activity recognition from depth sequences. In: CVPR (2013)
Google Scholar
Laptev, I.: On space-time interest points. IJCV 64(2–3), 107–123 (2005)
Article Google Scholar
Dollar, P., Rabaud, V., Cottrell, G., Belongiel, S.: Behavior recognition via sparse spatio-temporal features. In: Visual Surveillance and Performance Evaluation of Tracking and Surveillance (2005)
Google Scholar
Wang, H., Klaser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: CVPR (2011)
Google Scholar
Jiang, Y.-G., Dai, Q., Xue, X., Liu, W., Ngo, C.-W.: Trajectory-based modeling of human actions with motion reference points. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 425–438. Springer, Heidelberg (2012)
Chapter Google Scholar
Xia, L., Aggarwal, J.K.: Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera. In: CVPR (2013)
Google Scholar
Ryoo, M.S.: Human activity prediction: early recognition of ongoing activities from streaming videos. In: ICCV (2011)
Google Scholar
Yang, X., Tian, Y.: EigenJoints-based action recognition using Naive-Bayes-Nearest-Neighbor. In: CVPRW (2012)
Google Scholar
Yang, X., Zhang, C., Tian, Y.: Recognizing actions using depth motion maps-based histograms of oriented gradients. In: ACM Multimedia (2012)
Google Scholar
Chen, H.S., Chen, H.T., Chen, Y.W., Lee, S.Y.: Human action recognition using star skeleton. In: ACM International Workshop on Video Surveillance and Sensor Networks (2006)
Google Scholar
Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3d points. In: CVPRW (2010)
Google Scholar
Bentley, J.: Programming pearls: algorithm design techniques. Commun. ACM 27(9), 865–873 (1984)
Article MathSciNet Google Scholar
Zhu, Y., Chen, W., Guo, G.D.: Fusing spatiotemporal features and joints for 3D action recognition. In: CVPRW (2013)
Google Scholar
Hoai, M., DelaTorre, F.: Max-margin early event detectors. In: CVPR (2012)
Google Scholar
Zhou, B., Wang, X., Tang, X.: Understanding collective crowd behaviors: learning a mixture model of dynamic pedestrian-agents. In: CVPR (2012)
Google Scholar
Zanfir, M., Leordeanu, M., Sminchisescu, C.: The moving pose: an efficient 3d kinematics descriptor for low-latency action recognition and detection. In: ICCV (2013)
Google Scholar
Yu, G., Norberto, A., Yuan, J., Liu, Z.: Fast action detection via discriminative random forest voting and top-K subvolume search. IEEE Trans. Multimedia 13(3), 507–517 (2011)
Article Google Scholar
Sadanand, S., Corso, J.J.: Action bank: a high-level representation of activity in video. In: CVPR (2012)
Google Scholar
Chen, C.Y., Grauman, K.: Efficient activity detection with max-subgraph search. In: CVPR (2012)
Google Scholar
Gupta, A., Davis, L.S.: Objects in action: an approach for combining action understanding and object perception. In: CVPR (2007)
Google Scholar
Jain, A., Gupta, A., Rodriguez, M., Davis, L.S.: Representing videos using mid-level discriminative patches. In: CVPR (2013)
Google Scholar
Parikh, D., Grauman, K.: Relative attributes. In: ICCV (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
Gang Yu & Junsong Yuan
Microsoft Research, Redmond, WA, USA
Zicheng Liu

Authors

Gang Yu
View author publications
You can also search for this author in PubMed Google Scholar
Zicheng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Junsong Yuan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gang Yu .

Editor information

Editors and Affiliations

Technische Universität München, Garching, Germany
Daniel Cremers
University of Adelaide, Adelaide, South Australia, Australia
Ian Reid
Keio University, Yokohama, Kanagawa, Japan
Hideo Saito
University of California at Merced, Merced, California, USA
Ming-Hsuan Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yu, G., Liu, Z., Yuan, J. (2015). Discriminative Orderlet Mining for Real-Time Recognition of Human-Object Interaction. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9007. Springer, Cham. https://doi.org/10.1007/978-3-319-16814-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-16814-2_4
Published: 17 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16813-5
Online ISBN: 978-3-319-16814-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics