Abstract
Hand motion capture has been an active research topic, following the success of full-body pose tracking. Despite similarities, hand tracking proves to be more challenging, characterized by a higher dimensionality, severe occlusions and self-similarity between fingers. For this reason, most approaches rely on strong assumptions, like hands in isolation or expensive multi-camera systems, that limit practical use. In this work, we propose a framework for hand tracking that can capture the motion of two interacting hands using only a single, inexpensive RGB-D camera. Our approach combines a generative model with collision detection and discriminatively learned salient points. We quantitatively evaluate our approach on 14 new sequences with challenging interactions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The annotated dataset sequences and the supplementary material are available at http://files.is.tue.mpg.de/dtzionas/GCPR_2014.html.
References
Albrecht, I., Haber, J., Seidel, H.P.: Construction and animation of anatomically based human hand models. In: SCA, pp. 98–109 (2003)
Athitsos, V., Sclaroff, S.: Estimating 3d hand pose from a cluttered image. In: CVPR, pp. 432–439 (2003)
Ballan, L., Taneja, A., Gall, J., Van Gool, L., Pollefeys, M.: Motion capture of hands in action using discriminative salient points. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 640–653. Springer, Heidelberg (2012)
Baran, I., Popović, J.: Automatic rigging and animation of 3d characters. TOG 26(3), 72 (2007)
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. PAMI 24(4), 509–522 (2002)
Bregler, C., Malik, J., Pullen, K.: Twist based acquisition and tracking of animal and human kinematics. IJCV 56(3), 179–194 (2004)
de Campos, T., Murray, D.: Regression-based hand pose estimation from multiple cameras. In: CVPR, pp. 782–789 (2006)
Canny, J.: A computational approach to edge detection. PAMI 8(6), 679–698 (1986)
Chen, Y., Medioni, G.: Object modeling by registration of multiple range images. In: ICRA, pp. 2724–2729 (1991)
Ekvall, S., Kragic, D.: Grasp recognition for programming by demonstration. In: ICRA, pp. 748–753 (2005)
Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: a review. CVIU 108(1–2), 52–73 (2007)
Felzenszwalb, P.F., Huttenlocher, D.P.: Distance transforms of sampled functions. Technical report. Cornell Computing and Information Science (2004)
Gall, J., Fossati, A., Van Gool, L.: Functional categorization of objects using real-time markerless motion capture. In: CVPR, pp. 1969–1976 (2011)
Gall, J., Yao, A., Razavi, N., Van Gool, L., Lempitsky, V.: Hough forests for object detection, tracking, and action recognition. PAMI 33(11), 2188–2202 (2011)
Hamer, H., Schindler, K., Koller-Meier, E., Van Gool, L.: Tracking a hand manipulating an object. In: ICCV, pp. 1475–1482 (2009)
Hamer, H., Gall, J., Weise, T., Van Gool, L.: An object-dependent hand pose prior from sparse training data. In: CVPR, pp. 671–678 (2010)
Heap, T., Hogg, D.: Towards 3d hand tracking using a deformable model. In: FG, pp. 140–145 (1996)
Holzer, S., Rusu, R., Dixon, M., Gedikli, S., Navab, N.: Adaptive neighborhood selection for real-time surface normal estimation from organized point cloud data using integral images. In: IROS, pp. 2684–2689 (2012)
Jones, M.J., Rehg, J.M.: Statistical color models with application to skin detection. IJCV 46(1), 81–96 (2002)
Keskin, C., Kıraç, F., Kara, Y.E., Akarun, L.: Hand pose estimation and hand shape classification using multi-layered randomized decision forests. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 852–863. Springer, Heidelberg (2012)
Kim, D., Hilliges, O., Izadi, S., Butler, A.D., Chen, J., Oikonomidis, I., Olivier, P.: Digits: freehand 3d interactions anywhere using a wrist-worn gloveless sensor. In: UIST, pp. 167–176 (2012)
Kyriazis, N., Argyros, A.: Physically plausible 3d scene tracking: the single actor hypothesis. In: CVPR, pp. 9–16 (2013)
Lewis, J.P., Cordner, M., Fong, N.: Pose space deformation: a unified approach to shape interpolation and skeleton-driven deformation. In: SIGGRAPH, pp. 165–172 (2000)
MacCormick, J., Isard, M.: Partitioned sampling, articulated objects, and interface-quality hand tracking. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 3–19. Springer, Heidelberg (2000)
Moeslund, T.B., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. CVIU 104(2), 90–126 (2006)
Murray, R.M., Sastry, S.S., Zexiang, L.: A Mathematical Introduction to Robotic Manipulation (1994)
Oikonomidis, I., Kyriazis, N., Argyros, A.: Efficient model-based 3d tracking of hand articulations using kinect. In: BMVC, pp. 101.1–101.11 (2011)
Oikonomidis, I., Kyriazis, N., Argyros, A.: Full dof tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: ICCV, pp. 2088–2095 (2011)
Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Tracking the articulated motion of two strongly interacting hands. In: CVPR, pp. 1862–1869 (2012)
Paris, S., Durand, F.: A fast approximation of the bilateral filter using a signal processing approach. IJCV 81(1), 24–52 (2009)
Pons-Moll, G., Rosenhahn, B.: Model-Based Pose Estimation, pp. 139–170 (2011)
Rehg, J.M., Kanade, T.: Visual tracking of high dof articulated structures: an application to human hand tracking. In: Eklundh, J.-O. (ed.) ECCV 1994. LNCS, vol. 801, pp. 35–46. Springer, Heidelberg (1994)
Rehg, J., Kanade, T.: Model-based tracking of self-occluding articulated objects. In: ICCV, pp. 612–617 (1995)
Romero, J., Kjellström, H., Kragic, D.: Monocular real-time 3d articulated hand pose estimation. In: HUMANOIDS, pp. 87–92 (2009)
Romero, J., Kjellström, H., Kragic, D.: Hands in action: real-time 3d reconstruction of hands in interaction with objects. In: ICRA, pp. 458–463 (2010)
Rosales, R., Athitsos, V., Sigal, L., Sclaroff, S.: 3d hand pose reconstruction using specialized mappings. In: ICCV, pp. 378–387 (2001)
Rosenhahn, B., Brox, T., Weickert, J.: Three-dimensional shape knowledge for joint image segmentation and pose tracking. IJCV 73(3), 243–262 (2007)
Rusinkiewicz, S., Levoy, M.: Efficient variants of the icp algorithm. In: 3DIM, pp. 145–152 (2001)
Rusinkiewicz, S., Hall-Holt, O., Levoy, M.: Real-time 3d model acquisition. TOG 21(3), 438–446 (2002)
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: CVPR, pp. 1297–1304 (2011)
Sridhar, S., Oulasvirta, A., Theobalt, C.: Interactive markerless articulated hand motion tracking using rgb and depth data. In: ICCV, pp. 2456–2463 (2013)
Stenger, B., Mendonca, P., Cipolla, R.: Model-based 3D tracking of an articulated hand. In: CVPR, pp. 310–315 (2001)
Stolfi, J.: Oriented Proj. Geometry: A Framework for Geom. Computation (1991)
Teschner, M., Kimmerle, S., Heidelberger, B., Zachmann, G., Raghupathi, L., Fuhrmann, A., Cani, M.P., Faure, F., Magnetat-Thalmann, N., Strasser, W.: Collision detection for deformable objects. In: Eurographics, pp. 119–139 (2004)
Thayananthan, A., Stenger, B., Torr, P.H.S., Cipolla, R.: Shape context and chamfer matching in cluttered scenes. In: CVPR, pp. 127–133 (2003)
Tzionas, D., Gall, J.: A comparison of directional distances for hand pose estimation. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 131–141. Springer, Heidelberg (2013)
Vaezi, M., Nekouie, M.A.: 3d human hand posture reconstruction using a single 2d image. IJHCI 1(4), 83–94 (2011)
Wang, R.Y., Popović, J.: Real-time hand-tracking with a color glove. TOG 28(3), 68:1–68:8 (2009)
Acknowledgments
The authors acknowledge the help of Javier Romero and Jessica Purmort of MPI-IS regarding the acquisition of the personalized hand model, the assistance of Philipp Rybalov with annotation and the public software release of the FORTH tracker by the CVRL lab of FORTH-ICS, enabling comparison to [27]. Financial support was provided by the DFG Emmy Noether program (GA 1927/1-1).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Tzionas, D., Srikantha, A., Aponte, P., Gall, J. (2014). Capturing Hand Motion with an RGB-D Sensor, Fusing a Generative Model with Salient Points. In: Jiang, X., Hornegger, J., Koch, R. (eds) Pattern Recognition. GCPR 2014. Lecture Notes in Computer Science(), vol 8753. Springer, Cham. https://doi.org/10.1007/978-3-319-11752-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-11752-2_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11751-5
Online ISBN: 978-3-319-11752-2
eBook Packages: Computer ScienceComputer Science (R0)