Latent-Class Hough Forests for 3D Object Detection and Pose Estimation

Tejani, Alykhan; Tang, Danhang; Kouskouridas, Rigas; Kim, Tae-Kyun

doi:10.1007/978-3-319-10599-4_30

Alykhan Tejani¹⁹,
Danhang Tang¹⁹,
Rigas Kouskouridas¹⁹ &
…
Tae-Kyun Kim¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8694))

Included in the following conference series:

European Conference on Computer Vision

18k Accesses
122 Citations

Abstract

In this paper we propose a novel framework, Latent-Class Hough Forests, for 3D object detection and pose estimation in heavily cluttered and occluded scenes. Firstly, we adapt the state-of-the-art template matching feature, LINEMOD [14], into a scale-invariant patch descriptor and integrate it into a regression forest using a novel template-based split function. In training, rather than explicitly collecting representative negative samples, our method is trained on positive samples only and we treat the class distributions at the leaf nodes as latent variables. During the inference process we iteratively update these distributions, providing accurate estimation of background clutter and foreground occlusions and thus a better detection rate. Furthermore, as a by-product, the latent class distributions can provide accurate occlusion aware segmentation masks, even in the multi-instance scenario. In addition to an existing public dataset, which contains only single-instance sequences with large amounts of clutter, we have collected a new, more challenging, dataset for multiple-instance detection containing heavy 2D and 3D clutter as well as foreground occlusions. We evaluate the Latent-Class Hough Forest on both of these datasets where we outperform state-of-the art methods.

Download to read the full chapter text

Chapter PDF

Hough-Based Tracking of Deformable Objects

Learning Hough Transform with Latent Structures for Joint Object Detection and Pose Estimation

Robust Instance Recognition in Presence of Occlusion and Clutter

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: COLT. ACM (1998)
Google Scholar
Breiman, L.: Random forests. Machine Learning (2001)
Google Scholar
Chan, J., Koprinska, I., Poon, J.: Co-training with a single natural feature set applied to email classification. In: WIC (2004)
Google Scholar
Choi, C., Christensen, H.I.: 3D pose estimation of daily objects using an rgb-d camera. In: IROS (2012)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Google Scholar
Dollár, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: A benchmark. In: CVPR (2009)
Google Scholar
Drost, B., Ulrich, M., Navab, N., Ilic, S.: Model globally, match locally: Efficient and robust 3D object recognition. In: CVPR (2010)
Google Scholar
Fanelli, G., Gall, J., Van Gool, L.: Real time head pose estimation with random regression forests. In: CVPR (2011)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. PAMI (2010)
Google Scholar
Gall, J., Yao, A., Razavi, N., Van Gool, L., Lempitsky, V.: Hough forests for object detection, tracking, and action recognition. PAMI (2011)
Google Scholar
Girshick, R., Shotton, J., Kohli, P., Criminisi, A., Fitzgibbon, A.: Efficient regression of general-activity human poses from depth images. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 415–422. IEEE (2011)
Google Scholar
Goldman, S., Zhou, Y.: Enhancing supervised learning with unlabeled data. In: ICML (2000)
Google Scholar
Hinterstoisser, S., Benhimane, S., Lepetit, V., Navab, N.: Simultaneous recognition and homography extraction of local patches with a simple linear classifier (2008)
Google Scholar
Hinterstoisser, S., Holzer, S., Cagniart, C., Ilic, S., Konolige, K., Navab, N., Lepetit, V.: Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: ICCV (2011)
Google Scholar
Hinterstoisser, S., Lepetit, V., Ilic, S., Fua, P., Navab, N.: Dominant orientation templates for real-time detection of texture-less objects. In: CVPR (2010)
Google Scholar
Hinterstoisser, S., Lepetit, V., Ilic, S., Holzer, S., Bradski, G., Konolige, K., Navab, N.: Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 548–562. Springer, Heidelberg (2013)
Chapter Google Scholar
Hsiao, E., Hebert, M.: Occlusion reasoning for object detection under arbitrary viewpoint. In: CVPR (2012)
Google Scholar
Johnson, A.E., Hebert, M.: Using spin images for efficient object recognition in cluttered 3D scenes. PAMI (1999)
Google Scholar
Khan, S.S., Madden, M.G.: One-class classification: Taxonomy of study and review of techniques. arXiv preprint arXiv:1312.0049 (2013)
Google Scholar
Leibe, B., Leonardis, A., Schiele, B.: Combined object categorization and segmentation with an implicit shape model. In: ECCV (2004)
Google Scholar
Leibe, B., Leonardis, A., Schiele, B.: Robust object detection with interleaved categorization and segmentation. IJCV (2008)
Google Scholar
Liu, R., Cheng, J., Lu, H.: A robust boosting tracker with minimum error bound in a co-training framework. In: ICCV (2009)
Google Scholar
Moya, M., Koch, M., Hostetler, L.: One-class classifier networks for target recognition applications. Tech. rep. (1993)
Google Scholar
Newcombe, R.A., Davison, A.J., Izadi, S., Kohli, P., Hilliges, O., Shotton, J., Molyneaux, D., Hodges, S., Kim, D., Fitzgibbon, A.: Kinectfusion: Real-time dense surface mapping and tracking. In: 2011 10th IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 127–136. IEEE (2011)
Google Scholar
Okada, R.: Discriminative generalized hough transform for object dectection. In: ICCV (2009)
Google Scholar
Opelt, A., Pinz, A., Zisserman, A.: Learning an alphabet of shape and appearance for multi-class object detection. IJCV (2008)
Google Scholar
Perronnin, F., Sánchez, J., Liu, Y.: Large-scale image categorization with explicit data embedding. In: CVPR (2010)
Google Scholar
Rios-Cabrera, R., Tuytelaars, T.: Discriminatively trained templates for 3D object detection: A real time scalable approach. In: ICCV (2013)
Google Scholar
Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. ACM (2013)
Google Scholar
Skanect (2014), http://skanect.manctl.com/
Steger, C.: Similarity measures for occlusion, clutter, and illumination invariant object recognition. In: Radig, B., Florczyk, S. (eds.) DAGM 2001. LNCS, vol. 2191, pp. 148–154. Springer, Heidelberg (2001)
Chapter Google Scholar
Tang, D., Liu, Y., Kim, T.K.: Fast pedestrian detection by cascaded random forest with dominant orientation templates. In: BMVC (2012)
Google Scholar
Tang, D., Yu, T.H., Kim, T.K.: Real-time articulated hand pose estimation using semi-supervised transductive regression forests. In: ICCV (2013)
Google Scholar
Tax, D.M.: One-class classification (2001)
Google Scholar
Torralba, A., Efros, A.A.: Unbiased look at dataset bias. In: CVPR (2011)
Google Scholar
Weise, T., Wismer, T., Leibe, B., Van Gool, L.: In-hand scanning with online loop closure. In: 2009 IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops), pp. 1630–1637. IEEE (2009)
Google Scholar
Yu, S., Krishnapuram, B., Rosales, R., Steck, H., Rao, R.B.: Bayesian co-training. In: NIPS (2007)
Google Scholar
Zhang, Z.: Iterative point matching for registration of free-form curves and surfaces. International Journal of Computer Vision 13(2), 119–152 (1994)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Imperial Collge London, UK
Alykhan Tejani, Danhang Tang, Rigas Kouskouridas & Tae-Kyun Kim

Authors

Alykhan Tejani
View author publications
You can also search for this author in PubMed Google Scholar
Danhang Tang
View author publications
You can also search for this author in PubMed Google Scholar
Rigas Kouskouridas
View author publications
You can also search for this author in PubMed Google Scholar
Tae-Kyun Kim
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Toronto, 6 King’s College Road, M5H 3S5, Toronto, ON, Canada
David Fleet
Faculty of Electrical Engineering, Department of Cybernetics, Czech Technical University in Prague, Technicka 2, 166 27, Prague 6, Czech Republic
Tomas Pajdla
Max-Planck-Institut für Informatik, Campus E1 4, 66123, Saarbrücken, Germany
Bernt Schiele
ESAT - PSI, iMinds, KU Leuven, Kasteelpark Arenberg 10, Bus 2441, 3001, Leuven, Belgium
Tinne Tuytelaars

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tejani, A., Tang, D., Kouskouridas, R., Kim, TK. (2014). Latent-Class Hough Forests for 3D Object Detection and Pose Estimation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8694. Springer, Cham. https://doi.org/10.1007/978-3-319-10599-4_30

Download citation

DOI: https://doi.org/10.1007/978-3-319-10599-4_30
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10598-7
Online ISBN: 978-3-319-10599-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Latent-Class Hough Forests for 3D Object Detection and Pose Estimation

Abstract

Chapter PDF

Similar content being viewed by others

Hough-Based Tracking of Deformable Objects

Learning Hough Transform with Latent Structures for Joint Object Detection and Pose Estimation

Robust Instance Recognition in Presence of Occlusion and Clutter

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Latent-Class Hough Forests for 3D Object Detection and Pose Estimation

Abstract

Chapter PDF

Similar content being viewed by others

Hough-Based Tracking of Deformable Objects

Learning Hough Transform with Latent Structures for Joint Object Detection and Pose Estimation

Robust Instance Recognition in Presence of Occlusion and Clutter

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation