FPM: Fine Pose Parts-Based Model with 3D CAD Models

Lim, Joseph J.; Khosla, Aditya; Torralba, Antonio

doi:10.1007/978-3-319-10599-4_31

Joseph J. Lim¹⁹,
Aditya Khosla¹⁹ &
Antonio Torralba¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8694))

Included in the following conference series:

European Conference on Computer Vision

17k Accesses
36 Citations

Abstract

We introduce a novel approach to the problem of localizing objects in an image and estimating their fine-pose. Given exact CAD models, and a few real training images with aligned models, we propose to leverage the geometric information from CAD models and appearance information from real images to learn a model that can accurately estimate fine pose in real images. Specifically, we propose FPM, a fine pose parts-based model, that combines geometric information in the form of shared 3D parts in deformable part based models, and appearance information in the form of objectness to achieve both fast and accurate fine pose estimation. Our method significantly outperforms current state-of-the-art algorithms in both accuracy and speed.

Download to read the full chapter text

Chapter PDF

Qualitative Pose Estimation by Discriminative Deformable Part Models

A Hybrid Approach for 6DoF Pose Estimation

6D Object Pose Estimation Using Keypoints and Part Affinity Fields

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Alexe, B., Deselaers, T., Ferrari, V.: What is an object? In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 73–80 (2010)
Google Scholar
Aubry, M., Maturana, D., Efros, A., Russell, B., Sivic, J.: Seeing 3d chairs: exemplar part-based 2D-3D alignment using a large dataset of cad models. In: IEEE Conference on Computer Vision and Pattern Recognition (2014)
Google Scholar
Barron, J.T., Malik, J.: Intrinsic scene properties from a single RGB-D image. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)
Google Scholar
Choi, W., Chao, Y.W., Pantofaru, C., Savarese, S.: Understanding indoor scenes using 3D geometric phrases. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)
Google Scholar
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: Decaf: A deep convolutional activation feature for generic visual recognition. arXiv preprint arXiv:1310.1531 (2013)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC 2007) Results (2007)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D.: Discriminatively trained deformable part models (2009)
Google Scholar
Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. International Journal of Computer Vision 61(1), 55–79 (2005)
Article Google Scholar
Fidler, S., Dickinson, S.J., Urtasun, R.: 3D object detection and viewpoint estimation with a deformable 3D cuboid model. In: Advances in Neural Information Processing Systems (2012)
Google Scholar
Fisher, M., Hanrahan, P.: Context-based search for 3D models. ACM Trans. Graph. 29(6) (December 2010)
Google Scholar
Fouhey, D.F., Delaitre, V., Gupta, A., Efros, A.A., Laptev, I., Sivic, J.: People watching: Human actions as a cue for single view geometry. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 732–745. Springer, Heidelberg (2012)
Chapter Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (2014)
Google Scholar
Girshick, R., Song, H.O., Darrell, T.: Discriminatively activated sparselets. In: International Conference on Machine Learning (2013)
Google Scholar
Gupta, A., Satkin, S., Efros, A.A., Hebert, M.: From 3D scene geometry to human workspace. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)
Google Scholar
Gupta, S., Arbelaez, P., Malik, J.: Perceptual organization and recognition of indoor scenes from RGB-D images. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)
Google Scholar
Hariharan, B., Malik, J., Ramanan, D.: Discriminative decorrelation for clustering and classification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 459–472. Springer, Heidelberg (2012)
Chapter Google Scholar
Hedau, V., Hoiem, D., Forsyth, D.: Thinking inside the box: Using appearance models and context based on room geometry. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 224–237. Springer, Heidelberg (2010)
Chapter Google Scholar
Hejrati, M., Ramanan, D.: Analyzing 3D objects in cluttered images. In: Advances in Neural Information Processing Systems (2012)
Google Scholar
Hejrati, M., Ramanan, D.: Analysis by synthesis: 3D object recognition by object reconstruction. In: IEEE Conference on Computer Vision and Pattern Recognition (2014)
Google Scholar
Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: Rgbd mapping: Using depth cameras for dense 3D modeling of indoor environments. In: RGB-D: Advanced Reasoning with Depth Cameras Workshop in Conjunction with RSS (2010)
Google Scholar
Hoiem, D., Efros, A.A., Hebert, M.: Geometric context from a single image. In: IEEE International Conference on Computer Vision (2005)
Google Scholar
Hoiem, D., Hedau, V., Forsyth, D.: Recovering free space of indoor scenes from a single image. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)
Google Scholar
Jia, Z., Gallagher, A., Saxena, A., Chen, T.: 3D-based reasoning with blocks, support, and stability. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (2012)
Google Scholar
Lai, K., Bo, L., Ren, X., Fox, D.: Detection-based object labeling in 3D scenes. In: IEEE International Conference on on Robotics and Automation (2012)
Google Scholar
Lim, J.J., Pirsiavash, H., Torralba, A.: Parsing ikea objects: Fine pose estimation. In: IEEE International Conference on Computer Vision (2013)
Google Scholar
Lowe, D.: Fitting parameterized three-dimensional models to images. IEEE Transactions on Pattern Analysis and Machine intelligence (1991)
Google Scholar
Matzen, K., Snavely, N.: Nyc3dcars: A dataset of 3D vehicles in geographic context. In: Proc. Int. Conf. on Computer Vision (2013)
Google Scholar
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012)
Chapter Google Scholar
Pepik, B., Gehler, P., Stark, M., Schiele, B.: 3d2pm - 3D deformable part models. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 356–370. Springer, Heidelberg (2012)
Google Scholar
Satkin, S., Lin, J., Hebert, M.: Data-driven scene understanding from 3D models. In: British Machine Vision Conference (2012)
Google Scholar
Schwing, A.G., Fidler, S., Pollefeys, M., Urtasun, R.: Box In the Box: Joint 3D Layout and Object Reasoning from Single Images. In: Proc. ICCV (2013)
Google Scholar
Sun, M., Su, H., Savarese, S., Fei-Fei, L.: A multi-view probabilistic model for 3D object classes. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)
Google Scholar
Uijlings, J., van de Sande, K., Gevers, T., Smeulders, A.: Selective search for object recognition. International Journal of Computer Vision (2013)
Google Scholar
Xiang, Y., Mottaghi, R., Savarese, S.: Beyond pascal: A benchmark for 3D object detection in the wild. In: IEEE Winter Conference on Applications of Computer Vision (2014)
Google Scholar
Xiao, J., Russell, B., Torralba, A.: Localizing 3D cuboids in single-view images. In: Advances in Neural Information Processing Systems (2012)
Google Scholar
Zhao, Y., Zhu, S.C.: Scene parsing by integrating function, geometry and appearance models. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)
Google Scholar
Zia, M., Stark, M., Schindler, K.: Explicit occlusion modeling for 3D object class representations. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Massachusetts Institute of Technology, USA
Joseph J. Lim, Aditya Khosla & Antonio Torralba

Authors

Joseph J. Lim
View author publications
You can also search for this author in PubMed Google Scholar
Aditya Khosla
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Torralba
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Toronto, 6 King’s College Road, M5H 3S5, Toronto, ON, Canada
David Fleet
Faculty of Electrical Engineering, Department of Cybernetics, Czech Technical University in Prague, Technicka 2, 166 27, Prague 6, Czech Republic
Tomas Pajdla
Max-Planck-Institut für Informatik, Campus E1 4, 66123, Saarbrücken, Germany
Bernt Schiele
ESAT - PSI, iMinds, KU Leuven, Kasteelpark Arenberg 10, Bus 2441, 3001, Leuven, Belgium
Tinne Tuytelaars

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lim, J.J., Khosla, A., Torralba, A. (2014). FPM: Fine Pose Parts-Based Model with 3D CAD Models. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8694. Springer, Cham. https://doi.org/10.1007/978-3-319-10599-4_31

Download citation

DOI: https://doi.org/10.1007/978-3-319-10599-4_31
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10598-7
Online ISBN: 978-3-319-10599-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

FPM: Fine Pose Parts-Based Model with 3D CAD Models

Abstract

Chapter PDF

Similar content being viewed by others

Qualitative Pose Estimation by Discriminative Deformable Part Models

A Hybrid Approach for 6DoF Pose Estimation

6D Object Pose Estimation Using Keypoints and Part Affinity Fields

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

FPM: Fine Pose Parts-Based Model with 3D CAD Models

Abstract

Chapter PDF

Similar content being viewed by others

Qualitative Pose Estimation by Discriminative Deformable Part Models

A Hybrid Approach for 6DoF Pose Estimation

6D Object Pose Estimation Using Keypoints and Part Affinity Fields

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation