3D scene reconstruction using a texture probabilistic grammar

Li, Dan; Hu, Disheng; Sun, Yuke; Hu, Yingsong

doi:10.1007/s11042-018-6052-z

3D scene reconstruction using a texture probabilistic grammar

Published: 01 May 2018

Volume 77, pages 28417–28440, (2018)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Dan Li¹,
Disheng Hu¹,
Yuke Sun¹ &
…
Yingsong Hu¹

423 Accesses
3 Citations
Explore all metrics

Abstract

In this paper, texture probabilistic grammar is defined for the first time. We have developed an algorithm to obtain the 3D information in a 2D scene by training the texture probabilistic grammar from the prebuilt model library. The well-trained texture probabilistic grammar could also be applied to 3D reconstruction. Our detailed process contains: dividing the 2D scene into texture fragments; assigning the most suitable 3D object label to the 2D texture fragments; using our texture probabilistic grammar to predict 3D information of the texture fragments in 2D scene image; constructing the 3D model of the original 2D scene image. Through experiments, it is proved that the algorithm has a better effect on reconstruction of indoor scenes and building structures, and the algorithm is superior to the traditional reconstruction method based on point clouds. Different datasets and reconstructed objects are tested, which verifies the robustness of the algorithm. As a result, our algorithm is able to deal with the large numbers of scenes with similar semantics and it is also fast enough to deal with the online 3D reconstruction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised Texture Segmentation of Natural Scene Images Using Region-based Markov Random Field

Article 05 September 2015

Image Parallax Based Modeling of Depth-Layer Architecture

A robust three-stage approach to large-scale urban scene recognition

Article 06 September 2017

References

Ahmed M T, Dailey M N, Landabaso J L, et al. (2010) Robust key frame extraction for 3D reconstruction from video streams. In: Proceedings of the Fifth International Conference on Computer Vision Theory and Applications (VISAPP 2010), pp 231–236
Ankerst M, Kastenmüller G, Kriegel HP et al (1999) 3D shape histograms for similarity search and classification in spatial databases. Lect Notes Comput Sci 1651:207–226
Article Google Scholar
Audras C, Comport A, Meilland M, et al. (2011) Real-time dense appearance-based SLAM for RGB-D sensors. In: 2011 Australasian conference on robotics and automation, pp 2–2
Bay H, Ess A, Tuytelaars T et al (2008) Speeded-up robust features. Computer Vision & Image Understanding 110(3):404–417
Article Google Scholar
Bengio Y (2009) Learning deep architectures for AI. Foundations & Trends® in Machine Learning 2(1):1–127
Article Google Scholar
Boscaini D, Masci J, Melzi S, Bronstein MM, Castellani U, Vandergheynst P (2015) Learning class-specific descriptors for deformable shapes using localized spectral convolutional networks. Computer Graphics Forum 34(5):13–23
Article Google Scholar
Campen M, Attene M, Kobbelt L (2012) A Practical Guide to Polygon Mesh Repairing. Eurographics (Tutorials)
Chang A X, Funkhouser T, Guibas L, et al. (2015) Shapenet: An information-rich 3d model repository arXiv preprint arXiv:1512.03012
Chaudhuri S, Koltun V (2010) Data-driven suggestions for creativity support in 3D modeling. ACM Trans Graph 29(6):81–95
Article Google Scholar
Cho S, Lee S (2009) Fast motion deblurring. ACM Trans Graph 28(5):1–8
Article Google Scholar
Couprie C, Farabet C, Najman L, et al (2013) Indoor semantic segmentation using depth information. arXiv preprint arXiv:1301.3572
Fergus R, Singh B, Hertzmann A, Roweis ST, Freeman WT (2006) Removing camera shake from a single photograph. ACM Trans Graph 25(3):787–794
Article Google Scholar
Fraser CS (1997) Digital camera self-calibration. ISPRS J Photogramm Remote Sens 52(4):149–159
Article Google Scholar
Furukawa Y, Ponce J (2010) Accurate, dense, and robust multiview stereopsis. IEEE Transactions on Pattern Analysis & Machine Intelligence 32(8):1362–1376
Article Google Scholar
Guillou E, Meneveaux D, Maisel E et al (2010) Using vanishing points for camera calibration and coarse 3D reconstruction from a single image. Visual. Computer 16(7):396–410
MATH Google Scholar
Handa A, Patraucean V, Badrinarayanan V, et al (2015) Synthcam3d: Semantic understanding with synthetic indoor scenes arXiv preprint arXiv:1505.00171
Heikkila J, Silven O (1997) A four-step camera calibration procedure with implicitimage correction. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, pp 1106–1112
Henry P, Krainin M, Herbst E (2012) RGB-D mapping: using Kinect-style depth cameras for dense 3D modeling of indoor environments. The International Journal of Robotics Research 31(5):647–663
Article Google Scholar
Hong W, Yang AY, Huang K, Ma Y (2004) On symmetry and multiple-view geometry: structure, pose, and calibration from a single image. Int J Comput Vis 60(3):241–265
Article Google Scholar
Horn BKP (1983) Extended Gaussian images. Proc IEEE 72(12):1671–1686
Article Google Scholar
Hu G, Huang S, Zhao L (2012) A robust rgb-d slam algorithm. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, pp 1714–1719
Jiang N, Tan P, Cheong L F (2009) Symmetric architecture modeling with a single image. In: ACM SIGGRAPH Asia, ACM, pp. 113
Kalogerakis E, Hertzmann A, Singh K (2010) Learning 3D mesh segmentation and labeling. ACM Trans Graph 29(4):102
Article Google Scholar
Kato H, Ushiku Y, Harada T (2017) Neural 3D Mesh Renderer. arXiv preprint arXiv: 1711.07566
Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques - adaptive computation and machine learning. MIT Press, Cambridge
Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp. 1097–1105
Li FF, Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Transactions on Pattern Analysis & Machine Intelligence 28(4):594–611
Article Google Scholar
Li W, Mitra NJ et al (2013) Learning part-based templates from large collections of 3D shapes. ACM Trans Graph 32(4):70
MATH Google Scholar
Liebowitz D, Criminisi A, Zisserman A (1999) Creating architectural models from images. Computer Graphics Forum 18:39–50
Article Google Scholar
Lourakis MIA, Argyros AA (2009) SBA: A software package for generic sparse bundle adjustment. ACM Trans Math Softw 36(1):2
Article MathSciNet Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant Keypoints. Int J Comput Vis 60(2):91–110
Article MathSciNet Google Scholar
Orghidan R, Salvi J, Gordan M, Florea C, Batlle J (2014) Structured light self-calibration with vanishing points. Mach Vis Appl 25(2):489–500
Article Google Scholar
Rashidi A, Dai F, Brilakis I, Vela P (2013) Optimized selection of keyframes for monocular videogrammetric surveying of civil infrastructure. Adv Eng Inform 27(2):270–282
Article Google Scholar
Ren X, Bo L, Fox D (2012) RGB-D scene labeling: features and algorithms. In: 2012 I.E. Conference on Computer Vision and Pattern Recognition, IEEE, pp. 2759–2766
Seo YH, Choi JS (2008) Optimal keyframe selection algorithm for three-dimensional reconstruction in uncalibrated multiple images. Opt Eng 47(5):525–534
Article Google Scholar
Shi Y, Long P, Xu K, Huang H, Xiong Y (2016) Data-driven contextual modeling for 3D scene understanding. Comput Graph 55:55–67
Article Google Scholar
Shilane P, Min P, Kazhdan M, et al. (2004) The Princeton shape benchmark. In: Shape Modeling International, IEEE, pp. 167–178
Silberman N, Fergus R (2011) Indoor scene segmentation using a structured light sensor. In: IEEE International Conference on Computer Vision Workshops, IEEE, pp. 601–608
Silberman N, Hoiem D, Kohli P, et al (2012) Indoor segmentation and support inference from RGBD images. In: European Conference on Computer Vision, Springer, pp. 746–760
Snavely N, Seitz SM, Szeliski R (2008) Modeling the world from internet photo collections. Int J Comput Vis 80(2):189–210
Article Google Scholar
Socher R, Huval B, Bhat B, et al (2012) Convolutional-recursive deep learning for 3d object classification. In: Advances in neural information processing systems, pp. 656–664
Song S, Lichtenberg S P, Xiao J Sun (2015) RGB-D: a rgb-d scene understanding benchmark suite. In: 2-15 IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 567–576
Szegedy C, Liu W, Jia Y, et al (2015) Going deeper with convolutions. In: Computer Vision and Pattern Recognition IEEE, pp 1–9
Torralba A, Fergus R, Freeman WT (2008) 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis & Machine Intelligence 30(11):1958–1970
Article Google Scholar
Triggs B, Mclauchlan P F, Hartley R I, et al (1999) Bundle adjustment — a modern synthesis. In: 1999 international workshop on vision algorithms: theory and practice, springer, pp 298–372
Chapter Google Scholar
Vanegas C A, Aliaga DG, Benes B (2010) Building reconstruction using Manhattan-world grammars. In: Computer Vision and Pattern Recognition (CVPR), IEEE, pp 358–365
Wilczkowiak M, Sturm P, Boyer E (2005) Using geometric constraints through parallelepipeds for calibration and 3D modeling. IEEE Transactions on Pattern Analysis & Machine Intelligence 27(2):194–207
Article Google Scholar
Wu Z, Song S, Khosla A, et al (2015) 3D ShapeNets: a deep representation for volumetric shapes. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp 1912–1920
Wu J, Xue T, Lim JJ, et al (2016) Single image 3D interpreter network. In: European Conference on Computer Vision, Springer, pp. 365–382
Yan C, Zhang Y, Xu J, Dai F, Li L, Dai Q, Wu F (2014) A highly parallel framework for HEVC coding unit partitioning tree decision on many-core processors. IEEE Signal Processing Letters 21(5):573–576
Article Google Scholar
Yan C, Zhang Y, Xu J, Dai F, Zhang J, Dai Q, Wu F (2014) Efficient parallel framework for HEVC motion estimation on many-core processors. IEEE Transactions on Circuits & Systems for Video Technology 24(12):2077–2089
Article Google Scholar
Yan C, Xie H, Yang D et al (2017) Supervised hash coding with deep neural network for environment perception of intelligent vehicles. IEEE Trans Intell Transp Syst 99:1–12
Google Scholar
Yan C, Xie H, Liu S et al (2017) Effective Uyghur language text detection in complex background images for traffic prompt identification. IEEE Trans Intell Transp Syst 99:1–10
Google Scholar
Yang Z, Hoseinzadeh M. (2017) AutoTiering: Automatic Data Placement Manager in Multi-Tier All-Flash Datacenter, IEEE International Performance Computing and Communications Conference (IPCCC)
Yu K, Ng A (2010) Feature learning for image classification. ECCV (tutorials)

Download references

Acknowledgments

This work is supported by National Natural Science Foundation of China (No. 61502185) and the Fundamental Research Funds for the Central Universities (No: 2017KFYXJJ071).

The authors would like to thank Shuang Liu for helping us to collect experimental material.

Author information

Authors and Affiliations

School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China
Dan Li, Disheng Hu, Yuke Sun & Yingsong Hu

Authors

Dan Li
View author publications
You can also search for this author in PubMed Google Scholar
Disheng Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yuke Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yingsong Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuke Sun.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, D., Hu, D., Sun, Y. et al. 3D scene reconstruction using a texture probabilistic grammar. Multimed Tools Appl 77, 28417–28440 (2018). https://doi.org/10.1007/s11042-018-6052-z

Download citation

Received: 10 November 2017
Revised: 17 April 2018
Accepted: 23 April 2018
Published: 01 May 2018
Issue Date: November 2018
DOI: https://doi.org/10.1007/s11042-018-6052-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3D scene reconstruction using a texture probabilistic grammar

Abstract

Access this article

Similar content being viewed by others

Unsupervised Texture Segmentation of Natural Scene Images Using Region-based Markov Random Field

Image Parallax Based Modeling of Depth-Layer Architecture

A robust three-stage approach to large-scale urban scene recognition

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

3D scene reconstruction using a texture probabilistic grammar

Abstract

Access this article

Similar content being viewed by others

Unsupervised Texture Segmentation of Natural Scene Images Using Region-based Markov Random Field

Image Parallax Based Modeling of Depth-Layer Architecture

A robust three-stage approach to large-scale urban scene recognition

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation