Semantic Segmentation of Urban Scenes Using Dense Depth Maps

Zhang, Chenxi; Wang, Liang; Yang, Ruigang

doi:10.1007/978-3-642-15561-1_51

Chenxi Zhang¹⁹,
Liang Wang¹⁹ &
Ruigang Yang¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6314))

Included in the following conference series:

European Conference on Computer Vision

12k Accesses
75 Citations

Abstract

In this paper we present a framework for semantic scene parsing and object recognition based on dense depth maps. Five view-independent 3D features that vary with object class are extracted from dense depth maps at a superpixel level for training a classifier using randomized decision forest technique. Our formulation integrates multiple features in a Markov Random Field (MRF) framework to segment and recognize different object classes in query street scene images. We evaluate our method both quantitatively and qualitatively on the challenging Cambridge-driving Labeled Video Database (CamVid). The result shows that only using dense depth information, we can achieve overall better accurate segmentation and recognition than that from sparse 3D features or appearance, or even the combination of sparse 3D features and appearance, advancing state-of-the-art performance. Furthermore, by aligning 3D dense depth based features into a unified coordinate frame, our algorithm can handle the special case of view changes between training and testing scenarios. Preliminary evaluation in cross training and testing shows promising results.

Download to read the full chapter text

Chapter PDF

A robust three-stage approach to large-scale urban scene recognition

Article 06 September 2017

Recursive Inference for Prediction of Objects in Urban Environments

Object Recognition in 3D Point Cloud of Urban Street Scene

References

Levinshtein, A., Stere, A., Kutulakos, K.N., Fleet, D.J., Dickinson, S.J.: Turbopixels: Fast superpixels using geometric flows. IEEE Trans. on Pattern Analysis and Machine Intelligence 31(12), 2290–2297 (2009)
Article Google Scholar
Russell, B.C., Torralba, A.: Labelme: a database and web-based tool for image. Int. J. of Computer Vision 77(1)
Google Scholar
Liu, C., Yuen, J., Torralba, A.: Nonparametric scene parsing: Label transfer via dense scene alignment. In: Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (2009)
Google Scholar
Collins, R.T.: A space-sweep approach to true multi-image matching. In: Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, pp. 358–365 (1996)
Google Scholar
Comaniciu, D., Meer, P.: Mean shift: A robust approach toward feature space analysis. IEEE Trans. on Pattern Analysis and Machine Intelligence 25(5), 603–619 (2002)
Article Google Scholar
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient belief propagation for early vision. Int. J. of Computer Vision 70(1) (October 2006)
Google Scholar
Fischler, M., Bolles, R.: Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM 24(6), 381–395 (1981)
Article MathSciNet Google Scholar
Brostow, G., Fauqueur, J., Cipolla, R.: Semantic object classes in video: A high-definition ground truth database. Pattern Recognition letters 20(2), 88–97 (2009)
Article Google Scholar
Browtow, G., Shotton, J., Fauqueur, J., Cipolla, R.: Segmentation and recognition using structure from motion point clouds. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 44–57. Springer, Heidelberg (2008)
Chapter Google Scholar
Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (2008)
Google Scholar
Xiao, J., Quan, L.: Multiple view semantic segmentation for street view images. In: Proc. of Intl. Conf. on Computer Vision (2009)
Google Scholar
Li, L., Li, F.: What, where and who? classifying events by scene and object recognition. In: Proc. of Intl. Conf. on Computer Vision (2007)
Google Scholar
Merrell, P., Akbarzadeh, A., Wang, L., Mordohai, P., Frahm, J.M., Yang, R., Nister, D., Pollefeys, M.: Real-time visibility-based fusion of depth maps. In: Proc. of Intl. Conf. on Computer Vision (2007)
Google Scholar
Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Machine Learning 36(1), 3–42 (2006)
Article Google Scholar
Pollefeys, M., Nister, D., Frahm, J.M., Akbarzadeh, A., Mordohai, P., Clipp, B., Engels, C., Gallup, D., Kim, S.J., Merrell, P., Salmi, C., Sinha, S., Talton, B., Wang, L., Yang, Q., Stewenius, H., Yang, R., Welch, G., Towles, H.: Detailed real-time urban 3d reconstruction from video. Int. J. of Computer Vision 78(2), 143–167 (2008)
Article Google Scholar
Sun, J., Li, Y., Kang, S.B., Shum, H.Y.: Symmetric stereo matching for occlusion handling. In: Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (2005)
Google Scholar
Yang, R., Pollefeys, M.: Multi-resolution real-time stereo on commodity graphics hardware. In: Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (2003)
Google Scholar
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. on Pattern Analysis and Machine Intelligence 23(11), 1222–1239 (2001)
Article Google Scholar
Zhang, G., Jia, J., Wang, T.T., Bao, H.: Recovering consistent video depth maps via bundle optimization. In: Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (2008)
Google Scholar
Zhang, G., Qin, X., Hua, W., Wang, T.T., Heng, P.A., Bao, H.: Robust metric reconstruction from challenging video sequences. In: Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (2007), http://www.zjucvg.net/acts/acts.html

Download references

Author information

Authors and Affiliations

Center for Visualization and Virtual Environments, University of Kentucky, USA
Chenxi Zhang, Liang Wang & Ruigang Yang

Authors

Chenxi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Liang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ruigang Yang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

GRASP Laboratory, University of Pennsylvania, 3330 Walnut Street, 19104, Philadelphia, PA, USA
Kostas Daniilidis
School of Electrical and Computer Engineering, National Technical University of Athens, 15773, Athens, Greece
Petros Maragos
Department of Applied Mathematics, Ecole Centrale de Paris, Grande Voie des Vignes, 92295, Chatenay-Malabry, France
Nikos Paragios

1 Electronic Supplementary Material

Electronic Supplementary Material (13,480 KB)

Electronic Supplementary Material (126 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, C., Wang, L., Yang, R. (2010). Semantic Segmentation of Urban Scenes Using Dense Depth Maps. In: Daniilidis, K., Maragos, P., Paragios, N. (eds) Computer Vision – ECCV 2010. ECCV 2010. Lecture Notes in Computer Science, vol 6314. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15561-1_51

Download citation

DOI: https://doi.org/10.1007/978-3-642-15561-1_51
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15560-4
Online ISBN: 978-3-642-15561-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Semantic Segmentation of Urban Scenes Using Dense Depth Maps

Abstract

Chapter PDF

Similar content being viewed by others

A robust three-stage approach to large-scale urban scene recognition

Recursive Inference for Prediction of Objects in Urban Environments

Object Recognition in 3D Point Cloud of Urban Street Scene

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

1 Electronic Supplementary Material

Electronic Supplementary Material (13,480 KB)

Electronic Supplementary Material (126 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Semantic Segmentation of Urban Scenes Using Dense Depth Maps

Abstract

Chapter PDF

Similar content being viewed by others

A robust three-stage approach to large-scale urban scene recognition

Recursive Inference for Prediction of Objects in Urban Environments

Object Recognition in 3D Point Cloud of Urban Street Scene

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

1 Electronic Supplementary Material

Electronic Supplementary Material (13,480 KB)

Electronic Supplementary Material (126 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation