Scale Estimation of Monocular SfM for a Multi-modal Stereo Camera

Sumikura, Shinya; Sakurada, Ken; Kawaguchi, Nobuo; Nakamura, Ryosuke

doi:10.1007/978-3-030-20893-6_18

Shinya Sumikura¹⁸,
Ken Sakurada¹⁹,
Nobuo Kawaguchi¹⁸ &
…
Ryosuke Nakamura¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11363))

Included in the following conference series:

Asian Conference on Computer Vision

3216 Accesses

Abstract

This paper proposes a novel method of estimating the absolute scale of monocular SfM for a multi-modal stereo camera. In the fields of computer vision and robotics, scale estimation for monocular SfM has been widely investigated in order to simplify systems. This paper addresses the scale estimation problem for a stereo camera system in which two cameras capture different spectral images (e.g., RGB and FIR), whose feature points are difficult to directly match using descriptors. Furthermore, the number of matching points between FIR images can be comparatively small, owing to the low resolution and lack of thermal scene texture. To cope with these difficulties, the proposed method estimates the scale parameter using batch optimization, based on the epipolar constraint of a small number of feature correspondences between the invisible light images. The accuracy and numerical stability of the proposed method are verified by synthetic and real image experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agarwal, S., Snavely, N., Simon, I., Seitz, S.M., Szeliski, R.: Building Rome in a day. In: International Conference on Computer Vision (ICCV), pp. 72–79 (2009)
Google Scholar
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006). https://doi.org/10.1007/11744023_32
Chapter Google Scholar
Bertozzi, M., Broggi, A., Caraffi, C., Rose, M.D., Felisa, M., Vezzoni, G.: Pedestrian detection by means of far-infrared stereo vision. Comput. Vis. Image Underst. 106(2), 194–204 (2007)
Article Google Scholar
Clipp, B., Kim, J.H., Frahm, J.M., Pollefeys, M., Hartley, R.: Robust 6DOF motion estimation for non-overlapping, multi-camera systems. In: IEEE Workshop on Applications of Computer Vision (WACV) (2008)
Google Scholar
Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: MonoSLAM: real-time single camera SLAM. Trans. Pattern Anal. Mach. Intell. (TPAMI) 29(6), 1052–1067 (2007)
Article Google Scholar
DeTone, D., Malisiewicz, T., Rabinovich, A.: Toward geometric deep SLAM. arXiv preprint arXiv:1707.07410 (2017)
Furukawa, Y., Ponce, J.: Accurate, dense, and robust multi-view stereopsis. Trans. Pattern Anal. Mach. Intell. (TPAMI) 32(8), 1362–1376 (2010)
Article Google Scholar
Ham, Y., Golparvar-Fard, M.: An automated vision-based method for rapid 3D energy performance modeling of existing buildings using thermal and digital imagery. Adv. Eng. Inform. 27(3), 395–409 (2013)
Article Google Scholar
Han, X., Leung, T., Jia, Y., Sukthankar, R., Berg, A.C.: MatchNet: unifying feature and metric learning for patch-based matching. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3279–3286 (2015)
Google Scholar
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004). ISBN 0521540518
Book Google Scholar
Iwaszczuk, D., Stilla, U.: Camera pose refinement by matching uncertain 3D building models with thermal infrared image sequences for high quality texture extraction. ISPRS J. Photogramm. Remote. Sens. 132, 33–47 (2017)
Article Google Scholar
Jancosek, M., Pajdla, T.: Multi-view reconstruction preserving weakly-supported surfaces. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3121–3128 (2011)
Google Scholar
Kitt, B.M., Rehder, J., Chambers, A.D., Schonbein, M., Lategahn, H., Singh, S.: Monocular visual odometry using a planar road model to solve scale ambiguity. In: European Conference on Mobile Robots (2011)
Google Scholar
Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: International Symposium on Mixed and Augmented Reality (ISMAR), pp. 225–234 (2007)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. (IJCV) 60(2), 91–110 (2004)
Article Google Scholar
Müller, A.O., Kroll, A.: Generating high fidelity 3-D thermograms with a handheld real-time thermal imaging system. IEEE Sens. J. 17(3), 774–783 (2017)
Article Google Scholar
Newcombe, R.A., et al.: KinectFusion: real-time dense surface mapping and tracking. In: International Symposium on Mixed and Augmented Reality (ISMAR), pp. 127–136 (2011)
Google Scholar
Nistér, D.: An efficient solution to the five-point relative pose problem. IEEE Trans. Pattern Anal. Mach. Intell. 26(6), 756–770 (2004)
Article Google Scholar
Nützi, G., Weiss, S., Scaramuzza, D., Siegwart, R.: Fusion of IMU and vision for absolute scale estimation in monocular SLAM. J. Intell. Robot. Syst. 61(1), 287–299 (2011)
Article Google Scholar
Oreifej, O., Cramer, J., Zakhor, A.: Automatic generation of 3D thermal maps of building interiors. ASHRAE Trans. 120, C1 (2014)
Google Scholar
Phuc Truong, T., Yamaguchi, M., Mori, S., Nozick, V., Saito, H.: Registration of RGB and thermal point clouds generated by structure from motion. In: International Conference on Computer Vision Workshop (ICCVW) (2017)
Google Scholar
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: International Conference on Computer Vision (ICCV), pp. 2564–2571 (2011)
Google Scholar
Scaramuzza, D., Fraundorfer, F., Pollefeys, M., Siegwart, R.: Absolute scale in structure from motion from a single vehicle mounted camera by exploiting nonholonomic constraints. In: International Conference on Computer Vision (ICCV), pp. 1413–1419 (2009)
Google Scholar
Schönberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4104–4113 (2016)
Google Scholar
Schönberger, J.L., Zheng, E., Frahm, J.-M., Pollefeys, M.: Pixelwise view selection for unstructured multi-view stereo. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 501–518. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_31
Chapter Google Scholar
Stewénius, H., Engels, C., Nistér, D.: Recent developments on direct relative orientation. ISPRS J. Photogramm. Remote Sens. 60, 284–294 (2006)
Article Google Scholar
Thiele, S.T., Varley, N., James, M.R.: Thermal photogrammetric imaging: a new technique for monitoring dome eruptions. J. Volcanol. Geotherm. Res. 337(Suppl. C), 140–145 (2017)
Article Google Scholar
Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjustment—a modern synthesis. In: Triggs, B., Zisserman, A., Szeliski, R. (eds.) IWVA 1999. LNCS, vol. 1883, pp. 298–372. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44480-7_21
Chapter Google Scholar
Vidas, S., Moghadam, P., Bosse, M.: 3D thermal mapping of building interiors using an RGB-D and thermal camera. In: International Conference on Robotics and Automation (ICRA), pp. 2311–2318 (2013)
Google Scholar
Weinmann, M., Leitloff, J., Hoegner, L., Jutzi, B., Stilla, U., Hinz, S.: Thermal 3D mapping for object detection in dynamic scenes. ISPRS Ann. Photogramm. Remote. Sens. Spat. Inf. Sci. 2(1), 53 (2014)
Article Google Scholar
Zagoruyko, S., Komodakis, N.: Learning to compare image patches via convolutional neural networks. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4353–4361 (2015)
Google Scholar
Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 22, 1330–1334 (2000)
Article Google Scholar

Download references

Acknowledgements

This research is supported by the Hori Sciences & Arts Foundation, the New Energy and Industrial Technology Development Organization (NEDO) and JSPS KAKENHI Grant Number 18K18071.

Author information

Authors and Affiliations

Nagoya University, Nagoya, Japan
Shinya Sumikura & Nobuo Kawaguchi
National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
Ken Sakurada & Ryosuke Nakamura

Authors

Shinya Sumikura
View author publications
You can also search for this author in PubMed Google Scholar
Ken Sakurada
View author publications
You can also search for this author in PubMed Google Scholar
Nobuo Kawaguchi
View author publications
You can also search for this author in PubMed Google Scholar
Ryosuke Nakamura
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shinya Sumikura .

Editor information

Editors and Affiliations

IIIT Hyderabad, Hyderabad, India
C. V. Jawahar
ANU, Canberra, ACT, Australia
Hongdong Li
Simon Fraser University, Burnaby, BC, Canada
Greg Mori
ETH Zurich, Zurich, Zürich, Switzerland
Konrad Schindler

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 2 (mp4 30669 KB)

Supplementary material 1 (pdf 15031 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sumikura, S., Sakurada, K., Kawaguchi, N., Nakamura, R. (2019). Scale Estimation of Monocular SfM for a Multi-modal Stereo Camera. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11363. Springer, Cham. https://doi.org/10.1007/978-3-030-20893-6_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-20893-6_18
Published: 29 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20892-9
Online ISBN: 978-3-030-20893-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics