Abstract
This paper proposes a novel method of estimating the absolute scale of monocular SfM for a multi-modal stereo camera. In the fields of computer vision and robotics, scale estimation for monocular SfM has been widely investigated in order to simplify systems. This paper addresses the scale estimation problem for a stereo camera system in which two cameras capture different spectral images (e.g., RGB and FIR), whose feature points are difficult to directly match using descriptors. Furthermore, the number of matching points between FIR images can be comparatively small, owing to the low resolution and lack of thermal scene texture. To cope with these difficulties, the proposed method estimates the scale parameter using batch optimization, based on the epipolar constraint of a small number of feature correspondences between the invisible light images. The accuracy and numerical stability of the proposed method are verified by synthetic and real image experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agarwal, S., Snavely, N., Simon, I., Seitz, S.M., Szeliski, R.: Building Rome in a day. In: International Conference on Computer Vision (ICCV), pp. 72–79 (2009)
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006). https://doi.org/10.1007/11744023_32
Bertozzi, M., Broggi, A., Caraffi, C., Rose, M.D., Felisa, M., Vezzoni, G.: Pedestrian detection by means of far-infrared stereo vision. Comput. Vis. Image Underst. 106(2), 194–204 (2007)
Clipp, B., Kim, J.H., Frahm, J.M., Pollefeys, M., Hartley, R.: Robust 6DOF motion estimation for non-overlapping, multi-camera systems. In: IEEE Workshop on Applications of Computer Vision (WACV) (2008)
Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: MonoSLAM: real-time single camera SLAM. Trans. Pattern Anal. Mach. Intell. (TPAMI) 29(6), 1052–1067 (2007)
DeTone, D., Malisiewicz, T., Rabinovich, A.: Toward geometric deep SLAM. arXiv preprint arXiv:1707.07410 (2017)
Furukawa, Y., Ponce, J.: Accurate, dense, and robust multi-view stereopsis. Trans. Pattern Anal. Mach. Intell. (TPAMI) 32(8), 1362–1376 (2010)
Ham, Y., Golparvar-Fard, M.: An automated vision-based method for rapid 3D energy performance modeling of existing buildings using thermal and digital imagery. Adv. Eng. Inform. 27(3), 395–409 (2013)
Han, X., Leung, T., Jia, Y., Sukthankar, R., Berg, A.C.: MatchNet: unifying feature and metric learning for patch-based matching. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3279–3286 (2015)
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004). ISBN 0521540518
Iwaszczuk, D., Stilla, U.: Camera pose refinement by matching uncertain 3D building models with thermal infrared image sequences for high quality texture extraction. ISPRS J. Photogramm. Remote. Sens. 132, 33–47 (2017)
Jancosek, M., Pajdla, T.: Multi-view reconstruction preserving weakly-supported surfaces. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3121–3128 (2011)
Kitt, B.M., Rehder, J., Chambers, A.D., Schonbein, M., Lategahn, H., Singh, S.: Monocular visual odometry using a planar road model to solve scale ambiguity. In: European Conference on Mobile Robots (2011)
Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: International Symposium on Mixed and Augmented Reality (ISMAR), pp. 225–234 (2007)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. (IJCV) 60(2), 91–110 (2004)
Müller, A.O., Kroll, A.: Generating high fidelity 3-D thermograms with a handheld real-time thermal imaging system. IEEE Sens. J. 17(3), 774–783 (2017)
Newcombe, R.A., et al.: KinectFusion: real-time dense surface mapping and tracking. In: International Symposium on Mixed and Augmented Reality (ISMAR), pp. 127–136 (2011)
Nistér, D.: An efficient solution to the five-point relative pose problem. IEEE Trans. Pattern Anal. Mach. Intell. 26(6), 756–770 (2004)
Nützi, G., Weiss, S., Scaramuzza, D., Siegwart, R.: Fusion of IMU and vision for absolute scale estimation in monocular SLAM. J. Intell. Robot. Syst. 61(1), 287–299 (2011)
Oreifej, O., Cramer, J., Zakhor, A.: Automatic generation of 3D thermal maps of building interiors. ASHRAE Trans. 120, C1 (2014)
Phuc Truong, T., Yamaguchi, M., Mori, S., Nozick, V., Saito, H.: Registration of RGB and thermal point clouds generated by structure from motion. In: International Conference on Computer Vision Workshop (ICCVW) (2017)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: International Conference on Computer Vision (ICCV), pp. 2564–2571 (2011)
Scaramuzza, D., Fraundorfer, F., Pollefeys, M., Siegwart, R.: Absolute scale in structure from motion from a single vehicle mounted camera by exploiting nonholonomic constraints. In: International Conference on Computer Vision (ICCV), pp. 1413–1419 (2009)
Schönberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4104–4113 (2016)
Schönberger, J.L., Zheng, E., Frahm, J.-M., Pollefeys, M.: Pixelwise view selection for unstructured multi-view stereo. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 501–518. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_31
Stewénius, H., Engels, C., Nistér, D.: Recent developments on direct relative orientation. ISPRS J. Photogramm. Remote Sens. 60, 284–294 (2006)
Thiele, S.T., Varley, N., James, M.R.: Thermal photogrammetric imaging: a new technique for monitoring dome eruptions. J. Volcanol. Geotherm. Res. 337(Suppl. C), 140–145 (2017)
Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjustment—a modern synthesis. In: Triggs, B., Zisserman, A., Szeliski, R. (eds.) IWVA 1999. LNCS, vol. 1883, pp. 298–372. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44480-7_21
Vidas, S., Moghadam, P., Bosse, M.: 3D thermal mapping of building interiors using an RGB-D and thermal camera. In: International Conference on Robotics and Automation (ICRA), pp. 2311–2318 (2013)
Weinmann, M., Leitloff, J., Hoegner, L., Jutzi, B., Stilla, U., Hinz, S.: Thermal 3D mapping for object detection in dynamic scenes. ISPRS Ann. Photogramm. Remote. Sens. Spat. Inf. Sci. 2(1), 53 (2014)
Zagoruyko, S., Komodakis, N.: Learning to compare image patches via convolutional neural networks. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4353–4361 (2015)
Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 22, 1330–1334 (2000)
Acknowledgements
This research is supported by the Hori Sciences & Arts Foundation, the New Energy and Industrial Technology Development Organization (NEDO) and JSPS KAKENHI Grant Number 18K18071.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material 2 (mp4 30669 KB)
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Sumikura, S., Sakurada, K., Kawaguchi, N., Nakamura, R. (2019). Scale Estimation of Monocular SfM for a Multi-modal Stereo Camera. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11363. Springer, Cham. https://doi.org/10.1007/978-3-030-20893-6_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-20893-6_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20892-9
Online ISBN: 978-3-030-20893-6
eBook Packages: Computer ScienceComputer Science (R0)