Abstract
A traditional approach to extracting geometric information from a large scene is to compute multiple 3-D depth maps from stereo pairs or direct range finders, and then to merge the 3-D data. However, the resulting merged depth maps may be subject to merging errors if the relative poses between depth maps are not known exactly. In addition, the 3-D data may also have to be resampled before merging, which adds additional complexity and potential sources of errors.
This paper provides a means of directly extracting 3-D data covering a very wide field of view, thus by-passing the need for numerous depth map merging. In our work, cylindrical images are first composited from sequences of images taken while the camera is rotated 360° about a vertical axis. By taking such image panoramas at different camera locations, we can recover 3-D data of the scene using a set of simple techniques: feature tracking, an 8-point structure from motion algorithm, and multibaseline stereo. We also investigate the effect of median filtering on the recovered 3-D point distributions, and show the results of our approach applied to both synthetic and real scenes.
Similar content being viewed by others
References
Ayache, N. 1991. Artificial Vision for Mobile Robots: Stereo Vision and Multisensory Perception. MIT Press: Cambridge, Massachusetts.
Azarbayejani, A. and Pentland, A. P. 1995. Recursive estimation of motion, structure, and focal length. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(6):562-575.
Barnard, S. T. and Fischler, M. A. 1982. Computational stereo. Computing Surveys, 14(4):553-572.
Bolles, R. C., Baker, H. H., and Marimont, D. H. 1987. Epipolar-plane image analysis: An approach to determining structure from motion. International Journal of Computer Vision, 1:7-55.
Deriche, R., Zhang, Z., Luong, Q.-T., and Faugeras, O. 1994. Robust recovery of the epipolar geometry for an uncalibrated stereo rig. In Third European Conference on Computer Vision (ECCV'94), Springer-Verlag: Stockholm, Sweden, pp. 567-576.
Dhond, U. R. and Aggarwal, J. K. 1989. Structure from stereo-A review. IEEE Transactions on Systems, Man, and Cybernetics, 19(6):1489-1510.
Faugeras, O. D. 1992. What can be seen in three dimensions with an uncalibrated stereo rig? In Second European Conference on Computer Vision (ECCV'92), Springer-Verlag: Santa Margherita Liguere, Italy, pp. 563-578.
Faugeras, O. 1993. Three-Dimensional Computer Vision: A Geometric Viewpoint. MIT Press: Cambridge, Massachusetts.
Ferrie, F. P. and Levine, M. D. 1987. Integrating information from multiple views. In IEEE Workshop on Computer Vision, IEEE Computer Society, pp. 117-122.
Freedman, D. H. 1995. A camera for near, far, and wide. Discover, 16(48):48.
Hartley, R. 1995. In defence of the 8-point algorithm. In Fifth International Conference on Computer Vision (ICCV'95), IEEE Computer Society Press, Cambridge, Massachusetts, pp. 1064- 1070.
Higuchi, K., Hebert, M., and Ikeuchi, K. 1993. Building 3-D models from unregistered range images. Technical Report CMU-CS-93-214, Carnegie Mellon University.
Ishiguro, H., Yamamoto, M., and Tsuji, S. 1992. Omnidirectional stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14:257-262.
Kang, S. B., Johnson, A., and Szeliski, R. 1995. Extraction of concise and realistic 3-D models from real data. Technical Report 95/7, Digital Equipment Corporation, Cambridge Research Lab.
Kang, S. B., Webb, J., Zitnick, L., and Kanade, T. 1995. A multi-baseline stereo system with active illumination and real-time image acquisition. In Fifth International Conference on Computer Vision (ICCV'95), Cambridge, Massachusetts, pp. 88-93.
Kang, S. B. and Weiss, R. 1996. Characterization of errors in compositing panoramic images. Technical Report 96/2, Digital Equipment Corporation, Cambridge Research Lab.
Kolb, C. E. 1994. Rayshade User's Guide and Reference Manual.
Krishnan, A. and Ahuja, N. 1996. Panoramic image acquisition. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'96), San Francisco, California.
Kuglin, C. D. and Hines, D. C. 1975. The phase correlation image alignment method. In IEEE 1975 Conference on Cybernetics and Society, New York, pp. 163-165.
Longuet-Higgins, H. C. 1981. A computer algorithm for reconstructing a scene from two projections. Nature, 293:133-135.
McMillan, L. and Bishop, G. 1995. Plenoptic modeling: An image-based rendering system. Computer Graphics (SIGGRAPH'95), pp. 39-46.
Murray, D. W. 1995. Recovering range using virtual multicamera stereo. Computer Vision and Image Understanding, 61(2):285- 291.
Okutomi, M. and Kanade, T. 1993. A multiple baseline stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(4):353-363.
Parvin, B. and Medioni, G. 1992. B-rep from unregistered multiple range images. In IEEE Int'l Conference on Robotics and Automation, IEEE Society, pp. 1602-1607.
Press, W. H., Flannery, B. P., Teukolsky, S. A., and Vetterling, W. T. 1992. Numerical Recipes in C: The Art of Scientific Computing. Second edition, Cambridge University Press: Cambridge, England.
Shabana, A. A. 1989. Dynamics of Multibody Systems. J. Wiley: New York.
Shashua, A. 1994. Projective structure from uncalibrated images: Structure from motion and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(8):778-790.
Shi, J. and Tomasi, C. 1994. Good features to track. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'94), IEEE Computer Society, Seattle, Washington, pp. 593-600.
Shum, H.-Y., Ikeuchi, K., and Reddy, R. 1994. Principal component analysis with missing data and its application to object modeling. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'94), IEEE Computer Society, Seattle, Washington, pp. 560-565.
Stein, G. 1995. Accurate internal camera calibration using rotation, with analysis of sources of error. In Fifth International Conference on Computer Vision (ICCV'95), Cambridge, Massachusetts, pp. 230-236.
Szeliski, R. 1994. Image mosaicing for tele-reality applications. Technical Report 94/2, Digital Equipment Corporation, Cambridge Research Lab.
Szeliski, R. 1996. Video mosaics for virtual environments. IEEE Computer Graphics and Applications, pp. 22-30.
Szeliski, R. and Coughlan, J. 1994. Hierarchical spline-based image registration. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'94), IEEE Computer Society, Seattle, Washington, pp. 194-201.
Szeliski, R. and Kang, S. B. 1994. Recovering 3D shape and motion from image streams using nonlinear least squares. Journal of Visual Communication and Image Representation, 5(1):10- 28.
Szeliski, R., Kang, S. B., and Shum, H.-Y. 1995. A parallel feature tracker for extended image sequences. In IEEE International Symposium on Computer Vision, Coral Gables, Florida, pp. 241- 246.
Taylor, C. J., Debevec, P. E., and Malik, J. 1996. Reconstructing polyhedral models of architectural scenes from photographs. In Fourth European Conference on Computer Vision (ECCV'96), Springer-Verlag: Cambridge, England, pp. 659-668.
Tian, T. Y., Tomasi, C., and Heeger, D. J. 1996. Comparison of approaches to egomotion computation. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, pp. 315-320.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Kang, S.B., Szeliski, R. 3-D Scene Data Recovery Using Omnidirectional Multibaseline Stereo. International Journal of Computer Vision 25, 167–183 (1997). https://doi.org/10.1023/A:1007971901577
Issue Date:
DOI: https://doi.org/10.1023/A:1007971901577