Abstract
This paper describes a method for bringing two videos (recorded at different times) into spatiotemporal alignment, then comparing and combining corresponding pixels for applications such as background subtraction, compositing, and increasing dynamic range. We align a pair of videos by searching for frames that best match according to a robust image registration process. This process uses locally weighted regression to interpolate and extrapolate high-likelihood image correspondences, allowing new correspondences to be discovered and refined. Image regions that cannot be matched are detected and ignored, providing robustness to changes in scene content and lighting, which allows a variety of new applications.
Supplemental Material
Available for Download
- AGARWALA, A., DONTCHEVA, M., AGRAWALA, M., DRUCKER, S., COLBURN, A., CURLESS, B., SALESIN, D., AND COHEN, M. 2004. Interactive digital photomontage. ACM Trans. Graph., In press.]] Google ScholarDigital Library
- ATKESON, C. G., MOORE, A. W., AND SCHAAL, S. 1997. Locally weighted learning. Artificial Intelligence Review 11, 1-5, 11--73.]] Google ScholarDigital Library
- BEAUCHEMIN, S. S., AND BARRON, J. L. 1995. The computation of optical flow. ACM Computing Surveys 27, 3, 433--467.]] Google ScholarDigital Library
- BIRCHFIELD, S., AND TOMASI, C. 1998. A pixel dissimilarity measure that is insensitive to image sampling. IEEE Trans. on Pattern Analysis and Machine Intelligence 20, 4, 401--406.]] Google ScholarDigital Library
- BLACK, M. J., AND ANANDAN, P. 1996. The robust estimation of multiple motions: parametric and piecewise-smooth flow fields. Computer Vision and Image Understanding 63, 1, 75--104.]] Google ScholarDigital Library
- BROWN, M., AND LOWE, D. G. 2003. Recognising panoramas. In ICCV, 1218--1225.]] Google ScholarDigital Library
- CASPI, Y., AND IRANI, M. 2000. A step towards sequence to sequence alignment. In CVPR, 682--689.]]Google Scholar
- CHUANG, Y.-Y., AGARWALA, A., CURLESS, B., SALESIN, D. H., AND SZELISKI, R. 2002. Video matting of complex scenes. ACM Trans. Graph. 21, 3, 243--248.]] Google ScholarDigital Library
- DAVISON, A. J., DEUTSCHER, J., AND REID, I. D. 2001. Markerless motion capture of complex full-body movement for character animation. In Eurographics Workshop on Animation and Simulation, 3--14.]] Google ScholarDigital Library
- DEBEVEC, P. E., AND MALIK, J. 1997. Recovering high dynamic range radiance maps from photographs. In SIGGRAPH, 369--378.]] Google ScholarDigital Library
- DEMPSTER, A. P., LAIRD, N. M., AND RUBIN, D. B. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B 39, 1, 1--38.]]Google ScholarCross Ref
- FERRARI, V., TUYTELAARS, T., AND VAN GOOL, L. 2001. Real-time affine region tracking and coplanar grouping. In CVPR, 226--233.]]Google Scholar
- FISCHLER, M. A., AND BOLLES, R. C. 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM 24, 6, 381--395.]] Google ScholarDigital Library
- HARRIS, C., AND STEPHENS, M. 1988. A combined corner and edge detector. In 4th Alvey Vision Conference, 147--151.]]Google ScholarCross Ref
- HARTLEY, R., AND ZISSERMAN, A. 2000. Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge, UK.]] Google ScholarDigital Library
- KANAZAWA, Y., AND KANATANI, K. 2002. Robust image matching under a large disparity. In Workshop on Science of Computer Vision, 46--52.]]Google Scholar
- KANG, S. B., UYTTENDAELE, M., WINDER, S., AND SZELISKI, R. 2003. High dynamic range video. ACM Trans. Graph. 22, 3, 319--325.]] Google ScholarDigital Library
- KUTULAKOS, K. N. 2000. Approximate N-view stereo. In ECCV, 67--83.]] Google ScholarDigital Library
- LUCAS, B., AND KANADE, T. 1981. An iterative image registration technique with an application to stereo vision. In Int. Joint Conf. Artificial Intelligence, 674--679.]]Google Scholar
- NOBLE, A. 1989. Descriptions of Image Surfaces. PhD thesis, Oxford University, Oxford, UK.]]Google Scholar
- RAO, C., GRITAI, A., AND SHAH, M. 2003. View-invariant alignment and matching of video sequences. In ICCV, 939--945.]] Google ScholarDigital Library
- SAND, P., AND TELLER, S. 2004. Video matching. Tech. Rep. LCS TR 947, MIT.]]Google Scholar
- SAWHNEY, H. S., GUO, Y., HANNA, K., KUMAR, R., ADKINS, S., AND ZHOU, S. 2001. Hybrid stereo camera: an IBR approach for synthesis of very high resolution stereoscopic image sequences. In SIGGRAPH, 451--460.]] Google ScholarDigital Library
- SCHARSTEIN, D., AND SZELISKI, R. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vision 47, 1-3, 7--42.]] Google ScholarDigital Library
- SCHÖDL, A., SZELISKI, R., SALESIN, D. H., AND ESSA, I. 2000. Video textures. In SIGGRAPH, 489--498.]] Google ScholarDigital Library
- SHI, J., AND TOMASI, C. 1994. Good features to track. In CVPR, 593--600.]]Google Scholar
- SMITH, P., SINCLAIR, D., CIPOLLA, R., AND WOOD, K. 1998. Effective corner matching. In British Machine Vision Conference, 545--556.]]Google ScholarCross Ref
- SZELISKI, R., AND SCHARSTEIN, D. 2002. Symmetric sub-pixel stereo matching. In ECCV, 525--540.]] Google ScholarDigital Library
Index Terms
- Video matching
Recommendations
Time slice video synthesis by robust video alignment
Time slice photography is a popular effect that visualizes the passing of time by aligning and stitching multiple images capturing the same scene at different times together into a single image. Extending this effect to video is a difficult problem, and ...
Video matching
SIGGRAPH '04: ACM SIGGRAPH 2004 PapersThis paper describes a method for bringing two videos (recorded at different times) into spatiotemporal alignment, then comparing and combining corresponding pixels for applications such as background subtraction, compositing, and increasing dynamic ...
Robust and Efficient Image Alignment Based on Relative Gradient Matching
In this paper, we present a robust image alignment algorithm based on matching of relative gradient maps. This algorithm consists of two stages; namely, a learning-based approximate pattern search and an iterative energy-minimization procedure for ...
Comments