Abstract
We propose a robust methodology for 3D model-based markerless tracking of textured objects in monocular image sequences. The technique is based on mutual information maximization, a widely known criterion for multi-modal image registration, and employs an efficient multiresolution strategy in order to achieve robustness while keeping fast computational time, thus achieving near real-time performance for visual tracking of complex textured surfaces.
Similar content being viewed by others
References
Baker, S., & Matthews, I. (2004). Lucas–Kanade 20 years on: a unifying framework. International Journal of Computer Vision, 56(3), 221–255.
Black, M. J., & Jepson, A. D. (1996). Eigentracking: robust matching and tracking of articulated objects using a view-based representation. In European conference on computer vision (Vol. 1, pp. 329–342).
Brunelli, R., & Poggio, T. (1993). Face recognition: features versus templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(10), 1042–1052.
Cascia, M., Sclaroff, S., & Athitsos, V. (1999). Fast, reliable head tracking under varying illumination: an approach based on registration of texture-mapped 3d models.
Cootes, T. F., Edwards, G. J., & Taylor, C. J. (1998). Active appearance models. Lecture Notes in Computer Science, 1407, 484–498.
Cover, T. M., & Thomas, J. A. (1991). Elements of information theory. New York: Wiley.
Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York: Wiley.
Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.
Goldberg, D. E. (1989). Genetic algorithms in search, optimization, and machine learning. Reading: Addison–Wesley.
Gonzalez, R. C., & Woods, R. E. (2006). Digital image processing (3rd ed.). Upper Saddle River: Prentice-Hall.
Gorodnichy, D., Malik, S., & Roth, G. (2002). Affordable 3d face tracking using projective vision. In International conference on vision interfaces (pp. 383–390).
Hager, G. D., & Belhumeur, P. N. (1998). Efficient region tracking with parametric models of geometry and illumination. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(10), 1025–1039.
Huber, P. (1981). Robust statistics. New York: Wiley.
Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. In Proceedings of the IEEE international conference on neural networks (Vol. 4, pp. 1942–1948).
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Lu, L., Dai, X.-T., & Hager, G. (2004). A particle filter without dynamics for robust 3d face tracking. In Proceedings of the 2004 conference on computer vision and pattern recognition workshop (CVPRW’04) (Vol. 5, p. 70). Washington: IEEE Computer Society.
Maes, F., Collignon, A., Vandermeulen, D., Marchal, G., & Suetens, P. (1997). Multimodality image registration by maximization of mutual information. IEEE Transactions on Medical Imaging, 16(2), 187–198.
Marquardt, D. W. (1963). An algorithm for least-squares estimation of nonlinear parameters. j-J-SIAM, 11(2), 431–441.
Matthews, I., & Baker, S. (2003). Active appearance models revisited (Technical Report CMU-RI-TR-03-02). Robotics Institute, Carnegie Mellon University.
Nelder, J., & Mead, R. (1965). A simplex method for function minimization. Computer Journal, 7, 308–313.
Park, I. K., Zhang, H., Vezhnevets, V., & Choh, H.-K. (2004). Image-based photorealistic 3-d face modeling. In International conference on automatic face and gesture recognition (pp. 49–56).
Pluim, J. P. W., Maintz, J. B. A., & Viergever, M. A. (2003). Mutual-information-based registration of medical images: a survey. IEEE Transactions on Medical Imaging, 22(8), 986–1004.
Principe, J., Xu, D., & Fisher, J. (1999). Information theoretic learning. In S. Haykin (Ed.), Unsupervised adaptive filtering. New York: Wiley.
Shi, J., & Tomasi, C. (1994). Good features to track. In IEEE conference on computer vision and pattern recognition (CVPR’94), Seattle, June 1994.
Skrypnyk, I., & Lowe, D. G. (2004). Scene modelling, recognition and tracking with invariant image features. In ISMAR ’04: proceedings of the third IEEE and ACM international symposium on mixed and augmented reality (ISMAR’04) (pp. 110–119), Washington, DC, USA. Los Alamitos: IEEE Computer Society.
Thevenaz, P., & Unser, M. (2000). Optimization of mutual information for multiresolution image registration. IEEE Transactions on Image Processing, 9(12), 2083–2099.
Toyama, K. (1998). Look, ma—no hands!’ hands-free cursor control with real-time 3d face tracking. In Proceedings of the workshop on perceptual using interfaces (PUI’98) (pp. 49–54), San Francisco.
Toyama, K., & Hager, G. (1996). Incremental focus of attention for robust visual tracking. International Journal on Computer Vision, 35(1), 45–63.
Unser, M. (1999). Splines: a perfect fit for signal and image processing. IEEE Signal Processing Magazine, 16(6), 22–38. IEEE Signal Processing Society’s 2000 magazine award.
Unser, M., Aldroubi, A., & Eden, M. (1993). B-spline signal processing: part I: theory. IEEE Transactions on Signal Processing, 41(2), 821–833.
Unser, M., Aldroubi, A., & Eden, M. (1993). B-spline signal processing, part II: efficient design and applications. IEEE Transactions on Signal Processing, 41(2), 834–848.
Unser, M., Aldroubi, A., & Eden, M. (1993). The L 2-polynomial spline pyramid. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(4), 364–379.
Vacchetti, L., & Lepetit, V. (2004). Stable real-time 3d tracking using online and offline information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(10), 1385–1391.
Viola, P. A., & Jones, M. J. (2001). Robust real-time face detection. In International conference on computer vision (p. 747).
Wells, W., Viola, P., Atsumi, H., Nakajima, S., & Kikinis, R. (1996). Multi-modal volume registration by maximization of mutual information.
Xiao, J., Baker, S., Matthews, I., & Kanade, T. (2004). Real-time combined 2d + 3d active appearance models. In CVPR (pp. 535–542).
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
Rights and permissions
About this article
Cite this article
Panin, G., Knoll, A. Mutual Information-Based 3D Object Tracking. Int J Comput Vis 78, 107–118 (2008). https://doi.org/10.1007/s11263-007-0083-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-007-0083-7