Abstract
This paper describes a model-assisted system for reconstruction of 3D faces from a single consumer quality camera using a structure from motion approach. Typical multi-view stereo approaches use the motion of a sparse set of features to compute camera pose followed by a dense matching step to compute the final object structure. Accurate pose estimation depends upon precise identification and matching of feature points between images, but due to lack of texture on large areas of the face, matching is prone to errors.
To deal with outliers in both the sparse and dense matching stages, previous work either relies on a strong prior model for face geometry or imposes restrictions on the camera motion. Strong prior models result in a serious compromise in final reconstruction quality and typically bear a signature resemblance to a generic or mean face. Model-based techniques, while giving the appearance of face detail, in fact carry this detail over from the model prior. Face features such as beards, moles, and other characteristic geometry are lost. Motion restrictions such as allowing only pure rotation are nearly impossible to satisfy by the end user, especially with a handheld camera.
We significantly improve the robustness and flexibility of existing monocular face reconstruction techniques by introducing a deformable generic face model only at the pose estimation, face segmentation, and preprocessing stages. To preserve data fidelity in the final reconstruction, this generic model is discarded completely and dense matching outliers are removed using tensor voting: a purely data-driven technique. Results are shown from a complete end to end system.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Pollefeys, M., Gool, L.V., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., Koch, R.: Visual modeling with a hand-held camera. International Journal of Computer Vision 59, 207–232 (2004)
DeCarlo, D., Metaxas, D.: The integration of optical flow and deformable models with applications to human face shape and motion estimation (1996)
Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: Rockwood, A. (ed.) Siggraph 1999, Computer Graphics Proceedings, pp. 187–194. Addison Wesley Longman, Los Angeles (1999)
Shan, Y., Liu, Z., Zhang, Z.: Model-based bundle adjustment with application to face modeling. In: International Conference on Computer Vision, Vancouver, Canada (2001)
Fua, P.: Using model-driven bundle-adjustment to model heads from raw video sequences. In: Proceedings of the 7th International Conference on Computer Vision, Corfu, Greece, p. 4653 (1999)
Romdhani, S., Vetter, T.: Efficient, robust and accurate fitting of a 3d morphable model. In: ICCV 2003. Proceedings of the Ninth IEEE International Conference on Computer Vision, p. 59. IEEE Computer Society Press, Washington, DC, USA (2003)
Pesenti, B., Medioni, G.: Generation of a 3d face model from one camera. In: Proceedings of the 16th International Conference on Pattern Recognition, Quebec City, Quebec, Canada, pp. 667–671 (2002)
Fidaleo, D., Medioni, G., Fua, P., Lepetit, V.: An investigation of model bias in 3d face tracking. In: IEEE Analysis and Modeling of Faces and Gestures, pp. 125–139. IEEE Computer Society Press, Los Alamitos (2005)
Vacchetti, L., Lepetit, V., Fua, P.: Stable real-time 3d tracking using online and offline information. IEEE Trans. Pattern Anal. Mach. Intell. 26, 1385–1391 (2004)
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge, UK (2000)
Ilic, S., Fua, P.: Implicit meshes for surface reconstruction. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 328–333 (2006)
Shapiro, L., Haralick, R.: Image matching - an interest operator. In: Computer and Robot Vision Volume II, pp. 341–343. Prentice-Hall, Englewood Cliffs (1992)
Lourakis, M., Argyros, A.: The design and implementation of a generic sparse bundle adjustment software package based on the levenberg-marquardt algorithm. Technical Report 340, Institute of Computer Science - FORTH, Heraklion, Crete, Greece (2004)
Powell, M.J.D.: Radial basis functions for multivariable interpolation: a review, 143–167 (1987)
Mordohai, P., Medioni, G.: Stereo using monocular cues within the tensor voting framework. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 968–982 (2006)
Desbrun, M., Meyer, M., Schröder, P., Barr, A.H.: Implicit fairing of irregular meshes using diffusion and curvature flow. In: ACM SIGGRAPH Proceedings, vol. 33, pp. 317–324. ACM Press, New York (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fidaleo, D., Medioni, G. (2007). Model-Assisted 3D Face Reconstruction from Video. In: Zhou, S.K., Zhao, W., Tang, X., Gong, S. (eds) Analysis and Modeling of Faces and Gestures. AMFG 2007. Lecture Notes in Computer Science, vol 4778. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75690-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-540-75690-3_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75689-7
Online ISBN: 978-3-540-75690-3
eBook Packages: Computer ScienceComputer Science (R0)