doi:10.1016/j.imavis.2003.12.005
Copyright © 2004 Published by Elsevier Science B.V.
Support vector machine based multi-view face detection and recognition
Yongmin Li
,
, a, Shaogang Gong b, Jamie Sherrah c and Heather Liddell b
a Department of Information Systems and Computing, Brunel University, Uxbridge, Middlesex UB8 3PH, UK
b Department of Computer Science, Queen Mary, University of London, London E1 4NS, UK
c Safehouse Technology Pty Ltd, 2a/68 Oxford Street, Collingwood, Victoria 3066, Australia
Received 19 October 2003;
Revised 15 December 2003;
accepted 18 December 2003.
Available online 13 February 2004.
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Abstract
Detecting faces across multiple views is more challenging than in a fixed view, e.g. frontal view, owing to the significant non-linear variation caused by rotation in depth, self-occlusion and self-shadowing. To address this problem, a novel approach is presented in this paper. The view sphere is separated into several small segments. On each segment, a face detector is constructed. We explicitly estimate the pose of an image regardless of whether or not it is a face. A pose estimator is constructed using Support Vector Regression. The pose information is used to choose the appropriate face detector to determine if it is a face. With this pose-estimation based method, considerable computational efficiency is achieved. Meanwhile, the detection accuracy is also improved since each detector is constructed on a small range of views. We developed a novel algorithm for face detection by combining the Eigenface and SVM methods which performs almost as fast as the Eigenface method but with a significant improved speed. Detailed experimental results are presented in this paper including tuning the parameters of the pose estimators and face detectors, performance evaluation, and applications to video based face detection and frontal-view face recognition.
Author Keywords: Author Keywords: Face recognition; Multi-view face detection; Head pose estimation; Support vector machines
Fig. 1. The framework for multi-view face detection. Motion estimation, skin-colour detection and background subtraction are adopted for selective attention to obtain the ROIs that may contain faces. Pose estimation is performed first for each image patch of the search on the ROIs, regardless of whether it contains a face. The pose information is used to select an appropriate face detector to determine if it contains a face.
Fig. 2. Representation of face patterns “(a) From top to bottom are the original face images, the filtered patterns with horizontal and vertical Sobel operators, and recontstructed patterns from the first 20 PCs”. “(b) The first 10 significant PCs”.
Fig. 3. Modelling multi-view faces. Only four detectors need to be constructed based on the symmetry property of human face: up profile, up frontal, down profile, down frontal. When detecting faces, only one of the detectors is chosen if pose information is available.
Fig. 4. The hybrid method of Eigenface and SVM.
Fig. 5. Pose estimation performance vs. tolerance coefficient
. Results of (b,c) are computed on validation images.
Fig. 6. Pose estimation performance vs. PCA dimension
Fig. 7. Pose estimation on a test sequence. In (b) and (c), the solid curves are the estimated pose in yaw and tilt and the dotted curves are the ground-truth pose which is measured by the data acquisition system.
Fig. 8. Sample frames from a test sequence. From top to bottom are the face detection results of the SVM, Eigenface and hybrid methods. For each frame, detection is performed within the outer box. The small white box is the ground-truth position of the face, and the dark box is the detected face pattern.
Fig. 9. Comparison results of, from left to right, the SVM, Eigenface and hybrid methods for multi-view face detection on a test sequence: (a) shows the detection time in seconds on each frame; (b) and (c) are the position errors in pixels from the ground-truth position in horizontal (X) and vertical (Y) direction, respectively.
Fig. 10. Face detection on a video sequence. The larger boxes are obtained by motion-colour based selective attention. Face detection is then performed on these bounding boxes only. The final detections are labelled with the smaller boxes inside the larger ones.
Table 1. Parameters of the SVM based algorithm for pose estimation

Table 2. Parameters used to train the multi-view face detectors

Table 3. Test results on four sequences
