Prediction of eigenvalues and regularization of eigenfeatures for human face verification

https://doi.org/10.1016/j.patrec.2009.10.006Get rights and content

Abstract

We present a prediction and regularization strategy for alleviating the conventional problems of LDA and its variants. A procedure is proposed for predicting eigenvalues using few reliable eigenvalues from the range space. Entire eigenspectrum is divided using two control points, however, the effective low-dimensional discriminative vectors are extracted from the whole eigenspace. The estimated eigenvalues are used for regularization of eigenfeatures in the eigenspace. These prediction and regularization enable to perform discriminant evaluation in the full eigenspace. The proposed method is evaluated and compared with eight popular subspace based methods for face verification task. Experimental results on popular face databases show that our method consistently outperforms others.

Introduction

Human beings are experts in verifying subject’s identity just by analyzing face images (photographs). This ability is very appealing and has become an active research area in many real-life machine vision applications. Face verification (FV) is an important tool for authentication of an individual and has significant role in many security and e-commerce applications (Zhao et al., 2003).

Face verification and identification are the main applications of face recognition (FR). A face verification system has to discriminate between two kinds of events: either the person claiming a given identity is the true claimant or the person is an impostor. In recent years, many subspace based approaches like PCA and LDA are being applied to FR problem (Zhao et al., 2003). The results are not satisfactory because PCA does not encode the class information and LDA suffers from instability of eigenvalue decomposition due to the small number of training samples and high dimensionality of face images. Moreover, in Fisherfaces (FLDA) (Belhumeur et al., 1997), the singularity of the scatter matrices are not guaranteed (Zhuang and Dai, 2007).

In recent times, many researchers have noticed these problems and tried to solve them using different methods. Bayesian maximum likelihood (BML) is proposed in (Moghaddam et al., 2000, Moghaddam and Pentland, 1997). It uses a probabilistic similarity measure based on the Bayesian belief that the image intensity differences are characteristic of typical variations in appearance of an individual. Their similarity measure is expressed in terms of probability using the two class of facial image variations: intrapersonal variations and extrapersonal variations. Although this method performs good for FR task, one need to store the original face image of an individual in the database, which are in general, of very large dimensionality. Moreover, the computation of their distance measure has very high time complexity as it involves both distance-in-feature space and distance-from-feature space (Moghaddam et al., 1998, Moghaddam et al., 2000, Jiang et al., 2006).

To deliver promising FR results, recently a myriad of algorithms based on the applications of PCA and FLDA are proposed in the existing literature (Zhao et al., 2003, Shakhnarovich and Moghaddam, 2005, Stan et al., 2004). Direct LDA (DLDA) (Yu et al., 2001) approach removes null space of the between-class scatter matrix and extracts the eigenvectors corresponding to the smallest eigenvalues of the within-class scatter matrix. However, an argument against the DLDA algorithm is presented in (Gao and Davis, 2006), where they have shown that DLDA is actually a special case of LDA by directly taking the linear space of class means as the LDA solution. The pooled covariance estimate is completely ignored. They also demonstrate that DLDA is not equivalent to traditional LDA in dealing with the small sample size problem and may impose performance limitations in general application (Gao and Davis, 2006).

Null space LDA (NDA) approach is proposed in (Liu et al., 2004, Huang et al., 2002). They have shown that the null space of the total scatter matrix is the common null space of both within-class and between-class matrices. The algorithm firstly removes the null space of the total scatter matrix and projects the samples onto the null space of within-class scatter matrix. It then removes the null space of the between-class scatter matrix in the subspace to obtain the optimal discriminant vectors. The basic notion of this algorithm is that the null space of the within-class scatter matrix is particularly useful in discriminating ability. Interestingly, this appears to be contradicting the popular FLDA that uses only the principal space and discards the null space. A common problem to all these approaches is that they all lose some discriminative information, either in the principal or in the null space.

To take advantages from both the subspaces, dual space LDA (DSL) is proposed in (Wang and Tang, 2004a). Using the probabilistic visual model (Moghaddam and Pentland, 1997), the eigenvalue spectrum in the null space of the within-class scatter matrix is estimated. It performs discriminant analysis in both the subspaces and the discriminative features are then combined in recognition phase. The features in the complementary subspace are scaled by the average eigenvalue of the within-class scatter matrix over this subspace. As eigenvalues in this subspace are not well estimated (Wang and Tang, 2004a), their average may not be a good scaling factor relative to those in the principal subspace. Features extracted from the two complementary subspaces are properly fused by using summed normalized-distance (Yang et al., 2005). Open questions of these two approaches are how to divide the space into the principal and the complementary subspaces and how to apportion a given number of features to the two subspaces. Furthermore, as the discriminative information resides in the both subspaces, it is inefficient and only suboptimal to extract features separately from the two subspaces.

Another popular approach called unified framework of subspaces (UFS) (Wang et al., 2004b), addresses the problems of instability and noise disturbances in LDA based methods. Using this framework they demonstrate the importance of noise suppression. This approach applies three stages of subspace decompositions sequentially on the face training data and the dimensionality reduction occurs at the very first stage. However, as addressed in the literature (Jiang et al., 2007, Cevikalp et al., 2005, Wang and Tang, 2004a), applying PCA for dimensionality reduction may lose discriminative information. Another open question of UFS is how to choose the number of principal dimensions for the first two stages of subspace decompositions before selecting the final number of features in the third stage. The experimental results in (Wang et al., 2004b) show that the recognition performance is sensitive to these choices at different stages.

In this paper, we revisit the short comings of FLDA approach for FV task and related ideas proposed in (Mandal et al., 2008). FLDA has instability problem due to the limited number of training samples and high dimensionality of face images. Moreover, it loses important discrimination information in the range and/or null space. To alleviate these problems, we propose to partition the entire eigenspace into reliable, unreliable and null regions using two control points. A procedure for eigenvalue prediction is proposed. The forecasted eigenvalues are used for regularization of eigenfeatures in the eigenspace. These prediction and regularization enable to perform discriminant evaluation in the full eigenspace and extract effective low-dimensional discriminative features from face images. We evaluate and compare our approach with eight other popular subspace based methods for the FV task.

In the following section, we present the partitioning of subspaces and eigenspectrum modeling. In Section 3, we discuss the eigenfeature scaling and extraction procedures. Experimental results and discussions are presented in Section 4. Finally conclusions are drawn in Section 5.

Section snippets

Partitioning of subspaces and eigenspectrum modeling

Given a set of properly normalized h-by-w face images, we can form a training set of column vectors {Xij}, where XijRn=hw is an image column vector, by lexicographic ordering the pixel elements of image j of person i. Let the training set contain p persons and qi sample images for person i. The total number of training samples is l=i=1pqi. For face recognition, each person is a class with prior probability of ci. The within-class scatter matrix is defined bySw=i=1pciqij=1qiXij-X¯iXij-X¯iT,

Eigenfeature scaling and extraction

The partitioning of the eigenspectrum has helped in identifying the face, noise and null regions. Eigenvalues are then forecasted using the few reliable eigenvalues from the range space. From Fig. 2 it is evident that noise component is small as compared to face components in F but it is dominating in region N. Thus, the predicted eigenspectrum λ˜kw is given byλ˜kw=λkw,k<m1αk+β,m1km2αrw+1+β,m2<kn.The proposed feature weighting function is thenw˜kw=1λ˜kw,k=1,2,n.Fig. 2 shows the proposed

Experimental results and discussions

AR, FERET database 1 and FERET database 2 are used in our experiments. In all the experiments reported in this work, images are preprocessed, aligned and normalized following the CSU Face Identification Evaluation System (Beveridge et al., 2003), which also employs FERET database. Face verification is performed by accepting a claimant if the subject’s matching score is greater than or equal to a threshold and rejecting the claimant if its matching score is lower than the threshold. Verification

Conclusions

Subspace based approaches such as FLDA, DLDA, NDA and UFS discard a subspace before the discriminant evaluation. The extracted features are only suboptimal as they are the most discriminative only in a subspace. Although BML works in the whole space, it does not evaluate the discriminant value and, hence, the whole face image must be used in matching. The DSL approach scales features in the complementary subspace by the average eigenvalue of within-class scatter matrix over this subspace. As

Acknowledgement

This work was supported by the Institute for Infocomm Research, A∗STAR, Singapore.

References (30)

  • Huang, R., Liu, Q., Lu, H., Ma, S., 2002. Solving the small size problem of lda. In: Proc. 16th Internat. Conf. on...
  • X.D. Jiang

    Asymmetric principal component and discriminant analyses for pattern classification

    IEEE Trans. Pattern Anal. Machine Intell.

    (2009)
  • X.D. Jiang et al.

    Enhanced maximum likelihood face recognition

    IEE Electron. Lett.

    (2006)
  • Jiang, X.D., Mandal, B., Kot, A., 2007. Face recognition based on discriminant evaluation in the whole space. In: IEEE...
  • X.D. Jiang et al.

    Eigenfeature regularization and extraction in face recognition

    IEEE Trans. Pattern Anal. Machine Intell.

    (2008)
  • Cited by (16)

    • RBECA: A regularized Bi-partitioned entropy component analysis for human face recognition

      2022, Expert Systems with Applications
      Citation Excerpt :

      Non-parametric discriminant analysis (NDA) (Liu, Wang, Li, & Tan, 2004) considers the common null space of the between-class and within-class matrices but projects the data on within-class nullspaces. Mandal, Jiang, Eng, and Kot (2010) proposed a method of eliminating noise by regularizing the eigenfeatures based on the eigenvalues in the face space. All these methods rely on a linear transformation of the original input space to the solution space of the identity labels.

    • Efficient optimally regularized discriminant analysis

      2013, Neurocomputing
      Citation Excerpt :

      Experiment results show that without introducing too many parameters, regularized linear discriminant analysis (RDA) [19] and its special case uncorrelated linear discriminant analysis (ULDA) [13] can usually achieve better performance and provide good generalization ability due to its connection to the principle of statistical learning theory. Although in recently years, several methods that use more sophisticated strategy to regularize the eigenspectrum of the scatter matrix have been proposed [26,31–36], RDA is still useful for its stable performance. Another advantage of RDA is that there exist efficient solvers for RDA [15,37], which makes it scalable to large scale problems.

    View all citing articles on Scopus

    This paper is based on the best biometrics student paper award, received at the 19th International Conference on Pattern Recognition (ICPR), Tampa, FL, December 10, 2008.

    View full text