Two-dimensional principal component analysis based on Schatten p-norm for image feature extraction☆
Introduction
In many image classification and recognition applications, e.g. face recognition, feature extraction plays a key role due to its contribution to alleviate the so-called “curse of dimensionality” [1] and the computation burden. In the past two decades, researchers have presented a lot of vector-based feature extraction methods, including principal component analysis (PCA) [2], linear discriminant analysis (LDA) [3], locality preserving projection (LPP) [4], margin Fisher analysis (MFA) [5], sparsity preserving projection (SPP) [6], etc. Although vector-based feature extraction methods have been successfully applied in many real-world image classification and recognition applications, they need previously transform image matrices into image vectors. The matrix-to-vector transformations inevitably discard the underlying spatial information of images. This might make vector-based methods not optimal for extracting most representative or discriminative features [7]. Recently, Yu et al. presented some subspace learning and feature extraction methods, such as sparse patch alignment framework (SPAF) [8], adaptive hyper-graph learning [9], multimodal hyper-graph learning [10], high-order distance-based multi-view stochastic learning (HD-MSL) [11], and multi-view subspace learning [12]. These methods have been successfully applied in image clustering, image classification, and web image re-ranking, respectively. Moreover, Liu and Tao [13] introduced multi-view Hessian regularization (mHR) to multi-view semi-supervised learning (mSSL) for image annotation. Xu et al. [14] extended the theory of the information bottleneck (IB) and proposed a large-margin multi-view information bottleneck (LMIB) method, which models the multi-view learning problem by using a communication system with multiple senders, each of which represents one view of the data. Furthermore, they also proposed a multi-view intact space learning algorithm [15] to integrate the encoded complementary information of multiple views of the data.
In order to adequately utilize the underlying spatial information of image, many matrix-based feature extraction methods, such as two-dimensional PCA (2DPCA) [16], two-dimensional LDA (2DLDA) [17], two-directional maximum scatter difference (2DMSD) [18], two-dimensional LPP (2DLPP) [19], and Binary 2DPCA [20], have been developed. Different from vector-based feature extraction methods, matrix-based ones directly treat an image as a 2D matrix rather than as a 1D vector.
It must be pointed out that most of the above mentioned methods are based on L2-norm or Frobenius-norm criterion. The L2- or Frobenius-norm-based methods are prone to influence by outliers because L2- or Frobenius-norm criterion can amplify the effect of outliers. As a result, the presence of outliers may push the projection vectors from the desired directions. It is well known that L1-norm is more robust than L2- or Frobenius-norm [21], [22]. Hence, many L1-norm-based methods have been developed for feature extraction. Representative L1-norm-based methods are L1-norm PCA (L1-PCA) [23], R1-PCA [24], PCA-L1 [25], LDA-R1 [26] and LDA-L1 [27], [28], etc. Among these L1-norm-based PCA methods [23], [24], [25], PCA-L1 is an efficient, robust and rotationally invariant method. As reported by Ref. [25], the optimization technique of PCA-L1 is intuitive, simple, and easy to implement. However, since Ref. [25] solves the L1-norm maximization problem by using a greedy strategy, it is prone to get stuck in local solution. To solve this issue, Nie et al. [29] proposed a robust PCA with non-greedy L1-norm maximization, in which they used an efficient non-greedy algorithm to solve the optimization problem of PCA-L1. Moreover, Kwak [30] proposed several PCA methods based on Lp-norm criterion, which try to seek projections that maximize the general Lp-norm with arbitrary in the feature space. Following the basic idea of PCA-L1, Li et al. [7] proposed an L1-norm-based 2DPCA (2DPCA-L1) method, which is a robust L1-norm version of conventional 2DPCA. For notational clarity, here we refer to the conventional 2DPCA as 2DPCA-L2. Moreover, Pang et al. [31] presented an L1-norm-based tensor analysis (TPCA-L1) method. As a supervised method, LDA-L1 [27], [28] is an effective and robust L1-norm version of conventional LDA [3], which seeks a set of local optimal projection vectors by maximizing the ratio of the L1-norm-based between-class scatter to the L1-norm-based within-class scatter in the feature space.
Over the last few years, Schatten p-norm criterion has attracted much attention in machine learning and pattern recognition fields. Nie et al. [32] proposed a low-rank matrix recovery method based on Schatten p-norm minimization to recover a low-rank matrix with a fraction of its entries arbitrarily corrupted, and derived an efficient algorithm to solve the Schattenp-norm-based optimization problem. Furthermore, they also proposed a robust matrix completion method based on joint Schatten p-norm and Lp-norm minimization [33], [34], which can better approximate the rank minimization problem and is more robust to the outliers. Luo et al. [35] developed a Schatten p-norm-based matrix regression model for image classification, in which they presented a general framework for solving Schatten p-norm with Lq regularization minimization problem. Gu et al. [36] proposed a discriminative metric based on Schatten p-norm. By analyzing the statistical properties of Schatten p-norm metric, they clearly gave the reason why the differences facial images caused by impact factors (e.g. illuminations, view directions, and expressions) are larger than the differences due to identity variations under Frobenius-norm metric, and declared that Schatten p-norm metric is more robust to impact factors of images when [36]. Based on these observations, they proposed a Schatten 1-norm PCA (SPCA) [36] method. Zhang et al. [37] further pointed out that SPCA is just an approximative algorithm, because it imposes a very strict constraint, i.e. the projection matrix must be an orthogonal matrix, on the procedure of maximizing the Schatten 1-norm-based PCA criterion, while the projection matrix is usually a column-rank-deficient matrix in many real-world applications. To solve this issue, they proposed an exact Nuclear-norm-based 2DPCA (N-2DPCA) [37] for image feature extraction. However, both SPCA and N-2DPCA are only concerned with a special case of Schatten p-norm with .
In this paper, we propose a Schatten p-norm-based 2DPCA (2DPCA-Sp) method by maximizing the total scatter criterion based on Schatten p-norm in the low-dimensional feature space. Since choosing different p values can suit for different applications, the proposed 2DPCA-Sp can be regarded as a general framework of 2DPCA. It is easy to know that the conventional 2DPCA-L2 [8] and SPCA [36] (or N-2DPCA [37]) are special cases of 2DPCA-Sp with and , respectively. Although 2DPCA-Sp is theoretically defined for , we define it for to make it more robust to outliers and insensitive to impact factors of images. To solve the objective of 2DPCA-Sp with , we also derive an efficient iterative algorithm.
The remainder of this paper is organized as follows. Section 2 briefly reviews the conventional 2DPCA. The proposed 2DPCA-Sp is presented in Section 3. Section 4 gives the experimental results. Finally, Section 5 concludes this paper.
Section snippets
Conventional 2DPCA
As a classical matrix-based feature extraction method, 2DPCA [16] has been widely used in machine learning and pattern recognition fields. Suppose there are N training image matrices, the ith training image matrix is denoted as , the size of which is . 2DPCA aims at finding a projection matrix to transform into the feature space, i.e.and meanwhile maximizing the total scatter criterion in the feature space,where is
Schatten p-norm-based 2DPCA
The conventional 2DPCA-L2 tries to find the optimal projection matrix to maximize the total scatter criterion based on Frebenius-norm in the feature space. However, it is well known that the Frebenius-norm metric is sensitive to outliers, which means that the presence of outliers may make the solution of 2DPCA-L2 deviate from the desired solution. Motived by the existing Schatten p-norm-based models [32], [33], [34], [35], [36], [37], we propose a general Schatten p-norm-based 2DPCA (2DPCA-Sp)
Experiments
In this section, we conduct experiments on three popular image databases, including the ORL database [39], the CMU PIE database [40], and the Extended Yale B database [41], to evaluate the performance of our proposed 2DPCA-Sp with for image feature extraction. Without loss of generality, the value of p in all the experiments is set as 1/4, 1/2, and 3/4, respectively. In addition, since the proposed 2DPCA-Sp is an unsupervised method, we only compare it with several state-of-the-art
Conclusions
In this paper, a novel Schatten p-norm-based 2DPCA (2DPCA-Sp) method was proposed for image feature extraction, in which the value of p can take different values to suit for different applications where the data follow different distributions. A simple but efficient iterative algorithm was also presented to solve the optimization problem of 2DPCA-Sp with . Experimental results on the ORL, CMU PIE, and Extended Yale B image databases show that the proposed 2DPCA-Sp is effective and more
Acknowledgments
We would like to thank all reviewers and editors for their detailed reviews, constructive suggestions and valuable comments. This work is supported in part by the National Natural Science Foundation of China (Nos. 61374134 and 61304132) and the key Scientific Research Project of Universities in Henan Province, China (No. 15A413009).
References (41)
- et al.
Sparsity preserving projections with applications to face recognition
Pattern Recogn.
(2010) - et al.
Image clustering based on sparse patch alignment framework
Pattern Recogn.
(2014) - et al.
Two-directional maximum scatter difference discriminant analysis for face recognition
Neurocomputing
(2008) - et al.
2D-LPP: a two-dimensional extension of locality preserving projections
Neruocomputing
(2007) - et al.
Linear discriminant analysis using rotational invariant L1 norm
Neurocomputing
(2010) - et al.
Statistical pattern recognition: a review
IEEE Trans. Pattern Anal. Mach. Intell.
(2000) - et al.
Eigenfaces for recognition
J. Cognitive Neurosci.
(1991) - et al.
Eigenfaces vs. Fisherfaces: recognition using class specific linear projection
IEEE Trans. Pattern Anal. Mach. Intell.
(1997) - et al.
Face recognition using Laplacianfaces
IEEE Trans. Pattern Anal. Mach. Intell.
(2005) - et al.
Graph embedding and extensions: a general framework for dimensionality reduction
IEEE Trans. Pattern Anal. Mach. Intell.
(2007)
L1-norm-based 2DPCA
IEEE Trans. Syst. Man Cybern. B Cybern.
Adaptive hypergraph learning and its application in image classification
IEEE Trans. Image Process.
Click prediction for web image reranking using multimodal sparse coding
IEEE Trans. Image Process.
High-order distance-based multiview stochastic learning in image classification
IEEE Trans. Cybern.
Modern Machine Learning Techniques and Their Applications in Cartoon Animation Research
Multiview Hessian regularization for image annotation
IEEE Trans. Image Process.
Large-margin multi-view information bottleneck
IEEE Trans. Pattern Anal. Mach. Intell.
Two-dimensional PCA: a new approach to appearance-based face representation and recognition
IEEE Trans. Pattern Anal. Mach. Intell.
2D-LDA: a statistical linear discriminant analysis for image matrix
Pattern Recogn. Lett.
Cited by (14)
2D-LPCCA and 2D-SPCCA: Two new canonical correlation methods for feature extraction, fusion and recognition
2018, NeurocomputingCitation Excerpt :They use the image matrices directly rather than reshape them into vectors, and the transform result obtained using one transform axis is a vector, whereas the result of 1D-based method is a scalar. Furthermore, some other 2D based methods are also proposed, such as multi-dimensional latent semantic analysis (MDLSA) [27], Schatten p-norm-based 2DPCA (2DPCA-Sp) [28], nuclear- and L2,1-norm regularized 2D neighborhood preserving projection (2DNPP) [29], color principal component analysis (ColorPCA) and color linear discriminant analysis (ColorLDA) [30], two-dimensional neighborhood preserving projection (2DNPP) [31]. For (2D)2 methods like (2D)2PCA [32], (2D)2FLD [33] and two-dimensional bilinear preserving projections (2DBPP) [34], they investigate the two-directional two-dimensional projections along not only in row direction but also in column direction to further reduce the dimension [26].
Pseudo-full-space representation based classification for robust face recognition
2018, Signal Processing: Image CommunicationCitation Excerpt :So, in the following experiments, we use the feature of dimensionality 200, 200, 300 on the three databases respectively. A more detail discussion about feature extraction please refers to [53] and [54]. As mentioned before, the difference between standard sparse representation and PFSR is whether the representation space has unlabeled test samples.
Two-dimensional discriminant analysis based on Schatten p-norm for image feature extraction
2017, Journal of Visual Communication and Image RepresentationCitation Excerpt :The solution of optimization problem (7) can be obtained by using the iterative algorithm proposed in [29].
Robust 2DPCA by Tℓ Criterion Maximization for Image Recognition
2021, IEEE AccessFacial recognition using two-dimensional principal component analysis and k-nearest neighbor: A case analysis of facial images
2020, Journal of Physics: Conference Series
- ☆
This paper has been recommended for acceptance by Yehoshua Zeevi.