Elsevier

Neural Networks

Volume 46, October 2013, Pages 190-198
Neural Networks

2DPCA with L1-norm for simultaneously robust and sparse modelling

https://doi.org/10.1016/j.neunet.2013.06.002Get rights and content

Abstract

Robust dimensionality reduction is an important issue in processing multivariate data. Two-dimensional principal component analysis based on L1-norm (2DPCA-L1) is a recently developed technique for robust dimensionality reduction in the image domain. The basis vectors of 2DPCA-L1, however, are still dense. It is beneficial to perform a sparse modelling for the image analysis. In this paper, we propose a new dimensionality reduction method, referred to as 2DPCA-L1 with sparsity (2DPCAL1-S), which effectively combines the robustness of 2DPCA-L1 and the sparsity-inducing lasso regularization. It is a sparse variant of 2DPCA-L1 for unsupervised learning. We elaborately design an iterative algorithm to compute the basis vectors of 2DPCAL1-S. The experiments on image data sets confirm the effectiveness of the proposed approach.

Introduction

Dimensionality reduction (DR) is of great importance for multivariate data analysis. For classifying typically high-dimensional patterns in practice, DR can relieve the “curse of dimensionality” effectively (Jain, Duin, & Mao, 2000). Principal component analysis (PCA) (Jolliffe, 1986) is perhaps the most popular DR technique. It seeks a few basis vectors such that the variances of projected samples are maximized. In the domain of image analysis, two-dimensional PCA (2DPCA) (Yang, Zhang, Frangi, & Yang, 2004) is more efficient, due to its direct formulation based on raw two-dimensional images.

Although PCA and 2DPCA have been widely applied in many fields, they are vulnerable at the presence of atypical samples because of the employment of the L2-norm in the variance formulation. As a robust alternative, L1-norm-based approaches were developed. Specifically, the L1-norm-based PCA variants include L1-PCA (Ke & Kanade, 2005), R1-PCA (Ding, Zhou, He, & Zha, 2006), PCA-L1 (Kwak, 2008), and non-greedy PCA-L1 (Nie, Huang, Ding, Luo, & Wang, 2011). Li, Pang, and Yuan (2009) developed the L1-norm-based 2DPCA (2DPCA-L1), which demonstrated encouraging performance for the image analysis.

A limitation of the above methods is that the basis vectors learned are still dense, which makes it difficult to explain the resulting features. It is desirable to select the most relevant or salient elements from a large number of features. To address this issue, sparse modelling has been developed and received increasing attention in the community of pattern classification (Wright et al., 2010). The sparsity was achieved by regularizing objective variables with a lasso penalty term using the L1-norm (Chen et al., 1998, Tibshirani, 1996). Mathematically, the classic PCA approach could be reformulated as a regression-type optimization problem, and then the sparsity-inducing lasso penalty was imposed, resulting in sparse PCA (SPCA) (Zou, Hastie, & Tibshirani, 2006). The sparsity was further generalized to structured version, producing structured sparse PCA (Jenatton, Obozinski, & Bach, 2010). With the graph embedding platform (Yan et al., 2007), various DR approaches were endowed with a unified sparse framework by the L1-norm penalty (Cai et al., 2007, Wang, 2012, Zhou et al., 2011). Recently, the robustness of SPCA was improved by the L1-norm maximization (Meng, Zhao, & Xu, 2012).

The sparse modelling for 2DPCA-L1, however, is still not addressed. Note that the L1-norm used in 2DPCA-L1 works as a robust measure of sample dispersion rather than regularizing basis vectors. A common way of enforcing sparsity is to fix the L2-norm and minimize the L1-norm with a length constraint.

In this paper, we limit our attention to the image analysis, and consider extending 2DPCA-L1 with sparsity, referred to as 2DPCAL1-S. On account of the L1-norm used as the lasso penalty in the sparsity-inducing modelling, we propose incorporating the L1-norm lasso penalty, together with the fixed L2-norm, onto the basis vectors of 2DPCA-L1. Consequently, 2DPCAL1-S maximizes the L1-dispersion of samples subject to the elastic net (i.e., L2-norm and L1-norm) (Zou et al., 2006) constraint onto the basis vectors. Formally, we combine the L1-dispersion and the elastic net constraint onto the objective function. As can be seen, we use the L1-norm for both robust and sparse modelling simultaneously. Due to the involvement of the L1-norm in the two aspects, the optimization of 2DPCAL1-S is not straightforward. We design an elegant iterative algorithm to solve 2DPCAL1-S.

The remainder of this paper is organized as follows. The conventional 2DPCA-L1 method is briefly reviewed in Section  2. The formulation of 2DPCAL1-S is proposed in Section  3. Section  4 reports experimental results. And Section  5 concludes the paper.

Section snippets

Brief review of 2DPCA-L1

The 2DPCA-L1 approach, proposed by Li et al. (2009), finds basis vectors that maximize the dispersion of projected image samples in terms of the L1-norm. Suppose that X1,,Xn are a set of training images with size q×p, where n is the number of images. These images are assumed to be mean-centred.

Let vRp be the first basis vector of 2DPCA-L1. It maximizes the L1-norm-based dispersion of projected samples g(v)=i=1nXiv1 subject to v2=1, where 1 and 2 denote the L1-norm and the L2-norm,

Basic idea

Sparse modelling has been receiving exploding attention in computer vision and pattern classification (Wright et al., 2010). The obtained basis vectors of 2DPCA-L1, however, are still dense (Li et al., 2009). In other words, the projection procedure involves all the original features. As we know, a typical image usually has a large number of features. There may exist irrelevant or redundant features for classification. It is important to find a few salient features, which correspond to specific

Experiments

In order to evaluate the proposed 2DPCAL1-S algorithm, we compare its performances of image classification and reconstruction with four unsupervised learning algorithms: PCA, PCA-L1, 2DPCA, and 2DPCA-L1. Two benchmark face databases FERET and AR are used in our experiments.

In the experiments, the initial components of PCA-L1 are set as the corresponding components of PCA. The initial components of 2DPCA-L1 and 2DPCAL1-S are set as the corresponding components of 2DPCA.

There are two tuning

Conclusion

A new subspace learning method, called 2DPCAL1-S, is developed for image analysis in this paper. It uses the L1-norm for both robust and sparse modelling. The role of the L1-norm is two-fold. One is the robust measurement of the dispersion of samples, as in 2DPCA-L1. The other is to introduce penalty, resulting in the sparse projection vectors. 2DPCAL1-S utilizes the feature extraction and the feature selection simultaneously and robustly. Computationally, an iterative algorithm is designed,

Acknowledgements

The authors would like to thank the anonymous referees for the constructive recommendations, which greatly improve the paper. This work was supported in part by the National Natural Science Foundation of China under Grants 61075009 and 31130025, in part by the Natural Science Foundation of Jiangsu Province under Grant BK2011595, in part by the Program for New Century Excellent Talents in University of China under Grant NCET-12-0115, and in part by the Qing Lan Project of Jiangsu Province.

References (18)

  • D. Meng et al.

    Improve robustness of sparse PCA by L1-norm maximization

    Pattern Recognition

    (2012)
  • H. Wang

    Structured sparse linear graph embedding

    Neural Networks

    (2012)
  • J. Yang et al.

    Two-dimensional PCA: a new approach to appearance-based face representation and recognition

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2004)
  • Cai, D., He, X., & Han, J. (2007). Spectral regression: a unified approach for sparse subspace learning. In Proceedings...
  • S.S. Chen et al.

    Atomic decomposition by basis pursuit

    SIAM Journal on Scientific Computing

    (1998)
  • Ding, C., Zhou, D., He, X., & Zha, H. (2006). R1-PCA: rotational invariant L1-norm principal component analysis for...
  • A.K. Jain et al.

    Statistical pattern recognition: a review

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2000)
  • Jenatton, R., Obozinski, G., & Bach, F. (2010). Structured sparse principal component analysis. In Proceedings of the...
  • I.T. Jolliffe

    Principal component analysis

    (1986)
There are more references available in the full text version of this article.

Cited by (0)

View full text