Elsevier

Pattern Recognition

Volume 70, October 2017, Pages 112-125
Pattern Recognition

Low-rank preserving embedding

https://doi.org/10.1016/j.patcog.2017.05.003Get rights and content

Highlights

  • Using low-rank representation for dimension reduction (LRPE).

  • LRPE retains global discriminative structure into reduced space.

  • Recast related methods into a unified problem and then tackle it.

  • LRPE is more effective and robust, as well as cheap computation.

Abstract

In this paper, we consider the problem of linear dimensionality reduction with the novel technique of low-rank representation, which is a promising tool of discovering subspace structures of given data. Existing approaches based on graph embedding usually capture structure of data via stacking the local structure of each datum, such as neighborhood graph, 1-graph and 2-graph. Yet they lack explicit discrimination between those local structures and suffer from corrupted samples. To this end, we propose a new linear dimensionality reduction method by virtue of the lowest rank representation (LRR) of data, which is dubbed low-rank preserving embedding (LRPE). Different from the traditional routes, LRPE achieves all data self-representations jointly and can thus extract the global structure of a data set as a whole. The global low-rank constraint explicitly enforces the LRR matrix to be block-diagonal form, so that the samples with a similar intrinsic structure, which are more likely to be from the same class, are described by a similar set of bases. Hence, LRPE is discriminative even if no class labels are provided. Benefiting from the robust LRR, LRPE is also robust to various noises and errors in data. Besides, we rewritten all related methods into a unified formulation, followed by a detailed solution and clear comparisons. Finally, we conduct extensive experiments on publicly available data sets for data visualization and classification. The inspiring experimental results show the effectiveness, the cheap computation and the robustness of the proposed method.

Introduction

In computer vision and pattern analysis of high-dimensional data, an underlying assumption is that the intrinsic dimensionality of the data is much less than its ambient dimensionality [1,2]. Usually, dimensionality reduction (DR) is exploited to discover the meaningful compact representations of the high-dimensional data samples, where one also expects to maintain the intrinsic information of the data as much as possible. DR is important in a wide range domains, including visualization [1], classification [3] and information retrieval [4], since it can alleviate “the curse of dimensionality” and other undesired properties of high-dimensional space. Heretofore, countless linear or nonlinear DR methods have been developed under unsupervised, semi-supervised and supervised scenarios.

Traditionally, DR is performed using a linear technique that aims to learn a projection matrix from the given data, such as principal component analysis (PCA) [5], semi-supervised discriminant analysis (SDA) [6] and linear discriminant analysis (LDA) [7]. However, linear techniques cannot adequately handle complex nonlinear data. Thereby, nonlinear DR methods are proposed to learn a nonlinear mapping function from original space to reduced space, such as Isomap [8], autoencoder [9], locally linear embedding (LLE) [1] and kernel principal component analysis [10]. Their semi-supervised and supervised variations have been also presented, such as semi-supervised LLE [11], semi-supervised Isomap [11] and supervised LLE [12], supervised Isomap [13]. Although nonlinear methods have more effective expressive power, they suffer from the out-of-sample issue [14] and the unsatisfactory in real-world tasks [3]. In this paper, we focus on unsupervised linear DR due to its well generalization and extensibility.

Compared with nonlinear DR, linear DR methods have many merits: objective problem can be solved simply and efficiently; projection matrix can be used everywhere, either training data or test data; reduced representation can be achieved using simple algebraic manipulations [15]. Thus, many nonlinear methods based on manifold assumption have been recast into linearized versions. For instance, neighbor preserving embedding (NPE) [16] is a linearized version of LLE; isometric projection [17] can be seen as a linearized Isomap; locality preserving projections (LPP) [18] is a linearized laplacian eigenmaps [19]. In fact, the DR methods of Riemannian manifold learning [20] are to preserve the intrinsic geometric structure of the data into reduced space, and can come down to the graph embedding framework [21], [22]. Linear DR under this framework consists of two steps: constructing a graph that captures meaningful structures of data, followed by embedding the graph into reduced space via solving a generalized eigenvalue decomposition problem.

In the graph embedding framework, the pivotal and widely concerned point is how to knit the graph of data relationship. The traditional strategy is to manually choose the nearest neighbors to each datum, following by setting the weights with some distance metric (e.g., LPP) or least-square reconstruction (e.g., LLE and NPE). However, it is difficult and unstable to determine the nearest neighbors, especially in high-dimensional noisy space. The preferred method is to exploit the self-representation of each datum by reconstructing it using all other data points [23]. In [24], sparsity preserving projections (SPP) exploits sparse representation of each datum to construct 1-graph; it is also studied in [25]. Collaborative representation based projections (CRP) [26] aims to build 2-graph by ridged linear reconstruction of each datum based on the remaining data. In comparison to LPP and NPE using neighborhood graph, SPP and CRP achieve not only significant improvement on discriminative feature learning but also enhanced robustness to noises in data.

However, both 1-graph and 2-graph only collect the local structure of each datum due to the fact that the self-representation of each datum therein is individually developed. They lose sight of the global structure of the data from the perspective of the entire data set, such as multiple clusters [27], multiple subspaces [28] and multiple manifolds [29]. While the cluster assumption [30], which is widely used in clustering and semi-supervised learning [31], says that data on the same structure (like a cluster, a subspace or a manifold) are likely to share the same label. On the other hand, the independent coding process makes them less effective in noisy cases, especially encountering the grossly corrupted samples. 1-graph based on sparse representation may not be robust to noise and outliers when no extra clean data are available [32], while 2-graph based on the ridge regression is more likely to rebuild the noises even the errors in the data exploiting the samples under the same case. The shortcomings can reduce the performance and the robustness of SPP and CRP.

To this end, in this paper, we investigate the recently proposed low-rank representation (LRR) of the data which is an encouraging method for subspace clustering [28]. LRR aims to achieve the reconstruction coefficients of the data all at once, instead of one at a time, with imposing low-rank constraint on the coefficient matrix. Thereby, LRR can capture the global structure of a data set as a whole. Moreover, LRR can correct the possible errors (including noises, outliers and corruptions) and segment all samples into their respective subspace simultaneously, where each subspace more likely corresponds to one class [28]. Since LRR robustly captures global discriminative structure of data, we here exploit LRR to construct the graph of data relationship, followed by embedding the high-dimensional data into reduced space. The novel proposed method that preserves the low-rank reconstructive relationship is referred to as low-rank preserving embedding (LRPE). Our primary contributions can be summarized as follows.

  • (1)

    We propose a linear DR method, called LRPE, to learn discriminative features under the unsupervised scenarios. Benefiting from LRR, LRPE not only has well robustness to the possible errors even grossly corruption in samples, but also can yield discriminating features even if no class labels are available.

  • (2)

    We reformulate the related problems and approaches (including NPE, SPP and CRP), and group them, together with LRPE, into the unified problem of reconstructive relationship preserving embedding. And then we exhibit a solution scheme to reduce the problem of projection matrix to the common problem of projection vectors, which is often ignored in the pertinent literature.

  • (3)

    Different from the existing methods of reconstructive relationship preserving embedding, LPRE aims to retain the global discriminating structure of the data, no longer localization. The experimental results inspiringly exhibit its effectiveness, low computational cost and robustness. In addition, LRPE can be easily extended to supervised and semi-supervised scenarios by employing the existing DR framework [21], [33].

Note that the similar method, low-rank preserving projections (LRPP) [34], has been reported. LRPE is different from LRPP: LRPP combines the objective of developing LRR and the objective of LPP, through using LRR as data similarity; while LRPE is a two-stage method by utilizing LRR as reconstructive relationship, and is easy to be interpreted into the graph embedding framework.

The rest of this paper is organized as follows. In Section 2, we review and reformulate the prior related works. Section 3 presents the details of LRPE, including its motivations, objective function and solution scheme. We show our experimental results together with some discussions in Section 4, and finally conclude this paper in Section 5.

Section snippets

Related works

In order to set the stage for our further discussions, in this section, we briefly review some related works including PCA, NPE, SPP and CRP. Note that we here reformulate these methods such that they can have a unified expression. Based on the unified formulation, we make some comparisons and discussions. Specifically, the problem we consider can be described as follows:

Problem 1 (Linear Dimensionality Reduction). Given a sample matrix X=[x1,x2,,xn]Rm×n, let P=[p1,p2,,pd]Rm×d be the

Low-rank preserving embedding

In this section, we present the motivations, the objective function and the solution of the proposed Low-rank preserving embedding (LRPE) in detail.

Experiments

In this section, we conduct experiments on publicly available data sets to verify the efficacy of the proposed method LRPE. First, we visualize two datasets from the UCI repository in 2-dimensional subspace for insights into the discrimination of the resultant features. Then, we evaluate the face recognition performance of LRPE on two widely used databases. Finally, we test the robustness to various noises and errors. In addition, we mainly compare with the related DR methods: PCA, LPP, NPE,

Conclusions

In this paper, we propose a novel technique of linear dimensionality reduction relying on the graph embedding framework [21], dubbed LRPE. It exploits low-rank representation (LRR) to construct LRR graph that implicitly contains global discriminative structure of data, followed by preserving the low-rank reconstructive relationship into reduced space. We recast the mentioned methods into a unified formulation, named reconstructive relationship preserving embedding. Based on this, we provide the

Acknowledgments

We would like to thank the editors and any anonymous reviewers for their helpful comments.

Yupei Zhang received the B.Eng. degree in computer science and technology from East China University of Technology, China, in 2009, and received the M.Eng. degree in computer software and theory from Zhengzhou University, China, in 2013. He is currently a Ph.D. candidate in the department of computer science and technology, Xi'an Jiaotong University, China. His current research interests mainly include sparse and low-rank representation, pattern recognition and machine learning.

References (58)

  • I.T. Jolliffe

    Principal Component Analysis

    (2002)
  • CaiD. et al.

    Semi-supervised discriminant analysis

  • R.A. Fisher

    The use of multiple measurements in taxonomic problems

    Annals Eugen.

    (1936)
  • J.B. Tenenbaum et al.

    A global geometric framework for nonlinear dimensionality reduction

    Science

    (2000)
  • G.E. Hinton et al.

    Reducing the dimensionality of data with neural networks

    Science

    (2006)
  • B. Schölkopf et al.

    Kernel principal component analysis

  • YangX. et al.

    Semi-supervised nonlinear dimensionality reduction

  • D. De Ridder et al.

    Supervised locally linear embedding

  • GengX. et al.

    Supervised nonlinear dimensionality reduction for visualization and classification

    IEEE Trans. Syst. Man Cybern. Part B (Cybern.)

    (2005)
  • Y. Bengio et al.

    Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering

  • HeX. et al.

    Neighborhood preserving embedding

  • CaiD. et al.

    Isometric projection

  • X. Niyogi

    Locality preserving projections

  • M. Belkin et al.

    Laplacian eigenmaps for dimensionality reduction and data representation

    Neural Comput.

    (2003)
  • LinT. et al.

    Riemannian manifold learning

    IEEE Trans. Pattern Anal. Mach. Intel.

    (2008)
  • YanS. et al.

    Graph embedding and extensions: a general framework for dimensionality reduction

    IEEE Trans. Pattern Anal. Mach. Intel.

    (2007)
  • CaiD. et al.

    Spectral regression for efficient regularized subspace learning

  • E. Elhamifar et al.

    Sparse subspace clustering: Algorithm, theory, and applications

    IEEE Trans. Pattern Anal. Machine Intel.

    (2013)
  • ChengB. et al.

    Learning with L1-graph for image analysis

    IEEE Trans. Image Proces.

    (2010)
  • Cited by (67)

    • Graph-regularized federated learning with shareable side information

      2022, Knowledge-Based Systems
      Citation Excerpt :

      Many related studies on integrating side information have shown its power in improving data models in many research fields, such as recommendation [16] and image recognition [17]. In FL studies, client-side information can provide the relationship of similarity between clients, which corresponds to the manifold intuition that similar clients should have similar local models, while different clients should have different models [18]. However, current FL models generally calculate an identical model for all clients in the server, leading to large model biases concerning the optimal client models.

    • Feature extraction framework based on contrastive learning with adaptive positive and negative samples

      2022, Neural Networks
      Citation Excerpt :

      In recent years, it has been witnessed that several important structures should be preserved in unsupervised, supervised, and semi-supervised feature extraction. Concretely, for unsupervised learning, locality preserving projections (LPP) (He & Niyogi, 2003), neighborhood preserving embedding (NPE) (He, Cai, Yan, & Zhang, 2005), sparsity preserving projections (SPP) (Qiao, Chen, & Tan, 2010), collaborative representation-based projections (CRP) (Yang, Wang, & Sun, 2015), and low-rank preserving embedding (LRPE) (Zhang, Xiang, & Yang, 2017) are designed based on various graphs, respectively. Furthermore, supervised feature extraction methods obtain more discriminant information using sample labels in addition to preserving manifold structure.

    View all citing articles on Scopus

    Yupei Zhang received the B.Eng. degree in computer science and technology from East China University of Technology, China, in 2009, and received the M.Eng. degree in computer software and theory from Zhengzhou University, China, in 2013. He is currently a Ph.D. candidate in the department of computer science and technology, Xi'an Jiaotong University, China. His current research interests mainly include sparse and low-rank representation, pattern recognition and machine learning.

    Ming Xiang received the B.Eng. and Ph.D. degrees from Northwestern Polytechnical University, Xi'an, China, in 1987 and 1999, respectively, and currently works as an associate professor in the department of computer science and technology in Xi'an Jiaotong University, Xi'an, China. His current research interests mainly include information fusion, pattern recognition and machine learning.

    Bo Yang received the B.Eng. degree in computer science and technology from Xi'an University of Posts & Telecommunication, Xi'an, China, in 2005, and received the M.Eng. degree in computer system architecture from Xidian University, Xi'an, China, in 2009. He is currently a Ph.D. candidate in the department of computer science and technology, Xi'an Jiaotong University, Xi'an, China. His current research interests mainly include manifold learning, pattern recognition and machine learning.

    View full text