Robust domain adaptation image classification via sparse and low rank representation

https://doi.org/10.1016/j.jvcir.2015.09.005Get rights and content

Highlights

  • A domain adaptation sparse and low-rank representation (DASLRR) method is proposed.

  • DASLRR considers both global and local discriminative information as well as distribution adaptation.

  • A SSL framework is further constructed based on DASLRR.

  • Extensive experiments for various cross-domain image classification tasks are conducted.

Abstract

Domain adaptation image classification addresses the problem of adapting the image distribution of the source domain to the target domain for an effective learning task, where the classification objective is intended but the data distributions are different. However, corrupted data (e.g. noise and outliers, which exist universally in real-world domains) can cause significant deterioration of the practical performance of existing methods in cross-domain image classification. This motivates us to propose a robust domain adaptation image classification method with sparse and low rank representation. Specifically, we first obtain an optimal Domain Adaptation Sparse and Low Rank Representation (DASLRR) for all the data from both domains by incorporating a distribution adaptation regularization term, which is expected to minimize the distribution discrepancy between the source and target domain, into the existing low rank and sparse representation objective function. Formulating an optimization problem that combines the objective function of the sparse and low rank representation, constrained by distribution adaptation and local consistency, we propose an algorithm that alternates between obtaining an effective dictionary, while preserving the DASLRR to make the new representations robust to the distribution difference. Based on the obtained DASLRR, we then provide a flexible semi-supervised learning framework, which can propagate the labels of labeled data from both domains to unlabeled data from In-Sample as well as Out-of-Sample datasets by simultaneously learning a prediction label matrix and a classifier model. The proposed method can capture the global mixture of the clustering structure (by the sparseness and low rankness) and the locally consistent structure (by the local graph regularization) as well as the distribution difference (by the distribution adaptation) of the domains data. Hence, the proposed method is robust for accurately classifying cross-domain images that may be corrupted by noise or outliers. Extensive experiments demonstrate the effectiveness of our method on several types of images and video datasets.

Introduction

With the development of computer network and storage technologies, there has explosive growth of web images. In the field of multimedia and computer vision, many researchers have recently proposed a variety of machine learning and data mining algorithms for image classification also termed image annotation [1], [2], [3]. While these works have shown promising achievements in overcoming the well-known semantic gap by applying machine learning algorithms to image classification, explosive amounts of emerging image data have brought a dilemma of data deluge and label scarcity to the task of image classification with traditional methods [4]. In other words, on the one hand, expensive and time-consuming human labor is required for the collection of labels of new emerging images. On the other hand, there exists a large amount of outdated labeled images for previous tasks. While exploiting the vast amount of unlabeled data directly (e.g., via the semi-supervised learning (SSL) paradigm [5]) is valuable in its own right, it is beneficial to leverage labeled data of relevant categories across data sources. For example, it is increasingly popular to enrich our limited collection of training samples with those from the Internet. One problem with this strategy, however, comes from the possible misalignment of the target domain of interest and the source domain that provides the auxiliary data and labels. This misalignment corresponds to the shift in data distribution in a certain feature space. To be precise, the marginal distribution of the samples in the source domain and that in the target are different. This makes it harmful to incorporate data from the source domain into the target domain directly: in theory, the disparity violates the basic assumption underpinning supervised learning; in practice, the resulting performance degrades considerably on the target test samples [4].

The above theoretical and practical paradox has inspired recent research efforts into the domain adaptation learning (DAL) problem [6] in computer vision and machine learning [7]. Domain adaptation (or cross-domain) image classification emerges as one of the major techniques to accommodate to the above-mentioned dilemma, which aims to adapt the feature distribution in the source domain to the target domain. In general, a domain refers to data of a certain type, from a certain source, or generated in a certain period of time, etc. Unlike conventional image classification methods, a related source (or auxiliary) domain is provided in domain adaptation image classification to assist the classification process in the target domain and the distinction of data distribution between source and target domains should be minimized in the process of adaptation. Recently, many cross-domain learning techniques have been proposed to solve the problem of distribution mismatch in the field of image or video concept detection and classification [4], [8], [9], [10], [11], [12], [13], [14], [15].

Existing solutions to DAL vary in setting and methodology. Depending on how the source information is exploited, the division is between classifier-based and representation-based adaptation. The former advocates implicit adaptation to the target distribution by adjusting a classifier from the source domain (e.g., [14], [15]), whereas the latter attempts to achieve alignment by adjusting the representation of the source data via learning a transformation [9], [10], [13], [16]; e.g., Pan et al. [16] proposed to extract a “good” feature representation through which the probability distributions of labeled and unlabeled data are drawn close, which achieves much better classification performance by explicitly reducing distribution divergence. Orthogonal to this, the extant methods can also be classified into supervised (e.g., [8], [12], [14], [15]) and unsupervised (e.g., [11], [16]) adaptation, based on whether labels have been exploited during the adaptation.

While the effectiveness and efficiency of cross-domain image classification make it of particular use in practice, it also brings a new challenge, i.e., how to handle the errors (e.g., noise and corruption) which may exist in the training data. Since the image data may be randomly obtained from the Internet or other open source websites such as Flicker and YouTube, noise and outliers may by nature abound in the training data [17]. The common issues with the existing cross-domain image classification methods are twofold. First, during the adaptation, they typically deal with source samples separately, without accounting for the mixed (both local and global) structures of data. This may (either implicitly or explicitly) cause the adapted distribution to be arbitrarily scattered around [18] and any structural information beyond single data of the source data may become undermined. Second, they blindly translate all image data, including the noise and particularly possible outliers, from the source domain to the target [4]. The latter can lead to significantly distorted or corrupted models when the classification models are learned. Hence, how to guarantee the robustness of cross-domain image classification by handling data that may not strictly follow domain distribution structures, is an important challenge to robust domain adaptation image classification tasks.

Note that image representation is a crucial procedure for robust image processing and understanding [19]. With this viewpoint, in this paper, we study the robust cross-domain image classification problem using robust feature representation such as sparse representation (SR) [20], [21] and low-rank representation (LRR) [22], [23], [24]. To this end, by exploiting the advances in LRR [23], [24] and DAL [4], we propose a Robust Domain Adaptation Learning framework (RDAL) based on the obtained Domain Adaptation Sparse and Low Rank Representation (DASLRR) of domains data for cross-domain image classification. Specifically, we first aim to represent whole of the domains data as a linear combination of the learned bases, where the representation coefficients should be sparse and low rank, and locally consistent as well as robust to distribution difference. Distribution adaptation ensures that DASLRR is effective for data drawn from different distributions. Lowest rankness ensures that DASLRR can better capture the global cluster structures of the data and is more robust to noise and outliers, while sparsity and local consistency capture the mixed (global and local) discriminative information. We formulate a constrained optimization problem that incorporates the minimum distribution distance criterion, which makes the new representations of both domains close to each other, and ensures local consistency in the objective function of low rank and sparse representation [24]. Subsequently, we further propose a robust semi-supervised learning (SSL) framework based on the newly-obtained DASLRR, thus smoothly implementing label propagation from source domain to target domain [8]. Our method to learn an optimal DASLRR is markedly different from previous works in that we propose to jointly optimize the DASLRR with the constructed over-complete dictionary and constrain the optimal DASLRR coefficients to be adaptive to distribution divergence. Moreover, the proposed classification framework also intrinsically differentiates it from existing works [25], [26] in such aspects as Laplacian regularization term construction and feature representation. Extensive experiments on public databases for various cross-domain image classification tasks demonstrate that the proposed method can significantly improve the performance of domain adaptation image classification. These results clearly verify that the proposed framework is more robust and discriminative than conventional methods.

The main contributions of this paper can be summarized as follows:

  • (1)

    We present a robust domain adaptation image representation method termed DASLRR using sparse and low-rank representation regularized by both local consistency and distribution adaptation. It extends conventional low-rank and sparse representation into cross-domain learning scenarios where the training data may be drawn from a different distribution. In addition, we alternate learning an optimal dictionary and obtaining the DASLRR of the entire dataset. The dictionary learned by our method has good reconstruction and discriminative capabilities. With this high-quality dictionary, we are able to learn an optimal DASLRR and further a robust classification model.

  • (2)

    Based on the obtained DASLRR, we further provide a semi-supervised learning framework, which can propagate the labels of labeled data to unlabeled data from In-Sample as well as Out-of-Sample datasets by jointly learning a prediction label matrix and a classifier model in the framework.

  • (3)

    To the best of our knowledge, this is the first-ever to consider simultaneously both global and local discriminative information as well as distribution adaptation in the LRR objective and form a unified optimization function. The proposed method can capture the global mixture of clustering structure (by the sparseness and low rankness) and the local intrinsic structure (by the local graph regularization) as well as distribution difference (by the distribution adaptation) of the data.

  • (4)

    We have conducted extensive experiments on public databases for various cross-domain image classification tasks. In many of these experiments, we see that the proposed method can significantly improve the performance of DAL learning. These results clearly demonstrate that the proposed framework is more informative and discriminative than conventional methods.

The rest of the paper is organized as follows. In Section 2, the previous related works are discussed. In Section 3, the preliminaries including LRR and maximum mean discrepancy (MMD) are first introduced, and then in Section 4 we detail the DASLRR objective function and its corresponding solutions followed by a robust domain adaptation image classification framework based on DASLRR. The experimental results on cross-domain image classification are discussed in Section 5. Finally, we draw a conclusion and discuss future work in Section 6.

Section snippets

Related work

As one of robust tool for finding sparse representations of signals and capturing high-level semantics in image data, sparse representation (SR) can represent images using only a few active coefficients. This makes the sparse representations easy to interpret and manipulate, thus facilitating efficient image analysis. However, even when the data to be analyzed is a set of images which are from the same class and sharing common (correlated) features, existing SR methods would still be performed

Preliminaries on low rank representation and variants

In this section, the LRR methods are introduced to give a more detailed discussion. Given the observation data matrix X=[x1,x2,,xn]Rd×n, the signal can be represented as X=X0+E, where X0 denotes the original ground-truth matrix and E represents noise. LRR aims to recover the clean data X0 with only the corrupted observation data X. Specifically, LRR aims to find the lowest rank representation of X with an appropriate dictionary. Then, the objective function of LRR can be formulated as,minZ,E

Notations and problem statement

In the sequel, we refer to the training set as the source domain Ds=xis,yisi=1ns, where xiRd is the d-dimensional input space and yisY is the output label and Y{1,2,,c} is the label set with the number of classes c. The total number of samples in the source domain is ns. We also assume that the testing samples are available. We denote the testing set as Dt=xjj=ns+1ns+nt and xjRd is the input and the total number of samples in the target domain is nt. The total number of data in both

Experiments

In this section, we present a set of experiments where we use our method for several cross-domain image classification tasks including face recognition, web image annotation, object recognition and video concept recognition. For all the data sets, true labels are available for instances from source domains. All the labeled samples from both the source and target domains are selected for training, while only the unlabeled samples from the target domain are selected for testing.

Conclusion and future work

We propose a robust domain adaptation image classification framework via sparse and low rank representation regularized by inter-domain distribution divergence and local consistency. The optimization problem of the proposed representation method is solved using an efficient alternative algorithm. Under the representation method, we learn a robust label propagation model for unlabeled instances from the target domain based on the classical GSSL diagram. An important work worthy of further study

Acknowledgments

This work was supported in part by the Humanities and Social Science Foundation of Ministry of Education of China under Grant 13YJAZH084, by the Natural Science Foundation of Zhejiang Province under Grants LY14F020009, LY16F030012 and LY13F020011, and by the Natural Science Foundation of Ningbo City under Grants 2014A610024 and 2014A610066.

References (49)

  • JianWen Tao et al.

    Sparsity regularization label propagation for domain adaptation learning

    Neurocomputing

    (2014)
  • Jianwen Tao et al.

    On minimum distribution discrepancy support vector machine for domain adaptation

    Pattern Recog.

    (2012)
  • Yaoguo Zheng

    Low-rank representation with local constraint for graph construction

    Neurocomputing

    (2013)
  • Yi Yang

    Web and personal image annotation by mining label correlation with relaxed visual graph embedding

    IEEE Trans. Image Process.

    (2012)
  • R. Datta et al.

    Image retrieval: ideas, influences, and trends of the new age

    ACM Comput. Surv.

    (2008)
  • M.S. Lew et al.

    Content-based multimedia information retrieval: state of the art and challenges

    ACM Trans. Multimedia Comput., Commun. Appl.

    (2006)
  • I-Hong Jhuo

    Robust visual domain adaptation with low-rank reconstruction

  • X. Zhu, Semi-Supervised Learning Literature Survey, Computer Sciences Technical Report 1530, University of...
  • L. Bruzzone et al.

    Domain adaptation problems: a DASVM classification technique and a circular validation strategy

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2010)
  • S.J. Pan et al.

    A survey on transfer learning

    IEEE Trans. Knowl. Data Eng.

    (2010)
  • L. Duan, I.W. Tsang, D. Xu, S.J. Maybank, Domain transfer SVM for video concept detection, in: Proceedings of the IEEE...
  • L. Duan et al.

    Domain transfer multiple kernel learning

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2012)
  • B. Geng et al.

    DAML: domain adaptation metric learning

    IEEE Trans. Image Process.

    (2011)
  • Jianwen Tao et al.

    A kernel learning framework for domain adaptation learning

    Sci. China Inf. Sci.

    (2012)
  • J. Yang, R. Yan, A.G. Hauptmann, Cross-domain video concept detection using adaptive SVMs, in: Proceedings of the ACM...
  • Lixin Duan et al.

    Domain adaptation from multiple sources: a domain-dependent regularization approach

    IEEE Trans. Neural Netw. Learn. Syst.

    (2012)
  • S.J. Pan et al.

    Domain adaptation via transfer component analysis

    IEEE Trans. Neural Netw.

    (2011)
  • Hua Wang, Feiping Nie, Heng Huang, Robust and discriminative self-taught learning, in: Proceedings of the 30th...
  • B. Quanz, Jun Huan, M. Mishra, Knowledge transfer with low-quality data: a feature extraction issue, 2011 IEEE 27th...
  • Mingsheng Long

    Transfer sparse coding for robust image representation

  • John Wright

    Sparse representation for computer vision and pattern recognition

    Proc. IEEE

    (2010)
  • J. Wright et al.

    Robust face recognition via sparse representation

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2009)
  • Guangcan Liu et al.

    Robust recovery of subspace structures by low-rank representation

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2013)
  • Lu Xiaoqiang et al.

    Graph-regularized low-rank representation for destriping of hyperspectral images

    IEEE Trans. Geosci. Rem. Sens.

    (2013)
  • Cited by (9)

    • Unsupervised sub-domain adaptation using optimal transport

      2023, Journal of Visual Communication and Image Representation
    • Sparse representation of 3D images for piecewise dimensionality reduction with high quality reconstruction

      2019, Array
      Citation Excerpt :

      The results suggest that the method may be of assistance to image processing applications which rely on a transformation for data reduction as a first step of further processing. For examples of relevant applications we refer to Refs. [24–28]. Within the redundant dictionary framework for approximation, the problem of finding the sparsest decomposition of a given multi-channel image can be formulated as follows: Given an image and a dictionary, approximate the image by the ‘atomic decomposition’ (2) such that the number k of atoms is minimum.

    • Finding Robust Transfer Features for Unsupervised Domain Adaptation

      2020, International Journal of Applied Mathematics and Computer Science
    • Low-Rank Representation Based Domain Adaptation for Classification of Remote Sensing Images

      2019, 2019 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images, MultiTemp 2019
    View all citing articles on Scopus

    This paper has been recommended for acceptance by Prof. M.T. Sun.

    View full text