Visual comparison based on linear regression model and linear discriminant analysis

https://doi.org/10.1016/j.jvcir.2018.10.026Get rights and content

Highlights

  • Given image pairs, Pairwise Relative Attributes are used for visual comparison.

  • The Linear Discriminant Analysis is used to obtain discriminant features.

  • The Linear Discriminant Analysis saves much running time in visual comparison.

  • The Linear Regression model obtains the promising performance in visual comparison.

Abstract

Visual comparison is that given two images, we can predict which one exhibits a particular visual attribute more than the other. The existing relative attribute methods rely on ranking SVM functions to conduct visual comparison; however, the ranking SVM functions are sensitive to the support vectors. When there are rarely effective samples, the performance of the ranking SVM model will be greatly discounted. To address this issue, we propose the pairwise relative attribute method for visual comparison by training the Linear Regression Model (LRM), which can be formulated by learning a mapping function between a vector-formed feature input with pairwise image difference and a scalar-valued output. In addition, we propose a novel feature reduction method based on the Linear Discriminant Analysis (LDA) in order to obtain a low dimensional and discriminant feature. Experimental results on the three databases of UT-Zap50K-1, OSR and PubFig demonstrate the advantages of the proposed method.

Introduction

Visual comparison is a significant technique in computer vision, which is defined as that given two images and a special attribute, we can predict which one exhibits the attribute more than the other. Imagine you are given a pair of images of shoes and a ‘comfort’ attribute, and you need to observe which one is more ‘comfort’ than the other. As shown in Fig. 1, we can observe shoes B are more ‘comfort’ than shoes A.

Attributes, which are visual properties describable in words, can capture anything from material properties (plastic, wooden), shapes (pointy, round) to feelings (serious, smiling). Since their emergence, attributes have inspired a lot of work in image search [13], [26], [12], [11], biometrics [24], [4], and language based supervision for recognition [15], [21], [25], [2]. Those attribute models are mainly divided into two forms: binary attributes and relative attributes. Whereas binary attributes are suitable only for clear-cut predicates, such as boxy, relative attributes can show ‘real-valued’ properties that inherently exhibit a series of strengths, such as ‘comfort’. Relative attributes [21] were first proposed by learning the global ranking SVM functions, followed by much recent work for visual comparison based on ranking SVM functions [22], [12], [18], [30], [23], [32]. However, this kind of global ranking tends to fail when faced with unrelated training pairs. To address this issue, a local learning method based on ranking SVM [31] was proposed for fine-grained visual comparison. While more accurate for fine-grained visual comparison, the local learning method causes more computational cost in that it needs to find the nearest neighbor pairs and train the local ranking function for each testing image pair. Xiao and Lee [28] proposed a spatial extent of relative attributes for relative attribute ranking, which had some limitations such as time consuming and suboptimal performance. The ranker network (RN) method [27] was proposed to simultaneously learn to rank and localize relative attributes with deep convolutional network, which needs more space and time due to the character of the deep convolutional network.

In order to improve the performance of visual comparison and reduce the computational cost, we propose pairwise relative attribute method for visual comparison by training the linear regression model, which can be formulated by learning a mapping function between a vector-formed feature input with pairwise difference and a scalar-valued output. Due to the unrelated image pairs, the linear regression model is more accurate to the training data, while the ranking SVM functions are sensitive to the support vectors. Because the training data in visual comparison inevitably includes some unrelated image pairs, the ranking SVM may choose those unrelated pairs as the support vectors, which results in worse comparison result. When there are rarely effective samples, the performance of the ranking SVM model will be greatly discounted.

Nevertheless, the linear regression based method for visual comparison suffers from high dimensional feature. In order to obtain a low dimensional and discriminant feature, a number of feature reduction methods, such as the Principal Component Analysis (PCA) or the Linear Discriminant Analysis (LDA), have been proposed and used in different applications [9], [33], [10], [19]. Those methods have better performance in that they cannot only provide a low dimensional and discriminant feature but also save the computational cost. Although the PCA method could reserve more original data information, the method considered less intrinsic structure properties. Unlike the PCA method, the LDA method [9] is to find the most discriminant eigenvectors which maximize the ratio of between-class and within-class variances. The LDA is a supervised learning algorithm and its eigenvectors are usually nonorthogonal. Similar to the LDA, a feature reduction approach called the Cross-view Quadratic Discriminant Analysis (XQDA) [19] was proposed in person re-identification; however, this is under the assumption that the data classes are Gaussian structure based on the different variance and the equal mean. In this paper, we propose a feature reduction method based on LDA in order to obtain a low dimensional and discriminant visual feature.

The main contribution of this paper is the idea to learn pairwise relative attributes by linear regression, which to our knowledge has not been explored for visual comparison in any prior work. The other contribution is to propose a novel feature reduction method based on LDA for visual comparison with pairwise relative attributes. Tests on three challenging datasets show that the proposed approach improves the state-of-the-art methods in visual comparison.

Section snippets

Related work

Comparing attributes has gained a lot of interest recently. The relative attribute approach learned a global linear ranking function for each attribute [21], which was extended to non-linear ranking functions in [16], [17] by training a hierarchy of rankers and normalizing predictions at the leaf nodes. Aside from learning to rank formulations, researchers have applied the Elo rating system for biometrics [24], and a local learning method based on the ranking SVM [31] was proposed for

Approach

We use the LDA based method to reduce feature dimension, and apply linear regression to efficiently train pairwise attribute models for visual comparison. In the following, we first introduce pairwise relative attributes, and then present feature reduction. Next we present the relative attribute model by linear regression. At last we show the outline of the proposed method.

Experiments

To validate the advantages of the proposed method, we compare it with several state-of-the-art methods on three datasets: UT-Zap50K-1 [31], the Outdoor Scene Recognition dataset [20] (OSR), and a subset of the Public Figures faces dataset [14] (PubFig). All methods run for 10 random train/test splits on all ordered pairs and report the average accuracy of the percentage of correctly ordered pairs. We use the same labeled data as in [21]. Specifically, on UT-Zap50K-1 we randomly select almost

Conclusion

In this paper, we have proposed a novel visual comparison method, which first applies LDA to feature dimension reduction, and then uses LRM for visual comparison. The comprehensive experimental results on three benchmark datasets verified that the LDA-based feature dimension reduction is an effective approach for pairwise relative attribute visual comparison due to the property of maximizing the between-class covariance while minimizing the within-class covariance. Meanwhile, the LRM can

Acknowledgement

This work was funded by the Natural Science Foundation of Anhui Province (1508085QF127), the Natural Science Foundation of Anhui Higher Education Institutions of China (KJ2017B017, KJ2018B05) and Co-Innovation Center for Information Supply & Assurance Technology, Anhui University (ADXXBZ201604).

Hanqin Shi received her B.Eng. degree in computer science and technology in 2006 from the Anhui Normal University in China and her M.Eng. degree in computer science and technology in 2009 from the Anhui University in China. Currently, she is pursuing a Ph.D. degree in computer science and technology at the Anhui University in China. Her main research interests include image processing, vision processing and pattern recognition.

References (33)

  • Senjian An, Wanquan Liu, Svetha Venkatesh, Face recognition using kernel ridge regression, in: IEEE Conference on...
  • Arijit Biswas, Devi Parikh, Simultaneous active learning of classifiers & attributes via relative feedback, in: IEEE...
  • Antoni B. Chan et al.

    Counting people with low-level features and bayesian regression

    IEEE Trans. Image Process.

    (2012)
  • Ke Chen, Shaogang Gong, Tao Xiang, Chen Change Loy, Cumulative attribute space for age and crowd density estimation,...
  • Yun Fu et al.

    Human age estimation with regression on discriminative aging manifold

    IEEE Trans. Multimedia

    (2008)
  • Guodong Guo et al.

    Image-based human age estimation by manifold learning and locally adjusted robust regression

    IEEE Trans. Image Process.

    (2008)
  • Guodong Guo, Guowang Mu, Yun Fu, and Thomas S. Huang, Human age estimation using bio-inspired features, in: IEEE...
  • Yoel Haitovsky

    On multivariate ridge regression

    Biometrika

    (1987)
  • Tae Kyun Kim et al.

    Locally linear discriminant analysis for multimodally distributed classes for face recognition with a single model image

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2005)
  • Martin Köstinger, Martin Hirzer, Paul Wohlhart, Peter M. Roth, Horst Bischof, Large scale metric learning from...
  • Adriana Kovashka, Kristen Grauman, Attribute pivots for guiding relevance feedback in image search, in: IEEE Conference...
  • Adriana Kovashka, Devi Parikh, Kristen Grauman, Whittlesearch: Image search with relative attribute feedback, in: IEEE...
  • Neeraj Kumar, Peter Belhumeur, Shree Nayar, Facetracer: a search engine for large collections of images with faces, in:...
  • Neeraj Kumar et al.

    Attribute and simile classifiers for face verification

  • Christoph H. Lampert et al.

    Learning to detect unseen object classes by between-class attribute transfer

  • Shaoxin Li et al.

    Relative forest for attribute prediction

  • Cited by (5)

    • Acid treatment of carbonate reservoir with a new dual action microemulsion: Selection of optimal application conditions

      2022, Journal of Petroleum Science and Engineering
      Citation Excerpt :

      The procedure produces a linear discriminant function Z, as well as a plot of the probability of assigning data to one or another class according to the calculated value of Z. That is, by calculating the Z value for specific data, the probability that they belong to each class can be estimated (Shi and Tao, 2018; Ponomareva et al., 2021). To assess the reliability of the inclusion of a particular parameter in the classification function, characteristics such as the Wilks's Lambda, Partial Lambda, tolerance (Toler.),

    • Visualization of volatomic profiles for early detection of fungal infection on storage Jasmine brown rice using electronic nose coupled with chemometrics

      2020, Measurement: Journal of the International Measurement Confederation
      Citation Excerpt :

      Each principal component is a linear combination of all initial variables. LDA is a supervised method used to find the most discriminant eigenvectors that maximize the ratio of between-class and within-class variances [36,36] and capable of classifying two or more groups of samples [25]. This method is similar to variance analysis and regression analysis as all of them model the dependent variable as a linear combination of the other independent variables.

    Hanqin Shi received her B.Eng. degree in computer science and technology in 2006 from the Anhui Normal University in China and her M.Eng. degree in computer science and technology in 2009 from the Anhui University in China. Currently, she is pursuing a Ph.D. degree in computer science and technology at the Anhui University in China. Her main research interests include image processing, vision processing and pattern recognition.

    Liang Tao received the Ph.D. degree in information and communication engineering in 2003 from the University of Science and Technology of China, P. R. China. From August 1998 to August 1999, supported by the China Scholarship Council, he was a visiting scholar at the University of Windsor, Windsor, Ontario, Canada. Currently, he is a Professor with the School of Computer Science and Technology, Anhui University in China. He has published over 100 papers. His main research interests include digital signal and image processing, and pattern recognition.

    View full text