Elsevier

Pattern Recognition

Volume 47, Issue 4, April 2014, Pages 1731-1739
Pattern Recognition

A novel approach to combine features for salient object detection using constrained particle swarm optimization

https://doi.org/10.1016/j.patcog.2013.11.012Get rights and content

Highlights

  • We have used constrained particle swarm optimization approach to combine features.

  • A new objective function is proposed to compute the weights.

  • An adaptive threshold is used for pixel classification instead of fixed threshold.

  • Our model has the best precision, recall, F -measure and AUC values.

Abstract

Despite significant amount of research works, the best available visual attention models still lag far behind human performance in predicting salient object. In this paper, we present a novel approach to detect a salient object which involves two phases. In the first phase, three features such as multi-scale contrast, center-surround histogram and color spatial distribution are obtained as described in Liu et al. model. Constrained Particle Swarm Optimization is used in the second phase to determine an optimal weight vector to combine these features to obtain saliency map to distinguish a salient object from the image background. To achieve this, we defined a simple fitness function which highlights a salient object region with well-defined boundary and effectively suppresses the background regions in an image. The performance is evaluated both qualitatively and quantitatively on a publicly available dataset. Experimental results demonstrate that the proposed model outperforms existing state-of-the-art methods in terms of precision, recall, F -measure and area under curve.

Introduction

Salient object detection [1], [2] is one of the key problems in computer vision, having received continuous attention since its birth. Visual saliency refers to the ability to locate the relevant information (object) in an image quickly and efficiently. The yield of the salient object detection process is a saliency map [1] where each pixel is assigned a measure of relevance [2]. This can be achieved by giving high score to the interesting information and low score to the irrelevant information.

Salient object detection provides fast solutions to many complex processes real-time applications such as surveillance systems [3] to track vehicle(s), pedestrian(s) or any object. It is also used in remote sensing [4] and image retrieval [5], [6]. Additionally, it is used for automatic target detection such as finding traffic signs [1], [7] along the road or military vehicles in a savanna [7], in robotics to find salient objects in the environment as navigation landmarks. It can also be applied in the area of image and video compression [7] by giving higher quality to salient objects at the expense of degrading background clutter, automatic cropping/centering [8] of images for display on small portable screens [9]. It also finds its applications in detecting tumors in mammograms [10], advertising a design [7], image collection browsing [11], image enhancement [12] and many more.

Several approaches have been suggested to model visual saliency based on neurobiological concepts, computational and mathematical methods. They can be broadly classified into two major categories [13]: bottom-up and top-down. In bottom-up models, multiple low-level visual features (such as intensity, color, orientation, and texture) are extracted from the image. Then these features are normalized and combined into a saliency map. Salient locations are identified using winner-take-all [1] and inhibition-of-return [1] operations. On the contrary, the top-down models are task-dependent and use a priori knowledge of the visual system. They are always integrated with the bottom-up models to generate saliency maps for localizing objects of interest.

Recently, Liu et al. [14] proposed a salient object detection model based on the combination of bottom-up and top-down approach [13]. It combined multi-scale contrast, center-surround histogram and color spatial distribution with conditional random field under maximum likelihood estimation (MLE) criteria. MLE is a well-known parameter estimation technique with many advantages [15]. It provides a consistent and asymptotically efficient approach for parameter estimation. It gives unbiased variance when sample size is large. It has approximate normal distributions and approximate sample variances that can be used to generate confidence bounds and hypothesis tests for the parameters, and has a lower variance in comparison to other methods. However, it is overshadowed by certain disadvantages: MLE can be heavily biased for small samples and is highly sensitive to the choice of starting values. MLE is a derivation based approach where the function should have an analytical form. Also, the solution does not converge all the time and is usually non-trivial for the numerical estimation. Also, Liu et al. [14] used a common linear weight vector, obtained by MLE, to combine the feature maps for all test images. This weight vector may not give better saliency results for images which are significantly different from the training set. The weight vectors for such images must be learned in such a way that they give better saliency results for their corresponding images.

In this paper, we used a modified form of Particle Swarm optimization (PSO) [16], a commonly used optimization method, to obtain weight vector in order to optimally combine the features extracted from the image. PSO utilizes the fitness function to obtain the optimal solution. For this we have proposed a new fitness function to obtain better saliency results. To check the efficacy of our proposed model, the performance is evaluated in terms of precision, recall, F -measure, area under curve and computation time. Experiments are carried out on a publicly available image dataset and performance is compared with Liu et al. [14] model and 10 other popular state-of-the-art models.

The paper is organized as follows. Section 2 includes the state-of-the-art methods to obtain visual salient object. The proposed model is discussed in Section 3. The experimental setup and results are included in Section 4. Conclusion and future work are presented in Section 5.

Section snippets

Bottom-up methods

Itti et al. [1] proposed a biologically plausible model that computes saliency map by combining intensity, color and orientation features at multiple scales. Walther and Koch [17] extended the Itti et al. model to detect proto object regions. Harel et al. [18] modeled the graph theoretic ideas to determine activation maps from the raw features. The model gives high saliency values to the nodes which are at the center of the image. Han et al. [19] integrated the Itti's model with Markov random

Proposed salient object detection framework

Liu et al. [14] extracted multi-scale contrast, center-surround histogram and color spatial distribution feature maps from the image. These features can be combined in many ways to obtain a saliency map. One possible way is to give equal weightage to all the three features. However, there can be an image which is salient in terms of only singleton feature or combination of two features with different weights or combination of all the three features with different weights. So an appropriate

Experimental setup and results

To check the efficacy of the proposed approach to detect salient object, the performance is evaluated both qualitatively and quantitatively. The performance is compared with existing approaches.

In Salient Object Detection using Constrained Particle Swarm Optimization (SOD-C-PSO) procedure, the parameters are set according to Table 1.

All the experiments are carried out using Windows 7 environment over Intel(R) Xeon(R) processor with a speed of 2.27 GHz and 4 GB RAM.

Conclusion and future work

Liu et al. model used conditional random field (CRF) under maximum likelihood criteria to combine three features based on multi-scale contrast, center-surround histogram and color spatial distribution. The CRF learning generated a common linear weight vector which was applied on all the test images. This weight vector was not good for images which differs significantly from the training set and hence gave inappropriate detection results. We proposed a new fitness function for detecting salient

Conflict of interest statement

None declared.

Acknowledgments

The authors are indebted to the reviewers for their constructive suggestions which significantly helped in improving the quality of this paper. In addition, the first author expresses his gratitude to the University Grant Commission (UGC), India for the obtained financial support in performing this research work.

Navjot Singh obtained M.Tech (Computer Science and Technology) from Jawaharlal Nehru University, New Delhi. Presently, he is pursuing Ph.D. (Computer Vision and Pattern Recognition) from Jawaharlal Nehru University, New Delhi. His current research areas are Computer Vision, image processing, object detection, pattern recognition, feature extraction, and classification.

References (30)

  • D. Walther et al.

    Modeling attention to salient proto-objects

    Neural Netw.

    (2006)
  • L. Itti et al.

    A model of saliency based visual attention for rapid scene analysis

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1998)
  • A. Borji, D.N. Sihite, L. Itti, Salient object detection: a benchmark, in: Proceedings of the European Conference on...
  • R. Graefe et al.

    A novel approach for the detection of vehicles on freeways by real time vision

    Intell. Veh.

    (1996)
  • Z. Li et al.

    Saliency and gist features for target detection in satellite images

    IEEE Trans. Image Process.

    (2011)
  • Y. Amit

    2D Target Detection and Recognition, Models, Algorithms and Networks

    (2002)
  • R.C. Gonzalez et al.

    Digital Image Processing

    (2002)
  • L. Itti, Models of Bottom Up and Top Down Visual Attention (Dissertation), California Institute of Technology,...
  • A. Santella, M. Agrawala, D. Decarlo, D. Salesin, M. Cohen, Gaze based interaction for semi automatic photo cropping,...
  • L. Chen, X. Xie, X. Fan, W. Ma, H. Shang, H. Zhou, A Visual Attention Model for Adapting Images on Small Displays,...
  • N. Karssemeijer

    Detection of stellate distortions in mammograms

    IEEE Trans. Med. Imaging

    (2006)
  • C. Rother, L. Bordeaux, Y. Hamadi, A. Blake, Autocollage, ACM SIGGRAPH25 (2006)...
  • F. Gasparini et al.

    Low quality image enhancement using visual attention

    Opt. Eng.

    (2007)
  • W. Zhang et al.

    An adaptive computational model for salient object detection

    IEEE Trans. Multimed.

    (2010)
  • T. Liu et al.

    Learning to detect a salient object

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2011)
  • Cited by (0)

    Navjot Singh obtained M.Tech (Computer Science and Technology) from Jawaharlal Nehru University, New Delhi. Presently, he is pursuing Ph.D. (Computer Vision and Pattern Recognition) from Jawaharlal Nehru University, New Delhi. His current research areas are Computer Vision, image processing, object detection, pattern recognition, feature extraction, and classification.

    Rinki Arya obtained M.Tech (Computer Science and Technology) from Jawaharlal Nehru University, New Delhi. Presently, she is pursuing Ph.D. (Computer Vision and Pattern Recognition) from Jawaharlal Nehru University, New Delhi. Her current research areas are Computer Vision, object detection, pattern recognition, and feature extraction.

    R.K. Agrawal obtained M.Tech (Computer Application) from Indian Institute of Technology Delhi, New Delhi and Ph.D. (Computational Physics) from University of Delhi, Delhi. Presently, he is working as a Professor at the School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi. His current research areas are classification, feature extraction and selection for pattern recognition problems in domains of image processing, security, and bioinformatics.

    View full text