doi:10.1016/j.cag.2006.01.019
Copyright © 2006 Elsevier Ltd All rights reserved.
Shape reasoning on mis-segmented and mis-labeled objects using approximated Fisher criterion
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Hervé Glotina,
,
, Sabrina Tollaria and Pascale Giraudeta, b
aSystem and Information Sciences Lab, UMR CNRS 6168, University Sud Toulon Var, BP 20132, F-83957 La Garde cedex, France
bDepartment of Biology, University Sud Toulon Var, BP 20132, F-83957 La Garde cedex, France
Available online 23 February 2006.
Abstract
To automatically determine semantics of a shape or to generate a set of keywords that describe the content of a given image is a difficult problem due to: (a) the high-dimensional problem, (b) the unsolved automatic object segmentation (mis-segmentation), and (c) the lack of well-labeled large image database (mis-labeling). In order to tackle (a), despite (b), (c) and the expensive handy image segmentation and labeling, visual features should be automatically selected to convey the most robust and discriminant information without requiring too computational cost. Therefore, we propose a novel method: ‘Approximation of Linear Discriminant Analysis’ (ALDA), which is more generic than LDA: ALDA does not require explicit class labeling of each training samples. We theoretically show that under weak assumption, ALDA allows efficient ranking estimation of the discriminant powers of the visual features. We apply ALDA on COREL database (10K images, 267 words) with Normalized Cuts segmentation algorithm. First, we demonstrate an image classification gain of 43%, while reducing features set by a factor 10. Secondly, we demonstrate that for some words (like ‘Door’, ‘Flag’), even low-level shape features (convex hull, or moment of inertia) are more discriminant than any color or texture features.
Keywords: LDA; Approximation; Shape; Segmentation; Mis-label; High-dimensional problem; Classification; CBIR; COREL
Fig. 1. Illustrations of mis-segmentation and mis-labeling. We represent blobs segmented with Normalized Cuts [1] and [6]. On the left, the penguin is split into five blobs, moreover, like in the right illustration, we do not know the relation between labels and blobs.
Fig. 2. Words distributions over A(wk), B(wk) and cT/cG showing that C(x;wk) is negligible in
estimations for nearly all the words. These distributions have been calculated on 6K images (around 60K blobs), considering that for each word wk, CS equals the number of images labeled by wk.
Fig. 3. Maximum values of normalized estimated discriminant power
, in each SHAPE, COLOR or TEXTURE feature set. Each word is represented by a dot followed by the most frequent word, possibly indexed by the dimension index of the shape feature maximizing
: 4 is the ratio of blob's area to the perimeter squared; 6 the ratio of the blob's area by its convex hull. We draw the equal line of discriminant power for easy comparison. Results are intuitively correct: FLOWER, SNOW, PLANT are better discriminated by color; LEAF by color and shape; WATER and TREE (variant shapes and colors) by texture. Ratio of the blob's area by its convex hull is nearly always more discriminant for FLAG than any color or texture features. DOOR and ENTRANCE are more discriminated by shape than by color.
Fig. 4. Example of images labeled by words which have a high shape feature discrimination power: (top to bottom) HORIZON, BIGHORN, FLAG, DOOR. One can see indeed that these images contain recurrent shape, but not recurrent color neither texture, yielding to highest shape feature discrimination power.
Table 1.
Overview of classification results (mean of Normalized Score (NS)) under baseline AHC system with 40 features (40DIM), or with using only selected features with ALDA (NADAPT) for τ=0.3 (Eq. (9))

NS are averaged over the 52 most frequent words and 2.5K COREL test images.

Corresponding author. Tel.: +33 4 94 14 28 24; fax: +33 4 94 14 28 97.