Keywords

1 Introduction

The appearance of objects is the result of light, surface, and camera sensor interactions. Hence, the human visual system is able to preserve the appearance of objects by adjusting the gain in the different cones. This ability is called the human vision color constancy. In the case of an imaging device, the sensed image depends on surface reflectances, light color and camera spectral sensitivity. When light color changes, the sensed image colors change even if the surface reflectances and the camera spectral sensitivity remain the same. Chromatic adaptation [21] or computer vision color constancy allows to adjust image colors according to an estimate of the light color. This estimate is the camera sensed light originating from one or several sources that illuminate a scene. It is generally the output of a color constancy algorithm [14, 16] used to correct image colors and enhance image contents [30]. For these algorithms, surface reflectances and scene illuminant are unknown. To estimate the scene illuminant, color constancy algorithms use some assumptions and prior knowledge. Dichromatic model [25], lambertian model [22], Grey world assumption [3] or other independent assumptions [2, 8] can be used to estimate the scene illuminant since it is considered as an under-constrained problem. Based on assumptions and prior knowledge used, existing algorithms can be divided into two major categories: dichromatic model based methods [5, 26] and lambertian model based methods. Depending on the strategy used, the lambertian algorithms can be subdivided into two categories: static methods [3, 10, 19, 29] and learning methods [11].

This work is based on the dichromatic model to formulate an additional hypothesis based on the fact that the scene illuminant is within bright colors. This hypothesis is close to the hypothesis used in [8]. From this assumption we found that the scene illuminant is the eigenvector of the inner product matrix of image chromaticities. The paper is organized as follows: Sect. 2 presented the new hypothesis and its use for illuminant color estimation. The evaluation of the proposed algorithm on large datasets is presented in Sect. 3.

2 Problem Formulation

2.1 Maximal Square Projections’ Mean Assumption

Objects in nature are composed of different surface types. Dielectric objects are widely present in natural scenes. For dielectric surfaces the reflectance is a linear composition of two parts: the specular reflectance and the diffuse one. When imaged under a given light source, the resulting image is also a linear combination of specular and diffuse components. The diffuse part encodes the color of the surface as it is a function of the surface reflectance properties. However, the specular part is independent of surface reflectance properties and hence is considered as the image of the scene light called the illuminant. Based on these observations, pixels of dielectric surface image can be represented in a 2D sub-space known as dichromatic space [25], spanned by the specular vector and the diffuse vector.

Considering real scenes, where more than one surface may exist, and several 2D sub-spaces corresponding to the existing surfaces can be defined. Now, if the scene is illuminated by a single light source, just one specular component (i.e. illuminant) is present while several diffuse components, each of which represents a surface, exist (see Fig. 1). It follows that the estimation of the illuminant is equivalent to subspaces intersection estimation [20, 24, 27]. However, this requires the prior knowledge of the existing surfaces in the scene and hence an image segmentation. Another way to proceed is to consider just one surface plane and impose some constraints on sub-spaces [9, 28]. The prior segmentation can be avoided by investigating the relationship that may exist between the specular vector and sub-spaces vectors. Let us assume that there exist several chromaticities close to specular axis more than any other diffuse axis. These chromaticities belong to one or more surfaces, but their identification is not a trivial task. In fact, there are several works that try to identify them like in [5]. We argue that it is not required to identify exactly all the chromaticities close to the specular axis, but a vague subset of them is enough to estimate the illuminant. The subset cardinality must not be large to reduce the computational complexity, but sufficient for carrying statistical estimation, with acceptable bias. The specification of the subset will be explained in Sect. 2.3.

Fig. 1.
figure 1

Representation of dichromatic sub-spaces points and their distances to the specular component for dielectric surfaces.

Fig. 2.
figure 2

The refined gamut calculated from three datasets ([1, 4, 12]) (Color figure online).

As it is shown in Fig. 1, all the chromaticities have the (0, 0) origin. Moreover, being interested in the illuminant color, the chromaticity vectors are normalized such that their magnitude is one. In this case, the proximity between the chromaticity vector c and any other vector \( x_i\), where \(i \in [1,n]\) can be measured using the dot product between them. Since the chromaticity components are all positive, it follows that the angles between them and c is less or equal to 90 degrees. In this configuration, the dot product and the Euclidean distance between c and a given chromaticity are equivalent. Note that, other proximity measures can be used. In other words, given a set of normalized near specular chromaticities \(\mathcal L =\) \(\left\{ x_{i} \right\} \) and an estimator c of the real illuminant, we search for the vector c that is the most close, in terms of dot product, to all elements of \(\mathcal L\). However, one could use the dot product square as objective function. The resulting function \(m_{I,\mathcal L}\) can be interpreted in terms of dispersion of the chromaticities on the axis c. For the n chromaticities which will be chosen, the objective function is the sum of the squares of dot products. More formally,

$$\begin{aligned} m_{I,\mathcal L} = arg max_{c} \;\sum _{i=1}^{n}(\overrightarrow{x_{i}}.\overrightarrow{c})^{2} \text { subject to the constraint } c^{t}c=1 \end{aligned}$$
(1)

It can be rewritten in matrix form as follows:

$$\begin{aligned} m_{I,\mathcal L} = arg max_{c} \; c^{t}\varSigma c \text { subject to the constraint } c^{t}c=1 \end{aligned}$$
(2)

With \(\varSigma =X^{t}X \) is the inner product matrix of selected chromaticities \(\mathcal L\). One can think that the illuminant estimation problem is equivalent to PCA problem. However, the chromaticities are not centred and therefore, the problem in 2 is not a classical PCA problem. It is, as stated by [17], another PCA variant called uncentred PCA.

2.2 Assumption Validation

In order to validate our assumption, we carried out the following experimentation. We started using the SFU Lab dataset [1] which contains 321 images with corresponding true illuminants. We selected the set \(\mathcal L\) of chromaticities closest to the true illuminant l. Several chromatic spaces could be used to calculate chromaticities. We made the same observation with [6], the rg-chromaticity space is the most familiar space for calculating image chromaticities. Then, for each image I all the selected chromaticities \(\mathcal L\) including the illuminant l were projected on each other and the vector c which maximises the square projections’ mean was recorded. To compare recorded vectors and real illuminants, we use the dot product. We validated our assumption by using 1 % of chromaticities closest to real illuminants. This percentage contains a sufficient amount of points for doing statistical analysis. Numerically, we used 2981 chromaticities closest to the real illuminant l. Real illuminants and chromaticities yielding the maximal square projections’ mean were depicted in Fig. 3. We noted that the majority of these chromaticities were close to real illuminants of the dataset. Indeed, We found that in 98.75 % of the dataset images, the maximal square projections’ mean was obtained by projecting the chromaticities on chromaticities having less then 3 degrees from the real illuminants. The binned histogram of angular errors between these chromaticities and true illuminants, in Fig. 4, had an obvious maximum image count near the origin, which means several of chromaticities producing the maximal square projections’ mean were true illuminants. This experimentation showed that the maximal mean square projections’ assumption is realistic.

Fig. 3.
figure 3

The set of true illuminants and estimated illuminants (i.e. vectors yielding the maximal projections mean).

Fig. 4.
figure 4

The binned histogram of angular errors between true illuminants and chromaticities maximising the mean projection.

2.3 Chromaticities Selection

The accuracy of the estimated illuminant depends on the chromaticities \(\mathcal L\) involved in the estimation of the inner product matrix \(\varSigma \). The cardinality of \(\mathcal L\) must be small in order to reduce the computational time but large enough to produce lower estimation errors. The closest chromaticities to the true illuminant l are always the bright pixels of the image [18]. So a representative sample set from all chromaticities can be selected according to an adequate threshold. This leads us to ask the following question: is there a useful way to avoid the use of an arbitrary or inadequate threshold? In this section, we propose to take the bright pixels in a gamut of suited chromaticities that we called the refined gamut. Hence, we take first a percentage \(T\%\) of bright pixels and we kept from this set, chromaticities that are inside the refined gamut. If the number of resulting chromaticities is not enough to do statistical analysis, we take a greater percentage and search for chromaticities inside the refined gamut. To construct the refined gamut, we run our algorithm using different percentages (1 %, 3 %, 5 %, 7 %, 10 %) on three well-known datasets ([1, 4, 12]) and gather chromaticities allowing the best illuminant estimator. The construction of the refined gamut is done separately in the training step. The refined gamut and the gamut of real illuminants of the three datasets are plotted in Fig. 2. One can observe that, the gamut of real illuminants is inside the refined gamut. This means that the selected chromaticities involved in the illuminant estimation are always inside the gamut of real illuminants i.e. the selected chromaticities are physically feasible.

2.4 Maximal Square Projections’ Mean Algorithm

Based on experimentation described in Sect. 2.2, for a given image the illuminant c is the vector maximizing the square projections’ mean of its projected data. Let us recall that, the space of chromaticities is normalized (i.e., \(r+g+b=1\)) and therefore the illuminant intensity cannot be estimated. This is not a limitation because only the illuminant direction is used to correct image colors. The solution of this convex optimization problem is straightforward. Indeed, the illuminant c is the eigenvector of the matrix \(\varSigma \) corresponding to the largest eigenvalue. However, even if the mathematical solution is possible, it might not be physically feasible. In order to overcome this problem, additional constraints are needed: The illuminant c must be close to physically feasible illuminants. One can add a new constraint for the physical feasibility of the estimated illuminant like the fact that it belongs to the real illuminants’ gamut. This constraint is unnecessary because it is already fulfilled in the selection phase described in Sect. 2.3. Moreover, none of the components of the illuminant vector c can be negative. We propose then to impose the constraint \(c-\epsilon > 0\). Taking into account all constraints, we propose to minimize their linear combination. The illuminant estimation problem with constraints can be written as:

$$\begin{aligned} m_{I,\mathcal L} = arg min_{c} \; - c^{t}\varSigma c + \lambda _{1}(\epsilon -c) +\lambda _{2} (c^{t}c-1) \end{aligned}$$
(3)

Another important issue concerns the chromaticities which can be centred or not, the \(\varSigma \) matrix can then take positive or negative values. When the chromaticities are centred this matrix is simply the covariance matrix. In the case of uncentred chromaticities, the matrix \(\varSigma \) is the inner product of chromaticities. In this case the dimensionality reduction method is called the mean vector component analysis [17] which preserves the Euclidean length and the direction of the mean vector. For our case (i.e. illuminant estimation) the mean vector direction m is a good implicit constraint on the estimated illuminant c. That means, the mean m of bright chromaticities is generally a color close to the physically feasible illuminants’ colors. Moreover, \(\varSigma \) is a strictly positive matrix which is irreducible and then the Perron-Frobenuis theorem [23] can be applied on it. According to this theorem, there exists a largest eigenvalue \(\lambda \) to which corresponds an eigenvector v which is composed only of positive elements. Therefore, using the diagonalization method of Perron-Frobenuis the problem can be re-written without the explicit vector positivity constraint as in 4. Consequently, we propose the use of Perron-Frobenuis theorem to calculate the eigenvector of inner product matrix \(\varSigma \) as an illuminant estimator.

$$\begin{aligned} m_{I,\mathcal L} =arg max_{c} \; c^{t} \varSigma c + \lambda (c^{t}c-1) \end{aligned}$$
(4)

3 Experimental Results

The performance evaluation of the proposed algorithm is carried out in two experimentations. In the first experimentation, we used the SFU lab dataset, a collection of 321 laboratory images of size 637\(\,\times \,\)468 consisting of 31 objects imaging under 11 lights. We used also the SFU Grey Ball collection [4] which consists of 11346 images (874\(\,\times \,\)583) dataset taken through a video registration of indoor and outdoor scenes. The data collection includes a wide variety of scenes and illumination conditions. For both collections the ground truth is available.

For the sake of comparison, we reported the accuracy of some well-known algorithms that are: Grey world (GW) [3], White patch (Max-RGB) [19], Shades of grey (SHGR) [10], the Grey edge (GRED) [29]. From the learning category the Natural image statistics (NIS) [15] algorithm is selected. The Zeta image (Zeta) [5] represents the dichromatic based category. For the implementation of algorithms GW, Max-RGB, SHGR, GRED, we used the software platform [29], while for Zeta and NIS we compared with scores reported in [5, 15]. For comparison purposes algorithm which gives the best scores is considered as the best algorithm. In which follows, we refer to the proposed algorithm by maximum projection algorithm (MPA).

The performance measures are the mean and the median of angular errors. They are widely used in the state-of-the-art methods, we used them to allow fair comparison. The angular error (Eq. 5) is the dot product of the normalized estimated illuminant vector c and the normalized ground truth vector e. Because it is illumination intensity free, Hordley [16] claimed that this measure can be used to evaluate algorithms that estimate only the illuminant chromaticity. Let us recall that MPA operates in 2D chromatic space whereas other tested algorithms operate in RGB space. Finlayson et al. in [7] argued that the performance of an algorithm designed for RGB color space like the Grey world deteriorates when it was tested in the 2D chromaticity space.

$$\begin{aligned} Ang\_\,Error = \arccos (\dfrac{c^{t} e}{\Vert c \Vert \; \Vert e \Vert }) \end{aligned}$$
(5)

The second experimentation is devoted to evaluate the MPA computational accuracy compared to GW, Max-RGB, SHGR, GRED algorithms. The datasets used are the SFU Lab dataset [1], and the SFU Grey Ball dataset [4].

3.1 Algorithm Performance

Scores of selected algorithms including the MPA algorithm on three datasets SFU Lab, Color Checker, and SFU Grey Ball are reported in Table 1. The obtained scores confirm that MPA algorithm outperforms the other algorithms in terms of mean and median angular errors on the three datasets. For the SFU Lab dataset, MPA algorithm reduces the mean and median errors given by best algorithm Zeta by 43 % and 57 % respectively. Scores obtained with the SFU Grey ball dataset show that MPA algorithm enhances by roughly 21 % and 12 %, respectively, the mean and median errors achieved by the best algorithm NIS. These improvements might be considered as important since an enhancement over 5–6% is considered as perceptually significant [13].

Table 1. Mean and median angular errors estimated on two datasets (SFU Lab [1], and Gray Ball [4]) with computational times in seconds.

We investigate also the computational accuracy of MPA algorithm as function of the content and the size of images compared to four algorithms. We run the different algorithms on Alienware machine with Intel Core i7-3820 Processor and 16 GB of RAM memory. One can note that, MPA algorithm achieves least computational time compared to the tested algorithms on two over three datasets. The Max-RGB, the fast algorithm among the first four tested algorithms treats 23 images per second, while, MPA estimates illuminants of 25 images per second from the SFU Grey Ball dataset. However, MPA takes more time for SFU Lab dataset (i.e. over 11 images per second) compared to GW (over 14 images per second) and Max-RGB (over 15 images per second). This is due to iterations made by MPA algorithm to reach an acceptable set cardinality of selected chromaticities.

4 Conclusion

In this paper, we presented the maximum projection algorithm for illuminant color estimation. We observed that the projection of selected chromaticities on illuminant vector allows to derive an efficient and fast algorithm for the illuminant estimation. This algorithm is no other than uncentred PCA problem since we search for the sub-space which maximises the dispersion of chromaticities projected on it. Instead of using all chromaticities of an image, only a subset of them is used which makes the algorithm faster. The method is tested on three images collections and the angular errors obtained are lower compared to previous works. In further work, we will investigate other performance measures and refine the chormaticities selection criterion.