doi:10.1016/S0031-3203(02)00030-4
Copyright © 2002 Pattern Recognition Society. Published by Elsevier Science B.V.
The importance of being random: statistical principles of iris recognition
The Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
Received 21 December 2001.
Available online 5 March 2002.
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Abstract
The statistical variability that is the basis of iris recognition is analysed in this paper using new large databases. The principle underlying the recognition algorithm is the failure of a test of statistical independence on iris phase structure encoded by multi-scale quadrature wavelets. Combinatorial complexity of this phase information across different persons spans about 249 degrees-of-freedom and generates a discrimination entropy of about 3.2 bits/mm2 over the iris, enabling real-time identification decisions with great enough accuracy to support exhaustive searches through very large databases. This paper presents the results of 9.1 million comparisons among several thousand eye images acquired in trials in Britain, the USA, Japan and Korea.
Author Keywords: Statistical variability; Epigenesis; Wavelets; Texture; Iris recognition; Decision theory
Fig. 1. Examples of iris patterns, imaged monochromatically with NIR illumination in the 700–900 nm band at distances of about 35 cm. The outline overlays show results of the iris and pupil localization and eyelid detection steps. The bit streams are the results of demodulation with complex-valued 2D Gabor wavelets to encode iris patterns as a sequence of phasor quadrants.
Fig. 2. The phase demodulation process used to encode iris patterns. Local regions of an iris are projected (Eq. (2)) onto quadrature 2D Gabor wavelets, generating complex-valued projection coefficients whose real and imaginary parts specify the coordinates of a phasor in the complex plane. The angle of each phasor is quantized to one of the four quadrants, setting two bits of phase information. This process is repeated all across the iris with many wavelet sizes, frequencies, and orientations, to extract 2048 bits.
Fig. 3. Illustration that even for poorly focused eye images, the bits of a demodulation phase sequence are still set, primarily by random CCD noise. This prevents poorly focused eye images from resembling each other in the pattern matching stage, in the way that, e.g., poorly resolved face images look alike and can be confused with each other.
Fig. 4. Distribution of Hamming distances obtained from all 9.1 million possible comparisons between different pairs of irises in the database. The histogram forms a perfect binomial distribution with p=0.5 and N=249 degrees-of-freedom, as shown by the solid curve (Eq. (4)). The data implies that it is extremely improbable for two different irises to disagree in less than about a third of their phase information.
Fig. 5. Quantile–quantile plot of the observed cumulatives under the left tail of the histogram in Fig. 4, versus the predicted cumulatives under the theoretical binomial distribution. Their close agreement over several orders of magnitude strongly confirms the binomial model for phase bit comparisons between different irises.
Fig. 6. Distribution of Hamming distances between genetically identical irises in 648 paired eyes from 324 persons. The data are statistically indistinguishable from that shown in Fig. 4 comparing unrelated irises. Unlike eye colour, the phase structure of iris patterns therefore appears to be epigenetic, arising from random events and circumstances in the morphogenesis of this tissue.
Fig. 7. Distribution of Hamming distances from the same set of 9.1 million comparisons as seen in Fig. 4, but allowing for seven relative rotations and preserving only the best match found for each pair. This “best of n” test of agreement skews the distribution to the left and reduces its mean from about 0.5 to 0.458. The solid curve is the theoretical prediction for such “extreme-value” sampling, as described by (4), (8), (9), (10) and (11).
Fig. 8. Calculated cumulatives under the left tail of the distribution seen in Fig. 7, up to sequential points, using the same theoretical PDF described by (4), (8), (9), (10) and (11). The extremely rapid attenuation of these cumulatives reflects the binomial combinatorics with large N in Eq. (4). This accounts for the astronomic confidence levels against a false match when persons are recognized by failing this test of statistical independence.
Fig. 9. The decision environment for iris recognition under relatively unfavourable conditions, using images acquired at different distances, and by different optical platforms.
Fig. 10. The decision environment for iris recognition under very favourable conditions, using always the same camera, distance, and lighting, in a laboratory setting.
Table 1. Cumulatives under Eq. (11) giving false match probabilities for various HD criteria

Table 2. Speeds of various stages in the iris recognition process.
