Elsevier

Pattern Recognition

Volume 43, Issue 6, June 2010, Pages 2210-2223
Pattern Recognition

Emulating biological strategies for uncontrolled face recognition

https://doi.org/10.1016/j.patcog.2009.12.026Get rights and content

Abstract

Face recognition technology is of great significance for applications involving national security and crime prevention. Despite enormous progress in this field, machine-based system is still far from the goal of matching the versatility and reliability of human face recognition. In this paper, we show that a simple system designed by emulating biological strategies of human visual system can largely surpass the state-of-the-art performance on uncontrolled face recognition. In particular, the proposed system integrates dual retinal texture and color features for face representation, an incremental robust discriminant model for high level face coding, and a hierarchical cue-fusion method for similarity qualification. We demonstrate the strength of the system on the large-scale face verification task following the evaluation protocol of the Face Recognition Grand Challenge (FRGC) version 2 Experiment 4. The results are surprisingly well: Its modules significantly outperform their state-of-the-art counterparts, such as Gabor image representation, local binary patterns, and enhanced Fisher linear discriminant model. Furthermore, applying the integrated system to the FRGC version 2 Experiment 4, the verification rate at the false acceptance rate of 0.1 percent reaches to 93.12 percent.

Introduction

Over the last decade, significant progress in improving the performance of machine-based face recognition systems has emerged. Many algorithms can function effectively, even achieve 100 percent recognition accuracy, on the data sets with constrained variations. A recent study even reported that leading machine-based algorithms can surpass human performance matching faces over changes in illumination [1]. The US government funded evaluations (e.g. FERET [2] and Face Recognition Grand Challenge (FRGC) [3], [4]), however, cautions that state-of-the-art face recognition algorithms still cannot achieve satisfactory performance in many challenging scenarios, such as face recognition from uncontrolled imagery. Recently, we further pointed out “automatic face recognition is a complex pattern-recognition problem involved with early processing, perceptual coding, and cue-fusion mechanisms. Although countless solid contributions have been made, 100 percent accuracy in automatic face recognition in real-world settings remains an ambitious goal [5]”. The core challenge of uncontrolled face recognition results from image variation: any given face can cast innumerable images onto the 2D imaging system depending on its position, distance, orientation, lighting and background. Even the frontal faces with accurate location and alignment are still difficult for machine to discriminate. As evidenced in Fig. 1, Fig. 5, the variations between the controlled versus uncontrolled images of the same face due to illumination conditions, distortion (expression, adornments and cosmetics) and cameras (degradation and perspective distortion) are significantly larger than the image variations due to the change in face identity. In this respect, uncontrolled face recognition seems to be a research problem that is paradigmatic and everlasting for computer vision and pattern recognition [6].

The only system that can reliably cope with real-world image variability is the human vision system. By millions of years of evolution, human brain has evolved a highly specialized mechanism for face perception [7], which holds a distinct edge over machine for face recognition in the uncontrolled settings. Consequently, as stated by Sinha, “increased knowledge about the ways people recognize each other may help to guide efforts to develop practical automatic face-recognition systems [8]”. However, current methods mostly treat face recognition as purely theoretic exercises, typically characterized by undersampled discriminant analysis [9], [10], statistical learning theory [11], [12], and manifold learning [13], [14], [15]. Although some features inspired by the fragmentary knowledge of early vision [9], [16], [11], [17] appear to be very useful, these partial biological models have been proved to be insufficient by themselves to deal with real world recognition problems [18]. For instance, the top performance algorithm [11] in FRGC 2005 evaluation, which capitalizes on the Gabor wavelet representation, motivated by the receptive fields in primary visual cortex, achieves only 76 percent verification rate at 0.1 percent false accept rate in the uncontrolled face recognition task. To make machine really match human vision, some additional issues, such as perceptual learning and cue-fusion, still require further research.

Inspired by the evidences from neurobiological and psychophysical studies, we have developed an uncontrolled face recognition system, which aims at emulating the early processing, face coding, and cue-fusion strategies of human face recognition [5] as follows:

  • 1.

    In the feature extractor, a new set of biologically inspired features is derived from the response properties of neurons in the early stages of visual pathways, which are mostly sensitive to distinct visual properties, such as spectral and spatial frequencies, orientation, and color opponent [19]. The feature set aims to encode the most salient facial cues, including the texture and color information from both the internal and external faces [8], [19].

  • 2.

    A dimensionality reduction module is developed based on an incremental robust discriminant model (IRDM), which encodes the feature vector by its coordinates along the selected directions of greatest identity separability. The incremental learning processing can train face discrimination on the feature sets of massive faces and emulate the perceptual learning that happens in later stages of visual pathway to extract higher level (identity-specific) codes.

  • 3.

    Face coding processes are performed, respectively, on different facial cues, yielding separated visual pathways [20]. Recognition decision is finally made by the information fusion module which hierarchically integrates the similarities judged from the face codes of multiple pathways. This stage aims to predigest the highly complex cue-fusion strategy of human face recognition, which adaptively integrates all available cues to attain its impressive face recognition performance.

Extensive performance evaluation studies were conducted using the standardized procedures provided by the FRGC version 2 Experiment 4, which is designed for the verification of controlled single still image versus uncontrolled single still image. In particular, we performed comparative studies of 21 face recognition schemes covering popular intensity-, color-, and texture-based methods. The notable results include: (1) the proposed retinal texture and color features provide the optimal face representation over other state-of-the-art counterparts, such as Gabor wavelet and local binary patterns; (2) the proposed IRDM algorithm effectively solves the model selection problem of the popular “PCA+LDA face encoding methods; (3) similarity normalization is essential for face verification applications, even in the single classifier based system; and (4) single-feature based face recognition performance appears to be saturated as the training sample size enlarges to a critical number, such as 10,000. The effectiveness of the proposed face recognition strategies of (i) texture+color feature fusion and (ii) internal+external cues fusion are also validated. Our findings show that the simple process of similarity fusion can dramatically boost uncontrolled face recognition. Finally, compared with other state-of-the-art face recognition systems, our system demonstrates a large improvement on the face verification rate from below 86 percent to over 93 percent at 0.1 percent false acceptance rate (see Table 3) with comparable recognition efficiency.

Section snippets

Background and related works

A successful face recognition methodology depends heavily on the particular choices of the facial appearance descriptor and the feature coding scheme. In this section, we review some biologically inspired aspects of these issues which motivate the design of the proposed system.

Biologically inspired facial features

This section details the biologically inspired feature set which aims to represent the most salient facial cues for face recognition. The proposed feature set extends the conventional face descriptors by (1) characterizing both internal and external facial cues, (2) mimicking multiple processes in the early stages in human vision, and (3) increasing the tolerance against spatial shifts and size scaling.

Incremental robust discriminant model

Perceptual visual learning usually leads to low-dimensional feature coding [29]. In our system, the face coding task is carried out by a new computational model, called incremental robust discriminant model (IRDM), which improves the popular “PCA+LDA methods [36], [9], [10] by its higher scalability and better applicability.

Biologically inspired face recognition system

The proposed biologically inspired face recognition system consists of three modules: feature vector extraction, dimensionality reduction, and information fusion, as shown in Fig. 4.

In the feature vector extraction module, the retinal texture and color features are extracted from the pure and full face image, respectively, which encompass the processes of the type that may take part in the early stages in human vision. Specifically, the RTF mimics the center-surround processing in the retina

The FRGC version 2 Experiment 4

Face recognition grand challenge (FRGC) [3], [4] is a recent large-scale face recognition evaluation sponsored by US government. The data for FRGC version 2 consists of 50,000 recordings divided into non-overlapped training and validation partitions. Two types of images are involved in the FRGC data set: the controlled images and uncontrolled images. The controlled images were taken in a studio setting typical of those used for passport photographs with two facial expressions (neutral and

Conclusion and future works

In this paper, we showed that a simple system designed by emulating biological strategies of human visual system is capable of largely surpassing state-of-the-art performance on uncontrolled face recognition. In order to construct this biologically inspired system, we integrated (1) the IRDM based facial texture and color codes which emulates the visual information processing of visual pathways in a biologically meaningful manner; and (2) the strategy of fusing the perceptually meaningful

Acknowledgments

The authors would like to thank the anonymous reviewers for their critical and constructive comments and suggestions, which are very helpful to improve both the technical and the literary quality of this paper. This work was done when W. Deng was a postgraduate exchange student in School of Information Technologies, University of Sydney. W. Deng would like to thank Prof. David Feng and Tom Cai for providing the excellent research environment in the BMIT Laboratory. This work was partially

About the Author—WEIHONG DENG received the B.E. degree in information engineering and the Ph.D. degree in signal and information processing from the Beijing University of Posts and Telecommunications (BUPT), Beijing, China, in 2004 and 2009, respectively. From October 2007 to December 2008, he was a postgraduate exchange student in the School of Information Technologies, University of Sydney, Australia, under the support of the China Scholarship Council. He is currently a Lecturer in School of

References (55)

  • L.-F. Chen et al.

    Why recognition in a statistics-based face recognition system should be based on the pure face portion: a probabilistic decision-based proof

    Pattern Recognition

    (2001)
  • T. Maenpaa et al.

    Classification with color and texture: jointly or separately?

    Pattern Recognition

    (2004)
  • A.J. O’Toole et al.

    Face recognition algorithms surpass humans matching faces over changes in illumination

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2007)
  • P. Phillips et al.

    The Feret evaluation method for face recognition algorithms

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2000)
  • P.J. Phillips, P.J. Flynn, T. Scruggs, K. Bowyer, J. Chang, K. Hoffman, J. Marques, J. Min, W. Worek, Overview of the...
  • P.J. Phillips, P.J. Flynn, T. Scruggs, K. Bowyer, W. Worek, Preliminary face recognition grand challenge results, in:...
  • W. Deng et al.

    Comment on “100% accuracy in automatic face recognition”

    Science

    (2008)
  • J. Daugman

    Face and gesture recognition: overview

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1997)
  • D.Y. Tsao et al.

    A cortical region consisting entirely of face-selective cells

    Science

    (2006)
  • P. Sinha et al.

    Face recognition by humans: 19 results all computer vision researchers should know about

    Proc. IEEE

    (2006)
  • C. Liu et al.

    Gabor feature based classification using the enhanced Fisher linear discriminant model for face recognition

    IEEE Trans. Image Process.

    (2002)
  • X. Wang et al.

    A unified framework for subspace face recognition

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2004)
  • C. Liu

    Capitalize on dimensionality increasing techniques for improving face recognition grand challenge performance

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2006)
  • B.V.K.V. Kumar et al.

    Correlation pattern recognition for face recognition

    Proc. IEEE

    (2006)
  • X. He et al.

    Face recognition using Laplacianfaces

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2005)
  • D. Cai et al.

    Orthogonal Laplacianfaces for face recognition

    IEEE Trans. Image Process.

    (2006)
  • W. Deng et al.

    Comments on “globally maximizing, locally minimizing: unsupervised discriminant projection with applications to face and palm biometrics”

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2008)
  • W. Deng, J. Hu, J. Guo, Gabor–Eigen–Whiten–Cosine: a robust scheme for face recognition, in: Lecture Notes in Computer...
  • E. Meyers et al.

    Using bilogically inspired features for face recognition

    Int. J. Comput. Vis.

    (2000)
  • Y. Adini et al.

    Face recognition: the problem of compensating for changes in illumination direction

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1997)
  • S. Ullman et al.

    Visual features of intermediate complexity and their use in classification

    Nature Neurosci.

    (2002)
  • E. DeYoe et al.

    Pattern-color separable pathways predict sensitivity to simple colored patterns

    Vision Res.

    (1996)
  • D.J. Field

    What is the goal of sensory coding?

    Neural Comput.

    (1994)
  • L. Wiskott et al.

    Face recognition by elastic bunch graph matching

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1997)
  • D.J. Jobson et al.

    Properties and performance of a center/surround retinex

    IEEE Trans. Image Process.

    (1997)
  • E. Land et al.

    Lightness and retinex theory

    J. Opt. Soc. Am.

    (1971)
  • H. Wang, S.Z. Li, Y. Wang, Generalized quotient image, in: Proceedings of the IEEE Computer Vision and Pattern...
  • Cited by (25)

    • Face perception foundations for pattern recognition algorithms

      2021, Neurocomputing
      Citation Excerpt :

      The caricature concept was particularly mentioned as a suggestion for future work in pattern recognition [10,8]. Other hints concern adaptation, invariant scale and position features and in general more informative biologically-inspired features [152], probabilistic parameters [142], the equal importance of pigmentation and shape cues [10], perceptually optimal face codes [152]. Concerning the model types, according to Fu et al., bottom-up models have proven to be qualitatively constrained by the anatomy and physiology of the visual cortex and may not be actually suitable for practical computer vision systems, and a more high-level computation framework is required.

    • From one to many: Pose-Aware Metric Learning for single-sample face recognition

      2018, Pattern Recognition
      Citation Excerpt :

      However, these methods are suitable only for face representation and effective only for the recognition under constraint variations. Invariant features (e.g. Gabor feature [7,8] and local binary patterns [9]) are effective to increase the robustness to the lighting and expression changes. Unfortunately, since they discard all information about the 3D layout of the face, these feature descriptors are deficient to counteract the unobserved pose variations.

    • Robust face recognition after plastic surgery using region-based approaches

      2015, Pattern Recognition
      Citation Excerpt :

      Other recent works investigate how to emulate biological strategies in face recognition. For instance, the system presented in [16] integrates dual retinal texture and colour features for face representation, an incremental robust discriminant model for high level face coding, and a hierarchical cue-fusion method for similarity qualification. One of our preliminary goals was to measure the relative relevance of eyes, nose, and mouth regions on automatic recognition.

    • Equidistant prototypes embedding for single sample based face recognition with generic learning and incremental learning

      2014, Pattern Recognition
      Citation Excerpt :

      As the supervised learning techniques are not applicable without intraclass information, unsupervised techniques, which find the low-dimensional embedding of the gallery data by ICA [1], PCA [2] or its variants [3–5], have widely been applied, but these methods are optimal only for face representation and effective only for the recognition under small variation. Recognizing the face by invariant features (e.g. Gabor representation [6,7] and local binary patterns [8]) increases robustness to facial variation. However, because holistic image feature based methods discard all information about the 3D layout of the faces, they have limited descriptive ability on unobserved variation.

    View all citing articles on Scopus

    About the Author—WEIHONG DENG received the B.E. degree in information engineering and the Ph.D. degree in signal and information processing from the Beijing University of Posts and Telecommunications (BUPT), Beijing, China, in 2004 and 2009, respectively. From October 2007 to December 2008, he was a postgraduate exchange student in the School of Information Technologies, University of Sydney, Australia, under the support of the China Scholarship Council. He is currently a Lecturer in School of Information and Communication Engineering, BUPT. His research interests include statistical pattern recognition and computer vision, with a particular emphasis in face recognition. He has published over 10 technical papers in international journals and conferences, including a technical comment on face recognition in SCIENCE magazine. He also serves as an active reviewer for several international journals, such as IEEE TPAMI, Pattern Recognition and IEEE SMC-B.

    About the Author—JIANI HU received the B.E. degree in telecommunication engineering from China University of Geosciences in 2003, and the Ph.D. degree in signal and information processing from Beijing University of Posts and Telecommunications (BUPT), Beijing, China, in 2008. She is currently a Lecturer in School of Information and Communication Engineering, BUPT. Her research interests include information retrieval, statistical pattern recognition and computer vision.

    About the Author—JUN GUO received B.E. and M.E. degrees from BUPT, China in 1982 and 1985, respectively, Ph.D. degree from the Tohuku-Gakuin University, Japan in 1993. At present he is a professor and the dean of School of Information and Communication Engineering, BUPT. His research interests include pattern recognition theory and application, information retrieval, content based information security, and network management. He has published over 200 papers, some of them are in world-wide famous journals or conferences including SCIENCE, IEEE Transactions on PAMI, IEICE, ICPR, ICCV, SIGIR, etc. His book “Network management” was awarded by the government of Beijing city as a finest textbook for higher education in 2004.

    His team got a number of prices in national and international academic competitions including: the first place in a national test of handwritten Chinese character recognition 1995, the first place in a national test of face detection 2004, the first place in a national test of text classification 2004, the first place of paper design competition held by IEEE Industry Application Society 2005, the second place in the competition of CSIDC held by IEEE Computer Society 2006.

    About the Author—WEIDONG CAI received the B.S. degree from HuaQiao University, Quanzhou, China, in 1989, and the Ph.D. degree from the University of Sydney, Sydney, Australia, in 2001, both in computer science. Prior to his doctoral study, he worked in industry for five years. After graduation, he was a Postdoctoral Research Associate at the Centre for Multimedia Signal Processing (CMSP), Hong Kong Polytechnic University. In 2001, he was a Lecturer and is currently a Senior Lecturer in the School of Information Technologies, University of Sydney. His research interests include computer graphics, image processing and analysis, data compression and retrieval, and multimedia database and computer modeling with biomedical applications.

    About the Author—DAGAN FENG received the M.E. degree in electrical engineering and computing science from Shanghai JiaoTong University, Shanghai, China, in 1982, and the M.Sc. degree in biocybernetics and the Ph.D degree in computer science from the University of California, Los Angeles, in 1985 and 1988, respectively. After briefly working as an Assistant Professor at the University of California, Riverside, he joined the University of Sydney, Sydney, Australia, at the end of 1988, where he was a Lecturer, Senior Lecturer, Reader, Professor, Head of Department of Computer Science, Head of School of Information Technologies, and is currently Associate Dean of the Faculty of Science. He is also the Honorary Research Consultant, Royal Prince Alfred Hospital, Sydney; the Chair-Professor of Information Technology, Hong Kong Polytechnic University, Hong Kong; the Advisory Professor, Shanghai JiaoTong University; and a Guest Professor with Northwestern Polytechnic University, Xian, China, with Northeastern University, Shenyang, China, and with Tsinghua University, Beijing, China. He is the Founder and Director of the Biomedical and Multimedia Information Technology Research Group. He has published over 400 scholarly research papers, pioneered several new research directions, and made a number of landmark contributions in his field with significant scientific impact and social benefit. His research area is biomedical and multimedia information technology.

    Dr. Feng is a Fellow of the Australia Computer Society, the Australian Academy of Technological Sciences and Engineering, Hong Kong Institution of Engineers, and the Institution of Electrical Engineers, U.K., and the Institute of Electrical and Electronics Engineers. He is also the special Area Editor of the IEEE Transactions on Information Technology in Biomedicine and is the current Chairman of IFAC-T.

    1

    This work was done when Weihong Deng was a postgraduate exchange student at the University of Sydney.

    View full text