Can holistic processing be learned for inverted faces?
Introduction
For ordinary adults, faces form a special class of visual object. A specific cortical area in the right fusiform gyrus activates more strongly for faces than for other within-class object discrimination (Kanwisher, McDermott, & Chun, 1997), and much evidence shows that a holistic or configural style of cognitive processing makes faces “special”.
The exact definition of “holistic” or “configural” processing remains a mater of debate, but it is generally taken to mean integration of information from the whole of the internal face region, in which there is either no decomposition into component parts (eyes, nose, etc; Tanaka & Farah, 1993), or in which each part is processed in interaction with multiple other parts (e.g. Rhodes, 1988). Holistic processing is usually contrasted with “part-based” processing, in which decomposition based on local information (e.g. lip shape, eye colour) supports performance.
Initial evidence for this distinction was that faces show a disproportionate inversion effect on recognition memory: while memory is poorer for all objects when studied and tested upside-down (inverted) than when studied and tested upright, this effect is much larger for faces than for other objects (e.g. Diamond and Carey, 1986, Yin, 1969). This result is commonly interpreted as reflecting holistic processing only for upright faces, a conclusion which has been supported by paradigms providing direct tests of holistic vs. part-based processing styles (for a recent review see Maurer, Le Grand, & Mondloch, 2002). For example, in the part–whole paradigm (Tanaka & Farah, 1993), a particular face part (e.g. Bill's nose) is remembered more accurately when tested in the whole studied face (Bill's nose in Bill's face vs. John's nose in Bill's face) than when tested alone (Bill's nose vs. John's nose). In the composite effect (Young, Hellawell, & Hay, 1987), two different half-faces appear to fuse into a new face, slowing naming of one half-face, when the halves are aligned as compared to offset (unaligned). In both paradigms holistic effects occur in the upright orientation, but not inverted (see also Hole, 1994, Tanaka et al., 1998, Tanaka and Sengco, 1997). Thus, for faces, it has been shown that (a) holistic processing occurs only in the upright orientation, (b) inverted faces are processed in a part-based non-holistic manner, and (c) the processing of local feature information is largely unaffected by orientation (also see Bartlett and Searcy, 1993, Murray et al., 2000, Rhodes et al., 1993, Thompson, 1980).
There are two general theories of why, in adults, holistic processing is limited to faces, and moreover to upright faces. According to the expertise hypothesis, holistic processing is a domain-general property of expertise in making within-class discriminations. Diamond and Carey (1986) noted that, for most people, faces are the only class of visual stimulus for which they become genuine experts. However, these authors suggested that face-like holistic processing might be learnable for any object type, as long as three conditions were met: (a) all exemplars of the stimulus class share a similar basic configuration (i.e. the same parts in the same left–right, above–below relationships), (b) individual exemplars differ from this shared first-order configuration only in minor (second-order) ways (e.g. the exact distance between two parts; the exact shape of a part), and (c) the subject has sufficient expertise with the stimulus domain to make reliable discrimination between individual exemplars (e.g. dog-show judges who can remember one Scotch Terrier as distinct from another). According to the expertise view, holistic processing for faces is limited to the upright orientation because this is the orientation in which faces are usually experienced.
An alternative theory is that there is some innately driven component to the adult “special” processing for faces (de Gelder and Rouw, 2001, Farah et al., 2000, Morton and Johnson, 1991). This is supported by the finding that young babies orient preferentially towards face-like stimuli (Johnson, Dziurawiec, Ellis, & Morton, 1991), and that there appears to be a critical period in early infancy for the development of holistic processing (Le Grand, Mondloch, Maurer, & Brent, 2001). According to this view, the fact that holistic processing develops only for upright faces may be because (a) there is some innate representation of the basic structure of an upright face (see Mondloch et al., 1999), and/or (b) a bias in infants' visual orienting of subcortical origin causes upright faces to be a frequent input to developing cortical systems (de Hann, Humphreys, & Johnson, 2002), and/or (c) very early exposure to upright faces fixes the axes of face-space to suit this orientation. Critically, this view does not deny that experience has a role in face recognition, but proposes that generic perceptual learning in adults reflects different mechanisms from those involved in learning upright faces in infancy.
If holistic processing is due to generic expertise then, as an adult, it should be possible to learn holistic processing for stimuli other than upright faces. In the literature to date several studies have explored this prediction via tests on experts with non-face objects.
For objects-of-expertise, disproportionate inversion effects have been obtained in long term memory with dog experts (Diamond & Carey, 1986), although these have not been replicated in a sequential matching task with car and bird experts (Gauthier, Skudlarski, Gore, & Anderson, 2000). Relatively few studies have used direct paradigms. Using the part–whole paradigm, Tanaka and Gauthier (1997) found some suggestion of an expertise effect – that is, a larger whole–part advantage in experts than in novices – for dog experts (interestingly, looking at dog faces), but not for car experts or biological cell experts. Other studies have investigated experiment-trained “greeble experts”. Greebles are an artificial object class that may be grouped into genders (by direction of protrusion) and families (by body shape); experts are trained using seven to ten 1 h sessions involving identification of greebles at the gender, family and individual-name levels. Three greeble studies have shown no training effect in the basic part–whole paradigm (Gauthier and Tarr, 1997, Gauthier and Tarr, 2002, Gauthier et al., 1998), and no composite effect in trained subjects (Gauthier and Tarr, 2002, Gauthier et al., 1998). These studies have, however, shown some suggestion of a face-like effect in a modification of the part–whole paradigm: one of three greeble parts produced better memory for the part in the original configuration than in an altered configuration (cf. Tanaka & Sengco, 1997). Overall, studies with object experts have produced some suggestion of holistic processing, although the evidence is currently less than convincing.
In addition to non-face objects, another stimulus class of interest is inverted faces. If it were the case that, as an adult, holistic processing could be learned for inverted faces, then this would provide compelling evidence for the expertise hypothesis.
An important question is then how much practice would be required. Two sets of literature are relevant to this issue. First, the authors of the greeble studies have suggested that expertise sufficient to support holistic processing can be learned in under 10 h of practice. Second, results from speeded object-naming tasks indicate that, for non-face objects, processing in the inverted orientation can come to share properties exhibited in the upright orientation with only a very small amount of training.
When objects are rotated in the picture plane they are initially named more slowly the further they are from upright (e.g. Jolicoeur, 1985), but this effect disappears with practice (for review, see McKone & Grenfell, 1999). For objects requiring within-class discrimination, Tarr and Pinker (1989) showed that, after practice at specified new orientations, naming times at intermediate positions then increased as a function of distance from the nearest-trained-orientation. They interpreted these results as evidence for “view-based” (i.e. template-like) representations of objects regardless of orientation: prior to the experiment, most stored views of a familiar object are in the upright (canonical) orientation and, within the experiment, new views are rapidly formed following exposure to novel orientations. Importantly in the present context, these upright-like representations in new orientations were created in less than 100 trials.
No previous studies have adequately assessed whether it is possible to learn holistic processing of faces in the inverted orientation. Occasional claims that holistic processing can be learned for inverted faces are partly a result of miscitation. Both Valentine (1988) and Sergent (1984) cite Bradshaw and Wallace (1971) as finding no difference between upright and inverted faces following practice; however, with the particular task used in that study, Bradshaw and Wallace in fact reported that both orientations were processed in a part-based manner. Valentine (1988) also contrasted the findings of Sergent (1984), who found only part-based processing for inverted faces in unpracticed subjects, with those of Takane and Sergent (1983), who he claimed found holistic processing after practice; however, Takane and Sergent did not actually present or analyze any data for their inverted condition (although they did state that it was “similar to the upright condition”, p. 405). Endo, Masame, and Maruyama (1990) provide the only direct claim of holistic processing in inverted faces after practice. They used highly schematic faces (e.g. an unusual head outline; circles for eyes; triangle for nose, etc.) in a vertical half-face version of the composite paradigm. The standard pattern – a composite effect upright but not inverted – was obtained when subjects were unpracticed. Following extensive training with inverted faces, a composite effect did emerge in a condition with different headshapes in each half-face. We suspect, however, that this could be attributed to the extreme violation of vertical symmetry that resulted in the aligned condition. When the two halves of the head were symmetric, there was no composite effect for inverted faces even after training.
In contrast to this suggested evidence for holistic processing of inverted faces, another series of studies (Martini et al., 2003, McKone, in press, McKone et al., 2001) argue against any such learning. Each of these studies was designed to isolate the holistic component of face processing, by identifying some phenomenon which existed for upright whole faces, but was completely absent for inverted faces. In the present context, the relevant point is that subjects were given hundreds or thousands of trials with the face stimuli in the inverted orientation, and yet showed no signs of developing the signature phenomena for holistic processing. Only a limited style of practice was used, however, presenting the same image repeatedly, rather than different views, as in real life. Farah et al. (2000) note that recognizing faces across different views is something prosopagnosics cannot do, suggesting that it requires holistic processing. Similarly, Tong and Nakayama (1999) state that a variety of views and contexts are needed to acquire a “robust representation” of a face. Thus, seeing an inverted face over a variety of views may be necessary to acquire a full holistic representation.
The aim of the present study was to assess whether, with appropriate practice, inverted faces could come to be processed holistically. A major aspect of our design was the use of identical twins as stimuli to encourage maximum reliance on holistic rather than part-based processing. In real-world face recognition, single local features do not generally differentiate people reliably (e.g. many individuals have blue eyes). In an experimental setting, however, where stimuli include a limited number of different faces, local features can contribute substantially to performance. Even discrimination of approximately similar individuals (e.g. the same sex and age) could be based on local information alone (e.g. eye colour; presence of a particular freckle), especially when subjects see the same faces over many hours of practice.
Thus, to give the best chance for any holistic processing for inverted faces to emerge, we wished to minimize local feature cues that might be used to identify the faces. Our hope was that, with identical twins, no single feature would differ enough between siblings to support reliable discrimination, and instead that identification would rely on information integrated across the entire face region (i.e. holistic processing). Use of multiple images and viewing angles also made very local information (e.g. exact shape at the corner of the mouth in one particular photograph) unreliable as a cue to identity, and made learning more similar to real-life (see discussion on multiple views above).
During training, each twin (e.g. “Liz Smith”) was individually named approximately 350 times (Experiment 1) or 280 times (Experiment 2). This level of practice exceeded that used in the greeble studies of trained expertise (120 training trials per individual greeble; Gauthier & Tarr, 1997), and also the number of trials in the “naming rotated objects studies” necessary to produce upright-like representations at other orientations (less than 100 trials per object; Tarr & Pinker, 1989). Thus, although a training study can never hope to match the degree of real-world experience that people have with upright faces (or, for example, that expert dog-show judges have with their breed of expertise), we argue that the present study will at least provide a strong answer to the question of whether holistic processing for inverted faces can, or cannot, emerge in experiment-trained “experts”.
Section snippets
Experiment 1
In Experiment 1, we trained subjects to identify two sets of female identical twins, given pseudonyms Liz and Ruth Smith, and Ann and Clare Brown. Orientation was a between-subjects variable. Each subject completed 8 h of training sessions, in which feedback was given for decisions at three levels of categorization (Gauthier & Tarr, 1997), namely the individual level (e.g. Is this Liz?), the family level (e.g. Is this one of the Smiths?), and a gender level (e.g. Is this a female?). Individual
Experiment 2 (learning without eyebrows)
In Experiment 1, subjects failed to learn a holistic representation of the inverted faces, despite extensive practice. In Experiment 2, we explored what would happen if the featural cue used by subjects in the first experiment was unavailable. Specifically, we used the Smith twins only, and trained new subjects with the twins' eyebrows covered at all times.
With the most discriminating feature removed from the Smiths' faces, several outcomes were considered possible. First, subjects trained on
Experiment 3 (composite test for holistic processing)
So far, our only direct test of holistic processing for upright faces has been via the effects of removing one specific feature (the eyebrows). In Experiment 3, we used the composite paradigm (Young et al., 1987) to further assess whether trained twin discrimination had relied on holistic representations in Experiments 1 and 2. The composite paradigm is a well established direct test for holistic processing. Subjects name a half-face of a familiar person, presented simultaneously with the other
General discussion
The aim of the present study was to assess whether inverted faces could come to be processed holistically with practice. Our results argue strongly that this is not possible within the constraints of a training study. In terms of the amount of practice, we used a number of individually named trials with each twin easily greater than that used by Gauthier and Tarr (1997) in investigating face-like processing for greebles, and far greater than that shown by Tarr and Pinker (1989) to produce
Acknowledgements
This research was supported by Australian Research Council Small Grants (numbers F00093 and F01027) awarded to Elinor McKone. We very much appreciate the help of the twins and their parents, as well as the time commitment by friends who participated in Experiment 1. We also thank Anna Gilchrist for assistance in data coding, and Mark Edwards, Michael Tarr and three anonymous reviewers for comments on earlier drafts of this paper.
References (44)
- et al.
Inversion and configuration of faces
Cognitive Psychology
(1993) - et al.
Beyond localisation: a dynamical dual route account of face recognition
Acta Psychologia
(2001) - et al.
Becoming a “greeble” expert: exploring mechanisms for face recognition
Vision Research
(1997) - et al.
Training ‘greeble’ experts: a framework for studying expert object recognition processes
Vision Research
(1998) - et al.
Newborns' preferential tracking of face-like stimuli and its subsequent decline
Cognition
(1991) - et al.
The many faces of configural processing
Trends in Cognitive Sciences
(2002) - et al.
What's lost in inverted faces?
Cognition
(1993) - et al.
Expertise in object and face recognition
The Psychology of Learning and Motivation
(1997) - et al.
Mental rotation and orientation-dependence in shape recognition
Cognitive Psychology
(1989) - et al.
Models for the processing and identification of faces
Perception and Psychophysics
(1971)
PsyScope: an interactive graphic system for designing and controlling experiments in the psychology laboratory using Macintosh computers
Behavior Research Methods, Instruments and Computers
Developing a brain specialised for face perception: a converging methods approach
Developmental Psychobiology
Why faces are and are not special: an effect of expertise
Journal of Experimental Psychology: General
A meta-analytic review of the distribution of practice effect: now you see it, now you don't
Journal of Applied Psychology
A limited use of configural information in the perception of inverted faces
Tohoku Psychologica Folia
Verbal vulnerability of perceptual expertise
Journal of Experimental Psychology: Learning, Memory, and Cognition
Early commitment of neural substrates for face recognition
Cognitive Neuropsychology
Expertise for cars and birds recruits brain areas involved in face recognition
Nature Neuroscience
Unraveling mechanisms for expert object recognition: bridging brain activity and behavior
Journal of Experimental Psychology: Human Perception and Performance
Configurational factors in the perception of unfamiliar faces
Perception
The time to name disoriented natural objects
Memory and Cognition
The fusiform face area: a module in human extrastriate cortex specialized for face perception
Journal of Neuroscience
Cited by (74)
Serial dependence of facial identity reflects high-level face coding
2021, Vision ResearchInverted faces benefit from whole-face processing
2020, CognitionCitation Excerpt :In their strongest form, theories of holistic processing argue that upright and inverted faces recruit qualitatively different perceptual mechanisms: Upright faces are thought to engage holistic processing whereby local regions are integrated into a unified whole. In contrast, inverted faces are thought to recruit a serial parts-based analysis of local features (Farah et al., 1998; McKone et al., 2007; Richler, Mack, Palmeri, & Gauthier, 2011; Robbins & McKone, 2003; Rossion, 2008; Tsao & Livingstone, 2008; Yovel & Kanwisher, 2008; Yovel, 2016). Aperture paradigms offer a compelling test of this view (Evers et al., 2018; Murphy & Cook, 2017; Van Belle, De Graef, Verfaillie, Busigny et al., 2010; Van Belle, De Graef, Verfaillie, Rossion et al., 2010).
Part and whole face representations in immediate and long-term memory
2019, Vision ResearchThe effect of spatial frequency on perceptual learning of inverted faces
2013, Vision ResearchCitation Excerpt :Adults’ poorer processing of inverted faces than of upright faces (Yin, 1969) is typically attributed to limited exposure to this face category (e.g., Rossion, 2009). A few studies have examined whether increased exposure – through training – can improve adults’ discrimination of inverted faces (Bi et al., 2010; Dwyer et al., 2009; Hussain, Sekuler, & Bennett, 2009b; Laguesse et al., 2012; Moses et al., 1995; Robbins & McKone, 2003). All demonstrated that training with inverted faces is effective but to a lesser extent than what is observed for upright faces when the latter were used for comparison (Bi et al., 2010; Dwyer et al., 2009; Hussain, Sekuler, & Bennett, 2009b; Moses et al., 1995; Robbins & McKone, 2003).