Elsevier

Cognition

Volume 88, Issue 1, May 2003, Pages 79-107
Cognition

Can holistic processing be learned for inverted faces?

https://doi.org/10.1016/S0010-0277(03)00020-9Get rights and content

Abstract

The origin of “special” processing for upright faces has been a matter of ongoing debate. If it is due to generic expertise, as opposed to having some innate component, holistic processing should be learnable for stimuli other than upright faces. Here we assess inverted faces. We trained subjects to discriminate identical twins using up to 1100 exposures to each twin in different poses and images. In the upright orientation, twin discrimination was supported by holistic processing. Removal of a single face feature had no effect on performance, and a composite effect (Young, A. W., Hellawell, D., & Hay, D.C. (1987). Configurational information in face perception. Perception 16 (6), 747–759) was obtained. In the inverted orientation, however, above chance identification ability relied on (a) image specific learning, or (b) tiny local feature differences not noticed in the upright faces. The failure to learn holistic processing for inverted faces indicates that, in contrast to the situation for objects (Tarr, M.J., & Pinker, S. (1989). Mental rotation and orientation-dependence in shape recognition. Cognitive Psychology 21 (2), 233–282), orientation specificity of face processing is highly stable against practice.

Introduction

For ordinary adults, faces form a special class of visual object. A specific cortical area in the right fusiform gyrus activates more strongly for faces than for other within-class object discrimination (Kanwisher, McDermott, & Chun, 1997), and much evidence shows that a holistic or configural style of cognitive processing makes faces “special”.

The exact definition of “holistic” or “configural” processing remains a mater of debate, but it is generally taken to mean integration of information from the whole of the internal face region, in which there is either no decomposition into component parts (eyes, nose, etc; Tanaka & Farah, 1993), or in which each part is processed in interaction with multiple other parts (e.g. Rhodes, 1988). Holistic processing is usually contrasted with “part-based” processing, in which decomposition based on local information (e.g. lip shape, eye colour) supports performance.

Initial evidence for this distinction was that faces show a disproportionate inversion effect on recognition memory: while memory is poorer for all objects when studied and tested upside-down (inverted) than when studied and tested upright, this effect is much larger for faces than for other objects (e.g. Diamond and Carey, 1986, Yin, 1969). This result is commonly interpreted as reflecting holistic processing only for upright faces, a conclusion which has been supported by paradigms providing direct tests of holistic vs. part-based processing styles (for a recent review see Maurer, Le Grand, & Mondloch, 2002). For example, in the part–whole paradigm (Tanaka & Farah, 1993), a particular face part (e.g. Bill's nose) is remembered more accurately when tested in the whole studied face (Bill's nose in Bill's face vs. John's nose in Bill's face) than when tested alone (Bill's nose vs. John's nose). In the composite effect (Young, Hellawell, & Hay, 1987), two different half-faces appear to fuse into a new face, slowing naming of one half-face, when the halves are aligned as compared to offset (unaligned). In both paradigms holistic effects occur in the upright orientation, but not inverted (see also Hole, 1994, Tanaka et al., 1998, Tanaka and Sengco, 1997). Thus, for faces, it has been shown that (a) holistic processing occurs only in the upright orientation, (b) inverted faces are processed in a part-based non-holistic manner, and (c) the processing of local feature information is largely unaffected by orientation (also see Bartlett and Searcy, 1993, Murray et al., 2000, Rhodes et al., 1993, Thompson, 1980).

There are two general theories of why, in adults, holistic processing is limited to faces, and moreover to upright faces. According to the expertise hypothesis, holistic processing is a domain-general property of expertise in making within-class discriminations. Diamond and Carey (1986) noted that, for most people, faces are the only class of visual stimulus for which they become genuine experts. However, these authors suggested that face-like holistic processing might be learnable for any object type, as long as three conditions were met: (a) all exemplars of the stimulus class share a similar basic configuration (i.e. the same parts in the same left–right, above–below relationships), (b) individual exemplars differ from this shared first-order configuration only in minor (second-order) ways (e.g. the exact distance between two parts; the exact shape of a part), and (c) the subject has sufficient expertise with the stimulus domain to make reliable discrimination between individual exemplars (e.g. dog-show judges who can remember one Scotch Terrier as distinct from another). According to the expertise view, holistic processing for faces is limited to the upright orientation because this is the orientation in which faces are usually experienced.

An alternative theory is that there is some innately driven component to the adult “special” processing for faces (de Gelder and Rouw, 2001, Farah et al., 2000, Morton and Johnson, 1991). This is supported by the finding that young babies orient preferentially towards face-like stimuli (Johnson, Dziurawiec, Ellis, & Morton, 1991), and that there appears to be a critical period in early infancy for the development of holistic processing (Le Grand, Mondloch, Maurer, & Brent, 2001). According to this view, the fact that holistic processing develops only for upright faces may be because (a) there is some innate representation of the basic structure of an upright face (see Mondloch et al., 1999), and/or (b) a bias in infants' visual orienting of subcortical origin causes upright faces to be a frequent input to developing cortical systems (de Hann, Humphreys, & Johnson, 2002), and/or (c) very early exposure to upright faces fixes the axes of face-space to suit this orientation. Critically, this view does not deny that experience has a role in face recognition, but proposes that generic perceptual learning in adults reflects different mechanisms from those involved in learning upright faces in infancy.

If holistic processing is due to generic expertise then, as an adult, it should be possible to learn holistic processing for stimuli other than upright faces. In the literature to date several studies have explored this prediction via tests on experts with non-face objects.

For objects-of-expertise, disproportionate inversion effects have been obtained in long term memory with dog experts (Diamond & Carey, 1986), although these have not been replicated in a sequential matching task with car and bird experts (Gauthier, Skudlarski, Gore, & Anderson, 2000). Relatively few studies have used direct paradigms. Using the part–whole paradigm, Tanaka and Gauthier (1997) found some suggestion of an expertise effect – that is, a larger whole–part advantage in experts than in novices – for dog experts (interestingly, looking at dog faces), but not for car experts or biological cell experts. Other studies have investigated experiment-trained “greeble experts”. Greebles are an artificial object class that may be grouped into genders (by direction of protrusion) and families (by body shape); experts are trained using seven to ten 1 h sessions involving identification of greebles at the gender, family and individual-name levels. Three greeble studies have shown no training effect in the basic part–whole paradigm (Gauthier and Tarr, 1997, Gauthier and Tarr, 2002, Gauthier et al., 1998), and no composite effect in trained subjects (Gauthier and Tarr, 2002, Gauthier et al., 1998). These studies have, however, shown some suggestion of a face-like effect in a modification of the part–whole paradigm: one of three greeble parts produced better memory for the part in the original configuration than in an altered configuration (cf. Tanaka & Sengco, 1997). Overall, studies with object experts have produced some suggestion of holistic processing, although the evidence is currently less than convincing.

In addition to non-face objects, another stimulus class of interest is inverted faces. If it were the case that, as an adult, holistic processing could be learned for inverted faces, then this would provide compelling evidence for the expertise hypothesis.

An important question is then how much practice would be required. Two sets of literature are relevant to this issue. First, the authors of the greeble studies have suggested that expertise sufficient to support holistic processing can be learned in under 10 h of practice. Second, results from speeded object-naming tasks indicate that, for non-face objects, processing in the inverted orientation can come to share properties exhibited in the upright orientation with only a very small amount of training.

When objects are rotated in the picture plane they are initially named more slowly the further they are from upright (e.g. Jolicoeur, 1985), but this effect disappears with practice (for review, see McKone & Grenfell, 1999). For objects requiring within-class discrimination, Tarr and Pinker (1989) showed that, after practice at specified new orientations, naming times at intermediate positions then increased as a function of distance from the nearest-trained-orientation. They interpreted these results as evidence for “view-based” (i.e. template-like) representations of objects regardless of orientation: prior to the experiment, most stored views of a familiar object are in the upright (canonical) orientation and, within the experiment, new views are rapidly formed following exposure to novel orientations. Importantly in the present context, these upright-like representations in new orientations were created in less than 100 trials.

No previous studies have adequately assessed whether it is possible to learn holistic processing of faces in the inverted orientation. Occasional claims that holistic processing can be learned for inverted faces are partly a result of miscitation. Both Valentine (1988) and Sergent (1984) cite Bradshaw and Wallace (1971) as finding no difference between upright and inverted faces following practice; however, with the particular task used in that study, Bradshaw and Wallace in fact reported that both orientations were processed in a part-based manner. Valentine (1988) also contrasted the findings of Sergent (1984), who found only part-based processing for inverted faces in unpracticed subjects, with those of Takane and Sergent (1983), who he claimed found holistic processing after practice; however, Takane and Sergent did not actually present or analyze any data for their inverted condition (although they did state that it was “similar to the upright condition”, p. 405). Endo, Masame, and Maruyama (1990) provide the only direct claim of holistic processing in inverted faces after practice. They used highly schematic faces (e.g. an unusual head outline; circles for eyes; triangle for nose, etc.) in a vertical half-face version of the composite paradigm. The standard pattern – a composite effect upright but not inverted – was obtained when subjects were unpracticed. Following extensive training with inverted faces, a composite effect did emerge in a condition with different headshapes in each half-face. We suspect, however, that this could be attributed to the extreme violation of vertical symmetry that resulted in the aligned condition. When the two halves of the head were symmetric, there was no composite effect for inverted faces even after training.

In contrast to this suggested evidence for holistic processing of inverted faces, another series of studies (Martini et al., 2003, McKone, in press, McKone et al., 2001) argue against any such learning. Each of these studies was designed to isolate the holistic component of face processing, by identifying some phenomenon which existed for upright whole faces, but was completely absent for inverted faces. In the present context, the relevant point is that subjects were given hundreds or thousands of trials with the face stimuli in the inverted orientation, and yet showed no signs of developing the signature phenomena for holistic processing. Only a limited style of practice was used, however, presenting the same image repeatedly, rather than different views, as in real life. Farah et al. (2000) note that recognizing faces across different views is something prosopagnosics cannot do, suggesting that it requires holistic processing. Similarly, Tong and Nakayama (1999) state that a variety of views and contexts are needed to acquire a “robust representation” of a face. Thus, seeing an inverted face over a variety of views may be necessary to acquire a full holistic representation.

The aim of the present study was to assess whether, with appropriate practice, inverted faces could come to be processed holistically. A major aspect of our design was the use of identical twins as stimuli to encourage maximum reliance on holistic rather than part-based processing. In real-world face recognition, single local features do not generally differentiate people reliably (e.g. many individuals have blue eyes). In an experimental setting, however, where stimuli include a limited number of different faces, local features can contribute substantially to performance. Even discrimination of approximately similar individuals (e.g. the same sex and age) could be based on local information alone (e.g. eye colour; presence of a particular freckle), especially when subjects see the same faces over many hours of practice.

Thus, to give the best chance for any holistic processing for inverted faces to emerge, we wished to minimize local feature cues that might be used to identify the faces. Our hope was that, with identical twins, no single feature would differ enough between siblings to support reliable discrimination, and instead that identification would rely on information integrated across the entire face region (i.e. holistic processing). Use of multiple images and viewing angles also made very local information (e.g. exact shape at the corner of the mouth in one particular photograph) unreliable as a cue to identity, and made learning more similar to real-life (see discussion on multiple views above).

During training, each twin (e.g. “Liz Smith”) was individually named approximately 350 times (Experiment 1) or 280 times (Experiment 2). This level of practice exceeded that used in the greeble studies of trained expertise (120 training trials per individual greeble; Gauthier & Tarr, 1997), and also the number of trials in the “naming rotated objects studies” necessary to produce upright-like representations at other orientations (less than 100 trials per object; Tarr & Pinker, 1989). Thus, although a training study can never hope to match the degree of real-world experience that people have with upright faces (or, for example, that expert dog-show judges have with their breed of expertise), we argue that the present study will at least provide a strong answer to the question of whether holistic processing for inverted faces can, or cannot, emerge in experiment-trained “experts”.

Section snippets

Experiment 1

In Experiment 1, we trained subjects to identify two sets of female identical twins, given pseudonyms Liz and Ruth Smith, and Ann and Clare Brown. Orientation was a between-subjects variable. Each subject completed 8 h of training sessions, in which feedback was given for decisions at three levels of categorization (Gauthier & Tarr, 1997), namely the individual level (e.g. Is this Liz?), the family level (e.g. Is this one of the Smiths?), and a gender level (e.g. Is this a female?). Individual

Experiment 2 (learning without eyebrows)

In Experiment 1, subjects failed to learn a holistic representation of the inverted faces, despite extensive practice. In Experiment 2, we explored what would happen if the featural cue used by subjects in the first experiment was unavailable. Specifically, we used the Smith twins only, and trained new subjects with the twins' eyebrows covered at all times.

With the most discriminating feature removed from the Smiths' faces, several outcomes were considered possible. First, subjects trained on

Experiment 3 (composite test for holistic processing)

So far, our only direct test of holistic processing for upright faces has been via the effects of removing one specific feature (the eyebrows). In Experiment 3, we used the composite paradigm (Young et al., 1987) to further assess whether trained twin discrimination had relied on holistic representations in Experiments 1 and 2. The composite paradigm is a well established direct test for holistic processing. Subjects name a half-face of a familiar person, presented simultaneously with the other

General discussion

The aim of the present study was to assess whether inverted faces could come to be processed holistically with practice. Our results argue strongly that this is not possible within the constraints of a training study. In terms of the amount of practice, we used a number of individually named trials with each twin easily greater than that used by Gauthier and Tarr (1997) in investigating face-like processing for greebles, and far greater than that shown by Tarr and Pinker (1989) to produce

Acknowledgements

This research was supported by Australian Research Council Small Grants (numbers F00093 and F01027) awarded to Elinor McKone. We very much appreciate the help of the twins and their parents, as well as the time commitment by friends who participated in Experiment 1. We also thank Anna Gilchrist for assistance in data coding, and Mark Edwards, Michael Tarr and three anonymous reviewers for comments on earlier drafts of this paper.

References (44)

  • J. Cohen et al.

    PsyScope: an interactive graphic system for designing and controlling experiments in the psychology laboratory using Macintosh computers

    Behavior Research Methods, Instruments and Computers

    (1993)
  • M. de Hann et al.

    Developing a brain specialised for face perception: a converging methods approach

    Developmental Psychobiology

    (2002)
  • R. Diamond et al.

    Why faces are and are not special: an effect of expertise

    Journal of Experimental Psychology: General

    (1986)
  • J.D. Donovan et al.

    A meta-analytic review of the distribution of practice effect: now you see it, now you don't

    Journal of Applied Psychology

    (1999)
  • M. Endo et al.

    A limited use of configural information in the perception of inverted faces

    Tohoku Psychologica Folia

    (1990)
  • M. Fallshore et al.

    Verbal vulnerability of perceptual expertise

    Journal of Experimental Psychology: Learning, Memory, and Cognition

    (1995)
  • M.J. Farah et al.

    Early commitment of neural substrates for face recognition

    Cognitive Neuropsychology

    (2000)
  • I. Gauthier et al.

    Expertise for cars and birds recruits brain areas involved in face recognition

    Nature Neuroscience

    (2000)
  • I. Gauthier et al.

    Unraveling mechanisms for expert object recognition: bridging brain activity and behavior

    Journal of Experimental Psychology: Human Perception and Performance

    (2002)
  • G.J. Hole

    Configurational factors in the perception of unfamiliar faces

    Perception

    (1994)
  • P. Jolicoeur

    The time to name disoriented natural objects

    Memory and Cognition

    (1985)
  • N. Kanwisher et al.

    The fusiform face area: a module in human extrastriate cortex specialized for face perception

    Journal of Neuroscience

    (1997)
  • Cited by (74)

    • Inverted faces benefit from whole-face processing

      2020, Cognition
      Citation Excerpt :

      In their strongest form, theories of holistic processing argue that upright and inverted faces recruit qualitatively different perceptual mechanisms: Upright faces are thought to engage holistic processing whereby local regions are integrated into a unified whole. In contrast, inverted faces are thought to recruit a serial parts-based analysis of local features (Farah et al., 1998; McKone et al., 2007; Richler, Mack, Palmeri, & Gauthier, 2011; Robbins & McKone, 2003; Rossion, 2008; Tsao & Livingstone, 2008; Yovel & Kanwisher, 2008; Yovel, 2016). Aperture paradigms offer a compelling test of this view (Evers et al., 2018; Murphy & Cook, 2017; Van Belle, De Graef, Verfaillie, Busigny et al., 2010; Van Belle, De Graef, Verfaillie, Rossion et al., 2010).

    • The effect of spatial frequency on perceptual learning of inverted faces

      2013, Vision Research
      Citation Excerpt :

      Adults’ poorer processing of inverted faces than of upright faces (Yin, 1969) is typically attributed to limited exposure to this face category (e.g., Rossion, 2009). A few studies have examined whether increased exposure – through training – can improve adults’ discrimination of inverted faces (Bi et al., 2010; Dwyer et al., 2009; Hussain, Sekuler, & Bennett, 2009b; Laguesse et al., 2012; Moses et al., 1995; Robbins & McKone, 2003). All demonstrated that training with inverted faces is effective but to a lesser extent than what is observed for upright faces when the latter were used for comparison (Bi et al., 2010; Dwyer et al., 2009; Hussain, Sekuler, & Bennett, 2009b; Moses et al., 1995; Robbins & McKone, 2003).

    View all citing articles on Scopus
    View full text