Elsevier

Neuropsychologia

Volume 49, Issue 9, July 2011, Pages 2273-2282
Neuropsychologia

Reviews and perspectives
What the study of voice recognition in normal subjects and brain-damaged patients tells us about models of familiar people recognition

https://doi.org/10.1016/j.neuropsychologia.2011.04.027Get rights and content

Abstract

In recent years it has been shown that a disorder in recognizing familiar people can be observed in patients with lesions affecting the anterior parts of the temporal lobes and that these disorders can be multi-modal, simultaneously affecting the visual, auditory and linguistic channels that allow person identification. Several authors have also shown that patients with right anterior temporal atrophy are more impaired in assessing familiarity and in retrieving person-specific semantic information from faces than from names, whereas the opposite pattern of performance can be observed in patients with left temporal lobe atrophy. Voice recognition disorders have been studied much less even despite their clinical and theoretical importance. The aim of the present review, therefore, was to compare recognition of familiar faces and voices, taking into account not only results obtained in individual patients with right anterior temporal lesions, but also those of group studies of unselected right- and left brain-damaged patients and results of experimental investigations conducted on face and voice recognition in normal subjects. Results of the review showed that: (1) voice recognition disorders are mainly due to right temporal lesions, similarly to face recognition disorders; (2) famous voice recognition disorders can be dissociated from unfamiliar voice discrimination impairments; (3) although face and voice recognition disorders tend to co-occur, they can also dissociate and in these patients there is a prevalent involvement of the right fusiform gyrus when face recognition disorders are on the foreground, and of the right superior temporal gyrus when voice recognition disorders are prominent; (4) normal subjects have greater difficulty evaluating familiarity and drawing semantic information from the voices than from the faces of celebrities. These data are at variance with models which assume that familiarity feelings may be generated at the level of person identity nodes (PINs) and that the latter may be considered as modality-free gateways to single semantic systems in which information about people is stored in an amodal format.

Highlights

► Voice recognition disorders are mainly due to right temporal lesions. ► Face and voice recognition disorders can both co-occur and dissociate. ► When voice recognition disorders prevail, the superior temporal gyrus is damaged. ► Familiarity feelings are generated in the modality-specific recognition units. ► ‘Familiar-only-experiences’ from face and from voice in normal subjects.

Introduction

Recognition of familiar people is based on three main sources of information: the face, when, for instance, you meet a person you know on the street or see a famous actor on TV; the voice, when you hear someone who is calling you at the phone and the name when you are told that someone has left a message for you. Each of these channels of information about known people can be selectively impaired in persons with brain damage. In some patients, however, the recognition disorder is multi-modal and simultaneously affects all of the visual, auditory and linguistic channels that allow person identification. According to current models of familiar people recognition (e.g. Brédart et al., 1995, Bruce and Young, 1986, Burton et al., 1990, Burton et al., 1999, Valentine et al., 1996, Young and Burton, 1999), several cognitive and subjective/behavioural stages are involved in the process of recognizing familiar people. Bruce and Young (1986) first investigated the nature and the sequence of these stages in a serial cognitive model focused on faces. This model assumes that identification of a familiar person involves first the formation of a view independent structural description of a seen face, which could be compared with all known faces stored in the Face Recognition Units (FRUs). Subsequently, a similar process was hypothesized for other sources of person recognition, such as names (Burton et al., 1990) and voices (Valentine et al., 1996), which were deemed to be stored in similar Name Recognition Units (NRU) or Voice Recognition Units (VRU). The second step of the people identification process requires the convergence of information stored in these modality-specific units into person-identity nodes (PINs), allowing identification of a particular person and retrieval of the corresponding semantic (biographical) information. PINs (or accessed person-specific knowledge) could, in turn, activate the phonological codes underlying the production of the person's proper name. The subjective/behavioural components of the process of people recognition corresponding to these cognitive stages concern first the emergence of a feeling of familiarity for the addressed person, second the retrieval of person-specific information (such as occupation, nationality and so on), and third the retrieval and production of the person's name.

Although there are general similarities between Bruce and Young (1986) model and the subsequent models of other authors, there are also important differences concerning the locus in which familiarity feelings are generated and in which person-specific information is stored.

First, Bruce and Young's (1986) face identification model assumes that familiarity feelings are generated in modality-specific recognition units where, for instance, the structural description of a seen face is compared with familiar faces stored in the FRUs. By contrast, in the models of (Burton et al., 1990, Burton et al., 1999), Brédart et al. (1995) and Valentine et al. (1996), decisions about familiarity are made at a supra-modal level, namely, the PINs, where information from different modalities is combined in person identity nodes. Furthermore, Bruce and Young (1986) model assumes that PINs store semantic information, whereas the models of Burton et al., 1990, Burton et al., 1999, Brédart et al. (1995) and Valentine et al. (1996) hold that PINs do not store semantic information but provide a modality-free gateway to a single semantic system in which information about people is stored in an amodal format. A final controversy over these models concerns the claim that biographical knowledge is represented in an abstract, amodal format in the brain. This derives from Snowden, Thompson, and Neary (2004) demonstration that in patients with semantic dementia person-specific information accessed through faces and names was different depending on the prevalent side of atrophy. Patients with left temporal lobe atrophy identified faces better than names and performed better on the picture than on the word version of the semantic memory “Pyramids and Palm Trees” test (Howard & Patterson, 1992), whereas patients with right temporal lobe atrophy showed the opposite pattern of performance.

Because the anterior parts of the right and left temporal lobes seem to play a critical role in functions of recognition, identification and naming of famous people (Ellis et al., 1989, Hanley et al., 1989, Evans et al., 1995, Gainotti et al., 2003, Damasio et al., 2004, Snowden et al., 2004), in a previous review paper (Gainotti, 2007) we attempted to see whether these controversies could be clarified by a careful analysis of the patterns of famous people recognition impairment shown by patients with right and left anterior temporal atrophy. Results of our review were consistent with Bruce and Young (1986) model and at variance with the models of Burton et al., 1990, Burton et al., 1999, Brédart et al. (1995) and Valentine et al. (1996) with respect to the locus of generation of familiarity feelings and the function of the PINs. Regarding the former, two main findings suggested that familiarity judgments are generated in the modality-specific recognition units rather than at the PIN level. The first finding was that familiarity judgments were much more impaired in right (R) than in left (L) temporal lobe (TL) patients and the second that in patients with right temporal lesions familiarity defects were modality specific, concerning famous faces more than famous names. The interpretation of these findings was that familiarity feelings should be generated at the level of the FRUs and might be more represented in the RTL because of the major role played by the right hemisphere in face processing (De Renzi, 1986, De Renzi et al., 1994, Sergent and Signoret, 1992). Concerning the PIN function, two findings were inconsistent with the assumption that PINs provide a modality-free gateway to a single system in which semantic information about people is stored in an amodal format. The first was that the loss of person-specific semantic information shown by patients with RTL damage might not have been due to a PIN disruption, because it was greater for faces than for names. The second was that an important imbalance between the amount of person-specific information available from faces and names was also found in right and left TL patients who showed intact or mildly impaired familiarity judgments; therefore, according to the previously mentioned cognitive models they should have no defect at the PIN level. One objection to our interpretation of the imbalance between results obtained with faces and names in right TL patients could be that face and name recognition are not equally difficult in normal subjects. Indeed, Haslam, Kay, Hanley, and Lyons (2004) showed that in these subjects both familiarity judgments and access to biographical information are more accurate in response to names than to faces. Therefore, the greater loss of familiarity feelings and biographical information obtained from faces by RTL patients could have been partially due to this normal asymmetry between faces and names. To check whether the differences between patients with right and left anterior temporal atrophy were mostly due to this recognition bias, Gainotti, Ferraccioli, and Marra (2010) recently administered two very well controlled tests of face and name recognition and identification (Bizzozero et al., 2005, Bizzozero et al., 2007) to two patients with selective mild difficulty in familiar people identification due to predominantly right and left temporal lobe atrophy. Even with this well controlled material, the right TL patient showed very impaired familiarity and a greater loss of person-specific information available from faces in a context of spared familiarity and of a greater amount of personal semantic information available from names. These data, therefore, confirmed that familiarity judgments are generated at the level of modality-specific recognition units and that PINs do not provide a modality-free gateway to a unitary semantic system, where information about people is stored in an amodal format.

A second problem with the above mentioned review was that it took into account only results obtained with famous faces and names and excluded the third important source of information about familiar people, namely, their voices. This was because voice recognition disorders have been studied in only a very small number of single case reports (in which all patients had right temporal lesions) often with rather poor methodology. However, when we analyzed the reports of patients considered for inclusion in our review we had the impression that face and voice recognition disorders usually co-occurred in patients with right anterior temporal lesions but were dissociated in some patients.

The aim of the present review, therefore, was to shift attention from the comparison between face and name recognition disorders to that between recognition of familiar faces and voices by taking into account not only results obtained in individual patients with right anterior temporal lesions, but also those of group studies of unselected right and left brain-damaged patients and results of experimental investigations conducted on face and voice recognition in normal subjects.

The first line of research aimed to determine whether voice recognition disorders, labelled “phonagnosia” by Van Lancker and Canter (1982), Van Lancker, Cummings, Kreiman, and Dobkin (1988), are mainly due to right-hemisphere lesions, as are face recognition disorders. In our discussion of the different format semantic representations might have at the level of the right and left temporal lobes (Gainotti, 2007), we suggested that right-hemisphere knowledge might be substantially based on a convergence of perceptual (visual/face and auditory/voice) information, whereas left-hemisphere knowledge might be based on a more complex integration between sensorimotor and linguistic (name) information, with relative dominance of the latter. The hypothesis that right-hemisphere knowledge might be based mainly on a convergence of perceptual information could be supported by the existence of a significant relationship between right-hemisphere lesions and voice recognition disorders and falsified by a lack of hemispheric asymmetries in these tasks.

The second line of inquiry aimed to check our clinical impressions deriving from a superficial analysis of single-case patients to be included in our previous review and to see whether there were any fine-grained differences in the anatomy of lesions in patients with a prevalence of face or voice recognition disorders. Different anatomical structures are, indeed, involved in the processing of face and voice stimuli. Face processing is basically subsumed by a network of cortical areas, which span from the inferior occipital cortex (Occipital Face Area/OFA of Gauthier et al., 2000) to the anterior temporal areas and have their centre in the mid-fusiform gyrus, where the fusiform face area is located (FFA/Kanwisher, McDermott, & Chun, 1997). By contrast, voice processing is mainly subsumed by cortical areas located along the upper bank of the superior temporal gyrus (Belin et al., 2000, Belin, 2006).

The third line of research was undertaken to check findings in normal subjects reported by Hanley, Smith, and Hadfield (1998), Hanley and Turner (2000), Damjanovic and Hanley (2007), Bredart, Barsics, and Hanley (2009), Hanley and Damjanovic (2009), Barsics and Brédart (2010). Indeed, they reported that it is more difficult to evaluate familiarity and to derive semantic information from the voices than the faces of celebrities. Furthermore, results of the experimental investigations conducted in normal subjects by the above mentioned authors are not only important for evaluating whether the differences between results obtained with faces and voices in brain-damaged patients are due to a normal bias between these two channels, but also have important implications for the models of familiar people recognition we outlined at the beginning of Section 1.

Therefore, in the first part of the present review we will consider some group studies of unselected right and left brain-damaged patients, in which famous voices recognition disorders were investigated in various experimental conditions to check the following: (a) whether famous voice recognition disorders are mainly observed in right brain-damaged patients and (b) whether they can be dissociated from unfamiliar voice discrimination impairments, just as in face recognition disorders.

In the second part of our review, we will analyze all the single case studies of patients with right anterior temporal lesions we were able to find in the neuropsychological literature and in which both face and voice recognition disorders were considered. The aim of this survey was to evaluate the following: (a) whether face and voice recognition disorders tend to co-occur, dissociating from name recognition disorders; (b) whether there are differences in the anatomy of lesions in patients with a prevalence of face or voice recognition disorders and whether the fusiform gyrus is prevalently involved in the former and the superior temporal gyrus in the latter.

In the last part of our survey, we will take into account studies conducted in normal subjects in which the authors matched difficulty in evaluating familiarity and in drawing semantic information from the voices and faces of celebrities in different experimental conditions to evaluate the implications of these data for models of familiar people recognition.

Section snippets

Group studies of unselected right and left brain-damaged patients in which famous voice recognition disorders were investigated in various experimental conditions

In our search of the neuropsychological literature, we found only four papers in which famous voice recognition disorders had been investigated in unselected groups of right and left brain-damaged patients. In these studies, famous voice recognition disorders (FVRD) had been investigated: – in isolation (Lang, Kneidl, Hielscher-Fastabend, & Heckmann, 2009), – in association with famous faces recognition disorders (Van Lancker & Canter, 1982), – by contrasting famous voice recognition with

Discussion

The main results of the present review can be summarized as follows:

  • (1)

    Voice recognition disorders are mainly due to right temporal lesions, similarly to face recognition disorders and in contrast with name recognition disorders, which tend to prevail in patients with left temporal lesions (Gainotti, 2007).

  • (2)

    Famous voice recognition disorders can be dissociated from unfamiliar voice discrimination impairments, similarly to face recognition disorders (Van Lancker et al., 1988, Van Lancker et al., 1989

References (63)

  • J. Hocking et al.

    Dissociating verbal and nonverbal audiovisual object processing

    Brain and Language

    (2009)
  • M. Ikeda et al.

    A horse of a different colour: Do patients with semantic dementia recognise different versions of the same object as the same?

    Neuropsychologia

    (2006)
  • F. Neuner et al.

    Neuropsychological impairments in the recognition of faces, voices, and personal names

    Brain and Cognition

    (2000)
  • G. Thierry et al.

    Hemispheric dissociation in access to the human semantic system

    Neuron

    (2003)
  • T. Tsukiura et al.

    Dissociable roles of the bilateral anterior temporal lobe in face–name associations: An event-related fMRI study

    Neuroimage

    (2006)
  • D.R. Van Lancker et al.

    Impairment of voice and face recognition in patients with hemispheric damage

    Brain and Cognition

    (1982)
  • D.R. Van Lancker et al.

    Phonagnosia: A dissociation between familiar and unfamiliar voices

    Cortex

    (1988)
  • C. Barsics et al.

    Recalling episodic information about personally known faces and voices

    Conscious Cognition

    (2010)
  • P. Belin

    Voice processing in human and non-human primates

    Philosophical Transactions of the Royal Society of London B Biological Sciences

    (2006)
  • P. Belin et al.

    Voice-selective areas in human auditory cortex

    Nature

    (2000)
  • I. Bizzozero et al.

    Who is who: Italian norms for visual recognition and identification of celebrities

    Neurological Sciences

    (2005)
  • I. Bizzozero et al.

    What do you know about Ho Chi Minh? Italian norms of proper name comprehension

    Neurological Sciences

    (2007)
  • J. Boudouresques et al.

    Agnosia for faces: Evidence of functional disorganization of a certain type of recognition of objects in the physical world

    Bulletin de l’ Académie Nationale de Medecine

    (1979)
  • S. Bredart et al.

    Recalling semantic information about personally known faces and voices

    European Journal of Cognitive Psychology

    (2009)
  • S. Brédart et al.

    An interactive activation model of face naming

    Quarterly Journal of Experimental Psychology

    (1995)
  • V. Bruce et al.

    Understanding face recognition

    British Journal of Psychology

    (1986)
  • A.M. Burton et al.

    Understanding face recognition with an interactive activation model

    British Journal of Psychology

    (1990)
  • T. Busigny et al.

    Right anterior temporal lobe atrophy and person-based semantic defect: A detailed case study

    Neurocase

    (2009)
  • C.R. Butler et al.

    The neural correlates of verbal and nonverbal semantic processing deficits in neurodegenerative disease

    Cognitive and Behavioral Neurology

    (2009)
  • D. Chan et al.

    The clinical profile of right temporal lobe atrophy

    Brain

    (2009)
  • L. Damjanovic et al.

    Recalling episodic and semantic information about famous faces and voices

    Memory & Cognition

    (2007)
  • Cited by (59)

    View all citing articles on Scopus
    View full text