Elsevier

Behavioural Processes

Volume 73, Issue 3, 1 November 2006, Pages 348-359
Behavioural Processes

Learning of auditory equivalence classes for vowels by rats

https://doi.org/10.1016/j.beproc.2006.08.005Get rights and content

Abstract

Four male Long-Evans rats were trained to discriminate between synthetic vowel sounds using a GO/NOGO response choice task. The vowels were characterized by an increase in fundamental frequency correlated with an upward shift in formant frequencies. In an initial phase we trained the subjects to discriminate between two vowel categories using two exemplars from each category. In a subsequent phase the ability of the rats to generalize the discrimination between the two categories was tested. To test whether rats might exploit the fact that attributes of training stimuli covaried, we used non-standard stimuli with a reversed relation between fundamental frequency and formants. The overall results demonstrate that rats are able to generalize the discrimination to new instances of the same vowels. We present evidence that the performance of the subjects depended on the relation between fundamental and formant frequencies that they had previously been exposed to. Simple simulation results with artificial neural networks could reproduce most of the behavioral results and support the hypothesis that equivalence classes for vowels are associated with an experience-driven process based on general properties of peripheral auditory coding mixed with elementary learning mechanisms. These results suggest that rats use spectral and temporal cues similarly to humans despite differences in basic auditory capabilities.

Introduction

Recognition and categorization of meaningful sounds are typical features of all behaving mammals. In addition to being able to make fine acoustic discriminations, an individual must also be able to determine, depending on the context, which cues carry information and which are irrelevant.

In human speech there is no one-to-one correspondence between a given physical acoustic pattern and a functionally meaningful element of information such as a vowel. Vowels are physically characterized by the first and second formant frequencies (F1 and F2, see, e.g., Peterson and Barney, 1952). However, vowels spoken by different individuals have different distributions in the F1 versus F2 plane and these may even often overlap. This has led speech theorists to postulate that a listener somehow interprets the formant frequency patterns by using additional speaker-specific cues, a process referred to as speaker normalization (Joos, 1948, Ladefoged, 1967). It is generally accepted that both intrinsic and extrinsic factors play important roles in vowel normalization (Ainsworth, 1975, Nearey, 1989). Intrinsic factors including the fundamental frequency (f0) and higher order formants appear necessary for identifying isolated vowels. Miller (1989) has proposed an auditory-perceptual theory of vowel perception in which the speaker's voice pitch serves as a reference for vowel normalization. Hirahara and Kato (1992) have shown that the perception of vowel quality did not change when all formant frequencies were shifted in the same direction as f0, but could change if this rule was not applied.

Human listeners assign a phonemic label to an acoustic pattern based on categories of speech, referred to as equivalence classes, but it is yet unclear whether equivalence classes are innate or associated with an experience-driven process. One of the classic notions in speech studies, categorical perception, is associated with the ability to parcel out the perceptual space with sharp boundaries between speech sounds. Although some authors have proposed that these boundaries reflect innate non-linearities of the auditory system (Kuhl and Miller, 1978) many have put forward views which give greater weight to learning and general perceptual processes (Schouten, 1987, Schouten and van Hessen, 1992, Kluender, 1994, Kluender et al., 1998, Holt et al., 2001). Indeed, different languages partition the vowel space in different ways, which is evidence supporting the idea that linguistic experience influences categorization. At the age of 6 months already, infants can reliably categorize vowel sounds even when there is considerable overlap between stimuli from each vowel category, and this ability is specific to the language that these infants have been exposed to (Kuhl et al., 1992, Kuhl, 1983).

The study of speech sounds perception by non-human animals is necessary to determine whether the processing of these stimuli rests upon general properties of the auditory system and of learning processes, or on unique human capabilities. Studies on animal responses to speech sounds have been carried out in cats (Dewson, 1964), chinchilla, (Burdick and Miller, 1975), monkeys (Kuhl, 1991), budgerigars (Dooling and Brown, 1990), blackbirds and pigeons (Heinz et al., 1981), quails (Kluender et al., 1987), starlings (Kluender et al., 1998) and, recently, rats as well (Reed et al., 2003, Toro et al., 2003). Psychophysical tests of discrimination of vowel formant frequency by non-human species indicate that animal performances are only slightly below that of humans (Sinnot and Kreiter, 1991, Sommers et al., 1992, Heinz et al., 1996). These types of studies provide a behavioral counterpart to electrophysiological work aimed at determining how complex sounds (Nelken et al., 1999, Villa et al., 1999b) and vowels (Sachs and Young, 1979, Blackburn and Sachs, 1990, Recio et al., 2002) are encoded in the auditory system. A current hypothesis (Conley and Keilson, 1995, May et al., 1996) is that humans as well as non-human species discriminate vowel formant changes using a profile analysis based on the mean discharge rate of auditory nerve fibers (Green, 1988).

If animal species can be shown to discriminate and label speech sounds like humans, this would support the hypothesis that the underlying neural processes are similar and thus not unique to humans. There are many similarities between the anatomy and physiology of the auditory system of humans and rats. The rat is a common experimental subject providing an opportunity to investigate basic neural mechanisms using techniques which are not available in humans. Although there is interest in the rat's auditory capabilities (Syka et al., 1996, Talwar and Gerstein, 1999), few attempts have been made until recently to test this animal in behavioral tasks involving other than simple auditory stimuli (Villa et al., 1999a). In a test of absolute pitch perception rats perform like humans, in contrast to several avian species (Weisman et al., 2004).

The present study is aimed at investigating, firstly, whether discrimination of vowel categories could be learned by rats and, secondly, whether equivalence classes were based only on the vowel formants or both formants and fundamental frequency. We tested two vowel categories: /ɛ/, as in “head”, and /⊃/, as in “hawed”. The test stimuli were all synthetic vowels whose fundamental and formant frequencies were varied in a systematic way (with formants and f0s shifted in the same direction) to cover the range of variation recorded in human speech. After initial training with a small set of exemplars from each vowel category, we tested the ability of the rats to generalize to a larger set. Following an additional period of training with the full set of vowels, the rats were again tested with a set of vowels for which f0–formant relation was reversed. A hypothesis about the perceptual organization of stimulus attributes is proposed with the help of a computational model.

Section snippets

Subjects

Subjects were four Long-Evans male rats Rattus norvegicus, aged 6 months at the beginning of the experiment. The rats were housed individually with free access to water and restricted food supply. The rats were rewarded by sunflower seeds during the experimental sessions and were given supplemental pellets at the end of each session so to maintain their body weight at least at 90% of the ad libitum weight. All experimental procedures were carried out in accordance with the European Communities

Experiments

In order to enhance the attention of the subjects to the structure of the vowels, the standard stimuli were slightly modified after the fourth session; the first and third formants were attenuated by 20 dB, leaving the second formant unchanged. After eight sessions with these stimuli the performances were steadily >80% correct. During subsequent nine sessions the amplitudes of the first and third formants were raised by steps of 2–3 dB at each session, until the stimuli reached the standard

General discussion

In the present study we report experimental observations that demonstrate the ability of rats to discriminate between categories of human vowels. Initially, the subjects were exposed to two exemplars of vowel categories /ɛ/ and /⊃/, one being associated to a GO and the other to a NOGO response choice task. In a subsequent phase the subjects were exposed to seven new exemplars of each vowel category. These stimuli were “standard” because of the positive correlation between formants and

References (68)

  • L.L. Bloomfield et al.

    Open-ended categorization of chick-a-dee calls by black-capped chickadees (Poecile atricapilla)

    J. Comp. Psychol.

    (2003)
  • R.F. Braaten

    Multiple levels of representation of song by European starlings (Sturnus vulgaris): open-ended categorization of starling song types and differential forgetting of song categories and exemplars

    J. Comp. Psychol.

    (2000)
  • C.K. Burdick et al.

    Speech perception by the chinchilla: discrimination of sustained /a/ and /i/

    J. Acoust. Soc. Am.

    (1975)
  • P.A. Cariani et al.

    Neural correlates of the pitch of complex tones. I. Pitch and pitch salience

    J. Neurophysiol.

    (1996)
  • A.R. Chase

    Music discrimination by carp (Cyprinus carpio)

    Anim. Learn. Behav.

    (2001)
  • P.J. Christie et al.

    Pitch shifts and song structure indicate male quality in the dawn chorus of black-capped chickadees

    Behav. Ecol. Sociobiol.

    (2004)
  • R.A. Conley et al.

    Rate representation and discriniinability of second formant frequencies for /ɛ/-like steady-state vowels in cat auditory nerve

    J. Acoust. Soc. Am.

    (1995)
  • J.H. Dewson

    Speech sound discrimination by cats

    Science

    (1964)
  • R.J. Dooling et al.

    Speech perception by budgerigars (Melopsittacus undulatus): spoken vowels

    Percept. Psychophys.

    (1990)
  • A. El-Barbary

    Auditotory nerve of the normal and jaundiced rat. II. Frequency selectivity and two-tone rate suppression

    Hear. Res.

    (1991)
  • R.R. Fay

    Hearing in Vertebrates: A Psychophysics Databook

    (1988)
  • D.M. Green

    Profile Analysis

    (1988)
  • H.E. Heffner et al.

    Perception of the missing fundamental by cats

    J. Acoust. Soc. Am.

    (1976)
  • R.D. Heinz et al.

    Vowel discrimination in cats: threshold for the detection of second formant changes in the vowel /ɛ/

    J. Acoust. Soc. Am.

    (1996)
  • R.D. Heinz et al.

    Discrimination of steady-state vowels by blackbirds and pigeons

    J. Acoust. Soc. Am.

    (1981)
  • T. Hirahara et al.

    The effect of F0 on vowel identification

  • L.L. Holt et al.

    Influence of fundamental frequency on stop-consonant voicing perception: a case of learned covariance or auditory enhancement?

    J. Acoust. Soc. Am.

    (2001)
  • M. Joos

    Acoustic phonetics

    Linguistic Society of America (Ed.), Language Monograph No. 23

    (1948)
  • J.B. Kelly

    Rat auditory cortex

  • D.H. Klatt

    Software for a cascade/parallel formant synthesizer

    J. Acoust. Soc. Am.

    (1980)
  • K.R. Kluender

    Speech perception as a tractable problem

  • K.R. Kluender et al.

    Japanese quail can learn phonetic categories

    Science

    (1987)
  • K.R. Kluender et al.

    Effects of first formant onset frequency on [-voice] judgments result from auditory processes not specific to humans

    J. Acoust. Soc. Am.

    (1994)
  • K.R. Kluender et al.

    Role of experience for language-specific functional mappings of vowel sounds

    J. Acoust. Soc. Am.

    (1998)
  • Cited by (26)

    • Frequency-based organization of speech sequences in a nonhuman animal

      2016, Cognition
      Citation Excerpt :

      Fundamental frequency changes were the same as those used by Gervain and Werker (2013), which allowed for more direct comparisons across species, but were less marked than those used by de la Mora et al. (2013). However, they are within the auditory limits that have been used with speech stimuli (e.g. Eriksson & Villa, 2006). Test stimuli were the same as in Experiment 1, thus pitting frequent–initial items (AXBY) against infrequent–initial items (XABY).

    • Biological context of Hebb learning in artificial neural networks, a review

      2015, Neurocomputing
      Citation Excerpt :

      There are numerous examples of reproducible behaviors in biological neurons. In behavioral, psychological and biological experiments, one can manipulate newly perceived patterns of a visual scene [53], complex sounds of speech-like utterances [67] and other sensory stimuli. Even perceptual illusions are used in experimental manipulations [68] and their neuronal mechanisms established.

    • Rodent Models of Speech Sound Processing

      2015, Neurobiology of Language
    • Rule learning over consonants and vowels in a non-human animal

      2013, Cognition
      Citation Excerpt :

      These commonalities extend beyond the perception of simple sounds and extend to the processing of complex speech stimuli (Yip, 2006). For example, it has been observed that rats are able to discriminate across vowel categories (Eriksson & Villa, 2006), among consonants using the affricate-fricative continuum (Reed, Howell, Sackin, Pizzimenti, & Rosen, 2003), and can organize consonant sounds around categories using their frequency distribution (Pons, 2006). That is, there are consistent behavioral indicators that rodents can process consonants and vowels.

    • Evolving spiking neural networks for audiovisual information processing

      2010, Neural Networks
      Citation Excerpt :

      The outputs of the inner hair cells and auditory nerve fibers are properly represented with trains of spikes. This model has been used in Eriksson and Villa (2006b) to simulate the learning of synthetic vowels by rats reported in Eriksson and Villa (2006a). In this latter work, based on experimental measurements, besides proving that rats are able to discriminate and generalize instances of the same vowel, it is further suggested that, similar to humans, rats use spectral and temporal cues for sound recognition.

    View all citing articles on Scopus
    View full text