Learning of auditory equivalence classes for vowels by rats
Introduction
Recognition and categorization of meaningful sounds are typical features of all behaving mammals. In addition to being able to make fine acoustic discriminations, an individual must also be able to determine, depending on the context, which cues carry information and which are irrelevant.
In human speech there is no one-to-one correspondence between a given physical acoustic pattern and a functionally meaningful element of information such as a vowel. Vowels are physically characterized by the first and second formant frequencies (F1 and F2, see, e.g., Peterson and Barney, 1952). However, vowels spoken by different individuals have different distributions in the F1 versus F2 plane and these may even often overlap. This has led speech theorists to postulate that a listener somehow interprets the formant frequency patterns by using additional speaker-specific cues, a process referred to as speaker normalization (Joos, 1948, Ladefoged, 1967). It is generally accepted that both intrinsic and extrinsic factors play important roles in vowel normalization (Ainsworth, 1975, Nearey, 1989). Intrinsic factors including the fundamental frequency (f0) and higher order formants appear necessary for identifying isolated vowels. Miller (1989) has proposed an auditory-perceptual theory of vowel perception in which the speaker's voice pitch serves as a reference for vowel normalization. Hirahara and Kato (1992) have shown that the perception of vowel quality did not change when all formant frequencies were shifted in the same direction as f0, but could change if this rule was not applied.
Human listeners assign a phonemic label to an acoustic pattern based on categories of speech, referred to as equivalence classes, but it is yet unclear whether equivalence classes are innate or associated with an experience-driven process. One of the classic notions in speech studies, categorical perception, is associated with the ability to parcel out the perceptual space with sharp boundaries between speech sounds. Although some authors have proposed that these boundaries reflect innate non-linearities of the auditory system (Kuhl and Miller, 1978) many have put forward views which give greater weight to learning and general perceptual processes (Schouten, 1987, Schouten and van Hessen, 1992, Kluender, 1994, Kluender et al., 1998, Holt et al., 2001). Indeed, different languages partition the vowel space in different ways, which is evidence supporting the idea that linguistic experience influences categorization. At the age of 6 months already, infants can reliably categorize vowel sounds even when there is considerable overlap between stimuli from each vowel category, and this ability is specific to the language that these infants have been exposed to (Kuhl et al., 1992, Kuhl, 1983).
The study of speech sounds perception by non-human animals is necessary to determine whether the processing of these stimuli rests upon general properties of the auditory system and of learning processes, or on unique human capabilities. Studies on animal responses to speech sounds have been carried out in cats (Dewson, 1964), chinchilla, (Burdick and Miller, 1975), monkeys (Kuhl, 1991), budgerigars (Dooling and Brown, 1990), blackbirds and pigeons (Heinz et al., 1981), quails (Kluender et al., 1987), starlings (Kluender et al., 1998) and, recently, rats as well (Reed et al., 2003, Toro et al., 2003). Psychophysical tests of discrimination of vowel formant frequency by non-human species indicate that animal performances are only slightly below that of humans (Sinnot and Kreiter, 1991, Sommers et al., 1992, Heinz et al., 1996). These types of studies provide a behavioral counterpart to electrophysiological work aimed at determining how complex sounds (Nelken et al., 1999, Villa et al., 1999b) and vowels (Sachs and Young, 1979, Blackburn and Sachs, 1990, Recio et al., 2002) are encoded in the auditory system. A current hypothesis (Conley and Keilson, 1995, May et al., 1996) is that humans as well as non-human species discriminate vowel formant changes using a profile analysis based on the mean discharge rate of auditory nerve fibers (Green, 1988).
If animal species can be shown to discriminate and label speech sounds like humans, this would support the hypothesis that the underlying neural processes are similar and thus not unique to humans. There are many similarities between the anatomy and physiology of the auditory system of humans and rats. The rat is a common experimental subject providing an opportunity to investigate basic neural mechanisms using techniques which are not available in humans. Although there is interest in the rat's auditory capabilities (Syka et al., 1996, Talwar and Gerstein, 1999), few attempts have been made until recently to test this animal in behavioral tasks involving other than simple auditory stimuli (Villa et al., 1999a). In a test of absolute pitch perception rats perform like humans, in contrast to several avian species (Weisman et al., 2004).
The present study is aimed at investigating, firstly, whether discrimination of vowel categories could be learned by rats and, secondly, whether equivalence classes were based only on the vowel formants or both formants and fundamental frequency. We tested two vowel categories: /ɛ/, as in “head”, and /⊃/, as in “hawed”. The test stimuli were all synthetic vowels whose fundamental and formant frequencies were varied in a systematic way (with formants and f0s shifted in the same direction) to cover the range of variation recorded in human speech. After initial training with a small set of exemplars from each vowel category, we tested the ability of the rats to generalize to a larger set. Following an additional period of training with the full set of vowels, the rats were again tested with a set of vowels for which f0–formant relation was reversed. A hypothesis about the perceptual organization of stimulus attributes is proposed with the help of a computational model.
Section snippets
Subjects
Subjects were four Long-Evans male rats Rattus norvegicus, aged 6 months at the beginning of the experiment. The rats were housed individually with free access to water and restricted food supply. The rats were rewarded by sunflower seeds during the experimental sessions and were given supplemental pellets at the end of each session so to maintain their body weight at least at 90% of the ad libitum weight. All experimental procedures were carried out in accordance with the European Communities
Experiments
In order to enhance the attention of the subjects to the structure of the vowels, the standard stimuli were slightly modified after the fourth session; the first and third formants were attenuated by 20 dB, leaving the second formant unchanged. After eight sessions with these stimuli the performances were steadily >80% correct. During subsequent nine sessions the amplitudes of the first and third formants were raised by steps of 2–3 dB at each session, until the stimuli reached the standard
General discussion
In the present study we report experimental observations that demonstrate the ability of rats to discriminate between categories of human vowels. Initially, the subjects were exposed to two exemplars of vowel categories /ɛ/ and /⊃/, one being associated to a GO and the other to a NOGO response choice task. In a subsequent phase the subjects were exposed to seven new exemplars of each vowel category. These stimuli were “standard” because of the positive correlation between formants and
References (68)
- et al.
Song perception in the song sparrow: birds classify by song type but not by singer
Anim. Behav.
(1994) - et al.
Audiogram of the hooded Norway rat
Hear. Res.
(1994) Levels of stimulus control: a functional approach
Cognition
(1990)Perception of auditory equivalence classes for speech in early infancy
Infant Behav. Dev.
(1983)- et al.
Auditory frequency and intensity discrimination in pigmented rats
Hear. Res.
(1996) - et al.
A novel go/nogo conflict paradigm in rats suggests an interaction between stimulus evaluation and response systems
Behav. Process.
(1999) - et al.
A behavior analysis of absolute pitch: sex, experience, and species, behavioral processes
Behav. Process.
(2004) Intrinsic and extrinsic factors in vowel judgements
Neural Networks for Pattern Recognition
(1995)- et al.
The representation of the steady-state vowel /ɛ/ in the discharge patterns of cat anteroventral cochlear nucleus neurons
J. Neurophysiol.
(1990)
Open-ended categorization of chick-a-dee calls by black-capped chickadees (Poecile atricapilla)
J. Comp. Psychol.
Multiple levels of representation of song by European starlings (Sturnus vulgaris): open-ended categorization of starling song types and differential forgetting of song categories and exemplars
J. Comp. Psychol.
Speech perception by the chinchilla: discrimination of sustained /a/ and /i/
J. Acoust. Soc. Am.
Neural correlates of the pitch of complex tones. I. Pitch and pitch salience
J. Neurophysiol.
Music discrimination by carp (Cyprinus carpio)
Anim. Learn. Behav.
Pitch shifts and song structure indicate male quality in the dawn chorus of black-capped chickadees
Behav. Ecol. Sociobiol.
Rate representation and discriniinability of second formant frequencies for /ɛ/-like steady-state vowels in cat auditory nerve
J. Acoust. Soc. Am.
Speech sound discrimination by cats
Science
Speech perception by budgerigars (Melopsittacus undulatus): spoken vowels
Percept. Psychophys.
Auditotory nerve of the normal and jaundiced rat. II. Frequency selectivity and two-tone rate suppression
Hear. Res.
Hearing in Vertebrates: A Psychophysics Databook
Profile Analysis
Perception of the missing fundamental by cats
J. Acoust. Soc. Am.
Vowel discrimination in cats: threshold for the detection of second formant changes in the vowel /ɛ/
J. Acoust. Soc. Am.
Discrimination of steady-state vowels by blackbirds and pigeons
J. Acoust. Soc. Am.
The effect of F0 on vowel identification
Influence of fundamental frequency on stop-consonant voicing perception: a case of learned covariance or auditory enhancement?
J. Acoust. Soc. Am.
Acoustic phonetics
Linguistic Society of America (Ed.), Language Monograph No. 23
Rat auditory cortex
Software for a cascade/parallel formant synthesizer
J. Acoust. Soc. Am.
Speech perception as a tractable problem
Japanese quail can learn phonetic categories
Science
Effects of first formant onset frequency on [-voice] judgments result from auditory processes not specific to humans
J. Acoust. Soc. Am.
Role of experience for language-specific functional mappings of vowel sounds
J. Acoust. Soc. Am.
Cited by (26)
Expressive power of first-order recurrent neural networks determined by their attractor dynamics
2016, Journal of Computer and System SciencesFrequency-based organization of speech sequences in a nonhuman animal
2016, CognitionCitation Excerpt :Fundamental frequency changes were the same as those used by Gervain and Werker (2013), which allowed for more direct comparisons across species, but were less marked than those used by de la Mora et al. (2013). However, they are within the auditory limits that have been used with speech stimuli (e.g. Eriksson & Villa, 2006). Test stimuli were the same as in Experiment 1, thus pitting frequent–initial items (AXBY) against infrequent–initial items (XABY).
Biological context of Hebb learning in artificial neural networks, a review
2015, NeurocomputingCitation Excerpt :There are numerous examples of reproducible behaviors in biological neurons. In behavioral, psychological and biological experiments, one can manipulate newly perceived patterns of a visual scene [53], complex sounds of speech-like utterances [67] and other sensory stimuli. Even perceptual illusions are used in experimental manipulations [68] and their neuronal mechanisms established.
Rodent Models of Speech Sound Processing
2015, Neurobiology of LanguageRule learning over consonants and vowels in a non-human animal
2013, CognitionCitation Excerpt :These commonalities extend beyond the perception of simple sounds and extend to the processing of complex speech stimuli (Yip, 2006). For example, it has been observed that rats are able to discriminate across vowel categories (Eriksson & Villa, 2006), among consonants using the affricate-fricative continuum (Reed, Howell, Sackin, Pizzimenti, & Rosen, 2003), and can organize consonant sounds around categories using their frequency distribution (Pons, 2006). That is, there are consistent behavioral indicators that rodents can process consonants and vowels.
Evolving spiking neural networks for audiovisual information processing
2010, Neural NetworksCitation Excerpt :The outputs of the inner hair cells and auditory nerve fibers are properly represented with trains of spikes. This model has been used in Eriksson and Villa (2006b) to simulate the learning of synthetic vowels by rats reported in Eriksson and Villa (2006a). In this latter work, based on experimental measurements, besides proving that rats are able to discriminate and generalize instances of the same vowel, it is further suggested that, similar to humans, rats use spectral and temporal cues for sound recognition.