Introduction

Most auditory signals in the natural environment are masked by background noise. This noise is often temporally structured and shows correlated amplitude fluctuations over certain frequency ranges (Nelken et al. 1999). It has been shown in psychoacoustical experiments that the auditory system of vertebrates is able to use these correlated, i.e., comodulated, amplitude fluctuations to improve signal detection (e.g., Hall et al. 1984; Klump and Langemann 1995; Klump et al. 2001; McFadden 1986; Weik et al. 2005). This effect has been termed comodulation masking release (CMR) and has been attributed to auditory processing within one auditory channel (relying on within-channel cues) and/or across several auditory channels (relying on across-channel cues).

CMR can be studied in a flanking-band (FB) paradigm employing two narrow bands of noise as maskers (e.g., Schooneveldt and Moore 1987). One masker (the on-frequency masker, OFM) is always centered at the signal frequency, while the second masker (FB) can be positioned either within the same auditory channel as the signal to study the influence of within-channel cues or in a separate auditory channel to measure the amount of CMR due to across-channel cues (this is often referred to as “true CMR”). Schooneveldt and Moore (1987), for example, showed in humans that CMR tends to be greatest at FB center frequencies close to the signal frequency (i.e., at a small frequency separation between OFM and FB and, therefore, still within the same auditory filter), and this increase might be caused by the exploitation of changes in the temporal cues (i.e., the modulation pattern due to the beating between both maskers) when a signal is added.

Because human auditory filters are quite narrow and the equivalent rectangular bandwidth of the filter amounts to roughly 10% of the signal frequency (e.g., Moore and Glasberg 1983; Oxenham and Shera 2003), it is not possible to study the effects of within-channel cues for a large range of frequency separations (Berg 1996, however, proposed a model using one broad frequency filter including the OFM and FB to explain CMR in humans). Furthermore, the human filter bandwidth is similar to the bandwidth at which the envelope fluctuation becomes too fast to be of use for the auditory system (for a limit of modulation detection in humans see Viemeister 1979), i.e., the ability to analyze the temporal structure of the masker deteriorates at the same FB frequencies as the frequencies at which the border of the auditory filter is reached. This makes it difficult to clearly separate the effects of the masker spectrum from the effects of inherent temporal fluctuations of the maskers on the detection thresholds.

In the Naval Medical Research Institute (NMRI) strain of the house mouse, the auditory filter at 10 kHz—a frequency within the best hearing range of the subject—is relatively wide and amounts to at least 3.4 kHz (Ehret 1976; Weik et al. 2005). The large width of the filter offers the possibility to clearly separate conditions employing within-channel cues from conditions employing across-channel cues (i.e., the spectral and temporal effects on the threshold). Furthermore, by placing the FB at several positions within the same auditory filter, the amount of within-channel CMR that is due to the varying temporal cues resulting from different frequency separations between FB and OFM can be measured. Coupled with the possibility of integrating behavioral and neurophysiological data from the same species, we therefore think that the mouse is a suitable model for CMR studies, especially for studying spectral and temporal aspects of CMR independently.

Materials and methods

Subjects

Masked thresholds were obtained for 14 adult house mice (Mus musculus) of the NMRI strain (bred by Charles River Laboratories, Sulzfeld, Germany), a mouse strain with no abnormalities in the ear or the behavior and showing only moderate age-related hearing loss up to 18 months of age in the tested frequency range (see Ehret 1974). The subjects (seven females and seven males) were between 2 and 18 months old during the total period of testing, with the majority of the mice being between 5 and 13 months of age. Their hearing threshold for the 10-kHz signals at the end of the experiments was on average 21 dB SPL, which is only moderately higher than the average threshold of the subjects at the beginning of testing (7 dB SPL). The mean individual hearing loss from the beginning to the end of the experiments was 12 dB SPL. The subjects were housed in individual cages (Eurostandard Type III H, 43 × 27 × 19 cm; Tecniplast) with a hiding possibility (“mouse house”, Tecniplast) and a layer of wood shavings as bedding material (Raiffeisen). All cages were stored in a ventilated cage rack (Slim Line Sealsafe, Tecniplast). The animals had unrestricted access to water and were mildly food-deprived (their weight ranged from 28.2 to 46.9 g, which is above the mean weight at reaching maturity). The food rewards during the experiments consisted of 20-mg pellets (Bioserve Dustless Precision Pellets, Formula FO163), and additional rodent pellets (Altromin Type 1314) were given after the experiments to keep the animals’ weight about constant. Animals were moved from their cages to the experimental cage using a small transfer cage.

Apparatus

The animals were tested in sound-attenuating chambers (Industrial Acoustics type IAC 403 A: inner dimensions 224 × 214 × 199 cm or a custom-built chamber: inner dimensions 67 × 108 × 91 cm) lined with two to three layers of sound absorbing wedges (Illbruck Waffel type 70/125, mounted on Illbruck Plano type 50/0 SF or Illbruck Illsonic Pyramide 100/100, mounted on one or two layers of Illbruck Plano). The wedges had an absorption coefficient of more than 0.99 for frequencies above 500 Hz.

The custom-made experimental cage was shaped like a doughnut (outer diameter 22 cm, inner diameter 9 cm, height 14 cm; made from stainless steel wire mesh) and was located in the middle of the chamber on a rack constructed of thin metal bars (IAC chamber) or a wire construction lifting the cage above the sound absorbing wedges (custom-built chamber). The cage contained a small feeding dish with a nearby feeder light as a secondary reinforcer and a pedestal with a light-interrupting switch. A nearby pedestal light was used to provide feedback to the animals during testing. A custom-built feeder mounted at a distance of at least 30 cm was connected to the feeder dish by a flexible tube and dispensed the reward pellets. A loudspeaker (Canton Plus XS) was positioned a minimum of 30 cm above the pedestal at which the mouse sat in the experimental cage.

Stimulus generation

Masked thresholds were obtained for a 10-kHz pure tone signal (duration 800 ms, cosine rise/fall times of 10 ms) presented in continuous narrow-band noise. The noise in each experiment consisted of two 25-Hz-wide noise bands centered at various frequencies. The OFM was always centered at the signal frequency (i.e., 10 kHz). In the standard conditions, in which the general amount of CMR due to within- and across-channel processing was investigated in all subjects, the FB was centered at frequencies of either 5, 9, 10, 11, or 15 kHz. To test whether the frequency separation between both masker bands affects the size of within-channel CMR (due to changes in temporal cues), additional FB frequencies of 9.9 and 10.1 kHz were tested in a subgroup of the subjects (three individuals). The envelope of the FB was either uncorrelated or correlated with that of the OFM. In the 10-kHz FB condition, both bands were presented simultaneously and centered at 10 kHz, and the SPL of the resulting noise—compared to a single OFM—was on average 3 dB higher in the uncorrelated condition (reference condition) and 6 dB higher in the correlated condition.

The 10-kHz pure tone signal was generated with a Linux workstation with a standard sound card (Sound Blaster Model PCI 128, 44.1 kHz sampling rate; Creative Technology) and passed through a programmable attenuator (PA4, Tucker-Davis Technologies) to mute the signal completely between trials if necessary. The output sound pressure level of the signal was adjusted by a manual attenuator (Hewlett Packard 350D). The signal was amplified by a Rotel RMB-1066 amplifier and presented via the Canton XS loudspeaker. Sound pressure levels in the experimental setup were calibrated once per day with a sound level meter (Model 2238 mediator with Model 4188 microphone; Bruel & Kjaer) located at the position where the head of the animal would be during the experiment.

The continuous narrow-band noise was generated using a set of two real-time processors (RP2.1, Tucker-Davis Technologies). Each 25-Hz-wide noise band was produced by multiplying a continuously generated 12.5 Hz low-pass noise (Gaussian white noise sampled at a rate of 6 kHz which was cut-off at the 3 dB point using an FIR filter) with a sinusoid, thus generating a 25-Hz-wide noise band centered at the signal frequency of the pure tone (resampled at a rate of 50 kHz). Using the same low-pass noise as the source, two 25-Hz-wide noise bands with correlated amplitude fluctuations could be generated (see Fig. 1A-B, E-F), while using two independent low-pass noise bands produced two uncorrelated 25-Hz-wide noise bands (see Fig. 1C-D, G-H).

FIG. 1
figure 1

Examples of envelope fluctuations of correlated (A-B, E-F) and uncorrelated (C-D, G-H) 25-Hz narrow-band noise maskers without (left column) and with (right column) the pure tone signal present. The signal level equaled the mean SPL needed for the signal detection in the respective type of noise (i.e., at FB = 10.1 kHz: 36.4 dB SPL (correlated) and 54.1 dB SPL (uncorrelated), at FB = 11 kHz: 54.3 dB SPL (correlated) and 58.0 dB SPL (uncorrelated); noise spectrum level 40 dB SPL). Each noise masker in this figure consisted of either the OFM and a FB at 10.1 kHz center frequency (A-D) or of the OFM and a FB at 11 kHz center frequency (E-H). At a frequency separation of 100 Hz, the envelopes in both the correlated (A-B) and—to a smaller extent—the uncorrelated (C-D) noise show regular peaks and valleys at a modulation rate of 100 Hz in addition to the slow fluctuation of the 25 Hz noise bands. At a frequency separation of 1 kHz (E-H), however, the 1-kHz beating is too fast to be resolved by the auditory system and only the inherent fluctuation of the 25 Hz noise bands can be seen. Adding the 10-kHz signal to the noise results in a partial “filling” of the envelope minima (e.g., see Fig. E-F) and can be used as a cue for the detection of the signal.

The output signal was analyzed with a spectrum analyzer to make sure that no artifacts occurred. The noise spectrum level was adjusted daily to 40 dB SPL using a programmable attenuator (PA4, Tucker-Davis Technologies). Both signal and noise were mixed together before being presented though the Canton loudspeaker.

Procedure

The experimental paradigm was a Go/NoGo-procedure reinforcing the subject with food rewards (for initial training procedures and more details see Klink et al. 2006). The experimental protocol was controlled by the workstation using a custom-made program. A trial started when the subject jumped onto the pedestal in the experimental cage. After a random waiting interval of between 1 and 5 s, a single test stimulus was presented. The mouse was trained to jump off the pedestal when perceiving a test stimulus (Go condition), otherwise, it had to remain on the pedestal. If the subject responded correctly to a test stimulus (i.e., scored a “hit”) within 1 s, a feeder light was switched on, a food reward was given, and the next trial started. If the subject missed a test stimulus and remained seated on the pedestal, the pedestal light was switched off for 1 s before the next trial could be initiated. Thirty percent of all trials were catch trials in which no stimulus was given (NoGo condition), and the subject had to remain on the pedestal (“correct rejection”). These trials were used to measure the false alarm rate.

Signal detection thresholds were obtained with the method of constant stimuli. A block of ten trials, consisting of three catch trials and a set of seven test trials in which the sound pressure level of the tone differed in steps of 5 dB, was repeated six times. The level of the presented stimuli was adjusted in such a way that the two most salient stimuli were clearly detectable, while the two stimuli with the lowest SPL were below threshold. The trials within each block were presented in random order, thus making it impossible for the mouse to anticipate the next stimulus. The first ten trials of each session were used as a “warm-up” period (only test stimuli with the highest sound pressure level were presented) and were discarded from the analysis.

Each subject was tested once to three times a day, with a typical session lasting on average between 20 and 50 min (depending on the mouse). Every FB condition was tested only once in each animal, and the first session within each condition was excluded from the analysis as a practice session. To reduce any possible effects of aging, all FB conditions (aside from the additional FB conditions) were presented in random order, with the correlated and uncorrelated conditions within each FB condition presented successively.

Data analysis

At the end of each session, a psychometric function was compiled summarizing the results of the last 50 trials. Sessions were excluded from the analysis if the percentage of false alarms was greater than 20% and/or if the average hit rate in the trials with the two most salient test stimuli (i.e., those with the largest sound pressure level) was less than 80%. Threshold estimates of single sessions were calculated using signal-detection theory and a threshold criterion of d′ = 1.8. For each noise condition, two to three consecutive valid 50-trial sessions which did not differ from each other in threshold by more than 3 dB were combined into a single psychometric function (i.e., this corresponded to ten to 15 repetitions of the stimulus per sound pressure level). Using a threshold criterion of d′ = 1.8, mean signal detection thresholds were calculated from the combined psychometric function. The difference between the signal detection threshold in the uncorrelated and the correlated masker condition was defined as CMR.

Applying a peripheral model for CMR

Buschermöhle et al. (2007) proposed that CMR in psychoacoustic experiments can be explained at least partially by the compressive processing mechanisms of the human auditory periphery. We adapted this model to the mouse by using a gammatone filter with a bandwidth of 3.3 kHz that is close to the critical bandwidth measured in the NMRI mouse (Ehret 1976; Weik et al. 2005) and a compression exponent of 0.2. The waveforms of the masker alone and the signal plus the masker were used as the input to the model. In modeling, we applied a range of signal levels and 16 signal-masker phase relations across which the model results were averaged. A d′ value was calculated from the difference of the averaged compressed envelopes of the filtered waveform obtained for signal plus masker and the masker alone, respectively. The time period over which the averaged envelope values were determined corresponded to the signal duration. The calculation included a single scaling parameter for d′ (cf. Buschermöhle et al. 2007) that was used for all model results and was derived by fitting the CMR values provided by the model to the measured CMR values at all separations of OFM and FB.

Results

The present FB experiment was designed to demonstrate the effects of spectral and temporal cues on the amount of CMR in the mouse. Three different frequency separations between the OFM and the FB were examined: (1) very large frequency separations of 5 kHz (i.e., for FBs centered at 5 and 15 kHz, respectively) which addressed the processing of across-channel cues (i.e., measured the “true CMR”), (2) large frequency separations of 1 kHz within a single auditory filter (i.e., for FBs centered at 9 and 11 kHz, respectively), and (3) small frequency separations of 100 Hz within a single auditory filter (i.e., for FBs centered at 9.9 and 10.1 kHz, respectively). The latter two conditions only provide within-channel cues but with different temporal characteristics.

To investigate the role of spectral cues on CMR (i.e., the occurrence of within- and across-channel CMR), masked thresholds were measured in 14 NMRI mice. In total, 2,645 single sessions had to be run in order to obtain all masked thresholds. Of these sessions, 1,065 (40%) had to be discarded because they did not satisfy the criteria (i.e., false alarm rate and percent correct responses, or both) or had to be stopped before reaching the end (e.g., when the number of trials was very low after 30 min or the computer program crashed). On average, between 12 and 20 single sessions were needed until a stable threshold could be calculated for the standard conditions. For the additional conditions, on average, between 18 and 36 sessions were needed. Analyzing the full data set from all 14 subjects tested in the standard conditions, we found no influence of masker type or FB position on the number of sessions needed to obtain a threshold (general linear models (GLM) repeated-measures analysis of variance (ANOVA), all p > 0.1). The sequence of tests was randomized, and the age of the subjects did not vary between the different conditions (GLM repeated-measures ANOVA with age as the dependent variable and masker type and FB position as factors, all p > 0.1). In total, 431 psychometric functions of single sessions were obtained that resulted in 152 combined psychometric functions. Figure 2 shows examples from one subject of single and combined psychometric functions for the 9.9 kHz conditions (both correlated and uncorrelated).

FIG. 2
figure 2

Single psychometric functions (pmfs) and the corresponding combined pmfs of a representative subject obtained for the 9.9-kHz FB condition showing the response probability (plotted as the discrimination measure d-prime or d′) in relation to the SPL of the signal. Panels A and B show the pmfs for the correlated masker condition, panels C and D show the pmfs for the uncorrelated masker condition. The different symbols in panels A and C depict the three single pmfs; panels B and D show the respective corresponding combined pmf. The threshold criterion was a d-prime of 1.8. Please note that all pmfs were plotted in such a way that the x-axis is the same in all conditions, clearly showing that the pmfs in the correlated and uncorrelated condition are shifted along the x-axis.

Figure 3A shows the mean signal detection thresholds (and standard deviations) for a 10-kHz tone presented either in a correlated or an uncorrelated masker condition in relation to the center frequency of the FB. In the reference condition, a mean signal threshold of 57.5 dB SPL could be obtained. Positioning an uncorrelated 25-Hz-wide FB at a center frequency below or above 10 kHz did not affect the signal threshold much. The mean signal threshold for FB center frequencies between 9 and 15 kHz ranged from 57.8 to 58.8 dB SPL, for an FB center frequency of 5 kHz, it was 53.8 dB SPL. Positioning a correlated 25-Hz-wide FB at a center frequency 1 or 5 kHz below or above the center frequency of the OFM (i.e., 10 kHz) lowered the detection threshold to a value of 53.1 to 54.7 dB SPL (except for a FB of 5 kHz, where the threshold was 51.4 db SPL). A two-way repeated-measures ANOVA, with the signal threshold (in dB SPL) as the dependent variable and type of correlation and FB center frequency (in kHz) as factors, revealed a significant effect of type of correlation (p < 0.01, F = 16.674, df = 1) and FB center frequency (p < 0.01, F = 5.100, df = 4), and a significant interaction between the two factors (p < 0.01, F = 4.722, df = 4).

FIG. 3
figure 3

Mean signal thresholds for up to 14 NMRI mice for the detection of a 10-kHz tone masked by 25-Hz narrow-band noise as a function of the center frequency of the FB (filled diamonds: uncorrelated noise bands, open diamonds: correlated noise bands, filled squares: CMR). Error bars show the standard deviation across subjects. The left column (panels A and C) depicts the results for the standard conditions for 14 subjects. The right column (panels B and D) shows a close-up of the results of FB conditions between 9 and 11 kHz for the three subjects tested with all conditions, including the additional FB conditions 9.9 and 10.1 kHz. Panels A and B show the signal threshold for the uncorrelated and correlated conditions; the filled symbols in panels C and D show the amount of masking release, i.e., the threshold difference between the uncorrelated condition and the correlated condition. The open symbols in panels C and D show the predictions calculated on the basis of the model by Buschermöhle et al. (2007) applied to the mouse.

To test the influence of correlation on the masked thresholds, pairwise comparisons between the masked thresholds for different FB frequencies (Tukey test) were obtained which revealed a significant difference between correlated and uncorrelated thresholds for FBs centered at 9 (p < 0.001, q = 6.236), 11 (p < 0.01, q = 4.131), and 15 kHz (p < 0.05, q = 3.397), indicating a possible CMR at these conditions. Furthermore, there was a significant difference in threshold in the correlated condition between the FB centered at 10 kHz and FBs centered at 5 (p < 0.001, q = 7.060), 9 (p < 0.01, q = 5.450), 11 (p < 0.05, q = 4.390), and 15 kHz (p < 0.05, q = 4.030), respectively, indicating a beneficial effect of the frequency separation between FB and OFM at these conditions. In the uncorrelated condition, only thresholds for the FBs centered at 5 and 9 kHz differed significantly from each other (p < 0.001, q = 6.280).

CMR was calculated as the difference between the signal threshold in an uncorrelated and a correlated masker condition (see Fig. 3C). A one-way repeated-measures ANOVA with the CMR as dependent variable revealed a significant effect of FB position (p < 0.01, F = 4.772, df = 4). Pairwise comparisons between the amount of CMR for different FB conditions (Tukey test) showed that the difference between the correlated and the uncorrelated threshold for both the OFM and the FB centered at 10 kHz (i.e., −1.6 dB) was significantly different from the amount of CMR for FBs centered at 9 (p < 0.01, q = 5.902) and 11 kHz (p < 0.05, q = 4.357), respectively. On average, the CMR was 5.6 and 3.7 dB for FBs centered at 9 and 11 kHz, respectively. No significant CMR effect in comparison to the threshold difference at 10 kHz could be found for the largest frequency separations tested (i.e., with an FB of 5 and 15 kHz, both p > 0.05). Thus, in the current study, no unambiguous evidence for “true CMR”, i.e., CMR occurring due to across-channel cues, could be found.

To investigate the role of temporal cues on CMR, additional masked thresholds at FB conditions of 9.9 and 10.1 kHz were measured in three NMRI mice (see also Fig. 3B). Please note that for all following analyses, only the masked thresholds of the three subjects tested with the additional FB conditions were taken into consideration. For the three subjects, a mean signal threshold of 54.3 dB SPL could be obtained in the reference condition. Positioning an uncorrelated 25-Hz-wide FB at a center frequency of between 5 and 15 kHz did not change the detection threshold significantly (the detection thresholds ranged between 49.8 and 56.7 dB SPL). Positioning a correlated 25-Hz-wide FB at a frequency separation of 100 Hz lowered the detection threshold of the three subjects tested with all FB conditions to a mean value of 36.6 dB SPL, while the thresholds for the remaining FB conditions with a greater frequency separation (1 and 5 kHz) ranged between 46.0 and 52.4 dB SPL. A two-way repeated-measures ANOVA with the masked threshold as the dependent variable and type of correlation and FB center frequency as factors revealed a significant effect of type of correlation (p < 0.05, F = 21.627, df = 1) and FB center frequency (p < 0.01, F = 5.691, df = 6). Furthermore, a significant interaction between type of correlation and FB center frequency could be found (p < 0.01, F = 5.370, df = 6). To test the influence of correlation on the masked thresholds, pairwise comparisons between the masked thresholds for the correlated and uncorrelated conditions (Tukey test) were obtained. Concentrating on the data within each FB condition revealed a significant difference between uncorrelated and correlated thresholds for FBs centered at 9.9 (p < 0.001, q = 7.645) and 10.1 kHz (p < 0.001, q = 7.546) only, indicating a CMR at these conditions. To test whether the FB position had any influence on the size of the masked threshold, pairwise comparisons between the masked thresholds for different FB center frequencies within the same type of correlation (Tukey test) were conducted. An analysis within the uncorrelated condition showed no influence of FB condition on threshold (all p > 0.05). In the correlated condition, however, there were significant differences in thresholds between the additional conditions (9.9 and 10.1 kHz, respectively) and the 9, 10, 11, and 15 kHz conditions, respectively (all p < 0.01), indicating a beneficial effect of the frequency separation between FB and OFM only for the smallest separations measured (100 Hz). A one-way repeated-measures ANOVA with the CMR as dependent variable revealed a significant effect of FB position (p < 0.01, F = 5.370, df = 6). Pairwise comparisons (Tukey test) showed that the “amount of CMR” for the FB centered at 10 kHz—here only defined as the difference between the correlated and uncorrelated condition—was significantly different from the amount of CMR for FBs centered at 9.9 (p < 0.01, q = 5.908) and 10.1 kHz (p < 0.01, q = 5.832), respectively. The mean CMR was 17.8 dB for FBs centered at 9.9 and 10.1 kHz (see Fig. 3D). Please note that the failure to obtain significant CMR at FBs of 9 and 11 kHz was mostly due to high standard deviation across the three tested subjects; if all subjects were taken into consideration (see above), the amount of CMR in both conditions became significant.

Discussion

The results of the FB experiment demonstrated significant within-channel CMR in the mouse while no unambiguous evidence could be found for CMR occurring due to across-channel processing (i.e., “true CMR”). The amount of within-channel CMR was dependent on the frequency separation between the FB and the OFM. It increased from a value of between about 4 and 6 dB for a frequency separation of 1 kHz to a value of 18 dB for a frequency separation of 100 Hz. For both frequency differences between FB and OFM, the CMR was slightly smaller for FB positioned above rather than below the signal frequency, a tendency which is also found in humans (e.g., Hall et al. 1988; Piechowiak et al. 2007).

The largest CMR in the mouse in this FB experiment was higher than the CMR in other FB studies using 25 Hz narrow-band noise maskers. In European starlings, a within-channel CMR of 11.7 dB could be measured at a signal frequency of 2 kHz and a frequency separation between OFM and FB of 113 Hz (Klump et al. 2001). However, the starling also exhibited a large across-channel CMR of between 9.3 and 14.4 dB in the same experiment. CMR in humans measured with 25 Hz narrow-band maskers falling within the limits of an auditory filter ranged between 6 and 9 dB for 8 kHz tone signals; the largest CMR was observed for a frequency separation between OFM and FB of 100 Hz (Schooneveldt and Moore 1987). Schooneveldt and Moore also observed an across-channel CMR in this FB experiment. At frequencies lower than 8 kHz, CMR was also observed to be largest at a frequency separation between OFM and FB of 100 Hz (e.g., for 2-kHz signals Schooneveldt and Moore 1987 and Piechowiak et al. 2007 observed a CMR of up to 14 dB in humans). If the amount of CMR is dependent on temporal processing mechanisms, a larger value would be expected in the mouse compared to both starlings and humans, since the mouse has wider auditory filters which may allow for a better temporal resolution.

The absence of across-channel CMR (i.e., “true CMR”) in the current study is consistent with the results of a previous study in the mouse in which CMR was measured using a band-widening paradigm (Weik et al. 2005, 2006). In this experiment, a CMR of up to 13 dB could be found for masker bandwidths well below the auditory filter bandwidth of the mouse (i.e., below 3.4 kHz), and no additional CMR due to across-channel processing could be obtained. Both the current study and the band-widening study of Weik et al. support the hypothesis that the occurrence of CMR in the mouse can be explained on the basis of within-channel cues only. Also, in humans, it has been suggested that the main contribution to the CMR effect stems from processing of within-channel cues (Berg 1996; Verhey et al. 1999; Buschermöhle et al. 2007; Piechowiak et al. 2007).

What is the mechanism underlying CMR in the mouse that can explain both the lack of CMR for across-channel conditions and the huge increase in CMR for small frequency separations of 100 Hz? Moore (1992) suggested that the release from masking in the presence of correlated maskers within a single auditory channel might be attributed to the use of cues such as a change in the pattern of neuronal phase locking during the presentation of a signal. Phase locking can occur to the fine structure of a stimulus (i.e., to the signal frequency or the carrier frequency of the masker) and/or to its envelope fluctuation. In the mouse, phase locking to the fine structure ceases at a frequency of 4 kHz (Taberner and Liberman 2005). Therefore, in the current experiment, phase locking to the fine structure of signal and carrier cannot be used to explain the results in the mouse. Locking of action potentials to the masker envelope, however, still might provide usable cues. The envelope of the composite masker may be dominated by a temporal interaction between both maskers, leading to a beating with a rate depending on the frequency difference between both center frequencies (e.g., Schooneveldt and Moore 1987; Berg 1996; Piechowiak et al. 2007). At a frequency separation of 1 kHz, the beating occurs at a mean rate of 1 kHz, while a frequency separation of 100 Hz leads to a mean beating rate of 100 Hz. The addition of a tone signal to the masker reduces the depth of modulation which may lead to a reduced amount of locking of neural activity to the masker envelope (“locking suppression”); this change could be used as a cue for the detection of the signal (Nelken et al. 1999). This cue can be exploited efficiently when the frequency separation between OFM and the FB is small. The ability of the neurons to lock to the stimulus envelope depends on the modulation frequency. Tan and Borst (2007) found that in the mouse, inferior colliculus locking to sinusoidal amplitude fluctuations ceased for modulation frequencies of 160 Hz or more. This indicates that while locking to the masker envelope, beating at 100 Hz might be likely, and a change in this locking might be useable as a detection cue, locking to the 1-kHz beating rate is not probable. This is further supported by data of Kelly et al. (2006) who found that the modulation detection threshold in the rat, a species closely related to the mouse with a similar auditory filter bandwidth (according to critical masking ratio data from Gourevitch 1965 and Ehret 1976), increased considerably at modulation frequencies above 100 Hz.

The ability of vertebrates (for an overview see also Dent et al. 2002) to detect amplitude modulation depends not only on the modulation frequency but also on the modulation depth of the stimulus. Adding a signal to the masker results both in a local increase in level and a reduction of the modulation depth and a change in the modulation pattern of the resulting stimulus (e.g., Eddins 2001; Moore 1992) which could be used for signal detection. However, the auditory system is not equally sensitive to changes in modulation depth across all modulation frequencies. Similar to humans (Viemeister 1979; Dau et al. 1997), the modulation detection threshold in rats (Kelly et al. 2006) increases with increasing modulation rate, and the required changes in modulation depth become increasingly higher until they stop to be usable as a cue (see also Fig. 4).

FIG. 4
figure 4

Modulation spectra of the resulting noise of the OFM and the FB at 10.1 kHz center frequency (A-D), or the OFM and an FB at 11 kHz center frequency (E-H), either correlated (A + B, E + F) or uncorrelated (C + D, G + H) with each other. The left column shows spectra with only the maskers present, the right column shows spectra with the signal added at a mean SPL level needed for the signal detection in the respective type of noise (for level see Fig. 1 caption). The “peaks” in the 10.1-kHz condition (A-B) indicate the beating frequency (and multiples thereof). The gray area in each graph depicts the limit of modulation detection calculated from the modulation transfer function of rats (Kelly et al. 2006). The arrows highlight the modulation frequencies which might be usable as cues to improve signal detection.

Berg (1996) suggested a model that is based on the processing of modulation spectra to explain the CMR effect in an FB experiment with humans. His model uses a single envelope extracted from a broad range of frequencies that is low-pass filtered according to a temporal modulation transfer function (e.g., Viemeister 1979). The key features of the model, i.e., (1) signal and maskers are processed by a single broad filter and (2) modulation detection is considered, also apply to the present experiments in the mouse. Here, we have plotted the modulation spectra for different masker conditions and FB center frequencies in Figure 4. The black curve depicts the modulation depth of the respective stimulus at different modulation frequencies; the gray color marks the area in which the modulation cannot be detected by the rat (Kelly et al. 2006) and is likely not to be detectable by the mouse. The modulation spectra for correlated maskers and a frequency separation of 100 Hz are depicted in Figure 4A-B. Before the signal is added (Fig. 4A), the modulation spectrum shows high modulation depth (−3.6 dB corresponding to 66.1% modulation depth) at modulation frequencies corresponding to the beating rate of the composite masker (sharp peak at 100 Hz modulation frequency, see arrow) and to the inherent fluctuation rate of the narrow-band masker (broad peak at modulation frequencies below 50 Hz, see arrow). Similar patterns are reported by Berg (1996). Adding the signal (using a level equal to the mean SPL at threshold, Fig. 3B) to the masker reduces the modulation depth at the 100 Hz modulation frequency (peak) by 1.2 dB (which is equivalent to a reduction of 9%; see arrow). This reduction of modulation depth might be a sufficient cue for the signal detection, as indicated by data from Wakefield and Viemeister (1990) who showed that humans were able to detect changes in depth of 1–2 dB at high modulation depths (for low modulation rates of 100 Hz). If the auditory system of mice exhibits the same range of sensitivity, the mouse may be able to use this change in modulation depth provided by stimuli of this CMR experiment as a cue for signal detection. In the correlated condition, this might lead to especially low signal detection thresholds and to CMR of nearly 18 dB. Adding a signal to the masker hardly changes the modulation depth at low frequencies, making the use of this modulation cue questionable. In the uncorrelated condition (Fig. 4C-D), there is no peak in the modulation spectrum at 100 Hz modulation rate. Only the lowest modulation frequencies due to the inherent fluctuations of the masker show a modulation depth which may allow the mouse to detect the modulation (see arrow). Adding a signal to the masker, however, reduces the modulation depth below the threshold of modulation detection in rats (Fig. 4D), and it is unclear whether this change can be detected by the mice and can be used as a cue for detection of the signal. In conditions with a frequency separation of 1 kHz (Fig. 4E-H), the change in modulation depth at low modulation frequencies after addition of the signal is similar to that of the uncorrelated condition with a frequency separation of 100 Hz (see arrow). As in the 100 Hz condition, there is a peak in the modulation spectrum of the correlated condition at the beating frequency (i.e., 1 kHz; frequency range not plotted here), but it is very unlikely that mice are able to detect modulation rates at this frequency range (see Tan and Borst 2007). It remains unclear whether the mouse is able to use changes in modulation depth at the slow modulation rates below 20 Hz caused by the inherent fluctuation of the narrow-band noise as a cue. This cue would be present in both conditions (i.e., 100 Hz and 1 kHz frequency separation) and might be able to explain a small overall CMR. However, the CMR seen at frequency separations of 1 kHz might also be a result of the auditory system’s ability to detect a local increase in level of the masker after the addition of the tone (see Moore 1992).

An alternative explanation for the experimental data is provided by the model of Buschermöhle et al. (2007). Buschermöhle et al. suggested that the compressive nonlinearities of peripheral auditory filters can explain at least part of the CMR effect. We applied the model to our mouse data assuming that the level-dependent compression in the auditory periphery of the mouse is similar to that used in the model for humans and adapting the auditory filter bandwith to the mouse critical bandwidth (Ehret 1976; Weik et al. 2005). As demonstrated in Figure 3 that presents the model results together with the experimental data of the mouse, the model can explain at least part of the CMR effect. The qualitative relation between the size of the CMR and the frequency separation between the OFM and FB is reflected by the model. The quantitative model results, however, differ from the experimental data in the following way: the model underestimates the amount of CMR for a frequency separation of 100 Hz but overestimates the amount of CMR for the frequency separation of 1 kHz. A better fit between model results and experimental data would be obtained if using the auditory filter bandwith reported for the CBA/CaJ mouse by May et al. (2006) that is about half of the bandwidth reported for the NMRI mouse. Smaller auditory filters increase the amount of CMR for a frequency separation of 100 Hz and reduce the amount of CMR for the frequency separation of 1 kHz while not changing the results for a larger frequency separation.

In summary, the CMR effect seen in both mice and men might be explained by either the processing of a single envelope that is extracted from a broad range of frequencies (e.g., Berg 1996; Verhey et al. 1999; Piechowiak et al. 2007) or may result at least partially from the compressive nonlinearities of peripheral auditory filters processing the stimulus (Buschermöhle et al. 2007).