Introduction

Deficits in reward processing characterize a broad array of neuropsychiatric disorders, including major depressive disorder (MDD), substance abuse and schizophrenia.1 Such impairments may include deficits in hedonic capacity or pleasure (that is, anhedonia), lack of motivation to pursue rewards, lack of effective integration of reward value with planning of actions or deficits in reinforcement learning.2, 3, 4 Unfortunately, most clinical subjective assessments of anhedonia fail to capture various aspects of reward deficits. Furthermore, the use of subjective assessments that rely on an individual’s ability to recall past or imagine hypothetical pleasurable experiences is impossible to model in laboratory animals, limiting the potential impact of animal research on the discovery of new treatments for reward-related deficits in neuropsychiatric disorders.

To address this issue, clinical researchers have begun to develop objective, laboratory-based tasks to investigate discrete reward processes. One such task is the Response Bias Probabilistic Reward Task (hereafter referred to as probabilistic reward task (PRT)), a laboratory-based task designed to objectively assess reward responsiveness, that is, participants’ ability to modulate behavior as a function of reward.5 In this task, a signal detection approach is used, whereby subjects must discriminate between two ambiguous stimuli (for example, mouths varying slightly in length on a cartoon face) displayed rapidly on a computer screen in order to receive a monetary reward. As such, this task is different than classic probabilistic reward tasks where there is no ambiguity about the identity of the stimuli. Unbeknownst to the subjects, correct identification of one stimulus is reinforced three times more frequently than the other stimulus. Under these experimental circumstances, healthy subjects reliably develop a response bias (that is, preference) for the stimulus that is reinforced more frequently, regardless of which stimulus was actually presented. Thus, reward responsiveness assessed in this task reflects the rapid shaping of future behavioral choices based on prior reinforcement experiences.

Subjects with MDD,6, 7, 8 euthymic subjects with bipolar disorder9 and healthy subjects with elevated depressive symptoms5 fail to develop this biased response for the more frequently reinforced stimulus. Accordingly, in spite of being exposed to the same differential reinforcement schedule, these subjects respond similarly to both stimuli, reflecting decreased responsiveness to rewards. Moreover, response bias was found to predict current and future anhedonic symptoms in both nonclinical and clinical samples5, 9, 10 and predict the persistence of MDD diagnosis in MDD inpatients.8 Importantly, the objective nature of this task makes it ideal to develop and use as a translational tool to measure reward responsiveness across species.

Various key aspects of reward processing, particularly with regard to reinforcement learning and motivation, have been hypothesized to involve mesolimbic dopamine neurotransmission (for review, see refs 11–14). Using the PRT, Pizzagalli et al.15 demonstrated that a single low dose of the dopamine D2/D3 receptor agonist pramipexole—which at low doses is hypothesized to decrease extracellular dopamine levels via autoreceptor stimulation—impaired the development of response bias in psychiatrically healthy individuals. Deficits in striatal dopamine function in MDD have been described,16, 17, 18, 19 suggesting that decreased striatal dopamine function in humans with MDD may contribute to blunted reward responsiveness.

The goals of the present study were twofold. First, we aimed to develop a new behavioral task in rats that was conceptually and procedurally identical to the human version of the PRT. Once the task was developed and the rats’ behavior was characterized, the second goal was to determine whether dopaminergic manipulations would bidirectionally alter reward responsiveness. In light of prior findings in humans, we hypothesized that: (1) healthy rats would develop a response bias for a more frequently reinforced stimulus, similar to healthy human subjects;5, 6 and (2) pharmacological manipulations that block (that is, low doses of pramipexole) or enhance (that is, amphetamine) striatal dopamine transmission will decrease and increase this response bias, respectively.

Materials and methods

Subjects

A total of 12 male Wistar and 12 male Long–Evans rats (Charles River Laboratories, Raleigh, NC, USA) were used in experiment 1. A separate group of 24 male Wistar rats was used in experiments 2 and 3 (that is, the same rats were used for both experiments; for details, see Supplementary Methods). All procedures were conducted in accordance with the guidelines from the National Institutes of Health and the Association for the Assessment and Accreditation of Laboratory Animal Care and were approved by the Institutional Animal Care and Use Committee.

Apparatus

Behavioral training and testing were conducted in operant testing chambers that consisted of two metal retractable levers, a food receptacle located between the levers and a single speaker positioned above the food receptacle (Med Associates, St Albans, VT, USA). Tones were generated using a multipurpose sound generator, and all programs and data collection were controlled by a computer that ran MED-PC IV software (Med Associates; see Supplementary Methods).

Procedure

Tone discrimination training

The training procedure was developed to mirror the PRT instructions presented to humans5 (see Supplementary Methods). Briefly, rats were food restricted and trained to discriminate between two tone stimuli that varied in duration (5 kHz, 60 dB, 0.5 or 2 s) by pressing one of the two levers associated with each tone. Tone durations and lever sides were counterbalanced across subjects, and tones were presented in a random order over 100 trials. Each trial was initiated with presentation of a tone, after which the levers were extended, and rats had a 5-s limited hold period to respond. In each trial, correct identification of the tone stimuli resulted in a single 45 mg food pellet (Test Diet 5TUM, Richmond, IN, USA). Both levers retracted after a correct, incorrect or omitted response, followed by a variable intertrial interval between 5 and 8 s. Rats were trained daily until they achieved at least 70% accuracy for 5 consecutive days.

Experiment 1: performance of different rat strains in the response bias PRT

Optimal tone durations and reinforcement schedules were determined for testing (see Supplementary Results). During the test session, tone durations that were more ambiguous than the training tones (that is, 0.9 and 1.6 s) were reinforced for 60% and 20% of correct responses (counterbalanced across subjects) over 100 trials, which is identical to the 3:1 reinforcement ratio used in the human PRT.5 The stimulus paired with three times more frequent reward was referred to as the ‘rich stimulus’, whereas the other stimulus was referred to as the ‘lean stimulus’.

Experiment 2: effects of pramipexole on performance in the response bias PRT

Rats underwent tone discrimination training as described above and were habituated to the pramipexole administration procedure (see Supplementary Methods). On the testing day, half of the rats were injected with 0.1 mg kg−1 pramipexole and half with saline (1 ml kg−1, subcutaneous) 60 min before the test session. The test parameters were exactly the same as in experiment 1. After the test, training resumed for 5 days (that is, 0.5 and 2 s tone durations; 60% equal reinforcement), followed by a second test. Rats that received pramipexole during the first test were administered saline during the second test and vice versa.

Experiment 3: effects of amphetamine on performance in the response bias PRT

After the pramipexole experiment, training resumed for 9 days, followed by habituation to the amphetamine administration procedure (see Supplementary Methods). On the day of testing, half of the rats were injected with 0.5 mg kg−1 amphetamine and half with saline (1 ml kg−1, intraperitoneal) 15 min before the test. The test parameters were exactly the same as in experiment 1. After the test, training resumed for 9 days, followed by another test. Rats that received amphetamine during the first test were administered saline during the second test and vice versa.

Drugs

Rats were administered pramipexole dihydrochloride (Tocris Bioscience, Ellisville, MO, USA), D-amphetamine sulfate (Sigma, St Louis, MO, USA) or sterile 0.9% saline. The pramipexole dose used in the present study has been shown to suppress firing of ventral tegmental area dopaminergic neurons20, 21 and striatal dopamine levels.22, 23 This inhibitory effect on dopamine transmission parallels the decrease in ventral striatal activity in humans after administration of a low dose of pramipexole (that is, 0.5 mg),24 which was also used in the human PRT.15 The amphetamine dose used in the present study has been shown to enhance brain reward function25, 26 and elevate striatal dopamine levels without inducing stereotypy.27 All solutions were prepared fresh daily and administered in a volume of 1 ml kg−1.

Data and statistical analyses

Data collected by the MED-PC IV software included correct, incorrect and omitted responses and reaction times for the rich and lean stimuli for each individual trial and cumulated across blocks 1 (trials 1–33), 2 (trials 34–67) and 3 (trials 68–100). For each block, response bias, the primary dependent variable, was calculated as:

exactly as in the human task. A value of 0.5 was added to each cell to allow for calculations in cases of cells with a value of 0. A response bias arises when subjects tend to correctly classify the rich stimulus (that is, the stimulus associated with three times more frequent reward) and misclassify the lean stimulus (that is, pressing the lever associated with the rich stimulus when the stimulus presented was lean). As in humans, discriminability was calculated for each block as:

Discriminability captures the ability to differentiate between the stimuli, and can thus be taken as a proxy of task difficulty. In addition, accuracy (that is, number of correct responses/(numbers of correct+incorrect responses)) and reaction time were averaged within each block for each treatment group and stimulus type (that is, rich/lean), exactly as in the human task. Rats that could not successfully discriminate the two training tones (<70% accuracy in the training phase) were excluded from analyses and further testing (see Results). Moreover, because development of response bias is dependent on the ratio of rich vs lean reinforcements (that is, 3:1) and because rats were reinforced as a percentage of correct responses, rats were excluded if accuracy for either stimulus was <30% during either drug or vehicle test sessions, which resulted in insufficient reinforcements for that stimulus.

Response bias and discriminability scores were analyzed using a two-way analysis of covariance (ANCOVA; see below for covariate description), with Block as a within-subject factor and Rat Strain (experiment 1) as a between-subject factor or Drug Treatment (experiments 2 and 3) as a within-subject factor. To determine whether order of drug/vehicle administration affected response bias in experiments 2 and 3, Order was analyzed as a within-subject factor in a separate ANCOVA. Accuracy and reaction time were analyzed using similar ANCOVAs, in which Stimulus Type (rich vs lean) was an additional within-subject factor. Some rats responded asymmetrically for one or the other stimulus when equally reinforced during training sessions, suggesting that some degree of inherent bias was present during test sessions for these subjects, regardless of the differential reinforcement. Thus, variability of inherent response patterns was controlled for using a covariate, defined as the change in response bias between the first and third blocks during the training sessions, when both stimuli were equally reinforced, before each test day. For experiments 2 and 3, which involved within-subject testing, the change in response bias from both pretest training sessions was averaged.

Across ANCOVAs, significant effects were followed by post hoc t-tests. The level of significance was set at 0.05. A Greenhouse–Geisser correction was used when appropriate.

Results

Experiment 1: performance of different rat strains in the response bias PRT

One Long–Evans rat was excluded because of insufficient accuracy during training (<70%), leaving 11 Long–Evans and 12 Wistar rats for analyses. The mean (±s.e.m.) days to train Wistar and Long–Evans rats were 39.42 (±1.72) and 44.82 (±2.00) days, respectively.

Response bias

The Block × Rat Strain ANCOVA (covariate: inherent bias during training) revealed only a main effect of Block (F2, 40=7.19, P<0.01; all other P>0.49), suggesting that both strains displayed a similar increase in response bias over time (Figure 1a). Relative to block 1, response bias in blocks 2 and 3 was significantly higher (all t22>3.06, P<0.01), indicating that the differential reinforcement schedule had the intended effects.

Figure 1
figure 1

Response bias, discriminability, accuracy and reaction time in Wistar and Long–Evans rats. (a) Response bias gradually increased across blocks in Wistar rats (n=12) and Long–Evans rats (n=11; **P<0.01, significant difference between blocks). (b) Discriminability was consistent across blocks in both rat strains, indicating that the change in response bias was not a function of a change in the ability to differentiate the two ambiguous tone durations but rather a function of reinforcement history. The increased response bias was reflected by greater accuracy for the rich stimulus compared with the lean stimulus across blocks in both (c) Wistar and (d) Long–Evans rats (*P<0.05, **P<0.01 and ***P<0.001, significant difference between rich and lean stimuli). Consistent with the differential reinforcement schedule, across rat strains, rich accuracy increased from block 1 to block 2 (t22=2.39, P<0.05) and block 3 (t22=2.00, P=0.058), whereas lean accuracy decreased from block 1 to block 2 (t22=−2.06, P=0.051) and block 3 (t22=−2.29, P<0.05). Reaction times decreased for the rich stimulus compared with the lean stimulus in (e) Wistar, but not (f) Long–Evans, rats (**P<0.01, significant difference between rich and lean stimuli).

Discriminability

No significant effects emerged (all F<2.22, P>0.15), indicating that task difficulty remained consistent throughout blocks for both strains (Figure 1b).

Accuracy

The Block × Rat Strain × Stimulus ANCOVA revealed a main effect of Stimulus (F1,20=29.51, P<0.001) and a Block × Stimulus interaction (F2, 40=5.78, P<0.01). Consistent with the differential reinforcement schedule, accuracy of both rat strains combined was significantly higher for the rich vs lean stimulus in each block (all t22>2.26, P<0.034). Moreover, accuracy for the rich and lean stimuli increased and decreased, respectively, from early to later blocks (Figures 1c and d).

Reaction time

The three-way ANCOVA revealed significant main effects of Block (F2, 40=8.09, P<0.001), Stimulus (F1, 20=4.75, P<0.05) and Rat Strain (F1, 20=15.94, P<0.001). Post hoc tests indicated that reaction times were significantly shorter in blocks 2 and 3 relative to block 1 when data from both strains were combined (P<0.02). As expected, reaction times were shorter for the rich vs lean stimulus when data were combined across rat strains and blocks. Overall, Wistar rats had significantly slower reaction times relative to Long–Evans rats. The Block × Stimulus (F2, 40=2.75, P=0.076) and Rat Strain × Stimulus (F1, 20=3.16, P=0.091) interactions approached significance. Unlike Wistar rats (t11=−2.78, P<0.05; Figure 1e), Long–Evans rats did not show significantly shorter reaction times for the rich vs lean stimulus (t10=−0.17, P=0.87; Figure 1f). Based on the discriminability and reaction time patterns, only Wistar rats were used in subsequent experiments.

Experiment 2: effects of pramipexole on performance in the response bias PRT

Two rats were excluded because of insufficient accuracy during training (<70%). Six rats were excluded because of insufficient accuracy for either tone stimulus during the pramipexole or saline tests (that is, <30%). Thus, data from 16 rats were available. The mean (±s.e.m.) days to train Wistar rats for experiments 2 and 3 were 35.06 (±1.67) days.

Response bias

Response bias was lower in pramipexole-treated rats relative to saline-treated rats (Drug Treatment: F1, 14=5.85, P<0.05; Greenhouse–Geisser: 0.677; Figure 2a). No other effects, including order effects, emerged.

Figure 2
figure 2

Effects of pramipexole on reward responsiveness. Relative to saline, pramipexole administration reduced (a) response bias and (b) discriminability (n=16; *P<0.05, significantly different from saline). (c, d) The pramipexole-induced attenuation of response bias was reflected by greater accuracy for the rich stimulus in saline-treated rats compared with pramipexole-treated rats (***P<0.001, significantly greater than saline/lean; #P<0.05, significantly greater than pramipexole/rich).

Discriminability

Discriminability was lower in pramipexole-treated rats relative to saline-treated rats (Drug Treatment: F1, 14=7.38, P<0.05; Figure 2b). No other effects were observed.

Accuracy

The Drug Treatment × Block × Stimulus Type ANCOVA revealed a Drug Treatment effect (F1, 14=9.17, P<0.01), which was attributable to significantly lower overall accuracy in the pramipexole relative to saline condition. Critically, this effect was moderated by a Drug Treatment × Stimulus Type interaction (F1, 14=4.86, P<0.05). Post hoc analyses revealed that saline-treated rats were significantly more accurate for the rich vs lean stimulus (t15=4.04, P<0.001; Figure 2c), whereas pramipexole-treated rats had similar accuracy for rich and lean stimuli (Figure 2d). In addition, relative to saline, pramipexole induced lower rich stimulus accuracy (t15=2.87, P<0.05).

Reaction time

The three-way ANCOVA revealed a main effect of Drug Treatment (F1, 14=33.24, P<0.001), because of overall significantly slower reaction times for the pramipexole condition relative to the saline condition. A significant Drug Treatment × Block × Stimulus Type interaction was found (F2, 28=3.64, P<0.05). Follow-up analyses revealed, however, no further significant differences (data not shown).

Experiment 3: effects of amphetamine on performance in the response bias PRT

Two rats were excluded because of insufficient accuracy during training (<70%), leaving 22 rats for statistical analyses.

Response bias

There was a significant main effect of Drug Treatment (F1, 20=4.40, P<0.05), which was attributable to overall higher response bias in amphetamine-treated rats relative to saline-treated rats (Figure 3a), as well as a main effect of Block (F2, 40=8.25, P<0.001). Response bias systematically increased from blocks 1 to 3 (all t21>2.24, all P<0.036). No other effects, including order effects, emerged.

Figure 3
figure 3

Effects of amphetamine on reward responsiveness. Relative to saline, amphetamine administration (a) potentiated response bias (n=22; *P<0.05, significantly greater than saline) (b) without affecting discriminability (*P<0.05, significant difference between blocks). (c, d) The amphetamine-induced potentiation of response bias was reflected by greater accuracy for the rich stimulus in amphetamine-treated rats compared with saline-treated rats (*P<0.05, significantly greater than saline/lean; ***P<0.001, significantly greater than amphetamine/lean; #P<0.05, significantly greater than saline/rich).

Discriminability

There was only a main effect of Block (F2, 40=4.69, P<0.05; Figure 3b), which was because of significantly higher discriminability in blocks 2 and 3 relative to block 1 (all t21>2.35, P<0.029).

Accuracy

There was a significant main effect of Stimulus Type (F1, 20=31.87, P<0.001) and significant Drug Treatment × Stimulus Type (F1, 20=4.52, P<0.05) and Block × Stimulus Type (F2, 40=7.38, P<0.01; Greenhouse–Geisser: 0.787) interactions. Post hoc analyses revealed that although both treatment groups were significantly more accurate for the rich vs lean stimulus, amphetamine-treated rats were significantly more accurate for the rich stimulus than saline-treated rats (all t21>2.19, all P<0.040; Figures 3c and d). With regard to the Block × Stimulus Type interaction, post hoc tests indicated that accuracy for the rich stimulus increased systematically from blocks 1 to 3 (all t21>2.27, P<0.034), whereas lean accuracy did not differ across blocks (all P>0.23). Moreover, for blocks 2 and 3, rich stimulus accuracy was significantly higher than lean stimulus accuracy (all t21>5.77, P<0.001).

Reaction time

There was only a significant Drug Treatment × Block interaction (F2, 40=7.29, P<0.01). Post hoc analyses revealed that amphetamine-treated rats were significantly slower to respond than saline-treated rats during block 1, but not during blocks 2 and 3 (t21=−2.71, P<0.05). Moreover, for amphetamine-treated rats, reaction times were significantly faster for blocks 2 and 3 relative to block 1 (all t21>2.16, P<0.042; data not shown).

Discussion

Using procedures that are analogous to the human Response Bias PRT,5 we developed a new behavioral task to assess reward responsiveness in rats. Reward responsiveness reflects the modulation of a behavioral choice as a function of prior reinforcement history. Under baseline conditions, rats, like healthy human subjects,5 developed a response bias for the more frequently reinforced of two ambiguous stimuli, reflecting robust reward responsiveness. Furthermore, similar to the effects of pramipexole in humans tested with the PRT,15 administration of a low dose of pramipexole in rats attenuated response bias, which demonstrates important cross-species concordance. In addition, amphetamine treatment potentiated response bias in rats. Taken together, these data indicate that reward responsiveness can be quantified in rats and bidirectionally modulated by pharmacological manipulations that alter striatal dopamine neurotransmission, and monoamine neurotransmission in general in the case of amphetamine.

Consistent with data from healthy human subjects,5 both rat strains (Wistar and Long–Evans) displayed a positive response bias and comparable discriminability, reflecting a similar ability to differentiate between the two stimuli. In addition, consistent with the imposed differential reinforcement schedule, accuracy for the rich stimulus improved and accuracy for the lean stimulus declined over the course of the test session in both Wistar and Long–Evans rats, a pattern that is observed in healthy human subjects as well.5 In contrast to Long–Evans rats, reaction times in Wistar rats decreased and increased when responding to the rich and lean stimuli, respectively, also similar to the pattern of responding in healthy human subjects.5 Thus, the four measures (response bias, discriminability, accuracy and reaction time) collected from Wistar rats are virtually identical to those of human subjects,5 suggesting that the two tests are analogous, and Wistar rats use similar strategies and patterns of responding as healthy human subjects. After having established these important psychometric properties, the overarching goal of the next two experiments was to determine whether reward responsiveness could be modulated by pharmacological challenges hypothesized to affect dopaminergic/monoaminergic neurotransmission.

Several lines of evidence suggest that reward responsiveness is partially regulated by mesocorticolimbic dopaminergic circuits.28, 29 First, healthy subjects who developed a response bias in the PRT showed increased striatal activation after reward feedback in a different task.30 Second, subjects with MDD were characterized by both blunted response bias6 and reduced striatal responses to monetary rewards in a different task.31 Third, in humans, we are able to model blunted reward responsiveness in a computational model of frontostriatal circuitry by postulating reduced phasic dopaminergic bursts in response to rewards.28 Fourth, and directly relevant to the present findings, putatively reduced dopamine transmission (achieved by means of low pramipexole doses hypothesized to reduce dopaminergic transmission through autoreceptor activation) was found to reduce response bias in the PRT in healthy subjects compared with placebo-treated controls.15

Consistent with the above results, pramipexole attenuated response bias in rats. In light of prior evidence that similar pramipexole doses suppressed firing of ventral tegmental area dopaminergic neurons20, 21 and striatal dopamine levels22, 23 in rats, we speculate that the reduced reward responsiveness that emerged from the current study resulted from decreased striatal dopamine function. Although the clinical evidence of a dopaminergic mechanism underlying reward responsiveness described above is largely correlational, development of the PRT in rats will allow for a direct test of this hypothesis.

Pramipexole also impaired discriminability compared with saline treatment. The analyses of accuracy data clarified, however, that this finding was driven by the fact that pramipexole-treated rats were less accurate than saline-treated rats in identifying the rich stimulus. In addition, unlike saline-treated rats, pramipexole-treated rats failed to show the expected higher accuracy for the more frequently reinforced rich stimulus relative to the lean stimulus. Together, these selective deficits in responding for the rich stimulus indicate that blunted response bias in the pramipexole-treated group was not due to general difficulties with the task. Finally, consistent with prior reports that low doses of pramipexole suppressed locomotor activity in rats,32 pramipexole-treated rats had slower reaction times than saline-treated rats. These results mirror data showing that healthy humans who received low doses of pramipexole15 and subjects with MDD6 had slower reaction times during the PRT compared with placebo-treated and healthy controls, respectively. Collectively, these findings suggest that a low dose of pramipexole blunted the animals’ ability to modulate behavior as a function of the differential reinforcement schedule.

We further hypothesized that a pharmacological manipulation known to increase striatal dopamine levels would enhance reward responsiveness. In support of this hypothesis, acute amphetamine administration, which elevates striatal dopamine levels27 and enhances the sensitivity of brain reward activity in the intracranial self-stimulation procedure in rats,26 potentiated response bias compared with saline administration. Consistent with these results, acute administration via transdermal patches of another psychomotor stimulant, nicotine, similarly increased response bias in humans.33 Nicotine administration also elevates striatal dopamine levels34 and enhances brain reward function in rats.35 Nonetheless, as amphetamine and nicotine have multiple functions beyond increasing synaptic dopamine levels, further studies are warranted to elucidate the role of dopamine specifically in psychostimulant-induced potentiation of reward responsiveness.

Notably, although discriminability increased across blocks, it did so equally in both saline- and amphetamine-treated rats, indicating that amphetamine-induced potentiation of response bias was not a function of improved discriminability. In addition, in contrast to pramipexole-treated rats, amphetamine-treated rats were significantly more accurate for the rich, but not the lean, stimulus than saline-treated rats, highlighting a preference for the stimulus paired with more frequent reward. Thus, converging evidence from computational modeling,28 human imaging studies,30, 31 human pharmacological studies15 and the present rodent experiments strongly suggests that reward responsiveness is at least partially mediated by striatal dopaminergic mechanisms.

Although it is expected that the differential reinforcement schedule alone should bias responding for the more frequently reinforced stimulus in control subjects,36 the signal detection aspect of the task combined with a moderate degree of ambiguity between the target stimuli likely plays an important role in determining the strength of the bias. Little to no ambiguity between target stimuli is expected to reduce response bias irrespective of reward responsiveness, because subjects would be able to accurately identify each stimulus as they are instructed (humans) or trained (rats) to do. However, when presented with two moderately ambiguous stimuli associated with differential and partial reinforcement, healthy controls quickly develop a robust response bias (that is, preference) for the stimulus paired with more frequent rewards in the past. Faced with the same contingencies, subjects with deficits in reward responsiveness respond with similar accuracy to both stimuli, indicating that their behavior is not modulated by reinforcement history. Critically, such differences emerge even if experimental and control groups (for example, MDD vs healthy control subjects) are exposed to identical numbers of reward feedback and rich/lean reward ratio. Furthermore, it is possible that the differential omission of rewards after correct rich and lean trials would lead to a differential extinction of responding. However, it is unlikely that differential extinction would influence responding because partial or intermittent reinforcement schedules, like the one used in the PRT, are generally resistant to extinction, an effect known as the partial reinforcement extinction effect.37, 38 Thus, the task parameters used in the present study were sufficient to allow for the development of a response bias that was mediated by the differential and partial reinforcement schedules.

As the differential reinforcement schedule is introduced only during the test session, it is expected that intrasession learning of the reinforcement schedule occurs if rats develop a response bias. However, the rate of learning may vary between rats and experiments. For example, in the task development and amphetamine experiments, Wistar, Long–Evans and saline-treated rats displayed a gradual increase in response bias from blocks 1 to 3, whereas saline-treated Wistar rats in the pramipexole experiment displayed a consistently elevated response bias across all three blocks. Thus, during the task development and amphetamine experiments, rats likely learned the reinforcement schedule gradually throughout the three blocks, whereas saline-treated rats in the pramipexole experiment learned the reinforcement schedule more rapidly during the first block. Indeed, in the pramipexole experiment, accuracy for the rich stimulus peaked and was greater than accuracy for the lean stimulus during block 1 in saline-treated rats, reflecting learning of the differential reinforcement schedule during the initial block of testing. It is noteworthy that the rate of learning varies among human subjects and experiments in the human PRT as well. Although some healthy control subjects develop a response bias gradually throughout the test session,5, 8, 15, 39 others appear to learn the reinforcement contingencies within the first block of trials and display a consistently elevated response bias throughout the test session.6, 7, 9, 33

One limitation of the current studies is the use of food restriction, which is often required in rodents to ensure task performance. Potentiated responding for a more frequently reinforced stimulus may thus reflect increased motivation to obtain the food reward. However, participants in the human PRT are likely similarly motivated by the opportunity to earn money. Still, differences in values attributed to the rewards may also influence response patterns during testing. Furthermore, motivation and reward responsiveness are both dependent, at least partially, on dopaminergic neurotransmission.2, 11, 12, 29 Thus, the impact of motivation and potentially reward valuation on the development of response bias cannot be excluded based on the present studies. Conversely, the fact that response bias in humans correlates with subjective anhedonia measures does not necessarily imply that reward responsiveness is mediated by hedonic capacity or pleasure. Rather, most self-report measures of human anhedonia do not discriminate among impairments in discrete reward processes, such as reward responsiveness, motivation and valuation, yet deficits in these processes may be expressed during subjective assessments and imprecisely labeled as anhedonia. Indeed, it is argued that hedonic capacity may be preserved in MDD and that our understanding and assessment of reward-related processes in psychiatric disorders should be expanded beyond hedonic capacity.2, 4 It should be noted, however, that anhedonia is defined as a ‘lack of reactivity to usually pleasurable stimuli’,1 a definition that may be interpreted as decreased responsiveness to a monetary or food reward as in the human and rat PRT, respectively. Nevertheless, the impact of anhedonia, amotivation and other deficits in reward processing on reward responsiveness cannot be excluded and should be separately identified. It is in this spirit that the PRT task was developed, with the goal of objectively and reliably assessing a key component of anhedonic/amotivated behavior, namely, a reduced ability to modulate behavior as a function of rewards.

Recently, promising translational behavioral assessments aimed at characterizing such discrete reward-related processes have been developed. For example, Treadway et al.40 have developed a human version of an effort-based decision-making task that is based on a procedure previously developed by Salamone et al.41 to assess motivational processes in rats. Along similar lines, Anderson et al.42 have recently developed corresponding human and rat43 versions of a tone discrimination task to assess emotional biases based on a similar procedure previously developed in rats by Enkel et al.44 This latter task also utilizes a signal-detection approach, but unlike the Response Bias PRT, it assesses responding for both rewards and the avoidance of punishment. Furthermore, the Response Bias PRT assesses implicit learning of reward contingencies during testing, whereas the emotional bias task assesses emotional responding in subjects already trained to differentiate reinforcement contingencies. Thus, each of these tasks assesses different reward-related processes that are likely mediated by different neurobiological mechanisms. Together, these tasks provide a new armamentarium to dissect deficits in reward and motivational processes seen in several neuropsychiatric disorders into homologous psychological and neurobiological components and provide a powerful platform for translational studies across species.

In conclusion, our results highlight the development of a new behavioral assessment of reward responsiveness in rats that is conceptually and procedurally analogous to the Response Bias PRT used in humans and will allow for more direct and unconfounded investigation of neurobiological mechanisms underlying reward responsiveness. Along with similar efforts described above, this approach is expected to bridge the translational gap that currently exists in psychiatric research45 and promote a better understanding of discrete reward-related deficits in neuropsychiatric disorders beyond the traditional definition of anhedonia.