Abstract
Correlative studies have strongly linked phasic changes in dopamine activity with reward prediction error signaling. But causal evidence that these brief changes in firing actually serve as error signals to drive associative learning is more tenuous. Although there is direct evidence that brief increases can substitute for positive prediction errors, there is no comparable evidence that similarly brief pauses can substitute for negative prediction errors. In the absence of such evidence, the effect of increases in firing could reflect novelty or salience, variables also correlated with dopamine activity. Here we provide evidence in support of the proposed linkage, showing in a modified Pavlovian over-expectation task that brief pauses in the firing of dopamine neurons in rat ventral tegmental area at the time of reward are sufficient to mimic the effects of endogenous negative prediction errors. These results support the proposal that brief changes in the firing of dopamine neurons serve as full-fledged bidirectional prediction error signals.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Rescorla, R.A. & Wagner, A.R. A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement. in Classical Conditioning II: Current Research and Theory (eds. Black, A.H. & Prokasy, W.F.) 64–99 (Appleton-Century-Crofts, New York, 1972).
Sutton, R.S. Learning to predict by the method of temporal difference. Mach. Learn. 3, 9–44 (1988).
Mirenowicz, J. & Schultz, W. Importance of unpredictability for reward responses in primate dopamine neurons. J. Neurophysiol. 72, 1024–1027 (1994).
Roesch, M.R., Calu, D.J. & Schoenbaum, G. Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards. Nat. Neurosci. 10, 1615–1624 (2007).
Waelti, P., Dickinson, A. & Schultz, W. Dopamine responses comply with basic assumptions of formal learning theory. Nature 412, 43–48 (2001).
Pan, W.-X., Schmidt, R., Wickens, J.R. & Hyland, B.I. Dopamine cells respond to predicted events during classical conditioning: evidence for eligibility traces in the reward-learning network. J. Neurosci. 25, 6235–6242 (2005).
D'Ardenne, K., McClure, S.M., Nystrom, L.E. & Cohen, J.D. BOLD responses reflecting dopaminergic signals in the human ventral tegmental area. Science 319, 1264–1267 (2008).
Day, J.J., Roitman, M.F., Wightman, R.M. & Carelli, R.M. Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens. Nat. Neurosci. 10, 1020–1028 (2007).
Hart, A.S., Rutledge, R.B., Glimcher, P.W. & Phillips, P.E. Phasic dopamine release in the rat nucleus accumbens symmetrically encodes a reward prediction error term. J. Neurosci. 34, 698–704 (2014).
Schultz, W., Dayan, P. & Montague, P.R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997).
Redgrave, P., Gurney, K. & Reynolds, J. What is reinforced by phasic dopamine signals? Brain Res. Rev. 58, 322–339 (2008).
Zweifel, L.S. et al. Disruption of NMDAR-dependent burst firing by dopamine neurons provides selective assessment of phasic dopamine-dependent behavior. Proc. Natl. Acad. Sci. USA 106, 7281–7288 (2009).
Steinberg, E.E. et al. A causal link between prediction errors, dopamine neurons and learning. Nat. Neurosci. 16, 966–973 (2013).
Tsai, H.C. et al. Phasic firing in dopaminergic neurons is sufficient for behavioral conditioning. Science 324, 1080–1084 (2009).
Frank, M.J., Moustafa, A.A., Haughey, H.M., Curran, T. & Hutchison, K.E. Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning. Proc. Natl. Acad. Sci. USA 104, 16311–16316 (2007).
Kim, K.M. et al. Optogenetic mimicry of the transient activation of dopamine neurons by natural reward is sufficient for operant reinforcement. PLoS One 7, e33612 (2012).
Stopper, C.M., Tse, M.T., Montes, D.R., Wiedman, C.R. & Floresco, S.B. Overriding phasic dopamine signals redirects action selection during risk/reward decision making. Neuron 84, 177–189 (2014).
Shumake, J., Ilango, A., Scheich, H., Wetzel, W. & Ohl, F.W. Differential neuromodulation of acquisition and retrieval of avoidance learning by the lateral habenula and ventral tegmental area. J. Neurosci. 30, 5876–5883 (2010).
Stamatakis, A.M. & Stuber, G.D. Activation of lateral habenula inputs to the ventral midbrain promotes behavioral avoidance. Nat. Neurosci. 15, 1105–1107 (2012).
Danna, C.L., Shepard, P.D. & Elmer, G.I. The habenula governs the attribution of incentive salience to reward predictive cues. Front. Hum. Neurosci. 7, 781 (2013).
Bayer, H.M., Lau, B. & Glimcher, P.W. Statistics of midbrain dopamine neuron spike trains in the awake primate. J. Neurophysiol. 98, 1428–1439 (2007).
Glimcher, P.W. Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis. Proc. Natl. Acad. Sci. USA 108 (suppl. 3), 15647–15654 (2011).
Kakade, S. & Dayan, P. Dopamine: generalization and bonuses. Neural Netw. 15, 549–559 (2002).
Matsumoto, M. & Hikosaka, O. Two types of dopamine neuron distinctly convey positive and negative motivational signals. Nature 459, 837–841 (2009).
Pearce, J.M., Kaye, H. & Hall, G. Predictive accuracy and stimulus associability: development of a model for Pavlovian learning. in Quantitative Analyses of Behavior (eds. Commons, M.L., Herrnstein, R.J. & Wagner, A.R.) 241–255 (Ballinger, Cambridge, Massachusetts, USA, 1982).
Esber, G.R. & Haselgrove, M. Reconciling the influence of predictiveness and uncertainty on stimulus salience: a model of attention in associative learning. Proc. R. Soc. Lond. B 278, 2553–2561 (2011).
Rescorla, R.A. Renewal after overexpectation. Learn. Behav. 35, 19–26 (2007).
Rescorla, R.A. Spontaneous recovery from overexpectation. Learn. Behav. 34, 13–20 (2006).
Niv, Y., Daw, N.D., Joel, D. & Dayan, P. Tonic dopamine: opportunity costs and the control of response vigor. Psychopharmacology (Berl.) 191, 507–520 (2007).
Salamone, J.D. & Correa, M. The mysterious motivational functions of mesolimbic dopamine. Neuron 76, 470–485 (2012).
Berridge, K.C. & Robinson, T.E. What is the role of dopamine in reward: hedonic impact, reward learning, or incentive salience? Brain Res. Brain Res. Rev. 28, 309–369 (1998).
Ikemoto, S., Yang, C. & Tan, A. Basal ganglia circuit loops, dopamine and motivation: a review and enquiry. Behav. Brain Res. 290, 17–31 (2015).
Takahashi, Y.K. et al. The orbitofrontal cortex and ventral tegmental area are necessary for learning from unexpected outcomes. Neuron 62, 269–280 (2009).
Takahashi, Y.K. et al. Neural estimates of imagined outcomes in the orbitofrontal cortex drive behavior and learning. Neuron 80, 507–518 (2013).
Rescorla, R.A. Reduction in the effectiveness of reinforcement after prior excitatory conditioning. Learn. Motiv. 1, 372–381 (1970).
Witten, I.B. et al. Recombinase-driver rat lines: tools, techniques, and optogenetic application to dopamine-mediated reinforcement. Neuron 72, 721–733 (2011).
Lammel, S. et al. Diversity of transgenic mouse models for selective targeting of midbrain dopamine neurons. Neuron 85, 429–438 (2015).
Stamatakis, A.M. et al. A unique population of ventral tegmental area neurons inhibits the lateral habenula to promote reward. Neuron 80, 1039–1053 (2013).
Stuber, G.D., Stamatakis, A.M. & Kantak, P.A. Considerations when using cre-driver rodent lines for studying ventral tegmental area circuitry. Neuron 85, 439–445 (2015).
Yamaguchi, T., Qi, J., Wang, H.-L., Zhang, S. & Morales, M. Glutamatergic and dopaminergic neurons in the mouse ventral tegmental area. Eur. J. Neurosci. 41, 760–772 (2015).
Li, X., Qi, J., Yamaguchi, T., Wang, H.-L. & Morales, M. Heterogeneous composition of dopamine neurons of the rat A10 region: molecular evidence for diverse signaling properties. Brain Struct. Funct. 218, 1159–1176 (2013).
Root, D.H. et al. Norepinephrine activates dopamine D4 receptors in the rat lateral habenula. J. Neurosci. 35, 3460–3469 (2015).
Tecuapetla, F. et al. Glutamatergic signaling by mesolimbic dopamine neurons in the nucleus accumbens. J. Neurosci. 30, 7105–7110 (2010).
Stuber, G.D., Hnasko, T.S., Britt, J.P., Edwards, R.H. & Bonci, A. Dopaminergic terminals in the nucleus accumbens but not the dorsal striatum corelease glutamate. J. Neurosci. 30, 8229–8233 (2010).
Zhang, S. et al. Dopaminergic and glutamatergic microdomains in a subset of rodent mesoaccumbens axons. Nat. Neurosci. 18, 386–392 (2015).
Matsumoto, M. & Hikosaka, O. Lateral habenula as a source of negative reward signals in dopamine neurons. Nature 447, 1111–1115 (2007).
Hong, S., Jhou, T.C., Smith, M., Saleem, K.S. & Hikosaka, O. Negative reward signals from the lateral habenula to dopamine neurons are mediated by rostromedial tegmental nucleus in primates. J. Neurosci. 31, 11457–11471 (2011).
Ji, H. & Shepard, P.D. Lateral habenula stimulation inhibits rat midbrain dopamine neurons through a GABAA receptor-mediated mechanism. J. Neurosci. 27, 6923–6930 (2007).
Mileykovskiy, B. & Morales, M. Duration of inhibition of ventral tegmental area dopamine neurons encodes a level of conditioned fear. J. Neurosci. 31, 7471–7476 (2011).
Ilango, A. et al. Similar roles of substantia nigra and ventral tegmental dopamine neurons in reward and aversion. J. Neurosci. 34, 817–822 (2014).
Acknowledgements
This work was supported by the Intramural Research Program at the US National Institute on Drug Abuse (NIDA). The authors would like to thank K. Deisseroth and the Gene Therapy Center at the University of North Carolina at Chapel Hill for providing viral reagents, and G. Stuber for technical advice on their use. We would also like to thank B. Harvey and the NIDA Optogenetic and Transgenic Core and M. Morales and the NIDA Histology Core for their assistance. The opinions expressed in this article are the authors' own and do not reflect the views of the US National Institutes of Health/Department of Health and Human Services.
Author information
Authors and Affiliations
Contributions
C.Y.C. and G.S. conceived the experiment; C.Y.C. carried out the experiment, with help from G.R.E. and Y.M.-G. on the behavioral design and histology and from H.-J.Y. and A.B. on the slice physiology; C.Y.C. and G.S. analyzed the data and prepared the manuscript, in consultation with the other authors, particularly G.R.E., whose input on learning theory issues was invaluable.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 Targeting of eYFP expression to the VTA.
The expression of eYFP cells (green) resided within the boarder of TH expression (red) and showed relatively high specificity to co-localize with TH+ within individual cells. Scale: 1 mm. Note that the images were taken under large field scanning, the signal intensity during acquisition was adjusted to capture the overall brightness of the entire field without losing positive signal that was relatively weak in comparison. This will inevitably render the co-localization in some merged images seem to be overpowered by one color versus another (e.g. -6.9 mm panels). However cell counting on high magnification images showed that ~85% of the eYFP positive neurons were also TH+ (see main text).
Supplementary Figure 2 Responding to the visual cue.
Rats learned to respond to the visual cue learning, and there were no main effects nor any interactions with group during either conditioning or compound training (F’s < 1.2, p’s > 0.93). Note that responding to the visual cue is somewhat lower than to the auditory cues. This is a normal difference in the strength and form of conditioned responding between visual and auditory cues seen in our lab and others. In addition, the visual cue was presented alone, reinforced, without light delivery during the compound sessions, in order to push any effect of stimulation onto the auditory cue. Thus we do not expect (nor did we look for) any changes in responding to this cue in the probe test. The relevant comparisons are between the auditory cues.
Supplementary Figure 3 Rearing behavior.
All rats showed low levels of rearing during cue presentation. There were no main effects nor any differences between groups in any of the phases of training (F’s < 0.25, p’s > 0.92). This is as we have reported previously6. We typically normalize for rearing because we have found that is removal reduces the variability in our measures.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–3 (PDF 492 kb)
Rights and permissions
About this article
Cite this article
Chang, C., Esber, G., Marrero-Garcia, Y. et al. Brief optogenetic inhibition of dopamine neurons mimics endogenous negative reward prediction errors. Nat Neurosci 19, 111–116 (2016). https://doi.org/10.1038/nn.4191
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nn.4191
This article is cited by
-
Striatal dopamine signals reflect perceived cue–action–outcome associations in mice
Nature Neuroscience (2024)
-
Spontaneous recovery from overexpectation in an insect
Scientific Reports (2022)
-
Dopamine encodes real-time reward availability and transitions between reward availability states on different timescales
Nature Communications (2022)
-
The Role of the Striatum in Motor Learning
Neuroscience and Behavioral Physiology (2022)
-
The Role of the Striatum in Organizing Voluntary Behavior
Neuroscience and Behavioral Physiology (2021)