Deep and beautiful. The reward prediction error hypothesis of dopamine

doi:10.1016/j.shpsc.2013.10.006

Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences

Volume 45, March 2014, Pages 57-67

https://doi.org/10.1016/j.shpsc.2013.10.006 Get rights and content

Highlights

•
I reconstruct the history of the reward-prediction error hypothesis of dopamine.
•
I contrast the reward-prediction error and the incentive salience hypotheses.
•
I elucidate in which sense the reward-prediction error hypothesis is explanatory.
•
The reward-prediction error hypothesis is deeper than the incentive salience hypothesis.

Abstract

According to the reward-prediction error hypothesis (RPEH) of dopamine, the phasic activity of dopaminergic neurons in the midbrain signals a discrepancy between the predicted and currently experienced reward of a particular event. It can be claimed that this hypothesis is deep, elegant and beautiful, representing one of the largest successes of computational neuroscience. This paper examines this claim, making two contributions to existing literature. First, it draws a comprehensive historical account of the main steps that led to the formulation and subsequent success of the RPEH. Second, in light of this historical account, it explains in which sense the RPEH is explanatory and under which conditions it can be justifiably deemed deeper than the incentive salience hypothesis of dopamine, which is arguably the most prominent contemporary alternative to the RPEH.

Introduction

According to the reward-prediction error hypothesis of dopamine (RPEH), the phasic activity of dopaminergic neurons in specific regions in the midbrain signals a discrepancy between the predicted and currently experienced reward of a particular event. The RPEH is widely regarded as one of the largest successes of computational neuroscience. Terrence Sejnowski, a pioneer in computational neuroscience and prominent cognitive scientist, pointed at the RPEH, when, in 2012, he was invited by the online magazine Edge.org to answer the question “What is your favorite deep, elegant, or beautiful explanation?” Several researchers in cognitive and brain sciences would agree that this hypothesis “has become the standard model [for explaining dopaminergic activity and reward-based learning] within neuroscience” (Caplin & Dean, 2008, p. 663). Even among critics, the “stunning elegance” and the “beautiful rigor” of the RPEH are recognized (Berridge, 2007, pp. 399, 403).

However, the type of information coded by dopaminergic transmission—along with its functional role in cognition and behaviour—is very likely to go beyond reward-prediction error. The RPEH is not the only available hypothesis about what type of information is encoded by dopaminergic activity in the midbrain (cf., Berridge, 2007, Friston et al., 2012, Graybiel, 2008, Wise, 2004). Current evidence does not speak univocally in favour of this hypothesis, and disagreement remains about to what extent the RPEH is supported by available evidence (Dayan and Niv, 2008, O’Doherty, 2012, Redgrave and Gurney, 2006). On the one hand, it has been claimed that “to date no alternative has mustered as convincing and multidirectional experimental support as the prediction-error theory of dopamine” (Niv & Montague, 2009, p. 342; see also Glimcher, 2011, Niv, 2009); on the other hand, the counter-claims have been put forward that the RPEH is an “elegant illusion” and that “[s]o far, incentive salience predictions [that is, predictions of an alternative hypothesis about dopamine] appear to best fit the data from situations that explicitly pit the dopamine hypotheses against each other” (Berridge, 2007, p. 424).

How has the RPEH become so successful then? What does it explain exactly? And, granted that it is at least intuitively uncontroversial that the RPEH is beautiful and elegant, in which sense can it be justifiably deemed deeper than alternatives? The present paper addresses these questions by firstly reconstructing the main historical events that led to the formulation and subsequent success of the RPEH (Section 2).

With this historical account on the background, it is elucidated what and how the RPEH explains, contrasting it to the incentive salience hypothesis—arguably its most prominent current alternative. It is clarified that both hypotheses are concerned only with what type of information is encoded by dopaminergic activity. Specifically, the RPEH has the dual role of accurately describing the dynamic profile of phasic dopaminergic activity in the midbrain during reward-based learning and decision-making, and of explaining this profile by citing the representational role of dopaminergic phasic activity. If the RPEH is true, then a mechanism composed of midbrain dopaminergic neurons and their phasic activity carries out the task of learning what to do in the face of expected rewards, generating decisions accordingly (Section 3).

The paper finally explicates under which conditions some explanation of learning, motivation or decision-making phenomena based on the RPEH can be justifiably deemed deeper than some alternative explanation based on the incentive salience hypothesis. Two accounts of explanatory depth are considered. According to one account, deeper explanatory generalizations have wider scope (e.g., Hempel, 1959); according to the other, deeper explanatory generalizations show more degrees of invariance (e.g., Woodward & Hitchcock, 2003). It is argued that, although it is premature to maintain that explanations based on the RPEH are actually deeper—in either of these two senses of explanatory depth—than alternative explanations based on the incentive salience hypothesis, relevant available evidence indicates that they may well be (Section 4). The contribution of the paper to existing literature is summarised in the conclusion.

Section snippets

Reward-prediction error meets dopamine

Dopamine is a neurotransmitter in the brain.

Reward-prediction error and incentive salience: what do they explain?

In light of Montague et al., 1996, Schultz et al., 1997, the RPEH can now be more precisely characterised. The hypothesis states that the phasic firing of dopaminergic neurons in the ventral tegmental area and substantia nigra “in part” encodes reward-prediction errors. Montague and colleagues did not claim that all type of activity in all dopaminergic neurons encode only (or in all circumstances) reward-prediction errors. Their hypothesis is about “a particular relationship between the causes

Explanatory depth, reward-prediction error and incentive salience

A number of accounts of explanatory depth have recently been proposed in philosophy of science (e.g., Woodward and Hitchcock, 2003, Strevens, 2009, Weslake, 2010). While significantly different, these accounts agree that explanatory depth is a feature of generalizations that express the relationship between an explanans and an explanandum.

According to Woodward and Hitchcock (2003), in order to be genuinely explanatory, a generalization should exhibit patterns of counterfactual dependence

Conclusion

This paper has made two types of contributions to existing literature, which should be of interest to both historians and philosophers of cognitive science. First, the paper has provided a comprehensive historical overview of the main steps that have led to the formulation of the RPEH. Second, in light of this historical overview, it has made explicit what precisely the RPEH and the ISH explain, and under which circumstances neurocomputational explanations of learning and decision-making

Acknowledgements

I am sincerely grateful to Aistis Stankevicius, Charles Rathkopf, Peter Dayan, and especially to Gregory Radick, editor of this journal, and to two anonymous referees, for their encouragement, constructive criticisms and helpful suggestions. The work on this project was supported by the Deutsche Forschungsgemeinschaft (DFG) as part of the priority program “New Frameworks of Rationality” ([SPP 1516]). The usual disclaimers about any remaining error or misconception in the paper apply.

References (110)

L.F. Abbott
Theoretical neuroscience rising
Neuron
(2008)
H.M. Bayer et al.
Midbrain dopamine neurons encode a quantitative reward prediction error signal
Neuron
(2005)
K.C. Berridge et al.
What is the role of dopamine in reward: Hedonic impact, reward learning, or incentive salience?
Brain Research Reviews
(1998)
A. Björklund et al.
Dopamine neuron systems in the brain: An update
Trends in Neurosciences
(2007)
T.J. Crow
A map of the rat mesencephalon for electrical selfstimulation
Brain Research
(1972)
P. Dayan
Computational modelling
Current Opinion in Neurobiology
(1994)
P. Dayan
Twenty-five lessons from computational neuromodulation
Neuron
(2012)
P. Dayan et al.
Reinforcement learning: The good, the bad and the ugly
Current Opinion in Neurobiology
(2008)
K.J. Friston et al.
Value-dependent selection in the brain: Simulation in a synthetic neural model
Neuroscience
(1994)
A.A. Grace
Phasic versus tonic dopamine release and the modulation of dopamine system responsivity: A hypothesis for the etiology of schizophrenia
Neuroscience
(1991)

D. Joel et al.

Actor-critic models of the basal ganglia: New anatomical and computational perspectives

Neural Networks

(2002)

S.M. McClure et al.

Temporal prediction errors in a passive learning task activate human striatum

Neuron

(2003)

S.M. McClure et al.

Computational neuroimaging: Monitoring reward learning with blood flow

S.M. McClure et al.

A computational substrate for incentive salience

Trends in Neuroscience

(2003)

P.R. Montague et al.

Computational psychiatry

Trends in Cognitive Sciences

(2012)

Y. Niv

Reinforcement learning in the brain

Journal of Mathematical Psychology

(2009)

Y. Niv et al.

Theoretical and empirical studies of learning

Y. Niv et al.

Dialogues on prediction errors

Trends in Cognitive Sciences

(2008)

J. O’Doherty et al.

Temporal difference learning model accounts for responses in human ventral striatum and orbitofrontal cortex during Pavlovian appetitive learning

Neuron

(2003)

T.E. Robinson et al.

The neural basis of drug craving. An incentive-sensitization theory of addiction

Brain Research Reviews

(1993)

L. Stein

Chemistry of purposive behavior

B.W. Balleine et al.

Multiple forms of value learning and the function of dopamine

H.M. Bayer et al.

Statistics of midbrain dopamine neuron spike trains in the awake primate

Journal of Neurophysiology

(2007)

G.S. Berns et al.

Predictability modulates human brain response to reward

Journal of Neuroscience

(2001)

K.C. Berridge

The debate over dopamine’s role in reward: The case for incentive salience

Psychopharmacology (Berl)

(2007)

K.C. Berridge et al.

Taste reactivity analysis of 6-OHDA aphagia without impairment of taste reactivity: Implications for theories of dopamine function

Behavioral Neuroscience

(1989)

D.A. Bindra

A motivational view of learning, performance, and behavior modification

Psychological Review

(1974)

R.R. Bush et al.

A mathematical model for simple learning

Psychological Review

(1951)

J.H. Byrne et al.

Computational capabilities of single neurons: Relationship to simple forms of associative and nonassociative learning in aplysia

A. Caplin et al.

Dopamine, reward prediction error, and economics

Quarterly Journal of Economics

(2008)

A. Caplin et al.

Measuring beliefs and rewards: A neuroeconomic approach

Quarterly Journal of Economics

(2010)

A. Carlsson

The occurrence, distribution, and physiological role of catecholamines in the nervous system

Pharmacological Reviews

(1959)

A. Carlsson

Morphologic and dynamic aspects of dopamine in the central nervous system

A. Carlsson

A half-century of neurotransmitter research: Impact on neurology and psychiatry

P.S. Churchland et al.

The computational brain

(1992)

M. Colombo

Constitutive relevance and the personal/subpersonal distinction

Philosophical Psychology

(2013)

B. Costall et al.

Behavioural aspects of dopamine agonists and antagonists

N.D. Daw et al.

Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control

Nature Neuroscience

(2005)

S.B. Dunnett et al.

The functional role of mesotelencephalic dopamine systems

Biological Reviews of the Cambridge Philosophical Society

(1992)

H. Ehringep et al.

Verteilung von Noradrenalin und Dopamin (3Hydroxytyramin) im Gehirn des Menschen und ihr Verhalten bci Erkrankungen des extrapyramidalen Systems

Klinisch Wochenschrift

(1960)

H.C. Fibiger

Drugs and reinforcement mechanisms: A critical review of the catecholamine theory

Annual Review of Pharmacology and Toxicology

(1978)

K.J. Friston et al.

Dopamine, affordance and active inference

PLoS Computational Biology

(2012)

A. Gelperin et al.

The logic of limax learning

P.W. Glimcher

Understanding dopamine and reinforcement learning: The dopamine reward prediction error hypothesis

Proceeding of the National Academy of Science USA

(2011)

A.M. Graybiel

Habits, rituals and the evaluative brain

Annual Review of Neuroscience

(2008)

M. Hammer

An identified neuron mediates the unconditioned stimulus in associative olfactory learning in honeybees

Nature

(1993)

R.D. Hawkins et al.

Is there a cell-biological alphabet for simple forms of learning?

Psychological Review

(1984)

C.G. Hempel

The logic of functional analysis

C.B. Holroyd et al.

The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity

Psychological Review

(2002)

O. Hornykiewiczl

Dopamine (3-hydroxytyramine) and brain function

Pharmacological Reviews

(1966)

Cited by (0)

View full text

Deep and beautiful. The reward prediction error hypothesis of dopamine

Highlights

Abstract

Introduction

Section snippets

Reward-prediction error meets dopamine

Reward-prediction error and incentive salience: what do they explain?

Explanatory depth, reward-prediction error and incentive salience

Conclusion

Acknowledgements

Neuron

Neuron

Brain Research Reviews

Trends in Neurosciences

Brain Research

Current Opinion in Neurobiology

Neuron

Current Opinion in Neurobiology

Neuroscience

Neuroscience

Neural Networks

Neuron

Trends in Neuroscience

Trends in Cognitive Sciences

Journal of Mathematical Psychology

Trends in Cognitive Sciences

Neuron

Brain Research Reviews

Multiple forms of value learning and the function of dopamine

Statistics of midbrain dopamine neuron spike trains in the awake primate

Journal of Neurophysiology

Predictability modulates human brain response to reward

Journal of Neuroscience

The debate over dopamine’s role in reward: The case for incentive salience

Psychopharmacology (Berl)

Taste reactivity analysis of 6-OHDA aphagia without impairment of taste reactivity: Implications for theories of dopamine function

Behavioral Neuroscience

A motivational view of learning, performance, and behavior modification

Psychological Review

A mathematical model for simple learning

Psychological Review

Computational capabilities of single neurons: Relationship to simple forms of associative and nonassociative learning in aplysia

Dopamine, reward prediction error, and economics

Quarterly Journal of Economics

Measuring beliefs and rewards: A neuroeconomic approach

Quarterly Journal of Economics

The occurrence, distribution, and physiological role of catecholamines in the nervous system

Pharmacological Reviews

Morphologic and dynamic aspects of dopamine in the central nervous system

A half-century of neurotransmitter research: Impact on neurology and psychiatry

The computational brain

Constitutive relevance and the personal/subpersonal distinction

Philosophical Psychology

Behavioural aspects of dopamine agonists and antagonists

Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control

Nature Neuroscience

The functional role of mesotelencephalic dopamine systems

Biological Reviews of the Cambridge Philosophical Society

Verteilung von Noradrenalin und Dopamin (3Hydroxytyramin) im Gehirn des Menschen und ihr Verhalten bci Erkrankungen des extrapyramidalen Systems

Klinisch Wochenschrift

Drugs and reinforcement mechanisms: A critical review of the catecholamine theory

Annual Review of Pharmacology and Toxicology

Dopamine, affordance and active inference

PLoS Computational Biology

The logic of limax learning

Understanding dopamine and reinforcement learning: The dopamine reward prediction error hypothesis

Proceeding of the National Academy of Science USA

Habits, rituals and the evaluative brain

Annual Review of Neuroscience

An identified neuron mediates the unconditioned stimulus in associative olfactory learning in honeybees

Nature

Is there a cell-biological alphabet for simple forms of learning?

Psychological Review

The logic of functional analysis

The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity

Psychological Review

Dopamine (3-hydroxytyramine) and brain function

Pharmacological Reviews