1
|
Gershman SJ, Assad JA, Datta SR, Linderman SW, Sabatini BL, Uchida N, Wilbrecht L. Explaining dopamine through prediction errors and beyond. Nat Neurosci 2024; 27:1645-1655. [PMID: 39054370 DOI: 10.1038/s41593-024-01705-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 06/13/2024] [Indexed: 07/27/2024]
Abstract
The most influential account of phasic dopamine holds that it reports reward prediction errors (RPEs). The RPE-based interpretation of dopamine signaling is, in its original form, probably too simple and fails to explain all the properties of phasic dopamine observed in behaving animals. This Perspective helps to resolve some of the conflicting interpretations of dopamine that currently exist in the literature. We focus on the following three empirical challenges to the RPE theory of dopamine: why does dopamine (1) ramp up as animals approach rewards, (2) respond to sensory and motor features and (3) influence action selection? We argue that the prediction error concept, once it has been suitably modified and generalized based on an analysis of each computational problem, answers each challenge. Nonetheless, there are a number of additional empirical findings that appear to demand fundamentally different theoretical explanations beyond encoding RPE. Therefore, looking forward, we discuss the prospects for a unifying theory that respects the diversity of dopamine signaling and function as well as the complex circuitry that both underlies and responds to dopaminergic transmission.
Collapse
Affiliation(s)
- Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA, USA.
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA.
| | - John A Assad
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | | | - Scott W Linderman
- Department of Statistics and Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA
| | - Bernardo L Sabatini
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | - Linda Wilbrecht
- Department of Psychology and Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA
| |
Collapse
|
2
|
Runyon K, Bui T, Mazanek S, Hartle A, Marschalko K, Howe WM. Distinct cholinergic circuits underlie discrete effects of reward on attention. Front Mol Neurosci 2024; 17:1429316. [PMID: 39268248 PMCID: PMC11390659 DOI: 10.3389/fnmol.2024.1429316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 08/01/2024] [Indexed: 09/15/2024] Open
Abstract
Attention and reward are functions that are critical for the control of behavior, and massive multi-region neural systems have evolved to support the discrete computations associated with each. Previous research has also identified that attention and reward interact, though our understanding of the neural mechanisms that mediate this interplay is incomplete. Here, we review the basic neuroanatomy of attention, reward, and cholinergic systems. We then examine specific contexts in which attention and reward computations interact. Building on this work, we propose two discrete neural circuits whereby acetylcholine, released from cell groups located in different parts of the brain, mediates the impact of stimulus-reward associations as well as motivation on attentional control. We conclude by examining these circuits as a potential shared loci of dysfunction across diseases states associated with deficits in attention and reward.
Collapse
Affiliation(s)
- Kelly Runyon
- School of Neuroscience at Virginia Tech, Blacksburg, VA, United States
| | - Tung Bui
- School of Neuroscience at Virginia Tech, Blacksburg, VA, United States
| | - Sarah Mazanek
- School of Neuroscience at Virginia Tech, Blacksburg, VA, United States
| | - Alec Hartle
- School of Neuroscience at Virginia Tech, Blacksburg, VA, United States
| | - Katie Marschalko
- School of Neuroscience at Virginia Tech, Blacksburg, VA, United States
| | | |
Collapse
|
3
|
Basu A, Yang JH, Yu A, Glaeser-Khan S, Rondeau JA, Feng J, Krystal JH, Li Y, Kaye AP. Frontal Norepinephrine Represents a Threat Prediction Error Under Uncertainty. Biol Psychiatry 2024; 96:256-267. [PMID: 38316333 PMCID: PMC11269024 DOI: 10.1016/j.biopsych.2024.01.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 01/19/2024] [Accepted: 01/29/2024] [Indexed: 02/07/2024]
Abstract
BACKGROUND To adapt to threats in the environment, animals must predict them and engage in defensive behavior. While the representation of a prediction error signal for reward has been linked to dopamine, a neuromodulatory prediction error for aversive learning has not been identified. METHODS We measured and manipulated norepinephrine release during threat learning using optogenetics and a novel fluorescent norepinephrine sensor. RESULTS We found that norepinephrine response to conditioned stimuli reflects aversive memory strength. When delays between auditory stimuli and footshock are introduced, norepinephrine acts as a prediction error signal. However, temporal difference prediction errors do not fully explain norepinephrine dynamics. To explain noradrenergic signaling, we used an updated reinforcement learning model with uncertainty about time and found that it explained norepinephrine dynamics across learning and variations in temporal and auditory task structure. CONCLUSIONS Norepinephrine thus combines cognitive and affective information into a predictive signal and links time with the anticipation of danger.
Collapse
Affiliation(s)
- Aakash Basu
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut; Interdepartmental Neuroscience Program, Yale School of Medicine, New Haven, Connecticut
| | - Jen-Hau Yang
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut
| | - Abigail Yu
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut
| | | | - Jocelyne A Rondeau
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut
| | - Jiesi Feng
- State Key Laboratory of Membrane Biology, Peking University School of Life Sciences, Beijing, China
| | - John H Krystal
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut; Clinical Neuroscience Division, Veterans Administration National Center for PTSD, West Haven, Connecticut
| | - Yulong Li
- State Key Laboratory of Membrane Biology, Peking University School of Life Sciences, Beijing, China; Peking University-IDG/McGovern Institute for Brain Research, Beijing, China; Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China; Chinese Institute for Brain Research, Beijing, China
| | - Alfred P Kaye
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut; Clinical Neuroscience Division, Veterans Administration National Center for PTSD, West Haven, Connecticut; Wu Tsai Institute, Yale University, New Haven, Connecticut.
| |
Collapse
|
4
|
Cone I, Clopath C, Shouval HZ. Learning to express reward prediction error-like dopaminergic activity requires plastic representations of time. Nat Commun 2024; 15:5856. [PMID: 38997276 PMCID: PMC11245539 DOI: 10.1038/s41467-024-50205-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 07/02/2024] [Indexed: 07/14/2024] Open
Abstract
The dominant theoretical framework to account for reinforcement learning in the brain is temporal difference learning (TD) learning, whereby certain units signal reward prediction errors (RPE). The TD algorithm has been traditionally mapped onto the dopaminergic system, as firing properties of dopamine neurons can resemble RPEs. However, certain predictions of TD learning are inconsistent with experimental results, and previous implementations of the algorithm have made unscalable assumptions regarding stimulus-specific fixed temporal bases. We propose an alternate framework to describe dopamine signaling in the brain, FLEX (Flexibly Learned Errors in Expected Reward). In FLEX, dopamine release is similar, but not identical to RPE, leading to predictions that contrast to those of TD. While FLEX itself is a general theoretical framework, we describe a specific, biophysically plausible implementation, the results of which are consistent with a preponderance of both existing and reanalyzed experimental data.
Collapse
Affiliation(s)
- Ian Cone
- Department of Bioengineering, Imperial College London, London, UK
- Department of Neurobiology and Anatomy, University of Texas Medical School at Houston, Houston, TX, USA
- Applied Physics Program, Rice University, Houston, TX, USA
| | - Claudia Clopath
- Department of Bioengineering, Imperial College London, London, UK
| | - Harel Z Shouval
- Department of Neurobiology and Anatomy, University of Texas Medical School at Houston, Houston, TX, USA.
- Department of Electrical and Computer Engineering, Rice University, Houston, TX, USA.
| |
Collapse
|
5
|
Song MR, Lee SW. Rethinking dopamine-guided action sequence learning. Eur J Neurosci 2024; 60:3447-3465. [PMID: 38798086 DOI: 10.1111/ejn.16426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 04/21/2024] [Accepted: 05/08/2024] [Indexed: 05/29/2024]
Abstract
As opposed to those requiring a single action for reward acquisition, tasks necessitating action sequences demand that animals learn action elements and their sequential order and sustain the behaviour until the sequence is completed. With repeated learning, animals not only exhibit precise execution of these sequences but also demonstrate enhanced smoothness and efficiency. Previous research has demonstrated that midbrain dopamine and its major projection target, the striatum, play crucial roles in these processes. Recent studies have shown that dopamine from the substantia nigra pars compacta (SNc) and the ventral tegmental area (VTA) serve distinct functions in action sequence learning. The distinct contributions of dopamine also depend on the striatal subregions, namely the ventral, dorsomedial and dorsolateral striatum. Here, we have reviewed recent findings on the role of striatal dopamine in action sequence learning, with a focus on recent rodent studies.
Collapse
Affiliation(s)
- Minryung R Song
- Department of Brain and Cognitive Sciences, KAIST, Daejeon, South Korea
| | - Sang Wan Lee
- Department of Brain and Cognitive Sciences, KAIST, Daejeon, South Korea
- Kim Jaechul Graduate School of AI, KAIST, Daejeon, South Korea
- KI for Health Science and Technology, KAIST, Daejeon, South Korea
- Center for Neuroscience-inspired AI, KAIST, Daejeon, South Korea
| |
Collapse
|
6
|
Pellegrino A, Stein H, Cayco-Gajic NA. Dimensionality reduction beyond neural subspaces with slice tensor component analysis. Nat Neurosci 2024; 27:1199-1210. [PMID: 38710876 DOI: 10.1038/s41593-024-01626-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 03/20/2024] [Indexed: 05/08/2024]
Abstract
Recent work has argued that large-scale neural recordings are often well described by patterns of coactivation across neurons. Yet the view that neural variability is constrained to a fixed, low-dimensional subspace may overlook higher-dimensional structure, including stereotyped neural sequences or slowly evolving latent spaces. Here we argue that task-relevant variability in neural data can also cofluctuate over trials or time, defining distinct 'covariability classes' that may co-occur within the same dataset. To demix these covariability classes, we develop sliceTCA (slice tensor component analysis), a new unsupervised dimensionality reduction method for neural data tensors. In three example datasets, including motor cortical activity during a classic reaching task in primates and recent multiregion recordings in mice, we show that sliceTCA can capture more task-relevant structure in neural data using fewer components than traditional methods. Overall, our theoretical framework extends the classic view of low-dimensional population activity by incorporating additional classes of latent variables capturing higher-dimensional structure.
Collapse
Affiliation(s)
- Arthur Pellegrino
- Laboratoire de Neurosciences Cognitives et Computationnelles, INSERM U960, Département D'Etudes Cognitives, Ecole Normale Supérieure, PSL University, Paris, France.
- Institute for Adaptive and Neural Computation, School of Informatics, University of Edinburgh, Edinburgh, UK.
| | - Heike Stein
- Laboratoire de Neurosciences Cognitives et Computationnelles, INSERM U960, Département D'Etudes Cognitives, Ecole Normale Supérieure, PSL University, Paris, France
| | - N Alex Cayco-Gajic
- Laboratoire de Neurosciences Cognitives et Computationnelles, INSERM U960, Département D'Etudes Cognitives, Ecole Normale Supérieure, PSL University, Paris, France.
| |
Collapse
|
7
|
Abe K, Kambe Y, Majima K, Hu Z, Ohtake M, Momennezhad A, Izumi H, Tanaka T, Matunis A, Stacy E, Itokazu T, Sato TR, Sato T. Functional diversity of dopamine axons in prefrontal cortex during classical conditioning. eLife 2024; 12:RP91136. [PMID: 38747563 PMCID: PMC11095940 DOI: 10.7554/elife.91136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2024] Open
Abstract
Midbrain dopamine neurons impact neural processing in the prefrontal cortex (PFC) through mesocortical projections. However, the signals conveyed by dopamine projections to the PFC remain unclear, particularly at the single-axon level. Here, we investigated dopaminergic axonal activity in the medial PFC (mPFC) during reward and aversive processing. By optimizing microprism-mediated two-photon calcium imaging of dopamine axon terminals, we found diverse activity in dopamine axons responsive to both reward and aversive stimuli. Some axons exhibited a preference for reward, while others favored aversive stimuli, and there was a strong bias for the latter at the population level. Long-term longitudinal imaging revealed that the preference was maintained in reward- and aversive-preferring axons throughout classical conditioning in which rewarding and aversive stimuli were paired with preceding auditory cues. However, as mice learned to discriminate reward or aversive cues, a cue activity preference gradually developed only in aversive-preferring axons. We inferred the trial-by-trial cue discrimination based on machine learning using anticipatory licking or facial expressions, and found that successful discrimination was accompanied by sharper selectivity for the aversive cue in aversive-preferring axons. Our findings indicate that a group of mesocortical dopamine axons encodes aversive-related signals, which are modulated by both classical conditioning across days and trial-by-trial discrimination within a day.
Collapse
Affiliation(s)
- Kenta Abe
- Department of Neuroscience, Medical University of South CarolinaCharlestonUnited States
| | - Yuki Kambe
- Department of Pharmacology, Kagoshima UniversityKagoshimaJapan
| | - Kei Majima
- Institute for Quantum Life Science, National Institutes for Quantum Science and TechnologyChibaJapan
- Japan Science and Technology PRESTOSaitamaJapan
| | - Zijing Hu
- Department of Physiology, Monash UniversityClaytonAustralia
- Neuroscience Program, Biomedicine Discovery Institute, Monash UniversityClaytonAustralia
| | - Makoto Ohtake
- Department of Neuroscience, Medical University of South CarolinaCharlestonUnited States
| | - Ali Momennezhad
- Department of Pharmacology, Kagoshima UniversityKagoshimaJapan
| | - Hideki Izumi
- Faculty of Data Science, Shiga UniversityShigaJapan
| | | | - Ashley Matunis
- Department of Neuroscience, Medical University of South CarolinaCharlestonUnited States
- Department of Biology, College of CharlestonCharlestonUnited States
- Department of Neuro-Medical Science, Osaka UniversityOsakaJapan
| | - Emma Stacy
- Department of Neuroscience, Medical University of South CarolinaCharlestonUnited States
- Department of Biology, College of CharlestonCharlestonUnited States
| | | | - Takashi R Sato
- Department of Neuroscience, Medical University of South CarolinaCharlestonUnited States
| | - Tatsuo Sato
- Department of Pharmacology, Kagoshima UniversityKagoshimaJapan
- Japan Science and Technology PRESTOSaitamaJapan
- Department of Physiology, Monash UniversityClaytonAustralia
- Neuroscience Program, Biomedicine Discovery Institute, Monash UniversityClaytonAustralia
- Japan Science and Technology FORESTSaitamaJapan
| |
Collapse
|
8
|
Schultz W. A dopamine mechanism for reward maximization. Proc Natl Acad Sci U S A 2024; 121:e2316658121. [PMID: 38717856 PMCID: PMC11098095 DOI: 10.1073/pnas.2316658121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2024] Open
Abstract
Individual survival and evolutionary selection require biological organisms to maximize reward. Economic choice theories define the necessary and sufficient conditions, and neuronal signals of decision variables provide mechanistic explanations. Reinforcement learning (RL) formalisms use predictions, actions, and policies to maximize reward. Midbrain dopamine neurons code reward prediction errors (RPE) of subjective reward value suitable for RL. Electrical and optogenetic self-stimulation experiments demonstrate that monkeys and rodents repeat behaviors that result in dopamine excitation. Dopamine excitations reflect positive RPEs that increase reward predictions via RL; against increasing predictions, obtaining similar dopamine RPE signals again requires better rewards than before. The positive RPEs drive predictions higher again and thus advance a recursive reward-RPE-prediction iteration toward better and better rewards. Agents also avoid dopamine inhibitions that lower reward prediction via RL, which allows smaller rewards than before to elicit positive dopamine RPE signals and resume the iteration toward better rewards. In this way, dopamine RPE signals serve a causal mechanism that attracts agents via RL to the best rewards. The mechanism improves daily life and benefits evolutionary selection but may also induce restlessness and greed.
Collapse
Affiliation(s)
- Wolfram Schultz
- Department of Physiology, Development and Neuroscience, University of Cambridge, CambridgeCB2 3DY, United Kingdom
| |
Collapse
|
9
|
Bernklau TW, Righetti B, Mehrke LS, Jacob SN. Striatal dopamine signals reflect perceived cue-action-outcome associations in mice. Nat Neurosci 2024; 27:747-757. [PMID: 38291283 PMCID: PMC11001585 DOI: 10.1038/s41593-023-01567-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 12/21/2023] [Indexed: 02/01/2024]
Abstract
Striatal dopamine drives associative learning by acting as a teaching signal. Much work has focused on simple learning paradigms, including Pavlovian and instrumental learning. However, higher cognition requires that animals generate internal concepts of their environment, where sensory stimuli, actions and outcomes become flexibly associated. Here, we performed fiber photometry dopamine measurements across the striatum of male mice as they learned cue-action-outcome associations based on implicit and changing task rules. Reinforcement learning models of the behavioral and dopamine data showed that rule changes lead to adjustments of learned cue-action-outcome associations. After rule changes, mice discarded learned associations and reset outcome expectations. Cue- and outcome-triggered dopamine signals became uncoupled and dependent on the adopted behavioral strategy. As mice learned the new association, coupling between cue- and outcome-triggered dopamine signals and task performance re-emerged. Our results suggest that dopaminergic reward prediction errors reflect an agent's perceived locus of control.
Collapse
Affiliation(s)
- Tobias W Bernklau
- Translational Neurotechnology Laboratory, Department of Neurosurgery, Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
- Graduate School of Systemic Neurosciences, Ludwig-Maximilians-University Munich, Munich, Germany
| | - Beatrice Righetti
- Translational Neurotechnology Laboratory, Department of Neurosurgery, Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| | - Leonie S Mehrke
- Translational Neurotechnology Laboratory, Department of Neurosurgery, Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| | - Simon N Jacob
- Translational Neurotechnology Laboratory, Department of Neurosurgery, Klinikum rechts der Isar, Technical University of Munich, Munich, Germany.
| |
Collapse
|
10
|
Amo R, Uchida N, Watabe-Uchida M. Glutamate inputs send prediction error of reward, but not negative value of aversive stimuli, to dopamine neurons. Neuron 2024; 112:1001-1019.e6. [PMID: 38278147 PMCID: PMC10957320 DOI: 10.1016/j.neuron.2023.12.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Revised: 11/10/2023] [Accepted: 12/21/2023] [Indexed: 01/28/2024]
Abstract
Midbrain dopamine neurons are thought to signal reward prediction errors (RPEs), but the mechanisms underlying RPE computation, particularly the contributions of different neurotransmitters, remain poorly understood. Here, we used a genetically encoded glutamate sensor to examine the pattern of glutamate inputs to dopamine neurons in mice. We found that glutamate inputs exhibit virtually all of the characteristics of RPE rather than conveying a specific component of RPE computation, such as reward or expectation. Notably, whereas glutamate inputs were transiently inhibited by reward omission, they were excited by aversive stimuli. Opioid analgesics altered dopamine negative responses to aversive stimuli into more positive responses, whereas excitatory responses of glutamate inputs remained unchanged. Our findings uncover previously unknown synaptic mechanisms underlying RPE computations; dopamine responses are shaped by both synergistic and competitive interactions between glutamatergic and GABAergic inputs to dopamine neurons depending on valences, with competitive interactions playing a role in responses to aversive stimuli.
Collapse
Affiliation(s)
- Ryunosuke Amo
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| | - Mitsuko Watabe-Uchida
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|
11
|
Abe K, Kambe Y, Majima K, Hu Z, Ohtake M, Momennezhad A, Izumi H, Tanaka T, Matunis A, Stacy E, Itokazu T, Sato TR, Sato TK. Functional Diversity of Dopamine Axons in Prefrontal Cortex During Classical Conditioning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.08.23.554475. [PMID: 37662305 PMCID: PMC10473671 DOI: 10.1101/2023.08.23.554475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Midbrain dopamine neurons impact neural processing in the prefrontal cortex (PFC) through mesocortical projections. However, the signals conveyed by dopamine projections to the PFC remain unclear, particularly at the single-axon level. Here, we investigated dopaminergic axonal activity in the medial PFC (mPFC) during reward and aversive processing. By optimizing microprism-mediated two-photon calcium imaging of dopamine axon terminals, we found diverse activity in dopamine axons responsive to both reward and aversive stimuli. Some axons exhibited a preference for reward, while others favored aversive stimuli, and there was a strong bias for the latter at the population level. Long-term longitudinal imaging revealed that the preference was maintained in reward- and aversive-preferring axons throughout classical conditioning in which rewarding and aversive stimuli were paired with preceding auditory cues. However, as mice learned to discriminate reward or aversive cues, a cue activity preference gradually developed only in aversive-preferring axons. We inferred the trial-by-trial cue discrimination based on machine learning using anticipatory licking or facial expressions, and found that successful discrimination was accompanied by sharper selectivity for the aversive cue in aversive-preferring axons. Our findings indicate that a group of mesocortical dopamine axons encodes aversive-related signals, which are modulated by both classical conditioning across days and trial-by-trial discrimination within a day.
Collapse
|
12
|
Qian L, Burrell M, Hennig JA, Matias S, Murthy VN, Gershman SJ, Uchida N. The role of prospective contingency in the control of behavior and dopamine signals during associative learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.05.578961. [PMID: 38370735 PMCID: PMC10871210 DOI: 10.1101/2024.02.05.578961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Associative learning depends on contingency, the degree to which a stimulus predicts an outcome. Despite its importance, the neural mechanisms linking contingency to behavior remain elusive. Here we examined the dopamine activity in the ventral striatum - a signal implicated in associative learning - in a Pavlovian contingency degradation task in mice. We show that both anticipatory licking and dopamine responses to a conditioned stimulus decreased when additional rewards were delivered uncued, but remained unchanged if additional rewards were cued. These results conflict with contingency-based accounts using a traditional definition of contingency or a novel causal learning model (ANCCR), but can be explained by temporal difference (TD) learning models equipped with an appropriate inter-trial-interval (ITI) state representation. Recurrent neural networks trained within a TD framework develop state representations like our best 'handcrafted' model. Our findings suggest that the TD error can be a measure that describes both contingency and dopaminergic activity.
Collapse
Affiliation(s)
- Lechen Qian
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- These authors contributed equally
| | - Mark Burrell
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- These authors contributed equally
| | - Jay A. Hennig
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Psychology, Harvard University, Cambridge, MA, USA
| | - Sara Matias
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Center for Brain Science, Harvard University, Cambridge, MA, USA
| | - Venkatesh. N. Murthy
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Center for Brain Science, Harvard University, Cambridge, MA, USA
| | - Samuel J. Gershman
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Psychology, Harvard University, Cambridge, MA, USA
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Center for Brain Science, Harvard University, Cambridge, MA, USA
| |
Collapse
|
13
|
Pierce AF, Protter DSW, Watanabe YL, Chapel GD, Cameron RT, Donaldson ZR. Nucleus accumbens dopamine release reflects the selective nature of pair bonds. Curr Biol 2024; 34:519-530.e5. [PMID: 38218185 PMCID: PMC10978070 DOI: 10.1016/j.cub.2023.12.041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 10/06/2023] [Accepted: 12/13/2023] [Indexed: 01/15/2024]
Abstract
In monogamous species, prosocial behaviors directed toward partners are dramatically different from those directed toward unknown individuals and potential threats. Dopamine release in the nucleus accumbens has a well-established role in social reward and motivation, but how this mechanism may be engaged to drive the highly divergent social behaviors directed at a partner or unfamiliar conspecific remains unknown. Using monogamous prairie voles, we first employed receptor pharmacology in partner preference and social operant tasks to show that dopamine is critical for the appetitive drive for social interaction but not for low-effort, unconditioned consummatory behaviors. We then leveraged the subsecond temporal resolution of the fluorescent biosensor, GRABDA, to ask whether differential dopamine release might distinguish between partner and novel social access and interaction. We found that partner seeking, anticipation, and interaction resulted in more accumbal dopamine release than the same events directed toward a novel vole. Further, partner-associated dopamine release decreased after prolonged partner separation. Our results are consistent with a model in which dopamine signaling plays a prominent role in the appetitive aspects of social interactions. Within this framework, differences in partner- and novel-associated dopamine release reflect the selective nature of pair bonds and may drive the partner- and novel-directed social behaviors that reinforce and cement bonds over time. This provides a potential mechanism by which highly conserved reward systems can enable selective, species-appropriate social behaviors.
Collapse
Affiliation(s)
- Anne F Pierce
- Department of Psychology & Neuroscience, University of Colorado Boulder, 1945 Colorado Ave, Boulder, CO 80309, USA.
| | - David S W Protter
- Department of Molecular, Cellular, and Developmental Biology, University of Colorado Boulder, 1945 Colorado Ave, Boulder, CO 80309, USA
| | - Yurika L Watanabe
- Department of Molecular, Cellular, and Developmental Biology, University of Colorado Boulder, 1945 Colorado Ave, Boulder, CO 80309, USA
| | - Gabriel D Chapel
- Department of Molecular, Cellular, and Developmental Biology, University of Colorado Boulder, 1945 Colorado Ave, Boulder, CO 80309, USA
| | - Ryan T Cameron
- Department of Molecular, Cellular, and Developmental Biology, University of Colorado Boulder, 1945 Colorado Ave, Boulder, CO 80309, USA
| | - Zoe R Donaldson
- Department of Psychology & Neuroscience, University of Colorado Boulder, 1945 Colorado Ave, Boulder, CO 80309, USA; Department of Molecular, Cellular, and Developmental Biology, University of Colorado Boulder, 1945 Colorado Ave, Boulder, CO 80309, USA.
| |
Collapse
|
14
|
Amo R. Prediction error in dopamine neurons during associative learning. Neurosci Res 2024; 199:12-20. [PMID: 37451506 DOI: 10.1016/j.neures.2023.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 06/18/2023] [Accepted: 07/07/2023] [Indexed: 07/18/2023]
Abstract
Dopamine neurons have long been thought to facilitate learning by broadcasting reward prediction error (RPE), a teaching signal used in machine learning, but more recent work has advanced alternative models of dopamine's computational role. Here, I revisit this critical issue and review new experimental evidences that tighten the link between dopamine activity and RPE. First, I introduce the recent observation of a gradual backward shift of dopamine activity that had eluded researchers for over a decade. I also discuss several other findings, such as dopamine ramping, that were initially interpreted to conflict but later found to be consistent with RPE. These findings improve our understanding of neural computation in dopamine neurons.
Collapse
Affiliation(s)
- Ryunosuke Amo
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|
15
|
Tolooshams B, Matias S, Wu H, Temereanca S, Uchida N, Murthy VN, Masset P, Ba D. Interpretable deep learning for deconvolutional analysis of neural signals. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.05.574379. [PMID: 38260512 PMCID: PMC10802267 DOI: 10.1101/2024.01.05.574379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
The widespread adoption of deep learning to build models that capture the dynamics of neural populations is typically based on "black-box" approaches that lack an interpretable link between neural activity and function. Here, we propose to apply algorithm unrolling, a method for interpretable deep learning, to design the architecture of sparse deconvolutional neural networks and obtain a direct interpretation of network weights in relation to stimulus-driven single-neuron activity through a generative model. We characterize our method, referred to as deconvolutional unrolled neural learning (DUNL), and show its versatility by applying it to deconvolve single-trial local signals across multiple brain areas and recording modalities. To exemplify use cases of our decomposition method, we uncover multiplexed salience and reward prediction error signals from midbrain dopamine neurons in an unbiased manner, perform simultaneous event detection and characterization in somatosensory thalamus recordings, and characterize the responses of neurons in the piriform cortex. Our work leverages the advances in interpretable deep learning to gain a mechanistic understanding of neural dynamics.
Collapse
Affiliation(s)
- Bahareh Tolooshams
- Center for Brain Science, Harvard University, Cambridge MA, 02138
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge MA, 02138
- Computing + Mathematical Sciences, California Institute of Technology, Pasadena, CA, 91125
| | - Sara Matias
- Center for Brain Science, Harvard University, Cambridge MA, 02138
- Department of Molecular and Cellular Biology, Harvard University, Cambridge MA, 02138
| | - Hao Wu
- Center for Brain Science, Harvard University, Cambridge MA, 02138
- Department of Molecular and Cellular Biology, Harvard University, Cambridge MA, 02138
| | - Simona Temereanca
- Carney Institute for Brain Science, Brown University, Providence, RI, 02906
| | - Naoshige Uchida
- Center for Brain Science, Harvard University, Cambridge MA, 02138
- Department of Molecular and Cellular Biology, Harvard University, Cambridge MA, 02138
| | - Venkatesh N. Murthy
- Center for Brain Science, Harvard University, Cambridge MA, 02138
- Department of Molecular and Cellular Biology, Harvard University, Cambridge MA, 02138
| | - Paul Masset
- Center for Brain Science, Harvard University, Cambridge MA, 02138
- Department of Molecular and Cellular Biology, Harvard University, Cambridge MA, 02138
- Department of Psychology, McGill University, Montréal QC, H3A 1G1
| | - Demba Ba
- Center for Brain Science, Harvard University, Cambridge MA, 02138
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge MA, 02138
- Kempner Institute for the Study of Natural & Artificial Intelligence, Harvard University, Cambridge MA, 02138
| |
Collapse
|
16
|
Zhou ZC, Gordon-Fennell A, Piantadosi SC, Ji N, Smith SL, Bruchas MR, Stuber GD. Deep-brain optical recording of neural dynamics during behavior. Neuron 2023; 111:3716-3738. [PMID: 37804833 PMCID: PMC10843303 DOI: 10.1016/j.neuron.2023.09.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 08/24/2023] [Accepted: 09/06/2023] [Indexed: 10/09/2023]
Abstract
In vivo fluorescence recording techniques have produced landmark discoveries in neuroscience, providing insight into how single cell and circuit-level computations mediate sensory processing and generate complex behaviors. While much attention has been given to recording from cortical brain regions, deep-brain fluorescence recording is more complex because it requires additional measures to gain optical access to harder to reach brain nuclei. Here we discuss detailed considerations and tradeoffs regarding deep-brain fluorescence recording techniques and provide a comprehensive guide for all major steps involved, from project planning to data analysis. The goal is to impart guidance for new and experienced investigators seeking to use in vivo deep fluorescence optical recordings in awake, behaving rodent models.
Collapse
Affiliation(s)
- Zhe Charles Zhou
- Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, WA 98195, USA; Center for Neurobiology of Addiction, Pain, and Emotion, University of Washington, Seattle, WA 98195, USA
| | - Adam Gordon-Fennell
- Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, WA 98195, USA; Center for Neurobiology of Addiction, Pain, and Emotion, University of Washington, Seattle, WA 98195, USA
| | - Sean C Piantadosi
- Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, WA 98195, USA; Center for Neurobiology of Addiction, Pain, and Emotion, University of Washington, Seattle, WA 98195, USA
| | - Na Ji
- Department of Physics, University of California, Berkeley, Berkeley, CA 94720, USA; Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA; Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94720, USA; Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Spencer LaVere Smith
- Department of Electrical and Computer Engineering, University of California Santa Barbara, Santa Barbara, CA 93106, USA
| | - Michael R Bruchas
- Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, WA 98195, USA; Center for Neurobiology of Addiction, Pain, and Emotion, University of Washington, Seattle, WA 98195, USA; Department of Pharmacology, University of Washington, Seattle, WA 98195, USA; Department of Bioengineering, University of Washington, Seattle, WA 98195, USA.
| | - Garret D Stuber
- Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, WA 98195, USA; Center for Neurobiology of Addiction, Pain, and Emotion, University of Washington, Seattle, WA 98195, USA; Department of Pharmacology, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
17
|
Shikano Y, Yagishita S, Tanaka KF, Takata N. Slow-rising and fast-falling dopaminergic dynamics jointly adjust negative prediction error in the ventral striatum. Eur J Neurosci 2023; 58:4502-4522. [PMID: 36843200 DOI: 10.1111/ejn.15945] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 02/22/2023] [Indexed: 02/28/2023]
Abstract
The greater the reward expectations are, the more different the brain's physiological response will be. Although it is well-documented that better-than-expected outcomes are encoded quantitatively via midbrain dopaminergic (DA) activity, it has been less addressed experimentally whether worse-than-expected outcomes are expressed quantitatively as well. We show that larger reward expectations upon unexpected reward omissions are associated with the preceding slower rise and following larger decrease (DA dip) in the DA concentration at the ventral striatum of mice. We set up a lever press task on a fixed ratio (FR) schedule requiring five lever presses as an effort for a food reward (FR5). The mice occasionally checked the food magazine without a reward before completing the task. The percentage of this premature magazine entry (PME) increased as the number of lever presses approached five, showing rising expectations with increasing proximity to task completion, and hence greater reward expectations. Fibre photometry of extracellular DA dynamics in the ventral striatum using a fluorescent protein (genetically encoded GPCR activation-based DA sensor: GRABDA2m ) revealed that the slow increase and fast decrease in DA levels around PMEs were correlated with the PME percentage, demonstrating a monotonic relationship between the DA dip amplitude and degree of expectations. Computational modelling of the lever press task implementing temporal difference errors and state transitions replicated the observed correlation between the PME frequency and DA dip amplitude in the FR5 task. Taken together, these findings indicate that the DA dip amplitude represents the degree of reward expectations monotonically, which may guide behavioural adjustment.
Collapse
Affiliation(s)
- Yu Shikano
- Division of Brain Sciences, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
- Center for Disease Biology and Integrative Medicine, Faculty of Medicine, The University of Tokyo, Tokyo, Japan
| | - Sho Yagishita
- Center for Disease Biology and Integrative Medicine, Faculty of Medicine, The University of Tokyo, Tokyo, Japan
| | - Kenji F Tanaka
- Division of Brain Sciences, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
| | - Norio Takata
- Division of Brain Sciences, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
| |
Collapse
|
18
|
Sands LP, Jiang A, Liebenow B, DiMarco E, Laxton AW, Tatter SB, Montague PR, Kishida KT. Subsecond fluctuations in extracellular dopamine encode reward and punishment prediction errors in humans. SCIENCE ADVANCES 2023; 9:eadi4927. [PMID: 38039368 PMCID: PMC10691773 DOI: 10.1126/sciadv.adi4927] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 11/03/2023] [Indexed: 12/03/2023]
Abstract
In the mammalian brain, midbrain dopamine neuron activity is hypothesized to encode reward prediction errors that promote learning and guide behavior by causing rapid changes in dopamine levels in target brain regions. This hypothesis (and alternatives regarding dopamine's role in punishment-learning) has limited direct evidence in humans. We report intracranial, subsecond measurements of dopamine release in human striatum measured, while volunteers (i.e., patients undergoing deep brain stimulation surgery) performed a probabilistic reward and punishment learning choice task designed to test whether dopamine release encodes only reward prediction errors or whether dopamine release may also encode adaptive punishment learning signals. Results demonstrate that extracellular dopamine levels can encode both reward and punishment prediction errors within distinct time intervals via independent valence-specific pathways in the human brain.
Collapse
Affiliation(s)
- L. Paul Sands
- Neuroscience Graduate Program, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
- Department of Physiology and Pharmacology, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
| | - Angela Jiang
- Department of Physiology and Pharmacology, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
| | - Brittany Liebenow
- Neuroscience Graduate Program, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
- Department of Physiology and Pharmacology, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
| | - Emily DiMarco
- Neuroscience Graduate Program, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
- Department of Physiology and Pharmacology, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
| | - Adrian W. Laxton
- Department of Neurosurgery, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
| | - Stephen B. Tatter
- Department of Neurosurgery, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
| | - P. Read Montague
- Wellcome Centre for Human Neuroimaging, University College London, WC1N 3BG London, UK
- Fralin Biomedical Research Institute, Virginia Tech, Roanoke, VA 24016, USA
- Department of Physics, Virginia Tech, Blacksburg, VA 24061, USA
| | - Kenneth T. Kishida
- Neuroscience Graduate Program, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
- Department of Physiology and Pharmacology, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
- Department of Neurosurgery, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
| |
Collapse
|
19
|
Amo R, Uchida N, Watabe-Uchida M. Glutamate inputs send prediction error of reward but not negative value of aversive stimuli to dopamine neurons. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.09.566472. [PMID: 37986868 PMCID: PMC10659341 DOI: 10.1101/2023.11.09.566472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Midbrain dopamine neurons are thought to signal reward prediction errors (RPEs) but the mechanisms underlying RPE computation, particularly contributions of different neurotransmitters, remain poorly understood. Here we used a genetically-encoded glutamate sensor to examine the pattern of glutamate inputs to dopamine neurons. We found that glutamate inputs exhibit virtually all of the characteristics of RPE, rather than conveying a specific component of RPE computation such as reward or expectation. Notably, while glutamate inputs were transiently inhibited by reward omission, they were excited by aversive stimuli. Opioid analgesics altered dopamine negative responses to aversive stimuli toward more positive responses, while excitatory responses of glutamate inputs remained unchanged. Our findings uncover previously unknown synaptic mechanisms underlying RPE computations; dopamine responses are shaped by both synergistic and competitive interactions between glutamatergic and GABAergic inputs to dopamine neurons depending on valences, with competitive interactions playing a role in responses to aversive stimuli.
Collapse
Affiliation(s)
- Ryunosuke Amo
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| | - Mitsuko Watabe-Uchida
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| |
Collapse
|
20
|
Fraser KM, Collins VL, Wolff AR, Ottenheimer DJ, Bornhoft KN, Pat F, Chen BJ, Janak PH, Saunders BT. Contexts facilitate dynamic value encoding in the mesolimbic dopamine system. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.05.565687. [PMID: 37961363 PMCID: PMC10635154 DOI: 10.1101/2023.11.05.565687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Adaptive behavior in a dynamic environment often requires rapid revaluation of stimuli that deviates from well-learned associations. The divergence between stable value-encoding and appropriate behavioral output remains a critical test to theories of dopamine's function in learning, motivation, and motor control. Yet how dopamine neurons are involved in the revaluation of cues when the world changes to alter our behavior remains unclear. Here we make use of pharmacology, in vivo electrophysiology, fiber photometry, and optogenetics to resolve the contributions of the mesolimbic dopamine system to the dynamic reorganization of reward-seeking. Male and female rats were trained to discriminate when a conditioned stimulus would be followed by sucrose reward by exploiting the prior, non-overlapping presentation of a separate discrete cue - an occasion setter. Only when the occasion setter's presentation preceded the conditioned stimulus did the conditioned stimulus predict sucrose delivery. As a result, in this task we were able to dissociate the average value of the conditioned stimulus from its immediate expected value on a trial-to-trial basis. Both the activity of ventral tegmental area dopamine neurons and dopamine signaling in the nucleus accumbens were essential for rats to successfully update behavioral responding in response to the occasion setter. Moreover, dopamine release in the nucleus accumbens following the conditioned stimulus only occurred when the occasion setter indicated it would predict reward. Downstream of dopamine release, we found that single neurons in the nucleus accumbens dynamically tracked the value of the conditioned stimulus. Together these results reveal a novel mechanism within the mesolimbic dopamine system for the rapid revaluation of motivation.
Collapse
Affiliation(s)
- Kurt M Fraser
- Department of Psychological and Brain Sciences, Johns Hopkins University
| | | | - Amy R Wolff
- Department of Neuroscience, University of Minnesota
| | | | | | - Fiona Pat
- Department of Psychological and Brain Sciences, Johns Hopkins University
| | - Bridget J Chen
- Department of Psychological and Brain Sciences, Johns Hopkins University
| | - Patricia H Janak
- Department of Psychological and Brain Sciences, Johns Hopkins University
- The Solomon H. Snyder Department of Neuroscience, Johns Hopkins University
| | - Benjamin T Saunders
- Department of Neuroscience, University of Minnesota
- Medical Discovery Team on Addiction, University of Minnesota
| |
Collapse
|
21
|
Krausz TA, Comrie AE, Kahn AE, Frank LM, Daw ND, Berke JD. Dual credit assignment processes underlie dopamine signals in a complex spatial environment. Neuron 2023; 111:3465-3478.e7. [PMID: 37611585 PMCID: PMC10841332 DOI: 10.1016/j.neuron.2023.07.017] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 06/23/2023] [Accepted: 07/25/2023] [Indexed: 08/25/2023]
Abstract
Animals frequently make decisions based on expectations of future reward ("values"). Values are updated by ongoing experience: places and choices that result in reward are assigned greater value. Yet, the specific algorithms used by the brain for such credit assignment remain unclear. We monitored accumbens dopamine as rats foraged for rewards in a complex, changing environment. We observed brief dopamine pulses both at reward receipt (scaling with prediction error) and at novel path opportunities. Dopamine also ramped up as rats ran toward reward ports, in proportion to the value at each location. By examining the evolution of these dopamine place-value signals, we found evidence for two distinct update processes: progressive propagation of value along taken paths, as in temporal difference learning, and inference of value throughout the maze, using internal models. Our results demonstrate that within rich, naturalistic environments dopamine conveys place values that are updated via multiple, complementary learning algorithms.
Collapse
Affiliation(s)
- Timothy A Krausz
- Neuroscience Graduate Program, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Alison E Comrie
- Neuroscience Graduate Program, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Ari E Kahn
- Department of Psychology, and Princeton Neuroscience Institute, Princeton University, Princeton, Princeton, NJ 08544, USA
| | - Loren M Frank
- Neuroscience Graduate Program, University of California, San Francisco, San Francisco, CA 94158, USA; Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA; Department of Physiology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Nathaniel D Daw
- Department of Psychology, and Princeton Neuroscience Institute, Princeton University, Princeton, Princeton, NJ 08544, USA
| | - Joshua D Berke
- Neuroscience Graduate Program, University of California, San Francisco, San Francisco, CA 94158, USA; Kavli Institute for Fundamental Neuroscience, and Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA; Department of Neurology and Department of Psychiatry and Behavioral Science, University of California, San Francisco, San Francisco, CA 94158, USA.
| |
Collapse
|
22
|
Kim HY, Lee J, Kim HJ, Lee BE, Jeong J, Cho EJ, Jang HJ, Shin KJ, Kim MJ, Chae YC, Lee SE, Myung K, Baik JH, Suh PG, Kim JI. PLCγ1 in dopamine neurons critically regulates striatal dopamine release via VMAT2 and synapsin III. Exp Mol Med 2023; 55:2357-2375. [PMID: 37907739 PMCID: PMC10689754 DOI: 10.1038/s12276-023-01104-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 08/05/2023] [Accepted: 08/06/2023] [Indexed: 11/02/2023] Open
Abstract
Dopamine neurons are essential for voluntary movement, reward learning, and motivation, and their dysfunction is closely linked to various psychological and neurodegenerative diseases. Hence, understanding the detailed signaling mechanisms that functionally modulate dopamine neurons is crucial for the development of better therapeutic strategies against dopamine-related disorders. Phospholipase Cγ1 (PLCγ1) is a key enzyme in intracellular signaling that regulates diverse neuronal functions in the brain. It was proposed that PLCγ1 is implicated in the development of dopaminergic neurons, while the physiological function of PLCγ1 remains to be determined. In this study, we investigated the physiological role of PLCγ1, one of the key effector enzymes in intracellular signaling, in regulating dopaminergic function in vivo. We found that cell type-specific deletion of PLCγ1 does not adversely affect the development and cellular morphology of midbrain dopamine neurons but does facilitate dopamine release from dopaminergic axon terminals in the striatum. The enhancement of dopamine release was accompanied by increased colocalization of vesicular monoamine transporter 2 (VMAT2) at dopaminergic axon terminals. Notably, dopamine neuron-specific knockout of PLCγ1 also led to heightened expression and colocalization of synapsin III, which controls the trafficking of synaptic vesicles. Furthermore, the knockdown of VMAT2 and synapsin III in dopamine neurons resulted in a significant attenuation of dopamine release, while this attenuation was less severe in PLCγ1 cKO mice. Our findings suggest that PLCγ1 in dopamine neurons could critically modulate dopamine release at axon terminals by directly or indirectly interacting with synaptic machinery, including VMAT2 and synapsin III.
Collapse
Affiliation(s)
- Hye Yun Kim
- Department of Biological Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Jieun Lee
- Department of Biological Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Hyun-Jin Kim
- Department of Biological Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Byeong Eun Lee
- Department of Biological Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Jaewook Jeong
- Department of Biological Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Eun Jeong Cho
- Department of Biological Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Hyun-Jun Jang
- Herbal Medicine Resources Research Center, Korea Institute of Oriental Medicine, Naju, 58245, Republic of Korea
| | - Kyeong Jin Shin
- Department of Biological Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Min Ji Kim
- Department of Life Sciences, Korea University, Seoul, 02841, Korea
| | - Young Chan Chae
- Department of Biological Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Seung Eun Lee
- Research Animal Resource Center, Korea Institute of Science and Technology (KIST), Seoul, 02792, Republic of Korea
| | - Kyungjae Myung
- Center for Genomic Integrity, Institute for Basic Science (IBS), Ulsan, 44919, Republic of Korea
- Department of Biomedical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Ja-Hyun Baik
- Department of Life Sciences, Korea University, Seoul, 02841, Korea
| | - Pann-Ghill Suh
- Korea Brain Research Institute (KBRI), Daegu, 41062, Republic of Korea
| | - Jae-Ick Kim
- Department of Biological Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea.
| |
Collapse
|
23
|
Konishi M, Igarashi KM, Miura K. Biologically plausible local synaptic learning rules robustly implement deep supervised learning. Front Neurosci 2023; 17:1160899. [PMID: 37886676 PMCID: PMC10598703 DOI: 10.3389/fnins.2023.1160899] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 08/31/2023] [Indexed: 10/28/2023] Open
Abstract
In deep neural networks, representational learning in the middle layer is essential for achieving efficient learning. However, the currently prevailing backpropagation learning rules (BP) are not necessarily biologically plausible and cannot be implemented in the brain in their current form. Therefore, to elucidate the learning rules used by the brain, it is critical to establish biologically plausible learning rules for practical memory tasks. For example, learning rules that result in a learning performance worse than that of animals observed in experimental studies may not be computations used in real brains and should be ruled out. Using numerical simulations, we developed biologically plausible learning rules to solve a task that replicates a laboratory experiment where mice learned to predict the correct reward amount. Although the extreme learning machine (ELM) and weight perturbation (WP) learning rules performed worse than the mice, the feedback alignment (FA) rule achieved a performance equal to that of BP. To obtain a more biologically plausible model, we developed a variant of FA, FA_Ex-100%, which implements direct dopamine inputs that provide error signals locally in the layer of focus, as found in the mouse entorhinal cortex. The performance of FA_Ex-100% was comparable to that of conventional BP. Finally, we tested whether FA_Ex-100% was robust against rule perturbations and biologically inevitable noise. FA_Ex-100% worked even when subjected to perturbations, presumably because it could calibrate the correct prediction error (e.g., dopaminergic signals) in the next step as a teaching signal if the perturbation created a deviation. These results suggest that simplified and biologically plausible learning rules, such as FA_Ex-100%, can robustly facilitate deep supervised learning when the error signal, possibly conveyed by dopaminergic neurons, is accurate.
Collapse
Affiliation(s)
- Masataka Konishi
- Department of Biosciences, School of Biological and Environmental Sciences, Kwansei Gakuin University, Sanda, Hyogo, Japan
| | - Kei M. Igarashi
- Department of Anatomy and Neurobiology, School of Medicine, University of California, Irvine, Irvine, CA, United States
| | - Keiji Miura
- Department of Biosciences, School of Biological and Environmental Sciences, Kwansei Gakuin University, Sanda, Hyogo, Japan
| |
Collapse
|
24
|
Bech P, Crochet S, Dard R, Ghaderi P, Liu Y, Malekzadeh M, Petersen CCH, Pulin M, Renard A, Sourmpis C. Striatal Dopamine Signals and Reward Learning. FUNCTION 2023; 4:zqad056. [PMID: 37841525 PMCID: PMC10572094 DOI: 10.1093/function/zqad056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 09/28/2023] [Accepted: 09/29/2023] [Indexed: 10/17/2023] Open
Abstract
We are constantly bombarded by sensory information and constantly making decisions on how to act. In order to optimally adapt behavior, we must judge which sequences of sensory inputs and actions lead to successful outcomes in specific circumstances. Neuronal circuits of the basal ganglia have been strongly implicated in action selection, as well as the learning and execution of goal-directed behaviors, with accumulating evidence supporting the hypothesis that midbrain dopamine neurons might encode a reward signal useful for learning. Here, we review evidence suggesting that midbrain dopaminergic neurons signal reward prediction error, driving synaptic plasticity in the striatum underlying learning. We focus on phasic increases in action potential firing of midbrain dopamine neurons in response to unexpected rewards. These dopamine neurons prominently innervate the dorsal and ventral striatum. In the striatum, the released dopamine binds to dopamine receptors, where it regulates the plasticity of glutamatergic synapses. The increase of striatal dopamine accompanying an unexpected reward activates dopamine type 1 receptors (D1Rs) initiating a signaling cascade that promotes long-term potentiation of recently active glutamatergic input onto striatonigral neurons. Sensorimotor-evoked glutamatergic input, which is active immediately before reward delivery will thus be strengthened onto neurons in the striatum expressing D1Rs. In turn, these neurons cause disinhibition of brainstem motor centers and disinhibition of the motor thalamus, thus promoting motor output to reinforce rewarded stimulus-action outcomes. Although many details of the hypothesis need further investigation, altogether, it seems likely that dopamine signals in the striatum might underlie important aspects of goal-directed reward-based learning.
Collapse
Affiliation(s)
- Pol Bech
- Laboratory of Sensory Processing, Brain Mind Institute, Faculty of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne CH-1015, Switzerland
| | - Sylvain Crochet
- Laboratory of Sensory Processing, Brain Mind Institute, Faculty of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne CH-1015, Switzerland
| | - Robin Dard
- Laboratory of Sensory Processing, Brain Mind Institute, Faculty of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne CH-1015, Switzerland
| | - Parviz Ghaderi
- Laboratory of Sensory Processing, Brain Mind Institute, Faculty of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne CH-1015, Switzerland
| | - Yanqi Liu
- Laboratory of Sensory Processing, Brain Mind Institute, Faculty of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne CH-1015, Switzerland
| | - Meriam Malekzadeh
- Laboratory of Sensory Processing, Brain Mind Institute, Faculty of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne CH-1015, Switzerland
| | - Carl C H Petersen
- Laboratory of Sensory Processing, Brain Mind Institute, Faculty of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne CH-1015, Switzerland
| | - Mauro Pulin
- Laboratory of Sensory Processing, Brain Mind Institute, Faculty of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne CH-1015, Switzerland
| | - Anthony Renard
- Laboratory of Sensory Processing, Brain Mind Institute, Faculty of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne CH-1015, Switzerland
| | - Christos Sourmpis
- Laboratory of Sensory Processing, Brain Mind Institute, Faculty of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne CH-1015, Switzerland
| |
Collapse
|
25
|
Cone I, Clopath C, Shouval HZ. Learning to Express Reward Prediction Error-like Dopaminergic Activity Requires Plastic Representations of Time. RESEARCH SQUARE 2023:rs.3.rs-3289985. [PMID: 37790466 PMCID: PMC10543312 DOI: 10.21203/rs.3.rs-3289985/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
The dominant theoretical framework to account for reinforcement learning in the brain is temporal difference (TD) reinforcement learning. The TD framework predicts that some neuronal elements should represent the reward prediction error (RPE), which means they signal the difference between the expected future rewards and the actual rewards. The prominence of the TD theory arises from the observation that firing properties of dopaminergic neurons in the ventral tegmental area appear similar to those of RPE model-neurons in TD learning. Previous implementations of TD learning assume a fixed temporal basis for each stimulus that might eventually predict a reward. Here we show that such a fixed temporal basis is implausible and that certain predictions of TD learning are inconsistent with experiments. We propose instead an alternative theoretical framework, coined FLEX (Flexibly Learned Errors in Expected Reward). In FLEX, feature specific representations of time are learned, allowing for neural representations of stimuli to adjust their timing and relation to rewards in an online manner. In FLEX dopamine acts as an instructive signal which helps build temporal models of the environment. FLEX is a general theoretical framework that has many possible biophysical implementations. In order to show that FLEX is a feasible approach, we present a specific biophysically plausible model which implements the principles of FLEX. We show that this implementation can account for various reinforcement learning paradigms, and that its results and predictions are consistent with a preponderance of both existing and reanalyzed experimental data.
Collapse
Affiliation(s)
- Ian Cone
- Department of Bioengineering, Imperial College London, London, United Kingdom
- Department of Neurobiology and Anatomy, University of Texas Medical School at Houston, Houston, TX
- Applied Physics Program, Rice University, Houston, TX
| | - Claudia Clopath
- Department of Bioengineering, Imperial College London, London, United Kingdom
| | - Harel Z Shouval
- Department of Neurobiology and Anatomy, University of Texas Medical School at Houston, Houston, TX
- Department of Electrical and Computer Engineering, Rice University, Houston, TX
| |
Collapse
|
26
|
Kim MJ, Gibson DJ, Hu D, Mahar A, Schofield CJ, Sompolpong P, Yoshida T, Tran KT, Graybiel AM. Dopamine Release Plateau and Outcome Signals in Dorsal Striatum Contrast with Classic Reinforcement Learning Formulations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.15.553421. [PMID: 37645888 PMCID: PMC10462077 DOI: 10.1101/2023.08.15.553421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
We recorded dopamine release signals in medial and lateral sectors of the striatum as mice learned consecutive visual cue-outcome conditioning tasks including cue association, cue discrimination, reversal, and probabilistic discrimination task versions. Dopamine release responses in medial and lateral sites exhibited learning-related changes within and across phases of acquisition. These were different for the medial and lateral sites. In neither sector could these be accounted for by classic reinforcement learning as applied to dopamine-containing neuron activity. Cue responses ranged from initial sharp peaks to modulated plateau responses. In the medial sector, outcome (reward) responses during cue conditioning were minimal or, initially, negative. By contrast, in lateral sites, strong, transient dopamine release responses occurred at both cue and outcome. Prolonged, plateau release responses to cues emerged in both regions when discriminative behavioral responses became required. In most sites, we found no evidence for a transition from outcome to cue signaling, a hallmark of temporal difference reinforcement learning as applied to midbrain dopamine activity. These findings delineate reshaping of dopamine release activity during learning and suggest that current views of reward prediction error encoding need review to accommodate distinct learning-related spatial and temporal patterns of striatal dopamine release in the dorsal striatum.
Collapse
|
27
|
Krausz TA, Comrie AE, Frank LM, Daw ND, Berke JD. Dual credit assignment processes underlie dopamine signals in a complex spatial environment. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.15.528738. [PMID: 36993482 PMCID: PMC10054934 DOI: 10.1101/2023.02.15.528738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Dopamine in the nucleus accumbens helps motivate behavior based on expectations of future reward ("values"). These values need to be updated by experience: after receiving reward, the choices that led to reward should be assigned greater value. There are multiple theoretical proposals for how this credit assignment could be achieved, but the specific algorithms that generate updated dopamine signals remain uncertain. We monitored accumbens dopamine as freely behaving rats foraged for rewards in a complex, changing environment. We observed brief pulses of dopamine both when rats received reward (scaling with prediction error), and when they encountered novel path opportunities. Furthermore, dopamine ramped up as rats ran towards reward ports, in proportion to the value at each location. By examining the evolution of these dopamine place-value signals, we found evidence for two distinct update processes: progressive propagation along taken paths, as in temporal-difference learning, and inference of value throughout the maze, using internal models. Our results demonstrate that within rich, naturalistic environments dopamine conveys place values that are updated via multiple, complementary learning algorithms.
Collapse
Affiliation(s)
- Timothy A Krausz
- Neuroscience Graduate Program, University of California, San Francisco
| | - Alison E Comrie
- Neuroscience Graduate Program, University of California, San Francisco
| | - Loren M Frank
- Neuroscience Graduate Program, University of California, San Francisco
- Kavli Institute for Fundamental Neuroscience, and Weill Institute for Neurosciences, UCSF
- Howard Hughes Medical Institute
- Department of Physiology, UCSF
| | - Nathaniel D Daw
- Department of Psychology, and Princeton Neuroscience Institute, Princeton University, NJ
| | - Joshua D Berke
- Neuroscience Graduate Program, University of California, San Francisco
- Kavli Institute for Fundamental Neuroscience, and Weill Institute for Neurosciences, UCSF
- Department of Neurology, and Department of Psychiatry and Behavioral Science, UCSF
| |
Collapse
|
28
|
Mesolimbic dopamine adapts the rate of learning from action. Nature 2023; 614:294-302. [PMID: 36653450 PMCID: PMC9908546 DOI: 10.1038/s41586-022-05614-z] [Citation(s) in RCA: 27] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 11/30/2022] [Indexed: 01/20/2023]
Abstract
Recent success in training artificial agents and robots derives from a combination of direct learning of behavioural policies and indirect learning through value functions1-3. Policy learning and value learning use distinct algorithms that optimize behavioural performance and reward prediction, respectively. In animals, behavioural learning and the role of mesolimbic dopamine signalling have been extensively evaluated with respect to reward prediction4; however, so far there has been little consideration of how direct policy learning might inform our understanding5. Here we used a comprehensive dataset of orofacial and body movements to understand how behavioural policies evolved as naive, head-restrained mice learned a trace conditioning paradigm. Individual differences in initial dopaminergic reward responses correlated with the emergence of learned behavioural policy, but not the emergence of putative value encoding for a predictive cue. Likewise, physiologically calibrated manipulations of mesolimbic dopamine produced several effects inconsistent with value learning but predicted by a neural-network-based model that used dopamine signals to set an adaptive rate, not an error signal, for behavioural policy learning. This work provides strong evidence that phasic dopamine activity can regulate direct learning of behavioural policies, expanding the explanatory power of reinforcement learning models for animal learning6.
Collapse
|
29
|
Jeong H, Taylor A, Floeder JR, Lohmann M, Mihalas S, Wu B, Zhou M, Burke DA, Namboodiri VMK. Mesolimbic dopamine release conveys causal associations. Science 2022; 378:eabq6740. [PMID: 36480599 PMCID: PMC9910357 DOI: 10.1126/science.abq6740] [Citation(s) in RCA: 57] [Impact Index Per Article: 28.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Learning to predict rewards based on environmental cues is essential for survival. It is believed that animals learn to predict rewards by updating predictions whenever the outcome deviates from expectations, and that such reward prediction errors (RPEs) are signaled by the mesolimbic dopamine system-a key controller of learning. However, instead of learning prospective predictions from RPEs, animals can infer predictions by learning the retrospective cause of rewards. Hence, whether mesolimbic dopamine instead conveys a causal associative signal that sometimes resembles RPE remains unknown. We developed an algorithm for retrospective causal learning and found that mesolimbic dopamine release conveys causal associations but not RPE, thereby challenging the dominant theory of reward learning. Our results reshape the conceptual and biological framework for associative learning.
Collapse
Affiliation(s)
- Huijeong Jeong
- Department of Neurology, University of California, San Francisco, CA, USA
| | - Annie Taylor
- Neuroscience Graduate Program, University of California, San Francisco, CA, USA
| | - Joseph R Floeder
- Neuroscience Graduate Program, University of California, San Francisco, CA, USA
| | | | - Stefan Mihalas
- Allen Institute for Brain Science, Seattle, WA, USA
- Department of Applied Mathematics, University of Washington, Seattle, WA, USA
| | - Brenda Wu
- Department of Neurology, University of California, San Francisco, CA, USA
| | - Mingkang Zhou
- Department of Neurology, University of California, San Francisco, CA, USA
- Neuroscience Graduate Program, University of California, San Francisco, CA, USA
| | - Dennis A Burke
- Department of Neurology, University of California, San Francisco, CA, USA
| | - Vijay Mohan K Namboodiri
- Department of Neurology, University of California, San Francisco, CA, USA
- Neuroscience Graduate Program, University of California, San Francisco, CA, USA
- Weill Institute for Neuroscience, Kavli Institute for Fundamental Neuroscience, Center for Integrative Neuroscience, University of California, San Francisco, CA, USA
| |
Collapse
|
30
|
Yamada K, Toda K. Pupillary dynamics of mice performing a Pavlovian delay conditioning task reflect reward-predictive signals. Front Syst Neurosci 2022; 16:1045764. [PMID: 36567756 PMCID: PMC9772849 DOI: 10.3389/fnsys.2022.1045764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 11/21/2022] [Indexed: 12/13/2022] Open
Abstract
Pupils can signify various internal processes and states, such as attention, arousal, and working memory. Changes in pupil size have been associated with learning speed, prediction of future events, and deviations from the prediction in human studies. However, the detailed relationships between pupil size changes and prediction are unclear. We explored pupil size dynamics in mice performing a Pavlovian delay conditioning task. A head-fixed experimental setup combined with deep-learning-based image analysis enabled us to reduce spontaneous locomotor activity and to track the precise dynamics of pupil size of behaving mice. By setting up two experimental groups, one for which mice were able to predict reward in the Pavlovian delay conditioning task and the other for which mice were not, we demonstrated that the pupil size of mice is modulated by reward prediction and consumption, as well as body movements, but not by unpredicted reward delivery. Furthermore, we clarified that pupil size is still modulated by reward prediction even after the disruption of body movements by intraperitoneal injection of haloperidol, a dopamine D2 receptor antagonist. These results suggest that changes in pupil size reflect reward prediction signals. Thus, we provide important evidence to reconsider the neuronal circuit involved in computing reward prediction error. This integrative approach of behavioral analysis, image analysis, pupillometry, and pharmacological manipulation will pave the way for understanding the psychological and neurobiological mechanisms of reward prediction and the prediction errors essential to learning and behavior.
Collapse
Affiliation(s)
- Kota Yamada
- Department of Psychology, Keio University, Tokyo, Japan
- Japan Society for the Promotion of Science, Tokyo, Japan
| | - Koji Toda
- Department of Psychology, Keio University, Tokyo, Japan
| |
Collapse
|
31
|
De Corte BJ, Akdoğan B, Balsam PD. Temporal scaling and computing time in neural circuits: Should we stop watching the clock and look for its gears? Front Behav Neurosci 2022; 16:1022713. [PMID: 36570701 PMCID: PMC9773401 DOI: 10.3389/fnbeh.2022.1022713] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 10/31/2022] [Indexed: 12/13/2022] Open
Abstract
Timing underlies a variety of functions, from walking to perceiving causality. Neural timing models typically fall into one of two categories-"ramping" and "population-clock" theories. According to ramping models, individual neurons track time by gradually increasing or decreasing their activity as an event approaches. To time different intervals, ramping neurons adjust their slopes, ramping steeply for short intervals and vice versa. In contrast, according to "population-clock" models, multiple neurons track time as a group, and each neuron can fire nonlinearly. As each neuron changes its rate at each point in time, a distinct pattern of activity emerges across the population. To time different intervals, the brain learns the population patterns that coincide with key events. Both model categories have empirical support. However, they often differ in plausibility when applied to certain behavioral effects. Specifically, behavioral data indicate that the timing system has a rich computational capacity, allowing observers to spontaneously compute novel intervals from previously learned ones. In population-clock theories, population patterns map to time arbitrarily, making it difficult to explain how different patterns can be computationally combined. Ramping models are viewed as more plausible, assuming upstream circuits can set the slope of ramping neurons according to a given computation. Critically, recent studies suggest that neurons with nonlinear firing profiles often scale to time different intervals-compressing for shorter intervals and stretching for longer ones. This "temporal scaling" effect has led to a hybrid-theory where, like a population-clock model, population patterns encode time, yet like a ramping neuron adjusting its slope, the speed of each neuron's firing adapts to different intervals. Here, we argue that these "relative" population-clock models are as computationally plausible as ramping theories, viewing population-speed and ramp-slope adjustments as equivalent. Therefore, we view identifying these "speed-control" circuits as a key direction for evaluating how the timing system performs computations. Furthermore, temporal scaling highlights that a key distinction between different neural models is whether they propose an absolute or relative time-representation. However, we note that several behavioral studies suggest the brain processes both scales, cautioning against a dichotomy.
Collapse
Affiliation(s)
- Benjamin J. De Corte
- Department of Psychology, Columbia University, New York, NY, United States
- Division of Developmental Neuroscience, New York State Psychiatric Institute, New York, NY, United States
| | - Başak Akdoğan
- Department of Psychology, Columbia University, New York, NY, United States
- Division of Developmental Neuroscience, New York State Psychiatric Institute, New York, NY, United States
| | - Peter D. Balsam
- Department of Psychology, Columbia University, New York, NY, United States
- Division of Developmental Neuroscience, New York State Psychiatric Institute, New York, NY, United States
- Department of Neuroscience and Behavior, Barnard College, New York, NY, United States
| |
Collapse
|