1
|
Zimmerman CA, Bolkan SS, Pan-Vazquez A, Wu B, Keppler EF, Meares-Garcia JB, Guthman EM, Fetcho RN, McMannon B, Lee J, Hoag AT, Lynch LA, Janarthanan SR, López Luna JF, Bondy AG, Falkner AL, Wang SSH, Witten IB. A neural mechanism for learning from delayed postingestive feedback. Nature 2025:10.1038/s41586-025-08828-z. [PMID: 40175547 DOI: 10.1038/s41586-025-08828-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Accepted: 02/21/2025] [Indexed: 04/04/2025]
Abstract
Animals learn the value of foods on the basis of their postingestive effects and thereby develop aversions to foods that are toxic1-10 and preferences to those that are nutritious11-13. However, it remains unclear how the brain is able to assign credit to flavours experienced during a meal with postingestive feedback signals that can arise after a substantial delay. Here we reveal an unexpected role for the postingestive reactivation of neural flavour representations in this temporal credit-assignment process. To begin, we leverage the fact that mice learn to associate novel14,15, but not familiar, flavours with delayed gastrointestinal malaise signals to investigate how the brain represents flavours that support aversive postingestive learning. Analyses of brain-wide activation patterns reveal that a network of amygdala regions is unique in being preferentially activated by novel flavours across every stage of learning (consumption, delayed malaise and memory retrieval). By combining high-density recordings in the amygdala with optogenetic stimulation of malaise-coding hindbrain neurons, we show that delayed malaise signals selectively reactivate flavour representations in the amygdala from a recent meal. The degree of malaise-driven reactivation of individual neurons predicts the strengthening of flavour responses upon memory retrieval, which in turn leads to stabilization of the population-level representation of the recently consumed flavour. By contrast, flavour representations in the amygdala degrade in the absence of unexpected postingestive consequences. Thus, we demonstrate that postingestive reactivation and plasticity of neural flavour representations may support learning from delayed feedback.
Collapse
Affiliation(s)
| | - Scott S Bolkan
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | | | - Bichan Wu
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Emma F Keppler
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | | | - Eartha Mae Guthman
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Robert N Fetcho
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Brenna McMannon
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Junuk Lee
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Austin T Hoag
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Laura A Lynch
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | | | - Juan F López Luna
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Adrian G Bondy
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Annegret L Falkner
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Samuel S-H Wang
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Ilana B Witten
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA.
- Howard Hughes Medical Institute, Princeton University, Princeton, NJ, USA.
| |
Collapse
|
2
|
Faust TW, Mohebi A, Berke JD. Reward expectation and receipt differentially modulate the spiking of accumbens D1+ and D2+ neurons. Curr Biol 2025; 35:1285-1297.e3. [PMID: 40020662 PMCID: PMC11968066 DOI: 10.1016/j.cub.2025.02.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 10/21/2024] [Accepted: 02/04/2025] [Indexed: 03/03/2025]
Abstract
The nucleus accumbens (NAc) helps govern motivation to pursue reward. Two distinct sets of NAc projection neurons-expressing dopamine D1 vs. D2 receptors-are thought to promote and suppress motivated behaviors, respectively. However, support for this conceptual framework is limited: in particular, the spiking patterns of these distinct cell types during motivated behavior have been largely unknown. Using optogenetic tagging, we recorded the spiking of identified D1+ and D2+ neurons in the NAc core as unrestrained rats performed an operant task in which motivation to initiate work tracks recent reward rate. D1+ neurons preferentially increased firing as rats initiated trials and fired more when reward expectation was higher. By contrast, D2+ cells preferentially increased firing later in the trial, especially in response to reward delivery-a finding not anticipated from current theoretical models. Our results provide new evidence for the specific contribution of NAc D1+ cells to self-initiated approach behavior and will spur updated models of how D2+ cells contribute to learning.
Collapse
Affiliation(s)
- T W Faust
- Department of Neurology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - A Mohebi
- Department of Neurology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - J D Berke
- Department of Neurology, University of California, San Francisco, San Francisco, CA 94158, USA; Department of Psychiatry and Behavioral Sciences, University of California, San Francisco, San Francisco, CA 94158, USA; Neuroscience Graduate Program, University of California, San Francisco, San Francisco, CA 94158, USA; Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, San Francisco, CA 94158, USA; Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA.
| |
Collapse
|
3
|
Lakshminarasimhan K, Buck J, Kellendonk C, Horga G. A corticostriatal learning mechanism linking excess striatal dopamine and auditory hallucinations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.18.643990. [PMID: 40166304 PMCID: PMC11956939 DOI: 10.1101/2025.03.18.643990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
Auditory hallucinations are linked to elevated striatal dopamine, but their underlying computational mechanisms have been obscured by regional heterogeneity in striatal dopamine signaling. To address this, we developed a normative circuit model in which corticostriatal plasticity in the ventral striatum is modulated by reward prediction errors to drive reinforcement learning while that in the sensory-dorsal striatum is modulated by sensory prediction errors derived from internal belief to drive self-supervised learning. We then validate the key predictions of this model using dopamine recordings across striatal regions in mice, as well as human behavior in a hybrid learning task. Finally, we find that changes in learning resulting from optogenetic stimulation of the sensory striatum in mice and individual variability in hallucination proneness in humans are best explained by selectively enhancing dopamine levels in the model sensory striatum. These findings identify plasticity mechanisms underlying biased learning of sensory expectations as a biologically plausible link between excess dopamine and hallucinations.
Collapse
Affiliation(s)
- Kaushik Lakshminarasimhan
- Zuckerman Mind Brain Behavior Institute, Department of Neuroscience, Columbia University, New York, NY, USA
| | - Justin Buck
- Zuckerman Mind Brain Behavior Institute, Department of Neuroscience, Columbia University, New York, NY, USA
- Department of Psychiatry, Columbia University, New York, NY, USA
| | - Christoph Kellendonk
- Department of Psychiatry, Columbia University, New York, NY, USA
- Department of Molecular Pharmacology and Therapeutics, Columbia University, New York, NY, USA
| | - Guillermo Horga
- Zuckerman Mind Brain Behavior Institute, Department of Neuroscience, Columbia University, New York, NY, USA
- Department of Psychiatry, Columbia University, New York, NY, USA
| |
Collapse
|
4
|
Ding M, Tomsick PL, Young RA, Jadhav SP. Ventral tegmental area dopamine neural activity switches simultaneously with rule representations in the prefrontal cortex and hippocampus. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.09.09.611811. [PMID: 39314328 PMCID: PMC11419070 DOI: 10.1101/2024.09.09.611811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
Multiple brain regions need to coordinate activity to support cognitive flexibility and behavioral adaptation. Neural activity in both the hippocampus (HPC) and prefrontal cortex (PFC) is known to represent spatial context and is sensitive to reward and rule alterations. Midbrain dopamine (DA) activity is key in reward seeking behavior and learning. There is abundant evidence that midbrain DA modulates HPC and PFC activity. However, it remains underexplored how these networks engage dynamically and coordinate temporally when animals must adjust their behavior according to changing reward contingencies. In particular, is there any relationship between DA reward prediction change during rule switching, and rule representation changes in PFC and CA1? We addressed these questions using simultaneous recording of neuronal population activity from the hippocampal area CA1, PFC and ventral tegmental area (VTA) in male TH-Cre rats performing two spatial working memory tasks with frequent rule switches in blocks of trials. CA1 and PFC ensembles showed rule-specific activity both during maze running and at reward locations, with PFC rule coding more consistent across animals compared to CA1. Optogenetically tagged VTA DA neuron firing activity responded to and predicted reward outcome. We found that the correct prediction in DA emerged gradually over trials after rule-switching in coordination with transitions in PFC and CA1 ensemble representations of the current rule after a rule switch, followed by behavioral adaptation to the correct rule sequence. Therefore, our study demonstrates a crucial temporal coordination between the rule representation in PFC/CA1, the dopamine reward signal and behavioral strategy.
Collapse
Affiliation(s)
- Mingxin Ding
- Graduate Program in Neuroscience, Brandeis University, Waltham, MA 02453, USA
| | - Porter L. Tomsick
- Undergraduate Program in Neuroscience, Brandeis University, Waltham, MA 02453, USA
- Department of Neuroscience, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | - Ryan A. Young
- Department of Psychology, Brandeis University, Waltham, MA, 02453, USA
| | - Shantanu P. Jadhav
- Graduate Program in Neuroscience, Brandeis University, Waltham, MA 02453, USA
- Department of Psychology, Brandeis University, Waltham, MA, 02453, USA
- Volen National Center for Complex Systems, Brandeis University, Waltham, MA, 02453, USA
| |
Collapse
|
5
|
Kamath T, Lodder B, Bilsel E, Green I, Dalangin R, Capelli P, Raghubardayal M, Legister J, Hulshof L, Wallace JB, Tian L, Uchida N, Watabe-Uchida M, Sabatini BL. Hunger modulates exploration through suppression of dopamine signaling in the tail of striatum. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.11.622990. [PMID: 39713287 PMCID: PMC11661229 DOI: 10.1101/2024.11.11.622990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2024]
Abstract
Caloric depletion leads to behavioral changes that help an animal find food and restore its homeostatic balance. Hunger increases exploration and risk-taking behavior, allowing an animal to forage for food despite risks; however, the neural circuitry underlying this change is unknown. Here, we characterize how hunger restructures an animal's spontaneous behavior as well as its directed exploration of a novel object. We show that hunger-induced changes in exploration are accompanied by and result from modulation of dopamine signaling in the tail of the striatum (TOS). Dopamine signaling in the TOS is modulated by internal hunger state through the activity of agouti-related peptide (AgRP) neurons, putative "hunger neurons" in the arcuate nucleus of the hypothalamus. These AgRP neurons are poly-synaptically connected to TOS-projecting dopaminergic neurons through the lateral hypothalamus, the central amygdala, and the periaqueductal grey. We thus delineate a hypothalamic-midbrain circuit that coordinates changes in exploration behavior in the hungry state.
Collapse
|
6
|
Duhne M, Mohebi A, Kim K, Pelattini L, Berke JD. A mismatch between striatal cholinergic pauses and dopaminergic reward prediction errors. Proc Natl Acad Sci U S A 2024; 121:e2410828121. [PMID: 39365823 PMCID: PMC11474027 DOI: 10.1073/pnas.2410828121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Accepted: 08/23/2024] [Indexed: 10/06/2024] Open
Abstract
Striatal acetylcholine and dopamine critically regulate movement, motivation, and reward-related learning. Pauses in cholinergic interneuron (CIN) firing are thought to coincide with dopamine pulses encoding reward prediction errors (RPE) to jointly enable synaptic plasticity. Here, we examine the firing of identified CINs during reward-guided decision-making in freely moving rats and compare this firing to dopamine release. Relationships between CINs, dopamine, and behavior varied strongly by subregion. In the dorsal-lateral striatum, a Go! cue evoked burst-pause CIN spiking, followed by a brief dopamine pulse that was unrelated to RPE. In the dorsal-medial striatum, this cue evoked only a CIN pause, that was curtailed by a movement-selective rebound in firing. Finally, in the ventral striatum, a reward cue evoked RPE-coding increases in both dopamine and CIN firing, without a consistent pause. Our results demonstrate a spatial and temporal dissociation between CIN pauses and dopamine RPE signals and will inform future models of striatal information processing under both normal and pathological conditions.
Collapse
Affiliation(s)
- Mariana Duhne
- Department of Neurology, University of California, San Francisco, CA94158
| | - Ali Mohebi
- Department of Neurology, University of California, San Francisco, CA94158
| | - Kyoungjun Kim
- Department of Neurology, University of California, San Francisco, CA94158
| | - Lilian Pelattini
- Department of Neurology, University of California, San Francisco, CA94158
| | - Joshua D. Berke
- Department of Neurology, University of California, San Francisco, CA94158
- Department of Psychiatry and Behavioral Science, University of California, San Francisco, CA94107
- Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, CA94158
- Weill Institute for Neurosciences, University of California, San Francisco, CA94158
| |
Collapse
|
7
|
Zimmerman CA, Bolkan SS, Pan-Vazquez A, Wu B, Keppler EF, Meares-Garcia JB, Guthman EM, Fetcho RN, McMannon B, Lee J, Hoag AT, Lynch LA, Janarthanan SR, López Luna JF, Bondy AG, Falkner AL, Wang SSH, Witten IB. A neural mechanism for learning from delayed postingestive feedback. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.06.561214. [PMID: 37873112 PMCID: PMC10592633 DOI: 10.1101/2023.10.06.561214] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Animals learn the value of foods based on their postingestive effects and thereby develop aversions to foods that are toxic1-6 and preferences to those that are nutritious7-14. However, it remains unclear how the brain is able to assign credit to flavors experienced during a meal with postingestive feedback signals that can arise after a substantial delay. Here, we reveal an unexpected role for postingestive reactivation of neural flavor representations in this temporal credit assignment process. To begin, we leverage the fact that mice learn to associate novel15-18, but not familiar, flavors with delayed gastric malaise signals to investigate how the brain represents flavors that support aversive postingestive learning. Surveying cellular resolution brainwide activation patterns reveals that a network of amygdala regions is unique in being preferentially activated by novel flavors across every stage of the learning process: the initial meal, delayed malaise, and memory retrieval. By combining high-density recordings in the amygdala with optogenetic stimulation of genetically defined hindbrain malaise cells, we find that postingestive malaise signals potently and specifically reactivate amygdalar novel flavor representations from a recent meal. The degree of malaise-driven reactivation of individual neurons predicts strengthening of flavor responses upon memory retrieval, leading to stabilization of the population-level representation of the recently consumed flavor. In contrast, meals without postingestive consequences degrade neural flavor representations as flavors become familiar and safe. Thus, our findings demonstrate that interoceptive reactivation of amygdalar flavor representations provides a neural mechanism to resolve the temporal credit assignment problem inherent to postingestive learning.
Collapse
Affiliation(s)
| | - Scott S Bolkan
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | | | - Bichan Wu
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Emma F Keppler
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | | | - Eartha Mae Guthman
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Robert N Fetcho
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Brenna McMannon
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Junuk Lee
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Austin T Hoag
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Laura A Lynch
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | | | - Juan F López Luna
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Adrian G Bondy
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Annegret L Falkner
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Samuel S-H Wang
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Ilana B Witten
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
- Howard Hughes Medical Institute, Princeton University, Princeton, NJ, USA
| |
Collapse
|
8
|
Gershman SJ, Assad JA, Datta SR, Linderman SW, Sabatini BL, Uchida N, Wilbrecht L. Explaining dopamine through prediction errors and beyond. Nat Neurosci 2024; 27:1645-1655. [PMID: 39054370 DOI: 10.1038/s41593-024-01705-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 06/13/2024] [Indexed: 07/27/2024]
Abstract
The most influential account of phasic dopamine holds that it reports reward prediction errors (RPEs). The RPE-based interpretation of dopamine signaling is, in its original form, probably too simple and fails to explain all the properties of phasic dopamine observed in behaving animals. This Perspective helps to resolve some of the conflicting interpretations of dopamine that currently exist in the literature. We focus on the following three empirical challenges to the RPE theory of dopamine: why does dopamine (1) ramp up as animals approach rewards, (2) respond to sensory and motor features and (3) influence action selection? We argue that the prediction error concept, once it has been suitably modified and generalized based on an analysis of each computational problem, answers each challenge. Nonetheless, there are a number of additional empirical findings that appear to demand fundamentally different theoretical explanations beyond encoding RPE. Therefore, looking forward, we discuss the prospects for a unifying theory that respects the diversity of dopamine signaling and function as well as the complex circuitry that both underlies and responds to dopaminergic transmission.
Collapse
Affiliation(s)
- Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA, USA.
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA.
| | - John A Assad
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | | | - Scott W Linderman
- Department of Statistics and Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA
| | - Bernardo L Sabatini
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | - Linda Wilbrecht
- Department of Psychology and Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA
| |
Collapse
|
9
|
Taira M, Millard SJ, Verghese A, DiFazio LE, Hoang IB, Jia R, Sias A, Wikenheiser A, Sharpe MJ. Dopamine Release in the Nucleus Accumbens Core Encodes the General Excitatory Components of Learning. J Neurosci 2024; 44:e0120242024. [PMID: 38969504 PMCID: PMC11358529 DOI: 10.1523/jneurosci.0120-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Revised: 06/18/2024] [Accepted: 06/20/2024] [Indexed: 07/07/2024] Open
Abstract
Dopamine release in the nucleus accumbens core (NAcC) is generally considered to be a proxy for phasic firing of the ventral tegmental area dopamine (VTADA) neurons. Thus, dopamine release in NAcC is hypothesized to reflect a unitary role in reward prediction error signaling. However, recent studies reveal more diverse roles of dopamine neurons, which support an emerging idea that dopamine regulates learning differently in distinct circuits. To understand whether the NAcC might regulate a unique component of learning, we recorded dopamine release in NAcC while male rats performed a backward conditioning task where a reward is followed by a neutral cue. We used this task because we can delineate different components of learning, which include sensory-specific inhibitory and general excitatory components. Furthermore, we have shown that VTADA neurons are necessary for both the specific and general components of backward associations. Here, we found that dopamine release in NAcC increased to the reward across learning while reducing to the cue that followed as it became more expected. This mirrors the dopamine prediction error signal seen during forward conditioning and cannot be accounted for temporal-difference reinforcement learning. Subsequent tests allowed us to dissociate these learning components and revealed that dopamine release in NAcC reflects the general excitatory component of backward associations, but not their sensory-specific component. These results emphasize the importance of examining distinct functions of different dopamine projections in reinforcement learning.
Collapse
Affiliation(s)
- Masakazu Taira
- Department of Psychology, University of Sydney, Camperdown, New South Wales 2006, Australia
- Department of Psychology, University of California, Los Angeles 90095, California
| | - Samuel J Millard
- Department of Psychology, University of California, Los Angeles 90095, California
| | - Anna Verghese
- Department of Psychology, University of California, Los Angeles 90095, California
| | - Lauren E DiFazio
- Department of Psychology, University of California, Los Angeles 90095, California
| | - Ivy B Hoang
- Department of Psychology, University of California, Los Angeles 90095, California
| | - Ruiting Jia
- Department of Psychology, University of California, Los Angeles 90095, California
| | - Ana Sias
- Department of Psychology, University of California, Los Angeles 90095, California
| | - Andrew Wikenheiser
- Department of Psychology, University of California, Los Angeles 90095, California
| | - Melissa J Sharpe
- Department of Psychology, University of Sydney, Camperdown, New South Wales 2006, Australia
- Department of Psychology, University of California, Los Angeles 90095, California
| |
Collapse
|
10
|
Oesch LT, Ryan MB, Churchland AK. From innate to instructed: A new look at perceptual decision-making. Curr Opin Neurobiol 2024; 86:102871. [PMID: 38569230 PMCID: PMC11162954 DOI: 10.1016/j.conb.2024.102871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 03/07/2024] [Accepted: 03/08/2024] [Indexed: 04/05/2024]
Abstract
Understanding how subjects perceive sensory stimuli in their environment and use this information to guide appropriate actions is a major challenge in neuroscience. To study perceptual decision-making in animals, researchers use tasks that either probe spontaneous responses to stimuli (often described as "naturalistic") or train animals to associate stimuli with experimenter-defined responses. Spontaneous decisions rely on animals' pre-existing knowledge, while trained tasks offer greater versatility, albeit often at the cost of extensive training. Here, we review emerging approaches to investigate perceptual decision-making using both spontaneous and trained behaviors, highlighting their strengths and limitations. Additionally, we propose how trained decision-making tasks could be improved to achieve faster learning and a more generalizable understanding of task rules.
Collapse
Affiliation(s)
- Lukas T Oesch
- Department of Neurobiology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, United States
| | - Michael B Ryan
- Department of Neurobiology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, United States. https://twitter.com/NeuroMikeRyan
| | - Anne K Churchland
- Department of Neurobiology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, United States.
| |
Collapse
|
11
|
Alejandro RJ, Holroyd CB. Hierarchical control over foraging behavior by anterior cingulate cortex. Neurosci Biobehav Rev 2024; 160:105623. [PMID: 38490499 DOI: 10.1016/j.neubiorev.2024.105623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 02/14/2024] [Accepted: 03/13/2024] [Indexed: 03/17/2024]
Abstract
Foraging is a natural behavior that involves making sequential decisions to maximize rewards while minimizing the costs incurred when doing so. The prevalence of foraging across species suggests that a common brain computation underlies its implementation. Although anterior cingulate cortex is believed to contribute to foraging behavior, its specific role has been contentious, with predominant theories arguing either that it encodes environmental value or choice difficulty. Additionally, recent attempts to characterize foraging have taken place within the reinforcement learning framework, with increasingly complex models scaling with task complexity. Here we review reinforcement learning foraging models, highlighting the hierarchical structure of many foraging problems. We extend this literature by proposing that ACC guides foraging according to principles of model-based hierarchical reinforcement learning. This idea holds that ACC function is organized hierarchically along a rostral-caudal gradient, with rostral structures monitoring the status and completion of high-level task goals (like finding food), and midcingulate structures overseeing the execution of task options (subgoals, like harvesting fruit) and lower-level actions (such as grabbing an apple).
Collapse
Affiliation(s)
| | - Clay B Holroyd
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| |
Collapse
|
12
|
Floeder JR, Jeong H, Mohebi A, Namboodiri VMK. Mesolimbic dopamine ramps reflect environmental timescales. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.27.587103. [PMID: 38659749 PMCID: PMC11042231 DOI: 10.1101/2024.03.27.587103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Mesolimbic dopamine activity occasionally exhibits ramping dynamics, reigniting debate on theories of dopamine signaling. This debate is ongoing partly because the experimental conditions under which dopamine ramps emerge remain poorly understood. Here, we show that during Pavlovian and instrumental conditioning, mesolimbic dopamine ramps are only observed when the inter-trial interval is short relative to the trial period. These results constrain theories of dopamine signaling and identify a critical variable determining the emergence of dopamine ramps.
Collapse
Affiliation(s)
- Joseph R Floeder
- Neuroscience Graduate Program, University of California, San Francisco, CA, USA
| | - Huijeong Jeong
- Department of Neurology, University of California, San Francisco, CA, USA
| | - Ali Mohebi
- Department of Neurology, University of California, San Francisco, CA, USA
| | - Vijay Mohan K Namboodiri
- Neuroscience Graduate Program, University of California, San Francisco, CA, USA
- Department of Neurology, University of California, San Francisco, CA, USA
- Weill Institute for Neurosciences, Kavli Institute for Fundamental Neuroscience, Center for Integrative Neuroscience, University of California, San Francisco, CA, USA
| |
Collapse
|
13
|
Mohebi A, Wei W, Pelattini L, Kim K, Berke JD. Dopamine transients follow a striatal gradient of reward time horizons. Nat Neurosci 2024; 27:737-746. [PMID: 38321294 PMCID: PMC11001583 DOI: 10.1038/s41593-023-01566-3] [Citation(s) in RCA: 26] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 12/21/2023] [Indexed: 02/08/2024]
Abstract
Animals make predictions to guide their behavior and update those predictions through experience. Transient increases in dopamine (DA) are thought to be critical signals for updating predictions. However, it is unclear how this mechanism handles a wide range of behavioral timescales-from seconds or less (for example, if singing a song) to potentially hours or more (for example, if hunting for food). Here we report that DA transients in distinct rat striatal subregions convey prediction errors based on distinct time horizons. DA dynamics systematically accelerated from ventral to dorsomedial to dorsolateral striatum, in the tempo of spontaneous fluctuations, the temporal integration of prior rewards and the discounting of future rewards. This spectrum of timescales for evaluative computations can help achieve efficient learning and adaptive motivation for a broad range of behaviors.
Collapse
Affiliation(s)
- Ali Mohebi
- Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Wei Wei
- Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Lilian Pelattini
- Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Kyoungjun Kim
- Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Joshua D Berke
- Department of Neurology, University of California San Francisco, San Francisco, CA, USA.
- Department of Psychiatry and Behavioral Sciences, University of California San Francisco, San Francisco, CA, USA.
- Neuroscience Graduate Program, University of California San Francisco, San Francisco, CA, USA.
- Kavli Institute for Fundamental Neuroscience, University of California San Francisco, San Francisco, CA, USA.
- Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
14
|
Sagiv Y, Akam T, Witten IB, Daw ND. Prioritizing replay when future goals are unknown. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.29.582822. [PMID: 38496674 PMCID: PMC10942393 DOI: 10.1101/2024.02.29.582822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
Although hippocampal place cells replay nonlocal trajectories, the computational function of these events remains controversial. One hypothesis, formalized in a prominent reinforcement learning account, holds that replay plans routes to current goals. However, recent puzzling data appear to contradict this perspective by showing that replayed destinations lag current goals. These results may support an alternative hypothesis that replay updates route information to build a "cognitive map." Yet no similar theory exists to formalize this view, and it is unclear how such a map is represented or what role replay plays in computing it. We address these gaps by introducing a theory of replay that learns a map of routes to candidate goals, before reward is available or when its location may change. Our work extends the planning account to capture a general map-building function for replay, reconciling it with data, and revealing an unexpected relationship between the seemingly distinct hypotheses.
Collapse
Affiliation(s)
- Yotam Sagiv
- Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, USA
| | - Thomas Akam
- Department of Experimental Psychology, Oxford University, Oxford, UK
| | - Ilana B Witten
- Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, USA
| | - Nathaniel D Daw
- Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, USA
| |
Collapse
|