1
|
Muller TH, Butler JL, Veselic S, Miranda B, Wallis JD, Dayan P, Behrens TEJ, Kurth-Nelson Z, Kennerley SW. Distributional reinforcement learning in prefrontal cortex. Nat Neurosci 2024; 27:403-408. [PMID: 38200183 PMCID: PMC10917656 DOI: 10.1038/s41593-023-01535-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 11/29/2023] [Indexed: 01/12/2024]
Abstract
The prefrontal cortex is crucial for learning and decision-making. Classic reinforcement learning (RL) theories center on learning the expectation of potential rewarding outcomes and explain a wealth of neural data in the prefrontal cortex. Distributional RL, on the other hand, learns the full distribution of rewarding outcomes and better explains dopamine responses. In the present study, we show that distributional RL also better explains macaque anterior cingulate cortex neuronal responses, suggesting that it is a common mechanism for reward-guided learning.
Collapse
Affiliation(s)
- Timothy H Muller
- Department of Experimental Psychology, University of Oxford, Oxford, UK.
- Department of Clinical and Movement Neurosciences, University College London, London, UK.
| | - James L Butler
- Department of Experimental Psychology, University of Oxford, Oxford, UK
- Department of Clinical and Movement Neurosciences, University College London, London, UK
| | - Sebastijan Veselic
- Department of Experimental Psychology, University of Oxford, Oxford, UK
- Department of Clinical and Movement Neurosciences, University College London, London, UK
- Wellcome Trust Centre for Human Neuroimaging, University College London, London, UK
| | - Bruno Miranda
- Department of Clinical and Movement Neurosciences, University College London, London, UK
- Institute of Physiology and Institute of Molecular Medicine, Lisbon School of Medicine, University of Lisbon, Lisbon, Portugal
| | - Joni D Wallis
- Department of Psychology and Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA, USA
| | - Peter Dayan
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany
- University of Tübingen, Tübingen, Germany
| | - Timothy E J Behrens
- Wellcome Trust Centre for Human Neuroimaging, University College London, London, UK
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, John Radcliffe Hospital, Oxford, UK
- Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College London, London, UK
| | - Zeb Kurth-Nelson
- Google DeepMind, London, UK.
- Max Planck University College London Centre for Computational Psychiatry and Ageing Research, University College London, London, UK.
| | - Steven W Kennerley
- Department of Experimental Psychology, University of Oxford, Oxford, UK.
- Department of Clinical and Movement Neurosciences, University College London, London, UK.
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, John Radcliffe Hospital, Oxford, UK.
| |
Collapse
|
2
|
Veselic S, Muller TH, Gutierrez E, Behrens TEJ, Hunt LT, Butler JL, Kennerley SW. A cognitive map for value-guided choice in ventromedial prefrontal cortex. bioRxiv 2023:2023.12.15.571895. [PMID: 38168410 PMCID: PMC10760117 DOI: 10.1101/2023.12.15.571895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
The prefrontal cortex is crucial for economic decision-making and representing the value of options. However, how such representations facilitate flexible decisions remains unknown. We reframe economic decision-making in prefrontal cortex in line with representations of structure within the medial temporal lobe because such cognitive map representations are known to facilitate flexible behaviour. Specifically, we framed choice between different options as a navigation process in value space. Here we show that choices in a 2D value space defined by reward magnitude and probability were represented with a grid-like code, analogous to that found in spatial navigation. The grid-like code was present in ventromedial prefrontal cortex (vmPFC) local field potential theta frequency and the result replicated in an independent dataset. Neurons in vmPFC similarly contained a grid-like code, in addition to encoding the linear value of the chosen option. Importantly, both signals were modulated by theta frequency - occurring at theta troughs but on separate theta cycles. Furthermore, we found sharp-wave ripples - a key neural signature of planning and flexible behaviour - in vmPFC, which were modulated by accuracy and reward. These results demonstrate that multiple cognitive map-like computations are deployed in vmPFC during economic decision-making, suggesting a new framework for the implementation of choice in prefrontal cortex.
Collapse
Affiliation(s)
- Sebastijan Veselic
- Department of Experimental Psychology, University of Oxford, UK
- Clinical and Movement Neurosciences, Department of Motor Neuroscience, University College London, London, UK
| | - Timothy H Muller
- Department of Experimental Psychology, University of Oxford, UK
- Clinical and Movement Neurosciences, Department of Motor Neuroscience, University College London, London, UK
| | - Elena Gutierrez
- Department of Experimental Psychology, University of Oxford, UK
- Clinical and Movement Neurosciences, Department of Motor Neuroscience, University College London, London, UK
| | - Timothy E J Behrens
- Wellcome Centre for Human Neuroimaging, University College London, London, UK
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, FMRIB, John Radcliffe Hospital, Oxford, UK
- Sainsbury Wellcome Centre for Neural Circuits and Behaviour College, University College London, London, UK
| | - Laurence T Hunt
- Department of Experimental Psychology, University of Oxford, UK
- Department of Psychiatry, University of Oxford, Oxford, UK
| | - James L Butler
- Department of Experimental Psychology, University of Oxford, UK
| | - Steven W Kennerley
- Department of Experimental Psychology, University of Oxford, UK
- Clinical and Movement Neurosciences, Department of Motor Neuroscience, University College London, London, UK
| |
Collapse
|
3
|
Whittington JCR, Muller TH, Mark S, Chen G, Barry C, Burgess N, Behrens TEJ. The Tolman-Eichenbaum Machine: Unifying Space and Relational Memory through Generalization in the Hippocampal Formation. Cell 2020; 183:1249-1263.e23. [PMID: 33181068 PMCID: PMC7707106 DOI: 10.1016/j.cell.2020.10.024] [Citation(s) in RCA: 150] [Impact Index Per Article: 37.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 06/11/2020] [Accepted: 10/13/2020] [Indexed: 12/19/2022]
Abstract
The hippocampal-entorhinal system is important for spatial and relational memory tasks. We formally link these domains, provide a mechanistic understanding of the hippocampal role in generalization, and offer unifying principles underlying many entorhinal and hippocampal cell types. We propose medial entorhinal cells form a basis describing structural knowledge, and hippocampal cells link this basis with sensory representations. Adopting these principles, we introduce the Tolman-Eichenbaum machine (TEM). After learning, TEM entorhinal cells display diverse properties resembling apparently bespoke spatial responses, such as grid, band, border, and object-vector cells. TEM hippocampal cells include place and landmark cells that remap between environments. Crucially, TEM also aligns with empirically recorded representations in complex non-spatial tasks. TEM also generates predictions that hippocampal remapping is not random as previously believed; rather, structural knowledge is preserved across environments. We confirm this structural transfer over remapping in simultaneously recorded place and grid cells.
Collapse
Affiliation(s)
- James C R Whittington
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford OX3 9DU, UK.
| | - Timothy H Muller
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford OX3 9DU, UK; Institute of Neurology, UCL, London WC1N 3BG, UK
| | - Shirley Mark
- Wellcome Centre for Human Neuroimaging, UCL, London WC1N 3AR, UK
| | - Guifen Chen
- Institute of Cognitive Neuroscience, UCL, London WC1N 3AZ, UK; School of Biological and Chemical Sciences, QMUL, London E1 4NS, UK
| | - Caswell Barry
- Sainsbury Wellcome Centre for Neural Circuits and Behaviour, UCL, London W1T 4JG, UK; Research department of Cell and Developmental Biology, UCL, London WC1E 6BT, UK
| | - Neil Burgess
- Institute of Neurology, UCL, London WC1N 3BG, UK; Wellcome Centre for Human Neuroimaging, UCL, London WC1N 3AR, UK; Institute of Cognitive Neuroscience, UCL, London WC1N 3AZ, UK; Sainsbury Wellcome Centre for Neural Circuits and Behaviour, UCL, London W1T 4JG, UK
| | - Timothy E J Behrens
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford OX3 9DU, UK; Wellcome Centre for Human Neuroimaging, UCL, London WC1N 3AR, UK; Sainsbury Wellcome Centre for Neural Circuits and Behaviour, UCL, London W1T 4JG, UK
| |
Collapse
|
4
|
Bakermans JJ, Muller TH, Behrens TE. Reinforcement Learning: Full Glass or Empty — Depends Who You Ask. Curr Biol 2020; 30:R321-R324. [DOI: 10.1016/j.cub.2020.02.062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
5
|
Abstract
Humans and animals construct internal models of their environment in order to select appropriate courses of action. The representation of uncertainty about the current state of the environment is a key feature of these models that controls the rate of learning as well as directly affecting choice behaviour. To maintain flexibility, given that uncertainty naturally decreases over time, most theoretical inference models include a dedicated mechanism to drive up model uncertainty. Here we probe the long-standing hypothesis that noradrenaline is involved in determining the uncertainty, or entropy, and thus flexibility, of neural models. Pupil diameter, which indexes neuromodulatory state including noradrenaline release, predicted increases (but not decreases) in entropy in a neural state model encoded in human medial orbitofrontal cortex, as measured using multivariate functional MRI. Activity in anterior cingulate cortex predicted pupil diameter. These results provide evidence for top-down, neuromodulatory control of entropy in neural state models.
Collapse
Affiliation(s)
- Timothy H Muller
- Wellcome Centre for Integrative Neuroimaging, Centre for Functional Magnetic Resonance Imaging of the BrainUniversity of Oxford, John Radcliffe HospitalOxfordUnited Kingdom
| | - Rogier B Mars
- Wellcome Centre for Integrative Neuroimaging, Centre for Functional Magnetic Resonance Imaging of the BrainUniversity of Oxford, John Radcliffe HospitalOxfordUnited Kingdom
- Donders Institute for Brain, Cognition and BehaviourRadboud UniversityNijmegenThe Netherlands
| | - Timothy E Behrens
- Wellcome Centre for Integrative Neuroimaging, Centre for Functional Magnetic Resonance Imaging of the BrainUniversity of Oxford, John Radcliffe HospitalOxfordUnited Kingdom
- Wellcome Centre for Human Neuroimaging, Institute of NeurologyUniversity College LondonLondonUnited Kingdom
| | - Jill X O'Reilly
- Wellcome Centre for Integrative Neuroimaging, Centre for Functional Magnetic Resonance Imaging of the BrainUniversity of Oxford, John Radcliffe HospitalOxfordUnited Kingdom
- Donders Institute for Brain, Cognition and BehaviourRadboud UniversityNijmegenThe Netherlands
- Department of Experimental PsychologyUniversity of OxfordOxfordUnited Kingdom
| |
Collapse
|
6
|
Behrens TE, Muller TH, Whittington JC, Mark S, Baram AB, Stachenfeld KL, Kurth-Nelson Z. What Is a Cognitive Map? Organizing Knowledge for Flexible Behavior. Neuron 2018; 100:490-509. [DOI: 10.1016/j.neuron.2018.10.002] [Citation(s) in RCA: 219] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2018] [Revised: 09/26/2018] [Accepted: 09/28/2018] [Indexed: 12/27/2022]
|
7
|
Eldor A, Vlodavsky I, Fuks Z, Muller TH, Eisert WG. Different Effects of Aspirin, Dipyridamole and UD-CG 115 on Platelet Activation in a Model of Vascular Injury: Studies with Extracellular Matrix Covered with Endothelial Cells. Thromb Haemost 2018. [DOI: 10.1055/s-0038-1661678] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
SummaryCultured endothelial cells produce an extracellular matrix (ECM) which activates platelets, similarly to deendothelialized vascular segments. Platelet-rich plasma (PRP) was incubated with endothelial cells cultures seeded in various densities on ECM. The interaction of the platelets with this artifical intima was evaluated by phase microscopy and by thromboxane A2 (TXA2) and prostacyclin (PGI2) measurement. Large platelet aggregates were formed on exposed ECM. Platelets aggregation but not adhesion on the ECM was markedly inhibited by the presence of endothelial cells. Pretreatment of the endothelial cells with 0.1 mM aspirin reduced their PGI2 synthesis and was associated with platelet aggregation on the ECM. 10 μM dipyridamole markedly inhibited platelet activation by ECM when the drug was added to citrated whole blood before PRP preparation. UD-CG 115 which elevates cyclic AMP in cardiac muscle, inhibited platelet aggregation and TXA2 production induced by ECM, in the presence as well as in the absence of endothelial cells, without any effect on endothelial PGI2 production.
Collapse
Affiliation(s)
- A Eldor
- The Department of Hematology, Hadassah University Hospital, Jerusalem, Israel
| | - I Vlodavsky
- The Radiation and Clinical Oncology, Hadassah University Hospital, Jerusalem, Israel
| | - Z Fuks
- The Radiation and Clinical Oncology, Hadassah University Hospital, Jerusalem, Israel
| | - T H Muller
- The Department of Biological Research, Dr. Karl Thomas GmbH, Biberach/Riss, Federal Republic of Germany
| | - W G Eisert
- The Department of Biological Research, Dr. Karl Thomas GmbH, Biberach/Riss, Federal Republic of Germany
| |
Collapse
|
8
|
Wagner FF, Gassner C, Muller TH, Schonitzer D, Schunter F, Flegel WA. Three molecular structures cause rhesus D category VI phenotypes with distinct immunohematologic features. Blood 1998; 91:2157-68. [PMID: 9490704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Rhesus D category VI (DVI) is the clinically most important partial D. DVI red blood cells were assumed to possess very low RhD antigen density and to be caused by two RHD-CE-D hybrid alleles. Because there was no population-based work-up, we screened three populations in central Europe for DVI. Twenty-six DVI samples were detected and examined by exon-specific RHD polymerase chain reaction with sequence-specific primers (PCR-SSP). A new genotype, hereby designated D category VI type III, was characterized as a RHD-Ce(3-6)-D hybrid allele by sequencing of the cDNA, parts of intron 1, and by PCR-restriction fragment length polymorphism (PCR-RFLP) of intron 2. Rhesus introns 5 and 6 were sequenced and the 3' breakpoints of all known DVI types shown to be distinct. We differentiated the 5' breakpoints of DVI type I and DVI type II by a newly devised RHD-PCR. Thus, the DVI phenotype originated in at least three independent molecular events. Each DVI type showed distinct immunohematologic features in flow cytometry. The number of RhD proteins accessible on the red blood cells' surface of DVI type III was normal (about 12,000 antigens/cell; DVI type I, 500; DVI type II, 2,400) based on the determination of an RhD epitope density profile. DVI type II and DVI type III occurred as CDe haplotypes, and DVI type I as a cDE haplotype. The distribution of the DVI types varied significantly in three German-speaking populations. Genotyping strategies should take account of allelic variations in partial RhD. The reconsideration of previous serologic and clinical data for partial D in view of the underlying molecular structures may be worthwhile.
Collapse
Affiliation(s)
- F F Wagner
- Abteilung Transfusionsmedizin, Universitat Ulm and DRK-Blutspendezentrale Ulm, Ulm, Germany
| | | | | | | | | | | |
Collapse
|
9
|
Weber E, Haas TA, Muller TH, Eisert WG, Hirsh J, Richardson M, Buchanan MR. Relationship between vessel wall 13-HODE synthesis and vessel wall thrombogenicity following injury: influence of salicylate and dipyridamole treatment. Thromb Res 1990; 57:383-92. [PMID: 2315893 DOI: 10.1016/0049-3848(90)90254-a] [Citation(s) in RCA: 27] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
We performed studies to determine the relationship between injured vessel wall thrombogenicity, vessel wall 13-hydroxyoctadecadienoic acid (13-HODE) synthesis and cAMP levels in rabbit treated with salicylate or dipyridamole. Injured vessel wall thrombogenicity was measured as the number of 3H-adenine labelled platelets adhered to the subendothelial basement membrane exposed by air injury in carotid arteries of rabbits treated orally with salicylate or dipyridamole. Vessel wall 13-HODE was measured by HPLC and vessel wall cAMP was measured by RIA. Vessel wall thrombogenicity was increased two-fold in rabbits treated with salicylate and decreased by half in rabbits treated with dipyridamole. The levels of vessel wall cAMP levels were correlated both with the plasma dipyridamole levels and increases in 13-HODE synthesis. cAMP levels were unaffected by salicylate treatment, but 13-HODE synthesis was decreased. We conclude that there is a significant relationship between vessel wall cAMP levels and 13-HODE synthesis, which in turn, influences subsequent vessel wall thrombogenicity.
Collapse
Affiliation(s)
- E Weber
- Department of Pathology, McMaster University, Hamilton, Canada
| | | | | | | | | | | | | |
Collapse
|
10
|
Eldor A, Vlodavsky I, Fuks Z, Muller TH, Eisert WG. Different effects of aspirin, dipyridamole and UD-CG 115 on platelet activation in a model of vascular injury: studies with extracellular matrix covered with endothelial cells. Thromb Haemost 1986; 56:333-9. [PMID: 3551181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Cultured endothelial cells produce an extracellular matrix (ECM) which activates platelets, similarly to deendothelialized vascular segments. Platelet-rich plasma (PRP) was incubated with endothelial cells cultures seeded in various densities on ECM. The interaction of the platelets with this artificial intima was evaluated by phase microscopy and by thromboxane A2 (TXA2) and prostacyclin (PGI2) measurement. Large platelet aggregates were formed on exposed ECM. Platelets aggregation but not adhesion on the ECM was markedly inhibited by the presence of endothelial cells. Pretreatment of the endothelial cells with 0.1 mM aspirin reduced their PGI2 synthesis and was associated with platelet aggregation on the ECM. 10 microM dipyridamole markedly inhibited platelet activation by ECM when the drug was added to citrated whole blood before PRP preparation. UD-CG 115 which elevates cyclic AMP in cardiac muscle, inhibited platelet aggregation and TXA2 production induced by ECM, in the presence as well as in the absence of endothelial cells, without any effect on endothelial PGI2 production.
Collapse
|