1
|
Hunt LT, Daw ND, Kaanders P, MacIver MA, Mugan U, Procyk E, Redish AD, Russo E, Scholl J, Stachenfeld K, Wilson CRE, Kolling N. Formalizing planning and information search in naturalistic decision-making. Nat Neurosci 2021; 24:1051-1064. [PMID: 34155400 DOI: 10.1038/s41593-021-00866-w] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Accepted: 03/23/2021] [Indexed: 02/05/2023]
Abstract
Decisions made by mammals and birds are often temporally extended. They require planning and sampling of decision-relevant information. Our understanding of such decision-making remains in its infancy compared with simpler, forced-choice paradigms. However, recent advances in algorithms supporting planning and information search provide a lens through which we can explain neural and behavioral data in these tasks. We review these advances to obtain a clearer understanding for why planning and curiosity originated in certain species but not others; how activity in the medial temporal lobe, prefrontal and cingulate cortices may support these behaviors; and how planning and information search may complement each other as means to improve future action selection.
Collapse
Affiliation(s)
- L T Hunt
- Department of Psychiatry, Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK.
| | - N D Daw
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ, USA
| | - P Kaanders
- Department of Experimental Psychology, Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK
| | - M A MacIver
- Center for Robotics and Biosystems, Department of Neurobiology, Department of Biomedical Engineering, Department of Mechanical Engineering, Northwestern University, Evanston, IL, USA
| | - U Mugan
- Center for Robotics and Biosystems, Department of Neurobiology, Department of Biomedical Engineering, Department of Mechanical Engineering, Northwestern University, Evanston, IL, USA
| | - E Procyk
- Univ Lyon, Université Claude Bernard Lyon 1, INSERM, Stem Cell and Brain Research Institute U1208, Bron, France
| | - A D Redish
- Department of Neuroscience, University of Minnesota, Minneapolis, MN, USA
| | - E Russo
- Department of Theoretical Neuroscience, Central Institute of Mental Health, Mannheim, Germany.,Department of Psychiatry and Psychotherapy, University Medical Center, Johannes Gutenberg University, Mainz, Germany
| | - J Scholl
- Department of Experimental Psychology, Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK
| | | | - C R E Wilson
- Univ Lyon, Université Claude Bernard Lyon 1, INSERM, Stem Cell and Brain Research Institute U1208, Bron, France
| | - N Kolling
- Department of Psychiatry, Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK.
| |
Collapse
|
2
|
Momennejad I, Russek EM, Cheong JH, Botvinick MM, Daw ND, Gershman SJ. The successor representation in human reinforcement learning. Nat Hum Behav 2017; 1:680-692. [PMID: 31024137 PMCID: PMC6941356 DOI: 10.1038/s41562-017-0180-8] [Citation(s) in RCA: 133] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2016] [Accepted: 07/07/2017] [Indexed: 11/08/2022]
Abstract
Theories of reward learning in neuroscience have focused on two families of algorithms thought to capture deliberative versus habitual choice. 'Model-based' algorithms compute the value of candidate actions from scratch, whereas 'model-free' algorithms make choice more efficient but less flexible by storing pre-computed action values. We examine an intermediate algorithmic family, the successor representation, which balances flexibility and efficiency by storing partially computed action values: predictions about future events. These pre-computation strategies differ in how they update their choices following changes in a task. The successor representation's reliance on stored predictions about future states predicts a unique signature of insensitivity to changes in the task's sequence of events, but flexible adjustment following changes to rewards. We provide evidence for such differential sensitivity in two behavioural studies with humans. These results suggest that the successor representation is a computational substrate for semi-flexible choice in humans, introducing a subtler, more cognitive notion of habit.
Collapse
Affiliation(s)
- I Momennejad
- Department of Psychology, Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA.
| | - E M Russek
- Center for Neural Science, New York University, New York, NY, USA
| | - J H Cheong
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - M M Botvinick
- DeepMind and Gatsby Computational Neuroscience Unit, University College London, London, UK
| | - N D Daw
- Department of Psychology, Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - S J Gershman
- Department of Psychology, Center for Brain Science, Harvard University, Cambridge, MA, USA
| |
Collapse
|