1
|
Eluchans M, Lancia GL, Maselli A, D’Alessandro M, Gordon JR, Pezzulo G. Adaptive planning depth in human problem-solving. ROYAL SOCIETY OPEN SCIENCE 2025; 12:241161. [PMID: 40206860 PMCID: PMC11978448 DOI: 10.1098/rsos.241161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/13/2024] [Revised: 12/20/2024] [Accepted: 03/05/2025] [Indexed: 04/11/2025]
Abstract
We humans are capable of solving challenging planning problems, but the range of adaptive strategies that we use to address them is not yet fully characterized. Here, we designed a series of problem-solving tasks that require planning at different depths. After systematically comparing the performance of participants and planning models, we found that when facing problems that require planning to a certain number of subgoals (from 1 to 8), participants make an adaptive use of their cognitive resources-namely, they tend to select an initial plan having the minimum required depth, rather than selecting the same depth for all problems. These results support the view of problem-solving as a bounded rational process, which adapts costly cognitive resources to task demands.
Collapse
Affiliation(s)
- Mattia Eluchans
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
- Sapienza University of Rome, Roma, Lazio, Italy
| | - Gian Luca Lancia
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
- Sapienza University of Rome, Roma, Lazio, Italy
| | - Antonella Maselli
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
- Department of Biomedical and Dental Sciences and Morphofunctional Imaging, University of Messina, Messina, Italy
| | - Marco D’Alessandro
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
| | | | - Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
| |
Collapse
|
2
|
Park J, Polidoro P, Fortunato C, Arnold J, Mensh B, Gallego JA, Dudman JT. Conjoint specification of action by neocortex and striatum. Neuron 2025; 113:620-636.e6. [PMID: 39837325 DOI: 10.1016/j.neuron.2024.12.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 09/09/2024] [Accepted: 12/19/2024] [Indexed: 01/23/2025]
Abstract
The interplay between two major forebrain structures-cortex and subcortical striatum-is critical for flexible, goal-directed action. Traditionally, it has been proposed that striatum is critical for selecting what type of action is initiated, while the primary motor cortex is involved in specifying the continuous parameters of an upcoming/ongoing movement. Recent data indicate that striatum may also be involved in specification. These alternatives have been difficult to reconcile because comparing very distinct actions, as is often done, makes essentially indistinguishable predictions. Here, we develop quantitative models to reveal a somewhat paradoxical insight: only comparing neural activity across similar actions makes strongly distinguishing predictions. We thus developed a novel reach-to-pull task in which mice reliably selected between two similar but distinct reach targets and pull forces. Simultaneous cortical and subcortical recordings were uniquely consistent with a model in which cortex and striatum jointly specify continuous parameters governing movement execution.
Collapse
Affiliation(s)
- Junchol Park
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA.
| | - Peter Polidoro
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA
| | - Catia Fortunato
- Department of Bioengineering, Imperial College London, London W12 0BZ, UK
| | - Jon Arnold
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA
| | - Brett Mensh
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA
| | - Juan A Gallego
- Department of Bioengineering, Imperial College London, London W12 0BZ, UK
| | - Joshua T Dudman
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA.
| |
Collapse
|
3
|
Nicholas J, Daw ND, Shohamy D. Proactive and reactive construction of memory-based preferences. Nat Commun 2025; 16:1618. [PMID: 39948096 PMCID: PMC11825774 DOI: 10.1038/s41467-025-56183-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 01/08/2025] [Indexed: 02/16/2025] Open
Abstract
We are often faced with decisions we have never encountered before, requiring us to infer possible outcomes before making a choice. Computational theories suggest that one way to make these types of decisions is by accessing and linking related experiences stored in memory. Past work has shown that such memory-based preference construction can occur at a number of different timepoints relative to the moment a decision is made. Some studies have found that memories are integrated at the time a decision is faced (reactively) while others found that memory integration happens earlier, when memories were initially encoded (proactively). Here we offer a resolution to this inconsistency, demonstrating that these two strategies tradeoff rationally as a function of the associative structure of memory. We use fMRI to decode patterns of brain responses unique to categories of images in memory and find that proactive memory access is more common and allows more efficient inference. However, we also find that participants use reactive access when choice options are linked to a larger number of memory associations. Together, these results indicate that the brain judiciously conducts proactive inference by accessing memories ahead of time when conditions make this strategy more favorable.
Collapse
Affiliation(s)
- Jonathan Nicholas
- Department of Psychology, Columbia University, New York, NY, USA
- Mortimer B. Zuckerman Mind, Brain, Behavior Institute, Columbia University, New York, NY, USA
- Department of Psychology, New York University, New York, NY, USA
| | - Nathaniel D Daw
- Department of Psychology, Princeton University, Princeton, NJ, USA
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Daphna Shohamy
- Department of Psychology, Columbia University, New York, NY, USA.
- Mortimer B. Zuckerman Mind, Brain, Behavior Institute, Columbia University, New York, NY, USA.
- The Kavli Institute for Brain Science, Columbia University, New York, NY, USA.
| |
Collapse
|
4
|
Butz MV, Mittenbühler M, Schwöbel S, Achimova A, Gumbsch C, Otte S, Kiebel S. Contextualizing predictive minds. Neurosci Biobehav Rev 2025; 168:105948. [PMID: 39580009 DOI: 10.1016/j.neubiorev.2024.105948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 09/13/2024] [Accepted: 11/16/2024] [Indexed: 11/25/2024]
Abstract
The structure of human memory seems to be optimized for efficient prediction, planning, and behavior. We propose that these capacities rely on a tripartite structure of memory that includes concepts, events, and contexts-three layers that constitute the mental world model. We suggest that the mechanism that critically increases adaptivity and flexibility is the tendency to contextualize. This tendency promotes local, context-encoding abstractions, which focus event- and concept-based planning and inference processes on the task and situation at hand. As a result, cognitive contextualization offers a solution to the frame problem-the need to select relevant features of the environment from the rich stream of sensorimotor signals. We draw evidence for our proposal from developmental psychology and neuroscience. Adopting a computational stance, we present evidence from cognitive modeling research which suggests that context sensitivity is a feature that is critical for maximizing the efficiency of cognitive processes. Finally, we turn to recent deep-learning architectures which independently demonstrate how context-sensitive memory can emerge in a self-organized learning system constrained by cognitively-inspired inductive biases.
Collapse
Affiliation(s)
- Martin V Butz
- Cognitive Modeling, Faculty of Science, University of Tübingen, Sand 14, Tübingen 72076, Germany.
| | - Maximilian Mittenbühler
- Cognitive Modeling, Faculty of Science, University of Tübingen, Sand 14, Tübingen 72076, Germany
| | - Sarah Schwöbel
- Cognitive Computational Neuroscience, Faculty of Psychology, TU Dresden, School of Science, Dresden 01062, Germany
| | - Asya Achimova
- Cognitive Modeling, Faculty of Science, University of Tübingen, Sand 14, Tübingen 72076, Germany
| | - Christian Gumbsch
- Cognitive Modeling, Faculty of Science, University of Tübingen, Sand 14, Tübingen 72076, Germany; Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, TU Dresden, Dresden 01069, Germany
| | - Sebastian Otte
- Cognitive Modeling, Faculty of Science, University of Tübingen, Sand 14, Tübingen 72076, Germany; Adaptive AI Lab, Institute of Robotics and Cognitive Systems, University of Lübeck, Ratzeburger Allee 160, Lübeck 23562, Germany
| | - Stefan Kiebel
- Cognitive Computational Neuroscience, Faculty of Psychology, TU Dresden, School of Science, Dresden 01062, Germany
| |
Collapse
|
5
|
Nicholas J, Daw ND, Shohamy D. Proactive and reactive construction of memory-based preferences. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.10.570977. [PMID: 38106137 PMCID: PMC10723393 DOI: 10.1101/2023.12.10.570977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
We are often faced with decisions we have never encountered before, requiring us to infer possible outcomes before making a choice. Computational theories suggest that one way to make these types of decisions is by accessing and linking related experiences stored in memory. Past work has shown that such memory-based preference construction can occur at a number of different timepoints relative to the moment a decision is made. Some studies have found that memories are integrated at the time a decision is faced (reactively) while others found that memory integration happens earlier, when memories were initially encoded (proactively). Here we offer a resolution to this inconsistency, demonstrating that these two strategies tradeoff rationally as a function of the associative structure of memory. We use fMRI to decode patterns of brain responses unique to categories of images in memory and find that proactive memory access is more common and allows more efficient inference. However, we also find that participants use reactive access when choice options are linked to a larger number of memory associations. Together, these results indicate that the brain judiciously conducts proactive inference by accessing memories ahead of time when conditions make this strategy more favorable.
Collapse
Affiliation(s)
- Jonathan Nicholas
- Department of Psychology, Columbia University, New York, NY, USA
- Mortimer B. Zuckerman Mind, Brain, Behavior Institute, Columbia University, New York, NY, USA
- Department of Psychology, New York University, New York, NY, USA
| | - Nathaniel D. Daw
- Department of Psychology, Princeton University, Princeton, NJ, USA
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Daphna Shohamy
- Department of Psychology, Columbia University, New York, NY, USA
- Mortimer B. Zuckerman Mind, Brain, Behavior Institute, Columbia University, New York, NY, USA
- The Kavli Institute for Brain Science, Columbia University, New York, NY, USA
| |
Collapse
|
6
|
Millidge B, Song Y, Lak A, Walton ME, Bogacz R. Reward Bases: A simple mechanism for adaptive acquisition of multiple reward types. PLoS Comput Biol 2024; 20:e1012580. [PMID: 39561186 PMCID: PMC11614280 DOI: 10.1371/journal.pcbi.1012580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Revised: 12/03/2024] [Accepted: 10/22/2024] [Indexed: 11/21/2024] Open
Abstract
Animals can adapt their preferences for different types of reward according to physiological state, such as hunger or thirst. To explain this ability, we employ a simple multi-objective reinforcement learning model that learns multiple values according to different reward dimensions such as food or water. We show that by weighting these learned values according to the current needs, behaviour may be flexibly adapted to present preferences. This model predicts that individual dopamine neurons should encode the errors associated with some reward dimensions more than with others. To provide a preliminary test of this prediction, we reanalysed a small dataset obtained from a single primate in an experiment which to our knowledge is the only published study where the responses of dopamine neurons to stimuli predicting distinct types of rewards were recorded. We observed that in addition to subjective economic value, dopamine neurons encode a gradient of reward dimensions; some neurons respond most to stimuli predicting food rewards while the others respond more to stimuli predicting fluids. We also proposed a possible implementation of the model in the basal ganglia network, and demonstrated how the striatal system can learn values in multiple dimensions, even when dopamine neurons encode mixtures of prediction error from different dimensions. Additionally, the model reproduces the instant generalisation to new physiological states seen in dopamine responses and in behaviour. Our results demonstrate how a simple neural circuit can flexibly guide behaviour according to animals' needs.
Collapse
Affiliation(s)
- Beren Millidge
- MRC Brain Network Dynamics Unit, University of Oxford, Oxford, United Kingdom
| | - Yuhang Song
- MRC Brain Network Dynamics Unit, University of Oxford, Oxford, United Kingdom
| | - Armin Lak
- Department of Physiology, Anatomy & Genetics, University of Oxford, Oxford, United Kingdom
| | - Mark E. Walton
- Department of Experimental Psychology, University of Oxford, Oxford, United Kingdom
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, United Kingdom
| | - Rafal Bogacz
- MRC Brain Network Dynamics Unit, University of Oxford, Oxford, United Kingdom
- Theoretical Sciences Visiting Program (TSVP), Okinawa Institute of Science and Technology Graduate University, Onna, Japan
| |
Collapse
|
7
|
Tarder-Stoll H, Baldassano C, Aly M. Consolidation Enhances Sequential Multistep Anticipation but Diminishes Access to Perceptual Features. Psychol Sci 2024; 35:1178-1199. [PMID: 39110746 PMCID: PMC11532645 DOI: 10.1177/09567976241256617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 04/19/2024] [Indexed: 08/10/2024] Open
Abstract
Many experiences unfold predictably over time. Memory for these temporal regularities enables anticipation of events multiple steps into the future. Because temporally predictable events repeat over days, weeks, and years, we must maintain-and potentially transform-memories of temporal structure to support adaptive behavior. We explored how individuals build durable models of temporal regularities to guide multistep anticipation. Healthy young adults (Experiment 1: N = 99, age range = 18-40 years; Experiment 2: N = 204, age range = 19-40 years) learned sequences of scene images that were predictable at the category level and contained incidental perceptual details. Individuals then anticipated upcoming scene categories multiple steps into the future, immediately and at a delay. Consolidation increased the efficiency of anticipation, particularly for events further in the future, but diminished access to perceptual features. Further, maintaining a link-based model of the sequence after consolidation improved anticipation accuracy. Consolidation may therefore promote efficient and durable models of temporal structure, thus facilitating anticipation of future events.
Collapse
Affiliation(s)
- Hannah Tarder-Stoll
- Department of Psychology, Columbia University
- Baycrest Health Sciences, Rotman Research Institute, Toronto, Canada
| | | | - Mariam Aly
- Department of Psychology, Columbia University
| |
Collapse
|
8
|
Giallanza T, Campbell D, Cohen JD. Toward the Emergence of Intelligent Control: Episodic Generalization and Optimization. Open Mind (Camb) 2024; 8:688-722. [PMID: 38828434 PMCID: PMC11142636 DOI: 10.1162/opmi_a_00143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 04/01/2024] [Indexed: 06/05/2024] Open
Abstract
Human cognition is unique in its ability to perform a wide range of tasks and to learn new tasks quickly. Both abilities have long been associated with the acquisition of knowledge that can generalize across tasks and the flexible use of that knowledge to execute goal-directed behavior. We investigate how this emerges in a neural network by describing and testing the Episodic Generalization and Optimization (EGO) framework. The framework consists of an episodic memory module, which rapidly learns relationships between stimuli; a semantic pathway, which more slowly learns how stimuli map to responses; and a recurrent context module, which maintains a representation of task-relevant context information, integrates this over time, and uses it both to recall context-relevant memories (in episodic memory) and to bias processing in favor of context-relevant features and responses (in the semantic pathway). We use the framework to address empirical phenomena across reinforcement learning, event segmentation, and category learning, showing in simulations that the same set of underlying mechanisms accounts for human performance in all three domains. The results demonstrate how the components of the EGO framework can efficiently learn knowledge that can be flexibly generalized across tasks, furthering our understanding of how humans can quickly learn how to perform a wide range of tasks-a capability that is fundamental to human intelligence.
Collapse
Affiliation(s)
- Tyler Giallanza
- Department of Psychology, Princeton University, Princeton, NJ, USA
| | - Declan Campbell
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Jonathan D. Cohen
- Department of Psychology, Princeton University, Princeton, NJ, USA
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| |
Collapse
|
9
|
Alejandro RJ, Holroyd CB. Hierarchical control over foraging behavior by anterior cingulate cortex. Neurosci Biobehav Rev 2024; 160:105623. [PMID: 38490499 DOI: 10.1016/j.neubiorev.2024.105623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 02/14/2024] [Accepted: 03/13/2024] [Indexed: 03/17/2024]
Abstract
Foraging is a natural behavior that involves making sequential decisions to maximize rewards while minimizing the costs incurred when doing so. The prevalence of foraging across species suggests that a common brain computation underlies its implementation. Although anterior cingulate cortex is believed to contribute to foraging behavior, its specific role has been contentious, with predominant theories arguing either that it encodes environmental value or choice difficulty. Additionally, recent attempts to characterize foraging have taken place within the reinforcement learning framework, with increasingly complex models scaling with task complexity. Here we review reinforcement learning foraging models, highlighting the hierarchical structure of many foraging problems. We extend this literature by proposing that ACC guides foraging according to principles of model-based hierarchical reinforcement learning. This idea holds that ACC function is organized hierarchically along a rostral-caudal gradient, with rostral structures monitoring the status and completion of high-level task goals (like finding food), and midcingulate structures overseeing the execution of task options (subgoals, like harvesting fruit) and lower-level actions (such as grabbing an apple).
Collapse
Affiliation(s)
| | - Clay B Holroyd
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| |
Collapse
|
10
|
Wilbrecht L, Davidow JY. Goal-directed learning in adolescence: neurocognitive development and contextual influences. Nat Rev Neurosci 2024; 25:176-194. [PMID: 38263216 DOI: 10.1038/s41583-023-00783-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/12/2023] [Indexed: 01/25/2024]
Abstract
Adolescence is a time during which we transition to independence, explore new activities and begin pursuit of major life goals. Goal-directed learning, in which we learn to perform actions that enable us to obtain desired outcomes, is central to many of these processes. Currently, our understanding of goal-directed learning in adolescence is itself in a state of transition, with the scientific community grappling with inconsistent results. When we examine metrics of goal-directed learning through the second decade of life, we find that many studies agree there are steady gains in performance in the teenage years, but others report that adolescent goal-directed learning is already adult-like, and some find adolescents can outperform adults. To explain the current variability in results, sophisticated experimental designs are being applied to test learning in different contexts. There is also increasing recognition that individuals of different ages and in different states will draw on different neurocognitive systems to support goal-directed learning. Through adoption of more nuanced approaches, we can be better prepared to recognize and harness adolescent strengths and to decipher the purpose (or goals) of adolescence itself.
Collapse
Affiliation(s)
- Linda Wilbrecht
- Department of Psychology, University of California, Berkeley, CA, USA.
- Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA.
| | - Juliet Y Davidow
- Department of Psychology, Northeastern University, Boston, MA, USA.
| |
Collapse
|
11
|
Wientjes S, Holroyd CB. The successor representation subserves hierarchical abstraction for goal-directed behavior. PLoS Comput Biol 2024; 20:e1011312. [PMID: 38377074 PMCID: PMC10906840 DOI: 10.1371/journal.pcbi.1011312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 03/01/2024] [Accepted: 02/05/2024] [Indexed: 02/22/2024] Open
Abstract
Humans have the ability to craft abstract, temporally extended and hierarchically organized plans. For instance, when considering how to make spaghetti for dinner, we typically concern ourselves with useful "subgoals" in the task, such as cutting onions, boiling pasta, and cooking a sauce, rather than particulars such as how many cuts to make to the onion, or exactly which muscles to contract. A core question is how such decomposition of a more abstract task into logical subtasks happens in the first place. Previous research has shown that humans are sensitive to a form of higher-order statistical learning named "community structure". Community structure is a common feature of abstract tasks characterized by a logical ordering of subtasks. This structure can be captured by a model where humans learn predictions of upcoming events multiple steps into the future, discounting predictions of events further away in time. One such model is the "successor representation", which has been argued to be useful for hierarchical abstraction. As of yet, no study has convincingly shown that this hierarchical abstraction can be put to use for goal-directed behavior. Here, we investigate whether participants utilize learned community structure to craft hierarchically informed action plans for goal-directed behavior. Participants were asked to search for paintings in a virtual museum, where the paintings were grouped together in "wings" representing community structure in the museum. We find that participants' choices accord with the hierarchical structure of the museum and that their response times are best predicted by a successor representation. The degree to which the response times reflect the community structure of the museum correlates with several measures of performance, including the ability to craft temporally abstract action plans. These results suggest that successor representation learning subserves hierarchical abstractions relevant for goal-directed behavior.
Collapse
Affiliation(s)
- Sven Wientjes
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | - Clay B. Holroyd
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| |
Collapse
|
12
|
Scholz V, Waltmann M, Herzog N, Reiter A, Horstmann A, Deserno L. Cortical Grey Matter Mediates Increases in Model-Based Control and Learning from Positive Feedback from Adolescence to Adulthood. J Neurosci 2023; 43:2178-2189. [PMID: 36823039 PMCID: PMC10039741 DOI: 10.1523/jneurosci.1418-22.2023] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 12/20/2022] [Accepted: 01/13/2023] [Indexed: 02/25/2023] Open
Abstract
Cognition and brain structure undergo significant maturation from adolescence into adulthood. Model-based (MB) control is known to increase across development, which is mediated by cognitive abilities. Here, we asked two questions unaddressed in previous developmental studies. First, what are the brain structural correlates of age-related increases in MB control? Second, how are age-related increases in MB control from adolescence to adulthood influenced by motivational context? A human developmental sample (n = 103; age, 12-50, male/female, 55:48) completed structural MRI and an established task to capture MB control. The task was modified with respect to outcome valence by including (1) reward and punishment blocks to manipulate the motivational context and (2) an additional choice test to assess learning from positive versus negative feedback. After replicating that an age-dependent increase in MB control is mediated by cognitive abilities, we demonstrate first-time evidence that gray matter density (GMD) in the parietal cortex mediates the increase of MB control with age. Although motivational context did not relate to age-related changes in MB control, learning from positive feedback improved with age. Meanwhile, negative feedback learning showed no age effects. We present a first report that an age-related increase in positive feedback learning was mediated by reduced GMD in the parietal, medial, and dorsolateral prefrontal cortex. Our findings indicate that brain maturation, putatively reflected in lower GMD, in distinct and partially overlapping brain regions could lead to a more efficient brain organization and might thus be a key developmental step toward age-related increases in planning and value-based choice.SIGNIFICANCE STATEMENT Changes in model-based decision-making are paralleled by extensive maturation in cognition and brain structure across development. Still, to date the neuroanatomical underpinnings of these changes remain unclear. Here, we demonstrate for the first time that parietal GMD mediates age-dependent increases in model-based control. Age-related increases in positive feedback learning were mediated by reduced GMD in the parietal, medial, and dorsolateral prefrontal cortex. A manipulation of motivational context did not have an impact on age-related changes in model-based control. These findings highlight that brain maturation in distinct and overlapping cortical regions constitutes a key developmental step toward improved value-based choices.
Collapse
Affiliation(s)
- Vanessa Scholz
- Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, Centre of Mental Health, University of Würzburg, 97080 Würzburg, Germany
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, 6525 GD Nijmegen, The Netherlands
| | - Maria Waltmann
- Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, Centre of Mental Health, University of Würzburg, 97080 Würzburg, Germany
- Max Planck Institute for Cognition and Neuroscience, D-04103 Leipzig, Germany
| | - Nadine Herzog
- Max Planck Institute for Cognition and Neuroscience, D-04103 Leipzig, Germany
- Integrated Research and Treatment Center AdiposityDiseases, Leipzig University Medical Center, 04103 Leipzig, Germany
| | - Andrea Reiter
- Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, Centre of Mental Health, University of Würzburg, 97080 Würzburg, Germany
- Collaborative Research Center-940 Volition and Cognitive Control, Faculty of Psychology, Technical University Dresden, 01069 Dresden, Germany
| | - Annette Horstmann
- Max Planck Institute for Cognition and Neuroscience, D-04103 Leipzig, Germany
- Integrated Research and Treatment Center AdiposityDiseases, Leipzig University Medical Center, 04103 Leipzig, Germany
- Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Lorenz Deserno
- Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, Centre of Mental Health, University of Würzburg, 97080 Würzburg, Germany
- Max Planck Institute for Cognition and Neuroscience, D-04103 Leipzig, Germany
- Integrated Research and Treatment Center AdiposityDiseases, Leipzig University Medical Center, 04103 Leipzig, Germany
- Department of Psychiatry and Psychotherapy, University Hospital Carl Gustav Carus, Technical University Dresden, 01069 Dresden, Germany
| |
Collapse
|
13
|
Young ME, Howatt BC. Resource limitations: A taxonomy. Behav Processes 2023; 206:104823. [PMID: 36682436 DOI: 10.1016/j.beproc.2023.104823] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 01/02/2023] [Accepted: 01/17/2023] [Indexed: 01/21/2023]
Abstract
Decision making within the context of resource limitations requires balancing the short-term benefits of obtaining a resource and the long-term consequences of depleting those resources. The present manuscript focuses on four types of tasks that share this tradeoff to develop a taxonomy that will encourage a deeper understanding of the psychological processes at play. The four types considered are foraging, common pool traps, deterioration traps, and a novel designation referred to as resource cliffs. All four will be shown to include two opposite processes - depletion of the resource and its replenishment over time. By considering the unique and shared features of these tasks, a taxonomy of features emerges that can be combined to not only create novel tasks but also to shift the research focus to task features rather than specific tasks. The paper closes with a consideration of current theoretical frameworks previously applied to one or more of these resource-limitation tasks as well as the promise of reinforcement learning as a unifying theory.
Collapse
|
14
|
Kurth-Nelson Z, Behrens T, Wayne G, Miller K, Luettgau L, Dolan R, Liu Y, Schwartenbeck P. Replay and compositional computation. Neuron 2023; 111:454-469. [PMID: 36640765 DOI: 10.1016/j.neuron.2022.12.028] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 08/11/2022] [Accepted: 12/18/2022] [Indexed: 01/15/2023]
Abstract
Replay in the brain has been viewed as rehearsal or, more recently, as sampling from a transition model. Here, we propose a new hypothesis: that replay is able to implement a form of compositional computation where entities are assembled into relationally bound structures to derive qualitatively new knowledge. This idea builds on recent advances in neuroscience, which indicate that the hippocampus flexibly binds objects to generalizable roles and that replay strings these role-bound objects into compound statements. We suggest experiments to test our hypothesis, and we end by noting the implications for AI systems which lack the human ability to radically generalize past experience to solve new problems.
Collapse
Affiliation(s)
- Zeb Kurth-Nelson
- DeepMind, London, UK; Max Planck UCL Centre for Computational Psychiatry and Ageing Research, London, UK.
| | - Timothy Behrens
- Wellcome Centre for Human Neuroimaging, University College London, London, UK; Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK
| | | | - Kevin Miller
- DeepMind, London, UK; Institute of Ophthalmology, University College London, London, UK
| | - Lennart Luettgau
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, London, UK
| | - Ray Dolan
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, London, UK; Wellcome Centre for Human Neuroimaging, University College London, London, UK
| | - Yunzhe Liu
- State Key Laboratory of Cognitive Neuroscience and Learning, IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China; Chinese Institute for Brain Research, Beijing, China
| | - Philipp Schwartenbeck
- Max Planck Institute for Biological Cybernetics, Tubingen, Germany; University of Tubingen, Tubingen, Germany
| |
Collapse
|
15
|
Lancia GL, Eluchans M, D’Alessandro M, Spiers HJ, Pezzulo G. Humans account for cognitive costs when finding shortcuts: An information-theoretic analysis of navigation. PLoS Comput Biol 2023; 19:e1010829. [PMID: 36608145 PMCID: PMC9851521 DOI: 10.1371/journal.pcbi.1010829] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 01/19/2023] [Accepted: 12/19/2022] [Indexed: 01/09/2023] Open
Abstract
When faced with navigating back somewhere we have been before we might either retrace our steps or seek a shorter path. Both choices have costs. Here, we ask whether it is possible to characterize formally the choice of navigational plans as a bounded rational process that trades off the quality of the plan (e.g., its length) and the cognitive cost required to find and implement it. We analyze the navigation strategies of two groups of people that are firstly trained to follow a "default policy" taking a route in a virtual maze and then asked to navigate to various known goal destinations, either in the way they want ("Go To Goal") or by taking novel shortcuts ("Take Shortcut"). We address these wayfinding problems using InfoRL: an information-theoretic approach that formalizes the cognitive cost of devising a navigational plan, as the informational cost to deviate from a well-learned route (the "default policy"). In InfoRL, optimality refers to finding the best trade-off between route length and the amount of control information required to find it. We report five main findings. First, the navigational strategies automatically identified by InfoRL correspond closely to different routes (optimal or suboptimal) in the virtual reality map, which were annotated by hand in previous research. Second, people deliberate more in places where the value of investing cognitive resources (i.e., relevant goal information) is greater. Third, compared to the group of people who receive the "Go To Goal" instruction, those who receive the "Take Shortcut" instruction find shorter but less optimal solutions, reflecting the intrinsic difficulty of finding optimal shortcuts. Fourth, those who receive the "Go To Goal" instruction modulate flexibly their cognitive resources, depending on the benefits of finding the shortcut. Finally, we found a surprising amount of variability in the choice of navigational strategies and resource investment across participants. Taken together, these results illustrate the benefits of using InfoRL to address navigational planning problems from a bounded rational perspective.
Collapse
Affiliation(s)
- Gian Luca Lancia
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
- University of Rome “La Sapienza”, Rome, Italy
| | - Mattia Eluchans
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
- University of Rome “La Sapienza”, Rome, Italy
| | - Marco D’Alessandro
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
| | - Hugo J. Spiers
- Institute of Behavioural Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, United Kingdom
| | - Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
| |
Collapse
|
16
|
Emanuel A, Eldar E. Emotions as computations. Neurosci Biobehav Rev 2023; 144:104977. [PMID: 36435390 PMCID: PMC9805532 DOI: 10.1016/j.neubiorev.2022.104977] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 10/26/2022] [Accepted: 11/22/2022] [Indexed: 11/26/2022]
Abstract
Emotions ubiquitously impact action, learning, and perception, yet their essence and role remain widely debated. Computational accounts of emotion aspire to answer these questions with greater conceptual precision informed by normative principles and neurobiological data. We examine recent progress in this regard and find that emotions may implement three classes of computations, which serve to evaluate states, actions, and uncertain prospects. For each of these, we use the formalism of reinforcement learning to offer a new formulation that better accounts for existing evidence. We then consider how these distinct computations may map onto distinct emotions and moods. Integrating extensive research on the causes and consequences of different emotions suggests a parsimonious one-to-one mapping, according to which emotions are integral to how we evaluate outcomes (pleasure & pain), learn to predict them (happiness & sadness), use them to inform our (frustration & content) and others' (anger & gratitude) actions, and plan in order to realize (desire & hope) or avoid (fear & anxiety) uncertain outcomes.
Collapse
Affiliation(s)
- Aviv Emanuel
- Department of Psychology, Hebrew University of Jerusalem, Jerusalem 9190501, Israel; Department of Cognitive and Brain Sciences, Hebrew University of Jerusalem, Jerusalem 9190501, Israel.
| | - Eran Eldar
- Department of Psychology, Hebrew University of Jerusalem, Jerusalem 9190501, Israel; Department of Cognitive and Brain Sciences, Hebrew University of Jerusalem, Jerusalem 9190501, Israel.
| |
Collapse
|
17
|
Tiganj Z, Singh I, Esfahani ZG, Howard MW. Scanning a compressed ordered representation of the future. J Exp Psychol Gen 2022; 151:3082-3096. [PMID: 35913876 PMCID: PMC9670103 DOI: 10.1037/xge0001243] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Several authors have suggested a deep symmetry between the psychological processes that underlie our ability to remember the past and make predictions about the future. The judgment of recency (JOR) task measures temporal order judgments for the past by presenting pairs of probe stimuli; participants choose the probe that was presented more recently. We performed a short-term relative JOR task and introduced a novel judgment of imminence (JOI) task to study temporal order judgments for the future. In the JOR task, participants were presented with a sequence of stimuli and asked to choose which of two probe stimuli was presented closer to the present. In the JOI task, participants were trained on a probabilistic sequence. After training, the sequence was interrupted with probe stimuli. Participants were asked to choose which of two probe stimuli was expected to be presented closer to the present. Replicating prior work on JOR, we found that RT results supported a backward self-terminating search model operating on a temporally organized representation of the past. We also showed that RT distributions are consistent with this model and that the temporally organized representation is compressed. Critically, results for the JOI task probing expectations of the future suggest a forward self-terminating search model operating on a temporally organized representation of the future. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
Collapse
|
18
|
Prospective and retrospective values integrated in frontal cortex drive predictive choice. Proc Natl Acad Sci U S A 2022; 119:e2206067119. [PMID: 36417435 PMCID: PMC9889848 DOI: 10.1073/pnas.2206067119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
To make a deliberate action in a volatile environment, the brain must frequently reassess the value of each action (action-value). Choice can be initially made from the experience of trial-and-errors, but once the dynamics of the environment is learned, the choice can be made from the knowledge of the environment. The action-values constructed from the experience (retrospective value) and the ones from the knowledge (prospective value) were identified in various regions of the brain. However, how and which neural circuit integrates these values and executes the chosen action remains unknown. Combining reinforcement learning and two-photon calcium imaging, we found that the preparatory activity of neurons in a part of the frontal cortex, the anterior-lateral motor (ALM) area, initially encodes retrospective value, but after extensive training, they jointly encode the retrospective and prospective value. Optogenetic inhibition of ALM preparatory activity specifically abolished the expert mice's predictive choice behavior and returned them to the novice-like state. Thus, the integrated action-value encoded in the preparatory activity of ALM plays an important role to bias the action toward the knowledge-dependent, predictive choice behavior.
Collapse
|
19
|
Ho MK, Saxe R, Cushman F. Planning with Theory of Mind. Trends Cogn Sci 2022; 26:959-971. [PMID: 36089494 DOI: 10.1016/j.tics.2022.08.003] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Revised: 08/08/2022] [Accepted: 08/09/2022] [Indexed: 01/12/2023]
Abstract
Understanding Theory of Mind should begin with an analysis of the problems it solves. The traditional answer is that Theory of Mind is used for predicting others' thoughts and actions. However, the same Theory of Mind is also used for planning to change others' thoughts and actions. Planning requires that Theory of Mind consists of abstract structured causal representations and supports efficient search and selection from innumerable possible actions. Theory of Mind contrasts with less cognitively demanding alternatives: statistical predictive models of other people's actions, or model-free reinforcement of actions by their effects on other people. Theory of Mind is likely used to plan novel interventions and predict their effects, for example, in pedagogy, emotion regulation, and impression management.
Collapse
Affiliation(s)
- Mark K Ho
- Department of Computer Science, Princeton University, Princeton, NJ, USA; Department of Psychology, Princeton University, Princeton, NJ, USA.
| | - Rebecca Saxe
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA
| | - Fiery Cushman
- Department of Psychology, Harvard University, Cambridge, MA, USA
| |
Collapse
|
20
|
Polti I, Nau M, Kaplan R, van Wassenhove V, Doeller CF. Rapid encoding of task regularities in the human hippocampus guides sensorimotor timing. eLife 2022; 11:e79027. [PMID: 36317500 PMCID: PMC9625083 DOI: 10.7554/elife.79027] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 10/02/2022] [Indexed: 11/17/2022] Open
Abstract
The brain encodes the statistical regularities of the environment in a task-specific yet flexible and generalizable format. Here, we seek to understand this process by bridging two parallel lines of research, one centered on sensorimotor timing, and the other on cognitive mapping in the hippocampal system. By combining functional magnetic resonance imaging (fMRI) with a fast-paced time-to-contact (TTC) estimation task, we found that the hippocampus signaled behavioral feedback received in each trial as well as performance improvements across trials along with reward-processing regions. Critically, it signaled performance improvements independent from the tested intervals, and its activity accounted for the trial-wise regression-to-the-mean biases in TTC estimation. This is in line with the idea that the hippocampus supports the rapid encoding of temporal context even on short time scales in a behavior-dependent manner. Our results emphasize the central role of the hippocampus in statistical learning and position it at the core of a brain-wide network updating sensorimotor representations in real time for flexible behavior.
Collapse
Affiliation(s)
- Ignacio Polti
- Kavli Institute for Systems Neuroscience, Centre for Neural Computation, The Egil and Pauline Braathen and Fred Kavli Centre for Cortical Microcircuits, Jebsen Centre for Alzheimer’s Disease, Norwegian University of Science and TechnologyTrondheimNorway
- Max Planck Institute for Human Cognitive and Brain SciencesLeipzigGermany
| | - Matthias Nau
- Kavli Institute for Systems Neuroscience, Centre for Neural Computation, The Egil and Pauline Braathen and Fred Kavli Centre for Cortical Microcircuits, Jebsen Centre for Alzheimer’s Disease, Norwegian University of Science and TechnologyTrondheimNorway
- Max Planck Institute for Human Cognitive and Brain SciencesLeipzigGermany
| | - Raphael Kaplan
- Kavli Institute for Systems Neuroscience, Centre for Neural Computation, The Egil and Pauline Braathen and Fred Kavli Centre for Cortical Microcircuits, Jebsen Centre for Alzheimer’s Disease, Norwegian University of Science and TechnologyTrondheimNorway
- Department of Basic Psychology, Clinical Psychology, and Psychobiology, Universitat Jaume ICastellón de la PlanaSpain
| | - Virginie van Wassenhove
- CEA DRF/Joliot, NeuroSpin; INSERM, Cognitive Neuroimaging Unit; CNRS, Université Paris-SaclayGif-Sur-YvetteFrance
| | - Christian F Doeller
- Kavli Institute for Systems Neuroscience, Centre for Neural Computation, The Egil and Pauline Braathen and Fred Kavli Centre for Cortical Microcircuits, Jebsen Centre for Alzheimer’s Disease, Norwegian University of Science and TechnologyTrondheimNorway
- Max Planck Institute for Human Cognitive and Brain SciencesLeipzigGermany
- Wilhelm Wundt Institute of Psychology, Leipzig UniversityLeipzigGermany
| |
Collapse
|
21
|
Farisco M, Pennartz C, Annen J, Cecconi B, Evers K. Indicators and criteria of consciousness: ethical implications for the care of behaviourally unresponsive patients. BMC Med Ethics 2022; 23:30. [PMID: 35313885 PMCID: PMC8935680 DOI: 10.1186/s12910-022-00770-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Accepted: 03/13/2022] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Assessing consciousness in other subjects, particularly in non-verbal and behaviourally disabled subjects (e.g., patients with disorders of consciousness), is notoriously challenging but increasingly urgent. The high rate of misdiagnosis among disorders of consciousness raises the need for new perspectives in order to inspire new technical and clinical approaches. MAIN BODY We take as a starting point a recently introduced list of operational indicators of consciousness that facilitates its recognition in challenging cases like non-human animals and Artificial Intelligence to explore their relevance to disorders of consciousness and their potential ethical impact on the diagnosis and healthcare of relevant patients. Indicators of consciousness mean particular capacities that can be deduced from observing the behaviour or cognitive performance of the subject in question (or from neural correlates of such performance) and that do not define a hard threshold in deciding about the presence of consciousness, but can be used to infer a graded measure based on the consistency amongst the different indicators. The indicators of consciousness under consideration offer a potential useful strategy for identifying and assessing residual consciousness in patients with disorders of consciousness, setting the theoretical stage for an operationalization and quantification of relevant brain activity. CONCLUSIONS Our heuristic analysis supports the conclusion that the application of the identified indicators of consciousness to its disorders will likely inspire new strategies for assessing three very urgent issues: the misdiagnosis of disorders of consciousness; the need for a gold standard in detecting consciousness and diagnosing its disorders; and the need for a refined taxonomy of disorders of consciousness.
Collapse
Affiliation(s)
- Michele Farisco
- Centre for Research Ethics and Bioethics, Uppsala University, Uppsala, Sweden.
- Science and Society Unit, Biogem, Biology and Molecular Genetics Research Institute, Ariano Irpino, AV, Italy.
| | - Cyriel Pennartz
- Department of Cognitive and Systems Neuroscience, Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands
- Research Priority Area, Brain and Cognition, University of Amsterdam, Amsterdam, The Netherlands
| | - Jitka Annen
- Coma Science Group, GIGA-Consciousness, University of Liege, Liege, Belgium
- Centre du Cerveau, University Hospital of Liege, Liege, Belgium
| | - Benedetta Cecconi
- Coma Science Group, GIGA-Consciousness, University of Liege, Liege, Belgium
- Centre du Cerveau, University Hospital of Liege, Liege, Belgium
| | - Kathinka Evers
- Centre for Research Ethics and Bioethics, Uppsala University, Uppsala, Sweden
| |
Collapse
|
22
|
Abstract
In human neuroscience, studies of cognition are rarely grounded in non-task-evoked, 'spontaneous' neural activity. Indeed, studies of spontaneous activity tend to focus predominantly on intrinsic neural patterns (for example, resting-state networks). Taking a 'representation-rich' approach bridges the gap between cognition and resting-state communities: this approach relies on decoding task-related representations from spontaneous neural activity, allowing quantification of the representational content and rich dynamics of such activity. For example, if we know the neural representation of an episodic memory, we can decode its subsequent replay during rest. We argue that such an approach advances cognitive research beyond a focus on immediate task demand and provides insight into the functional relevance of the intrinsic neural pattern (for example, the default mode network). This in turn enables a greater integration between human and animal neuroscience, facilitating experimental testing of theoretical accounts of intrinsic activity, and opening new avenues of research in psychiatry.
Collapse
|
23
|
Sharp PB, Russek EM, Huys QJM, Dolan RJ, Eldar E. Humans perseverate on punishment avoidance goals in multigoal reinforcement learning. eLife 2022; 11:e74402. [PMID: 35199640 PMCID: PMC8912924 DOI: 10.7554/elife.74402] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Accepted: 02/21/2022] [Indexed: 11/20/2022] Open
Abstract
Managing multiple goals is essential to adaptation, yet we are only beginning to understand computations by which we navigate the resource demands entailed in so doing. Here, we sought to elucidate how humans balance reward seeking and punishment avoidance goals, and relate this to variation in its expression within anxious individuals. To do so, we developed a novel multigoal pursuit task that includes trial-specific instructed goals to either pursue reward (without risk of punishment) or avoid punishment (without the opportunity for reward). We constructed a computational model of multigoal pursuit to quantify the degree to which participants could disengage from the pursuit goals when instructed to, as well as devote less model-based resources toward goals that were less abundant. In general, participants (n = 192) were less flexible in avoiding punishment than in pursuing reward. Thus, when instructed to pursue reward, participants often persisted in avoiding features that had previously been associated with punishment, even though at decision time these features were unambiguously benign. In a similar vein, participants showed no significant downregulation of avoidance when punishment avoidance goals were less abundant in the task. Importantly, we show preliminary evidence that individuals with chronic worry may have difficulty disengaging from punishment avoidance when instructed to seek reward. Taken together, the findings demonstrate that people avoid punishment less flexibly than they pursue reward. Future studies should test in larger samples whether a difficulty to disengage from punishment avoidance contributes to chronic worry.
Collapse
Affiliation(s)
- Paul B Sharp
- The Hebrew University of JerusalemJerusalemIsrael
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Wellcome Centre for Human Neuroimaging, University College LondonLondonUnited Kingdom
| | - Evan M Russek
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Wellcome Centre for Human Neuroimaging, University College LondonLondonUnited Kingdom
| | - Quentin JM Huys
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Division of Psychiatry, University College LondonLondonUnited Kingdom
| | - Raymond J Dolan
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Wellcome Centre for Human Neuroimaging, University College LondonLondonUnited Kingdom
| | - Eran Eldar
- The Hebrew University of JerusalemJerusalemIsrael
| |
Collapse
|
24
|
Abstract
Recent breakthroughs in artificial intelligence (AI) have enabled machines to plan in tasks previously thought to be uniquely human. Meanwhile, the planning algorithms implemented by the brain itself remain largely unknown. Here, we review neural and behavioral data in sequential decision-making tasks that elucidate the ways in which the brain does-and does not-plan. To systematically review available biological data, we create a taxonomy of planning algorithms by summarizing the relevant design choices for such algorithms in AI. Across species, recording techniques, and task paradigms, we find converging evidence that the brain represents future states consistent with a class of planning algorithms within our taxonomy-focused, depth-limited, and serial. However, we argue that current data are insufficient for addressing more detailed algorithmic questions. We propose a new approach leveraging AI advances to drive experiments that can adjudicate between competing candidate algorithms.
Collapse
|
25
|
Brunec IK, Momennejad I. Predictive Representations in Hippocampal and Prefrontal Hierarchies. J Neurosci 2022; 42:299-312. [PMID: 34799416 PMCID: PMC8802932 DOI: 10.1523/jneurosci.1327-21.2021] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Revised: 10/19/2021] [Accepted: 10/22/2021] [Indexed: 11/21/2022] Open
Abstract
As we navigate the world, we use learned representations of relational structures to explore and to reach goals. Studies of how relational knowledge enables inference and planning are typically conducted in controlled small-scale settings. It remains unclear, however, how people use stored knowledge in continuously unfolding navigation (e.g., walking long distances in a city). We hypothesized that multiscale predictive representations guide naturalistic navigation in humans, and these scales are organized along posterior-anterior prefrontal and hippocampal hierarchies. We conducted model-based representational similarity analyses of neuroimaging data collected while male and female participants navigated realistically long paths in virtual reality. We tested the pattern similarity of each point, along each path, to a weighted sum of its successor points within predictive horizons of different scales. We found that anterior PFC showed the largest predictive horizons, posterior hippocampus the smallest, with the anterior hippocampus and orbitofrontal regions in between. Our findings offer novel insights into how cognitive maps support hierarchical planning at multiple scales.SIGNIFICANCE STATEMENT Whenever we navigate the world, we represent our journey at multiple horizons: from our immediate surroundings to our distal goal. How are such cognitive maps at different horizons simultaneously represented in the brain? Here, we applied a reinforcement learning-based analysis to neuroimaging data acquired while participants virtually navigated their hometown. We investigated neural patterns in the hippocampus and PFC, key cognitive map regions. We uncovered predictive representations with multiscale horizons in prefrontal and hippocampal gradients, with the longest predictive horizons in anterior PFC and the shortest in posterior hippocampus. These findings provide empirical support for the computational hypothesis that multiscale neural representations guide goal-directed navigation. This advances our understanding of hierarchical planning in everyday navigation of realistic distances.
Collapse
Affiliation(s)
- Iva K Brunec
- Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania 19104
| | | |
Collapse
|
26
|
Haynos AF, Widge AS, Anderson LM, Redish AD. Beyond Description and Deficits: How Computational Psychiatry Can Enhance an Understanding of Decision-Making in Anorexia Nervosa. Curr Psychiatry Rep 2022; 24:77-87. [PMID: 35076888 PMCID: PMC8934594 DOI: 10.1007/s11920-022-01320-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/13/2021] [Indexed: 01/27/2023]
Abstract
PURPOSE OF REVIEW Despite decades of research, knowledge of the mechanisms maintaining anorexia nervosa (AN) remains incomplete and clearly effective treatments elusive. Novel theoretical frameworks are needed to advance mechanistic and treatment research for this disorder. Here, we argue the utility of engaging a novel lens that differs from existing perspectives in psychiatry. Specifically, we argue the necessity of expanding beyond two historically common perspectives: (1) the descriptive perspective: the tendency to define mechanisms on the basis of surface characteristics and (2) the deficit perspective: the tendency to search for mechanisms associated with under-functioning of decision-making abilities and related circuity, rather than problems of over-functioning, in psychiatric disorders. RECENT FINDINGS Computational psychiatry can provide a novel framework for understanding AN because this approach emphasizes the role of computational misalignments (rather than absolute deficits or excesses) between decision-making strategies and environmental demands as the key factors promoting psychiatric illnesses. Informed by this approach, we argue that AN can be understood as a disorder of excess goal pursuit, maintained by over-engagement, rather than disengagement, of executive functioning strategies and circuits. Emerging evidence suggests that this same computational imbalance may constitute an under-investigated phenotype presenting transdiagnostically across psychiatric disorders. A variety of computational models can be used to further elucidate excess goal pursuit in AN. Most traditional psychiatric treatments do not target excess goal pursuit or associated neurocognitive mechanisms. Thus, targeting at the level of computational dysfunction may provide a new avenue for enhancing treatment for AN and related disorders.
Collapse
Affiliation(s)
- Ann F. Haynos
- Department of Psychiatry and Behavioral Sciences, University of Minnesota, 2450 Riverside Ave, Minneapolis, MN F 253, USA
| | - Alik S. Widge
- Department of Psychiatry and Behavioral Sciences, University of Minnesota, 2450 Riverside Ave, Minneapolis, MN F 253, USA
| | - Lisa M. Anderson
- Department of Psychiatry and Behavioral Sciences, University of Minnesota, 2450 Riverside Ave, Minneapolis, MN F 253, USA
| | - A. David Redish
- Department of Neuroscience, University of Minnesota, 6-145 Jackson Hall 321 Church St. SE, Minneapolis, MN 55455, USA
| |
Collapse
|
27
|
Deserno L, Moran R, Michely J, Lee Y, Dayan P, Dolan RJ. Dopamine enhances model-free credit assignment through boosting of retrospective model-based inference. eLife 2021; 10:e67778. [PMID: 34882092 PMCID: PMC8758138 DOI: 10.7554/elife.67778] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 12/08/2021] [Indexed: 11/13/2022] Open
Abstract
Dopamine is implicated in representing model-free (MF) reward prediction errors a as well as influencing model-based (MB) credit assignment and choice. Putative cooperative interactions between MB and MF systems include a guidance of MF credit assignment by MB inference. Here, we used a double-blind, placebo-controlled, within-subjects design to test an hypothesis that enhancing dopamine levels boosts the guidance of MF credit assignment by MB inference. In line with this, we found that levodopa enhanced guidance of MF credit assignment by MB inference, without impacting MF and MB influences directly. This drug effect correlated negatively with a dopamine-dependent change in purely MB credit assignment, possibly reflecting a trade-off between these two MB components of behavioural control. Our findings of a dopamine boost in MB inference guidance of MF learning highlight a novel DA influence on MB-MF cooperative interactions.
Collapse
Affiliation(s)
- Lorenz Deserno
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College LondonLondonUnited Kingdom
- Department of Child and Adolescent Psychiatry, Psychotherapy and Psychosomatics, University of WürzburgWürzburgGermany
- Department of Psychiatry and Psychotherapy, Technische Universität DresdenDresdenGermany
| | - Rani Moran
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College LondonLondonUnited Kingdom
| | - Jochen Michely
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College LondonLondonUnited Kingdom
- Department of Psychiatry and Psychotherapy, Charité Universitätsmedizin BerlinBerlinGermany
| | - Ying Lee
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College LondonLondonUnited Kingdom
- Department of Psychiatry and Psychotherapy, Technische Universität DresdenDresdenGermany
| | - Peter Dayan
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Max Planck Institute for Biological CyberneticsTübingenGermany
- University of TübingenTübingenGermany
| | - Raymond J Dolan
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College LondonLondonUnited Kingdom
| |
Collapse
|
28
|
Piray P, Daw ND. Linear reinforcement learning in planning, grid fields, and cognitive control. Nat Commun 2021; 12:4942. [PMID: 34400622 PMCID: PMC8368103 DOI: 10.1038/s41467-021-25123-3] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 07/19/2021] [Indexed: 12/02/2022] Open
Abstract
It is thought that the brain’s judicious reuse of previous computation underlies our ability to plan flexibly, but also that inappropriate reuse gives rise to inflexibilities like habits and compulsion. Yet we lack a complete, realistic account of either. Building on control engineering, here we introduce a model for decision making in the brain that reuses a temporally abstracted map of future events to enable biologically-realistic, flexible choice at the expense of specific, quantifiable biases. It replaces the classic nonlinear, model-based optimization with a linear approximation that softly maximizes around (and is weakly biased toward) a default policy. This solution demonstrates connections between seemingly disparate phenomena across behavioral neuroscience, notably flexible replanning with biases and cognitive control. It also provides insight into how the brain can represent maps of long-distance contingencies stably and componentially, as in entorhinal response fields, and exploit them to guide choice even under changing goals. Models of decision making have so far been unable to account for how humans’ choices can be flexible yet efficient. Here the authors present a linear reinforcement learning model which explains both flexibility, and rare limitations such as habits, as arising from efficient approximate computation
Collapse
Affiliation(s)
- Payam Piray
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA.
| | - Nathaniel D Daw
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| |
Collapse
|
29
|
Diekhof EK, Geana A, Ohm F, Doll BB, Frank MJ. The Straw That Broke the Camel's Back: Natural Variations in 17β-Estradiol and COMT-Val158Met Genotype Interact in the Modulation of Model-Free and Model-Based Control. Front Behav Neurosci 2021; 15:658769. [PMID: 34305543 PMCID: PMC8297616 DOI: 10.3389/fnbeh.2021.658769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 06/08/2021] [Indexed: 12/02/2022] Open
Abstract
The sex hormone estradiol has recently gained attention in human decision-making research. Animal studies have already shown that estradiol promotes dopaminergic transmission and thus supports reward-seeking behavior and aspects of addiction. In humans, natural variations of estradiol across the menstrual cycle modulate the ability to learn from direct performance feedback ("model-free" learning). However, it remains unclear whether estradiol also influences more complex "model-based" contributions to reinforcement learning. Here, 41 women were tested twice - in the low and high estradiol state of the follicular phase of their menstrual cycle - with a Two-Step decision task designed to separate model-free from model-based learning. The results showed that in the high estradiol state women relied more heavily on model-free learning, and accomplished reduced performance gains, particularly during the more volatile periods of the task that demanded increased learning effort. In contrast, model-based control remained unaltered by the influence of hormonal state across the group. Yet, when accounting for individual differences in the genetic proxy of the COMT-Val158Met polymorphism (rs4680), we observed that only the participants homozygote for the methionine allele (n = 12; with putatively higher prefrontal dopamine) experienced a decline in model-based control when facing volatile reward probabilities. This group also showed the increase in suboptimal model-free control, while the carriers of the valine allele remained unaffected by the rise in endogenous estradiol. Taken together, these preliminary findings suggest that endogenous estradiol may affect the balance between model-based and model-free control, and particularly so in women with a high prefrontal baseline dopamine capacity and in situations of increased environmental volatility.
Collapse
Affiliation(s)
- Esther K. Diekhof
- Neuroendocrinology and Human Biology Unit, Department of Biology, Faculty of Mathematics, Informatics and Natural Sciences, Institute of Zoology, Universität Hamburg, Hamburg, Germany
| | - Andra Geana
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI, United States
- Carney Institute for Brain Science, Brown University, Providence, RI, United States
| | - Frederike Ohm
- Neuroendocrinology and Human Biology Unit, Department of Biology, Faculty of Mathematics, Informatics and Natural Sciences, Institute of Zoology, Universität Hamburg, Hamburg, Germany
| | - Bradley B. Doll
- New York University, New York, NY, United States
- Columbia University, New York, NY, United States
| | - Michael J. Frank
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI, United States
- Carney Institute for Brain Science, Brown University, Providence, RI, United States
| |
Collapse
|
30
|
Abstract
Experiments have implicated dopamine in model-based reinforcement learning (RL). These findings are unexpected as dopamine is thought to encode a reward prediction error (RPE), which is the key teaching signal in model-free RL. Here we examine two possible accounts for dopamine's involvement in model-based RL: the first that dopamine neurons carry a prediction error used to update a type of predictive state representation called a successor representation, the second that two well established aspects of dopaminergic activity, RPEs and surprise signals, can together explain dopamine's involvement in model-based RL.
Collapse
|
31
|
Castegnetti G, Zurita M, De Martino B. How usefulness shapes neural representations during goal-directed behavior. SCIENCE ADVANCES 2021; 7:7/15/eabd5363. [PMID: 33827810 PMCID: PMC8026134 DOI: 10.1126/sciadv.abd5363] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Accepted: 02/18/2021] [Indexed: 05/13/2023]
Abstract
Value is often associated with reward, emphasizing its hedonic aspects. However, when circumstances change, value must also change (a compass outvalues gold, if you are lost). How are value representations in the brain reshaped under different behavioral goals? To answer this question, we devised a new task that decouples usefulness from its hedonic attributes, allowing us to study flexible goal-dependent mapping. Here, we show that, unlike sensory cortices, regions in the prefrontal cortex (PFC)-usually associated with value computation-remap their representation of perceptually identical items according to how useful the item has been to achieve a specific goal. Furthermore, we identify a coding scheme in the PFC that represents value regardless of the goal, thus supporting generalization across contexts. Our work questions the dominant view that equates value with reward, showing how a change in goals triggers a reorganization of the neural representation of value, enabling flexible behavior.
Collapse
Affiliation(s)
- G Castegnetti
- Institute of Cognitive Neuroscience, University College London, London, UK.
| | - M Zurita
- Institute of Cognitive Neuroscience, University College London, London, UK
| | - B De Martino
- Institute of Cognitive Neuroscience, University College London, London, UK.
- Wellcome Centre for Human Neuroimaging, University College London, London, UK
| |
Collapse
|
32
|
Reiter AMF, Atiya NAA, Berwian IM, Huys QJM. Neuro-cognitive processes as mediators of psychological treatment effects. Curr Opin Behav Sci 2021. [DOI: 10.1016/j.cobeha.2021.02.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
33
|
Tang W, Shin JD, Jadhav SP. Multiple time-scales of decision-making in the hippocampus and prefrontal cortex. eLife 2021; 10:e66227. [PMID: 33683201 PMCID: PMC7993991 DOI: 10.7554/elife.66227] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Accepted: 03/05/2021] [Indexed: 02/07/2023] Open
Abstract
The prefrontal cortex and hippocampus are crucial for memory-guided decision-making. Neural activity in the hippocampus exhibits place-cell sequences at multiple timescales, including slow behavioral sequences (~seconds) and fast theta sequences (~100-200 ms) within theta oscillation cycles. How prefrontal ensembles interact with hippocampal sequences to support decision-making is unclear. Here, we examined simultaneous hippocampal and prefrontal ensemble activity in rats during learning of a spatial working-memory decision task. We found clear theta sequences in prefrontal cortex, nested within its behavioral sequences. In both regions, behavioral sequences maintained representations of current choices during navigation. In contrast, hippocampal theta sequences encoded alternatives for deliberation and were coordinated with prefrontal theta sequences that predicted upcoming choices. During error trials, these representations were preserved to guide ongoing behavior, whereas replay sequences during inter-trial periods were impaired prior to navigation. These results establish cooperative interaction between hippocampal and prefrontal sequences at multiple timescales for memory-guided decision-making.
Collapse
Affiliation(s)
- Wenbo Tang
- Graduate Program in Neuroscience, Brandeis UniversityWalthamUnited States
| | - Justin D Shin
- Graduate Program in Neuroscience, Brandeis UniversityWalthamUnited States
| | - Shantanu P Jadhav
- Graduate Program in Neuroscience, Brandeis UniversityWalthamUnited States
- Neuroscience Program, Department of Psychology, and Volen National Center for Complex Systems, Brandeis UniversityWalthamUnited States
| |
Collapse
|
34
|
Herd S, Krueger K, Nair A, Mollick J, O'Reilly R. Neural Mechanisms of Human Decision-Making. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2021; 21:35-57. [PMID: 33409958 DOI: 10.3758/s13415-020-00842-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 09/28/2020] [Indexed: 11/08/2022]
Abstract
We present a theory and neural network model of the neural mechanisms underlying human decision-making. We propose a detailed model of the interaction between brain regions, under a proposer-predictor-actor-critic framework. This theory is based on detailed animal data and theories of action-selection. Those theories are adapted to serial operation to bridge levels of analysis and explain human decision-making. Task-relevant areas of cortex propose a candidate plan using fast, model-free, parallel neural computations. Other areas of cortex and medial temporal lobe can then predict likely outcomes of that plan in this situation. This optional prediction- (or model-) based computation can produce better accuracy and generalization, at the expense of speed. Next, linked regions of basal ganglia act to accept or reject the proposed plan based on its reward history in similar contexts. If that plan is rejected, the process repeats to consider a new option. The reward-prediction system acts as a critic to determine the value of the outcome relative to expectations and produce dopamine as a training signal for cortex and basal ganglia. By operating sequentially and hierarchically, the same mechanisms previously proposed for animal action-selection could explain the most complex human plans and decisions. We discuss explanations of model-based decisions, habitization, and risky behavior based on the computational model.
Collapse
Affiliation(s)
- Seth Herd
- eCortex, Inc., Boulder, CO, USA.
- University of Colorado, Boulder, CO, USA.
| | - Kai Krueger
- eCortex, Inc., Boulder, CO, USA
- University of Colorado, Boulder, CO, USA
| | - Ananta Nair
- eCortex, Inc., Boulder, CO, USA
- University of Colorado, Boulder, CO, USA
| | - Jessica Mollick
- eCortex, Inc., Boulder, CO, USA
- University of Colorado, Boulder, CO, USA
- Yale University, New Haven, CT, USA
| | - Randall O'Reilly
- eCortex, Inc., Boulder, CO, USA
- University of Colorado, Boulder, CO, USA
- University of California, Davis, Davis, CA, USA
| |
Collapse
|
35
|
Miletić S, Boag RJ, Trutti AC, Stevenson N, Forstmann BU, Heathcote A. A new model of decision processing in instrumental learning tasks. eLife 2021; 10:e63055. [PMID: 33501916 PMCID: PMC7880686 DOI: 10.7554/elife.63055] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 01/26/2021] [Indexed: 01/12/2023] Open
Abstract
Learning and decision-making are interactive processes, yet cognitive modeling of error-driven learning and decision-making have largely evolved separately. Recently, evidence accumulation models (EAMs) of decision-making and reinforcement learning (RL) models of error-driven learning have been combined into joint RL-EAMs that can in principle address these interactions. However, we show that the most commonly used combination, based on the diffusion decision model (DDM) for binary choice, consistently fails to capture crucial aspects of response times observed during reinforcement learning. We propose a new RL-EAM based on an advantage racing diffusion (ARD) framework for choices among two or more options that not only addresses this problem but captures stimulus difficulty, speed-accuracy trade-off, and stimulus-response-mapping reversal effects. The RL-ARD avoids fundamental limitations imposed by the DDM on addressing effects of absolute values of choices, as well as extensions beyond binary choice, and provides a computationally tractable basis for wider applications.
Collapse
Affiliation(s)
- Steven Miletić
- University of Amsterdam, Department of PsychologyAmsterdamNetherlands
| | - Russell J Boag
- University of Amsterdam, Department of PsychologyAmsterdamNetherlands
| | - Anne C Trutti
- University of Amsterdam, Department of PsychologyAmsterdamNetherlands
- Leiden University, Department of PsychologyLeidenNetherlands
| | - Niek Stevenson
- University of Amsterdam, Department of PsychologyAmsterdamNetherlands
| | - Birte U Forstmann
- University of Amsterdam, Department of PsychologyAmsterdamNetherlands
| | - Andrew Heathcote
- University of Amsterdam, Department of PsychologyAmsterdamNetherlands
- University of Newcastle, School of PsychologyNewcastleAustralia
| |
Collapse
|
36
|
Huys QJM, Browning M, Paulus MP, Frank MJ. Advances in the computational understanding of mental illness. Neuropsychopharmacology 2021; 46:3-19. [PMID: 32620005 PMCID: PMC7688938 DOI: 10.1038/s41386-020-0746-4] [Citation(s) in RCA: 64] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Revised: 06/11/2020] [Accepted: 06/15/2020] [Indexed: 12/11/2022]
Abstract
Computational psychiatry is a rapidly growing field attempting to translate advances in computational neuroscience and machine learning into improved outcomes for patients suffering from mental illness. It encompasses both data-driven and theory-driven efforts. Here, recent advances in theory-driven work are reviewed. We argue that the brain is a computational organ. As such, an understanding of the illnesses arising from it will require a computational framework. The review divides work up into three theoretical approaches that have deep mathematical connections: dynamical systems, Bayesian inference and reinforcement learning. We discuss both general and specific challenges for the field, and suggest ways forward.
Collapse
Affiliation(s)
- Quentin J M Huys
- Division of Psychiatry and Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London, UK.
- Camden and Islington NHS Trust, London, UK.
| | - Michael Browning
- Computational Psychiatry Lab, Department of Psychiatry, University of Oxford, Oxford, UK
- Oxford Health NHS Trust, Oxford, UK
| | - Martin P Paulus
- Laureate Institute For Brain Research (LIBR), Tulsa, OK, USA
| | - Michael J Frank
- Cognitive, Linguistic & Psychological Sciences, Neuroscience Graduate Program, Brown University, Providence, RI, USA
- Carney Center for Computational Brain Science, Carney Institute for Brain Science Psychiatry and Human Behavior, Brown University, Providence, RI, USA
| |
Collapse
|
37
|
Fine JM, Zarr N, Brown JW. Computational Neural Mechanisms of Goal-Directed Planning and Problem Solving. ACTA ACUST UNITED AC 2020. [DOI: 10.1007/s42113-020-00095-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
|
38
|
Mollick JA, Hazy TE, Krueger KA, Nair A, Mackie P, Herd SA, O'Reilly RC. A systems-neuroscience model of phasic dopamine. Psychol Rev 2020; 127:972-1021. [PMID: 32525345 PMCID: PMC8453660 DOI: 10.1037/rev0000199] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
We describe a neurobiologically informed computational model of phasic dopamine signaling to account for a wide range of findings, including many considered inconsistent with the simple reward prediction error (RPE) formalism. The central feature of this PVLV framework is a distinction between a primary value (PV) system for anticipating primary rewards (Unconditioned Stimuli [USs]), and a learned value (LV) system for learning about stimuli associated with such rewards (CSs). The LV system represents the amygdala, which drives phasic bursting in midbrain dopamine areas, while the PV system represents the ventral striatum, which drives shunting inhibition of dopamine for expected USs (via direct inhibitory projections) and phasic pausing for expected USs (via the lateral habenula). Our model accounts for data supporting the separability of these systems, including individual differences in CS-based (sign-tracking) versus US-based learning (goal-tracking). Both systems use competing opponent-processing pathways representing evidence for and against specific USs, which can explain data dissociating the processes involved in acquisition versus extinction conditioning. Further, opponent processing proved critical in accounting for the full range of conditioned inhibition phenomena, and the closely related paradigm of second-order conditioning. Finally, we show how additional separable pathways representing aversive USs, largely mirroring those for appetitive USs, also have important differences from the positive valence case, allowing the model to account for several important phenomena in aversive conditioning. Overall, accounting for all of these phenomena strongly constrains the model, thus providing a well-validated framework for understanding phasic dopamine signaling. (PsycInfo Database Record (c) 2020 APA, all rights reserved).
Collapse
Affiliation(s)
- Jessica A Mollick
- Department of Psychology and Neuroscience, University of Colorado Boulder
| | - Thomas E Hazy
- Department of Psychology and Neuroscience, University of Colorado Boulder
| | - Kai A Krueger
- Department of Psychology and Neuroscience, University of Colorado Boulder
| | - Ananta Nair
- Department of Psychology and Neuroscience, University of Colorado Boulder
| | - Prescott Mackie
- Department of Psychology and Neuroscience, University of Colorado Boulder
| | - Seth A Herd
- Department of Psychology and Neuroscience, University of Colorado Boulder
| | - Randall C O'Reilly
- Department of Psychology and Neuroscience, University of Colorado Boulder
| |
Collapse
|
39
|
Grogan JP, Sandhu TR, Hu MT, Manohar SG. Dopamine promotes instrumental motivation, but reduces reward-related vigour. eLife 2020; 9:58321. [PMID: 33001026 PMCID: PMC7599069 DOI: 10.7554/elife.58321] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Accepted: 09/30/2020] [Indexed: 01/07/2023] Open
Abstract
We can be motivated when reward depends on performance, or merely by the prospect of a guaranteed reward. Performance-dependent (contingent) reward is instrumental, relying on an internal action-outcome model, whereas motivation by guaranteed reward may minimise opportunity cost in reward-rich environments. Competing theories propose that each type of motivation should be dependent on dopaminergic activity. We contrasted these two types of motivation with a rewarded saccade task, in patients with Parkinson’s disease (PD). When PD patients were ON dopamine, they had greater response vigour (peak saccadic velocity residuals) for contingent rewards, whereas when PD patients were OFF medication, they had greater vigour for guaranteed rewards. These results support the view that reward expectation and contingency drive distinct motivational processes, and can be dissociated by manipulating dopaminergic activity. We posit that dopamine promotes goal-directed motivation, but dampens reward-driven vigour, contradictory to the prediction that increased tonic dopamine amplifies reward expectation.
Collapse
Affiliation(s)
- John P Grogan
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom
| | - Timothy R Sandhu
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom.,Department of Psychology, University of Cambridge, Cambridge, United Kingdom
| | - Michele T Hu
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom.,Oxford Parkinson's Disease Centre, University of Oxford, Oxford, United Kingdom
| | - Sanjay G Manohar
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom.,Department of Experimental Psychology, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
40
|
Steixner-Kumar S, Gläscher J. Strategies for navigating a dynamic world. Science 2020; 369:1056-1057. [DOI: 10.1126/science.abd7258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
A rich neural representation of its environment enables an adaptable organism to respond to changes
Collapse
Affiliation(s)
- Saurabh Steixner-Kumar
- Institute for Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Jan Gläscher
- Institute for Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| |
Collapse
|
41
|
Dohmatob E, Dumas G, Bzdok D. Dark control: The default mode network as a reinforcement learning agent. Hum Brain Mapp 2020; 41:3318-3341. [PMID: 32500968 PMCID: PMC7375062 DOI: 10.1002/hbm.25019] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Revised: 03/22/2020] [Accepted: 04/12/2020] [Indexed: 12/11/2022] Open
Abstract
The default mode network (DMN) is believed to subserve the baseline mental activity in humans. Its higher energy consumption compared to other brain networks and its intimate coupling with conscious awareness are both pointing to an unknown overarching function. Many research streams speak in favor of an evolutionarily adaptive role in envisioning experience to anticipate the future. In the present work, we propose a process model that tries to explain how the DMN may implement continuous evaluation and prediction of the environment to guide behavior. The main purpose of DMN activity, we argue, may be described by Markov decision processes that optimize action policies via value estimates through vicarious trial and error. Our formal perspective on DMN function naturally accommodates as special cases previous interpretations based on (a) predictive coding, (b) semantic associations, and (c) a sentinel role. Moreover, this process model for the neural optimization of complex behavior in the DMN offers parsimonious explanations for recent experimental findings in animals and humans.
Collapse
Affiliation(s)
- Elvis Dohmatob
- Criteo AI LabParisFrance
- INRIA, Parietal TeamSaclayFrance
- Neurospin, CEAGif‐sur‐YvetteFrance
| | - Guillaume Dumas
- Institut Pasteur, Human Genetics and Cognitive Functions UnitParisFrance
- CNRS UMR 3571 Genes, Synapses and Cognition, Institut PasteurParisFrance
- University Paris Diderot, Sorbonne Paris CitéParisFrance
- Centre de Bioinformatique, Biostatistique et Biologie IntégrativeParisFrance
| | - Danilo Bzdok
- Department of Biomedical Engineering, McConnell Brain Imaging Centre, Montreal Neurological Institute, Faculty of Medicine, School of Computer ScienceMcGill UniversityMontrealCanada
- Mila—Quebec Artificial Intelligence InstituteMontrealCanada
| |
Collapse
|
42
|
What Are Memories For? The Hippocampus Bridges Past Experience with Future Decisions. Trends Cogn Sci 2020; 24:542-556. [DOI: 10.1016/j.tics.2020.04.004] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Revised: 04/24/2020] [Accepted: 04/26/2020] [Indexed: 01/07/2023]
|
43
|
Combined model-free and model-sensitive reinforcement learning in non-human primates. PLoS Comput Biol 2020; 16:e1007944. [PMID: 32569311 PMCID: PMC7332075 DOI: 10.1371/journal.pcbi.1007944] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Revised: 07/02/2020] [Accepted: 05/12/2020] [Indexed: 11/25/2022] Open
Abstract
Contemporary reinforcement learning (RL) theory suggests that potential choices can be evaluated by strategies that may or may not be sensitive to the computational structure of tasks. A paradigmatic model-free (MF) strategy simply repeats actions that have been rewarded in the past; by contrast, model-sensitive (MS) strategies exploit richer information associated with knowledge of task dynamics. MF and MS strategies should typically be combined, because they have complementary statistical and computational strengths; however, this tradeoff between MF/MS RL has mostly only been demonstrated in humans, often with only modest numbers of trials. We trained rhesus monkeys to perform a two-stage decision task designed to elicit and discriminate the use of MF and MS methods. A descriptive analysis of choice behaviour revealed directly that the structure of the task (of MS importance) and the reward history (of MF and MS importance) significantly influenced both choice and response vigour. A detailed, trial-by-trial computational analysis confirmed that choices were made according to a combination of strategies, with a dominant influence of a particular form of model sensitivity that persisted over weeks of testing. The residuals from this model necessitated development of a new combined RL model which incorporates a particular credit assignment weighting procedure. Finally, response vigor exhibited a subtly different collection of MF and MS influences. These results provide new illumination onto RL behavioural processes in non-human primates. We routinely solve planning problems in which present decisions have consequences in the future. These pose complex computational and statistical problems and are addressed by multiple systems in the brain which use different solutions to these problems, and which may compete and cooperate. We trained two rhesus monkeys on a paradigmatic planning task that transparently reveals canonical aspects of different strategies. We performed a detailed behavioral analysis using methods of reinforcement learning on choice and reaction time to reveal conjoint influences and structural interactions of different sources of information. We show the strengths and limitations of these analyses, at the same time as we provide a novel perspective on how different learning systems interact for choice in non-human primates.
Collapse
|
44
|
Rusch T, Steixner-Kumar S, Doshi P, Spezio M, Gläscher J. Theory of mind and decision science: Towards a typology of tasks and computational models. Neuropsychologia 2020; 146:107488. [PMID: 32407906 DOI: 10.1016/j.neuropsychologia.2020.107488] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 04/27/2020] [Accepted: 05/04/2020] [Indexed: 01/27/2023]
Abstract
The ability to form a Theory of Mind (ToM), i.e., to theorize about others' mental states to explain and predict behavior in relation to attributed intentional states, constitutes a hallmark of human cognition. These abilities are multi-faceted and include a variety of different cognitive sub-functions. Here, we focus on decision processes in social contexts and review a number of experimental and computational modeling approaches in this field. We provide an overview of experimental accounts and formal computational models with respect to two dimensions: interactivity and uncertainty. Thereby, we aim at capturing the nuances of ToM functions in the context of social decision processes. We suggest there to be an increase in ToM engagement and multiplexing as social cognitive decision-making tasks become more interactive and uncertain. We propose that representing others as intentional and goal directed agents who perform consequential actions is elicited only at the edges of these two dimensions. Further, we argue that computational models of valuation and beliefs follow these dimensions to best allow researchers to effectively model sophisticated ToM-processes. Finally, we relate this typology to neuroimaging findings in neurotypical (NT) humans, studies of persons with autism spectrum (AS), and studies of nonhuman primates.
Collapse
Affiliation(s)
- Tessa Rusch
- Institute of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Martinistr. 52, 20251, Hamburg, Germany; Division of the Humanities and Social Sciences, California Institute of Technology, 1200 E. California Blvd., Pasadena, CA, 91125, USA.
| | - Saurabh Steixner-Kumar
- Institute of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Martinistr. 52, 20251, Hamburg, Germany
| | - Prashant Doshi
- Department of Computer Science, University of Georgia, 539 Boyd GSRC, Athens, GA, 30602, USA
| | - Michael Spezio
- Institute of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Martinistr. 52, 20251, Hamburg, Germany; Psychology, Neuroscience, and Data Science, Scripps College, 1030 N Columbia Ave, Claremont, CA, 91711, USA.
| | - Jan Gläscher
- Institute of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Martinistr. 52, 20251, Hamburg, Germany
| |
Collapse
|
45
|
Mouse tracking reveals structure knowledge in the absence of model-based choice. Nat Commun 2020; 11:1893. [PMID: 32312966 PMCID: PMC7170897 DOI: 10.1038/s41467-020-15696-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Accepted: 03/24/2020] [Indexed: 11/28/2022] Open
Abstract
Converging evidence has demonstrated that humans exhibit two distinct strategies when learning in complex environments. One is model-free learning, i.e., simple reinforcement of rewarded actions, and the other is model-based learning, which considers the structure of the environment. Recent work has argued that people exhibit little model-based behavior unless it leads to higher rewards. Here we use mouse tracking to study model-based learning in stochastic and deterministic (pattern-based) environments of varying difficulty. In both tasks participants’ mouse movements reveal that they learned the structures of their environments, despite the fact that standard behavior-based estimates suggested no such learning in the stochastic task. Thus, we argue that mouse tracking can reveal whether subjects have structure knowledge, which is necessary but not sufficient for model-based choice. Mouse tracking can reveal people’s subjective beliefs and whether they understand the structure of a task. These data demonstrate that people often do not use this information to make good choices.
Collapse
|
46
|
Abstract
The commentaries suggest many important improvements to the target article. They clearly distinguish two varieties of rationalization - the traditional "motivated reasoning" model, and the proposed representational exchange model - and show that they have distinct functions and consequences. They describe how representational exchange occurs not only by post hoc rationalization but also by ex ante rationalization and other more dynamic processes. They argue that the social benefits of representational exchange are at least as important as its direct personal benefits. Finally, they construe our search for meaning, purpose, and narrative - both individually and collectively - as a variety of representational exchange. The result is a theory of rationalization as representational exchange both wider in scope and better defined in mechanism.
Collapse
|
47
|
Geramita MA, Yttri EA, Ahmari SE. The two‐step task, avoidance, and OCD. J Neurosci Res 2020; 98:1007-1019. [DOI: 10.1002/jnr.24594] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2019] [Revised: 01/02/2020] [Accepted: 01/30/2020] [Indexed: 01/12/2023]
Affiliation(s)
- Matthew A. Geramita
- Department of Psychiatry University of Pittsburgh Pittsburgh PA USA
- Department of Biological Sciences Carnegie Mellon University Pittsburgh PA USA
- Center for Neural Basis of Cognition University of Pittsburgh Pittsburgh PA USA
| | - Eric A. Yttri
- Department of Biological Sciences Carnegie Mellon University Pittsburgh PA USA
- Center for Neural Basis of Cognition University of Pittsburgh Pittsburgh PA USA
| | - Susanne E. Ahmari
- Department of Psychiatry University of Pittsburgh Pittsburgh PA USA
- Center for Neural Basis of Cognition University of Pittsburgh Pittsburgh PA USA
| |
Collapse
|
48
|
Momennejad I. Learning Structures: Predictive Representations, Replay, and Generalization. Curr Opin Behav Sci 2020; 32:155-166. [DOI: 10.1016/j.cobeha.2020.02.017] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
49
|
Coddington LT, Dudman JT. Learning from Action: Reconsidering Movement Signaling in Midbrain Dopamine Neuron Activity. Neuron 2020; 104:63-77. [PMID: 31600516 DOI: 10.1016/j.neuron.2019.08.036] [Citation(s) in RCA: 79] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Revised: 08/10/2019] [Accepted: 08/22/2019] [Indexed: 01/07/2023]
Abstract
Animals infer when and where a reward is available from experience with informative sensory stimuli and their own actions. In vertebrates, this is thought to depend upon the release of dopamine from midbrain dopaminergic neurons. Studies of the role of dopamine have focused almost exclusively on their encoding of informative sensory stimuli; however, many dopaminergic neurons are active just prior to movement initiation, even in the absence of sensory stimuli. How should current frameworks for understanding the role of dopamine incorporate these observations? To address this question, we review recent anatomical and functional evidence for action-related dopamine signaling. We conclude by proposing a framework in which dopaminergic neurons encode subjective signals of action initiation to solve an internal credit assignment problem.
Collapse
|
50
|
O’Reilly RC, Nair A, Russin JL, Herd SA. How Sequential Interactive Processing Within Frontostriatal Loops Supports a Continuum of Habitual to Controlled Processing. Front Psychol 2020; 11:380. [PMID: 32210892 PMCID: PMC7076192 DOI: 10.3389/fpsyg.2020.00380] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Accepted: 02/18/2020] [Indexed: 11/13/2022] Open
Abstract
We address the distinction between habitual/automatic vs. goal-directed/controlled behavior, from the perspective of a computational model of the frontostriatal loops. The model exhibits a continuum of behavior between these poles, as a function of the interactive dynamics among different functionally-specialized brain areas, operating iteratively over multiple sequential steps, and having multiple nested loops of similar decision making circuits. This framework blurs the lines between these traditional distinctions in many ways. For example, although habitual actions have traditionally been considered purely automatic, the outer loop must first decide to allow such habitual actions to proceed. Furthermore, because the part of the brain that generates proposed action plans is common across habitual and controlled/goal-directed behavior, the key differences are instead in how many iterations of sequential decision-making are taken, and to what extent various forms of predictive (model-based) processes are engaged. At the core of every iterative step in our model, the basal ganglia provides a "model-free" dopamine-trained Go/NoGo evaluation of the entire distributed plan/goal/evaluation/prediction state. This evaluation serves as the fulcrum of serializing otherwise parallel neural processing. Goal-based inputs to the nominally model-free basal ganglia system are among several ways in which the popular model-based vs. model-free framework may not capture the most behaviorally and neurally relevant distinctions in this area.
Collapse
Affiliation(s)
- Randall C. O’Reilly
- Computational Cognitive Neuroscience Lab, Department of Psychology, Computer Science, and Center for Neuroscience, University of California, Davis, Davis, CA, United States
- eCortex, Inc., Boulder, CO, United States
| | | | - Jacob L. Russin
- Computational Cognitive Neuroscience Lab, Department of Psychology, Computer Science, and Center for Neuroscience, University of California, Davis, Davis, CA, United States
| | - Seth A. Herd
- Computational Cognitive Neuroscience Lab, Department of Psychology, Computer Science, and Center for Neuroscience, University of California, Davis, Davis, CA, United States
- eCortex, Inc., Boulder, CO, United States
| |
Collapse
|