1
|
Cowan RL, Davis T, Kundu B, Rahimpour S, Rolston JD, Smith EH. More widespread and rigid neuronal representation of reward expectation underlies impulsive choices. bioRxiv 2024:2024.04.11.588637. [PMID: 38645037 PMCID: PMC11030340 DOI: 10.1101/2024.04.11.588637] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Impulsive choices prioritize smaller, more immediate rewards over larger, delayed, or potentially uncertain rewards. Impulsive choices are a critical aspect of substance use disorders and maladaptive decision-making across the lifespan. Here, we sought to understand the neuronal underpinnings of expected reward and risk estimation on a trial-by-trial basis during impulsive choices. To do so, we acquired electrical recordings from the human brain while participants carried out a risky decision-making task designed to measure choice impulsivity. Behaviorally, we found a reward-accuracy tradeoff, whereby more impulsive choosers were more accurate at the task, opting for a more immediate reward while compromising overall task performance. We then examined how neuronal populations across frontal, temporal, and limbic brain regions parametrically encoded reinforcement learning model variables, namely reward and risk expectation and surprise, across trials. We found more widespread representations of reward value expectation and prediction error in more impulsive choosers, whereas less impulsive choosers preferentially represented risk expectation. A regional analysis of reward and risk encoding highlighted the anterior cingulate cortex for value expectation, the anterior insula for risk expectation and surprise, and distinct regional encoding between impulsivity groups. Beyond describing trial-by-trial population neuronal representations of reward and risk variables, these results suggest impaired inhibitory control and model-free learning underpinnings of impulsive choice. These findings shed light on neural processes underlying reinforced learning and decision-making in uncertain environments and how these processes may function in psychiatric disorders.
Collapse
Affiliation(s)
- Rhiannon L Cowan
- Department of Neurosurgery, University of Utah, Salt Lake City, UT 84132, USA
| | - Tyler Davis
- Department of Neurosurgery, University of Utah, Salt Lake City, UT 84132, USA
| | - Bornali Kundu
- Department of Neurosurgery, University of Missouri, Columbia, MO 65212, USA
| | - Shervin Rahimpour
- Department of Neurosurgery, University of Utah, Salt Lake City, UT 84132, USA
| | - John D Rolston
- Department of Neurosurgery, Brigham & Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Elliot H Smith
- Department of Neurosurgery, University of Utah, Salt Lake City, UT 84132, USA
| |
Collapse
|
2
|
Xia X, Klishin AA, Stiso J, Lynn CW, Kahn AE, Caciagli L, Bassett DS. Human learning of hierarchical graphs. Phys Rev E 2024; 109:044305. [PMID: 38755869 DOI: 10.1103/physreve.109.044305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 02/16/2024] [Indexed: 05/18/2024]
Abstract
Humans are exposed to sequences of events in the environment, and the interevent transition probabilities in these sequences can be modeled as a graph or network. Many real-world networks are organized hierarchically and while much is known about how humans learn basic transition graph topology, whether and to what degree humans can learn hierarchical structures in such graphs remains unknown. We probe the mental estimates of transition probabilities via the surprisal effect phenomenon: humans react more slowly to less expected transitions. Using mean-field predictions and numerical simulations, we show that surprisal effects are stronger for finer-level than coarser-level hierarchical transitions, and that surprisal effects at coarser levels are difficult to detect for limited learning times or in small samples. Using a serial response experiment with human participants (n=100), we replicate our predictions by detecting a surprisal effect at the finer level of the hierarchy but not at the coarser level of the hierarchy. We then evaluate the presence of a trade-off in learning, whereby humans who learned the finer level of the hierarchy better also tended to learn the coarser level worse, and vice versa. This study elucidates the processes by which humans learn sequential events in hierarchical contexts. More broadly, our work charts a road map for future investigation of the neural underpinnings and behavioral manifestations of graph learning.
Collapse
Affiliation(s)
- Xiaohuan Xia
- Department of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Andrei A Klishin
- Department of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Jennifer Stiso
- Department of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Christopher W Lynn
- Department of Physics, Quantitative Biology Institute, and Wu Tsai Institute, Yale University, New Haven, Connecticut 06520, USA
- Joseph Henry Laboratories of Physics, Princeton University, Princeton, New Jersey 08544, USA
- Initiative for the Theoretical Sciences, Graduate Center, City University of New York, New York, New York 10016, USA
| | - Ari E Kahn
- Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey 08544, USA
| | - Lorenzo Caciagli
- Department of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Dani S Bassett
- Department of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Department of Electrical & Systems Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Department of Neurology, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Department of Psychiatry, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Santa Fe Institute, Santa Fe, New Mexico 87501, USA
| |
Collapse
|
3
|
Lussange J, Vrizzi S, Palminteri S, Gutkin B. Mesoscale effects of trader learning behaviors in financial markets: A multi-agent reinforcement learning study. PLoS One 2024; 19:e0301141. [PMID: 38557590 PMCID: PMC10984546 DOI: 10.1371/journal.pone.0301141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 03/08/2024] [Indexed: 04/04/2024] Open
Abstract
Recent advances in the field of machine learning have yielded novel research perspectives in behavioural economics and financial markets microstructure studies. In this paper we study the impact of individual trader leaning characteristics on markets using a stock market simulator designed with a multi-agent architecture. Each agent, representing an autonomous investor, trades stocks through reinforcement learning, using a centralized double-auction limit order book. This approach allows us to study the impact of individual trader traits on the whole stock market at the mesoscale in a bottom-up approach. We chose to test three trader trait aspects: agent learning rate increases, herding behaviour and random trading. As hypothesized, we find that larger learning rates significantly increase the number of crashes. We also find that herding behaviour undermines market stability, while random trading tends to preserve it.
Collapse
Affiliation(s)
- Johann Lussange
- Laboratoire des Neurosciences Cognitives, Département des Études Cognitives, INSERM U960, Paris, France
| | - Stefano Vrizzi
- Laboratoire des Neurosciences Cognitives, Département des Études Cognitives, INSERM U960, Paris, France
| | - Stefano Palminteri
- Laboratoire des Neurosciences Cognitives, Département des Études Cognitives, INSERM U960, Paris, France
- Center for Cognition and Decision Making, Department of Psychology, NU University Higher School of Economics, Moscow, Russia
| | - Boris Gutkin
- Laboratoire des Neurosciences Cognitives, Département des Études Cognitives, INSERM U960, Paris, France
- Center for Cognition and Decision Making, Department of Psychology, NU University Higher School of Economics, Moscow, Russia
| |
Collapse
|
4
|
Sagiv Y, Akam T, Witten IB, Daw ND. Prioritizing replay when future goals are unknown. bioRxiv 2024:2024.02.29.582822. [PMID: 38496674 PMCID: PMC10942393 DOI: 10.1101/2024.02.29.582822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
Although hippocampal place cells replay nonlocal trajectories, the computational function of these events remains controversial. One hypothesis, formalized in a prominent reinforcement learning account, holds that replay plans routes to current goals. However, recent puzzling data appear to contradict this perspective by showing that replayed destinations lag current goals. These results may support an alternative hypothesis that replay updates route information to build a "cognitive map." Yet no similar theory exists to formalize this view, and it is unclear how such a map is represented or what role replay plays in computing it. We address these gaps by introducing a theory of replay that learns a map of routes to candidate goals, before reward is available or when its location may change. Our work extends the planning account to capture a general map-building function for replay, reconciling it with data, and revealing an unexpected relationship between the seemingly distinct hypotheses.
Collapse
Affiliation(s)
- Yotam Sagiv
- Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, USA
| | - Thomas Akam
- Department of Experimental Psychology, Oxford University, Oxford, UK
| | - Ilana B Witten
- Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, USA
| | - Nathaniel D Daw
- Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, USA
| |
Collapse
|
5
|
Colas JT, O’Doherty JP, Grafton ST. Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts. PLoS Comput Biol 2024; 20:e1011950. [PMID: 38552190 PMCID: PMC10980507 DOI: 10.1371/journal.pcbi.1011950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 02/26/2024] [Indexed: 04/01/2024] Open
Abstract
Active reinforcement learning enables dynamic prediction and control, where one should not only maximize rewards but also minimize costs such as of inference, decisions, actions, and time. For an embodied agent such as a human, decisions are also shaped by physical aspects of actions. Beyond the effects of reward outcomes on learning processes, to what extent can modeling of behavior in a reinforcement-learning task be complicated by other sources of variance in sequential action choices? What of the effects of action bias (for actions per se) and action hysteresis determined by the history of actions chosen previously? The present study addressed these questions with incremental assembly of models for the sequential choice data from a task with hierarchical structure for additional complexity in learning. With systematic comparison and falsification of computational models, human choices were tested for signatures of parallel modules representing not only an enhanced form of generalized reinforcement learning but also action bias and hysteresis. We found evidence for substantial differences in bias and hysteresis across participants-even comparable in magnitude to the individual differences in learning. Individuals who did not learn well revealed the greatest biases, but those who did learn accurately were also significantly biased. The direction of hysteresis varied among individuals as repetition or, more commonly, alternation biases persisting from multiple previous actions. Considering that these actions were button presses with trivial motor demands, the idiosyncratic forces biasing sequences of action choices were robust enough to suggest ubiquity across individuals and across tasks requiring various actions. In light of how bias and hysteresis function as a heuristic for efficient control that adapts to uncertainty or low motivation by minimizing the cost of effort, these phenomena broaden the consilient theory of a mixture of experts to encompass a mixture of expert and nonexpert controllers of behavior.
Collapse
Affiliation(s)
- Jaron T. Colas
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, California, United States of America
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, California, United States of America
- Computation and Neural Systems Program, California Institute of Technology, Pasadena, California, United States of America
| | - John P. O’Doherty
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, California, United States of America
- Computation and Neural Systems Program, California Institute of Technology, Pasadena, California, United States of America
| | - Scott T. Grafton
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, California, United States of America
| |
Collapse
|
6
|
Heimer O, Hertz U. The spread of affective and semantic valence representations across states. Cognition 2024; 244:105714. [PMID: 38176154 DOI: 10.1016/j.cognition.2023.105714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 12/22/2023] [Accepted: 12/24/2023] [Indexed: 01/06/2024]
Abstract
In many decision problems, outcomes are not reached after a single action but rather after a series of events or states. To optimize decisions over multiple states, representations of how good or bad the outcomes are, that is, the outcomes' valence, should spread across states. One mechanism for valence spreading is a temporal, state-independent process in which a single valence representation is updated when an outcome is experienced and fades away afterwards. Each state's valence is based on its temporal proximity to the experienced outcome. An alternative, state-dependent mechanism relies on the structure of transitions between states, updating a separate valence representation for each state according to its spatial distance from the outcomes. We examined how these mechanistic accounts shape the spread of two formats of valence representation, feelings (affective valence) and knowledge (semantic valence), between states. In two pre-registered experiments (N = 585), we used a novel task in which participants move in a four-state maze, one of which contains an outcome. The participants provide self-reports of affective and semantic valence throughout the maze and after finishing it. Results show that the affective representation of negative valence is more localized in state-space than the semantic representation. We also found evidence for the relative reliance of the affective valence on a temporal, state-independent mechanism and of the semantic valence on a structured, state-dependent mechanism. Our findings provide mechanistic accounts for the differences between affective and semantic valence representations and indicate how such representations may play a role in associative learning and decision-making.
Collapse
Affiliation(s)
- Orit Heimer
- Department of Psychology, University of Haifa, Haifa, Israel.
| | - Uri Hertz
- Department of Cognitive Sciences, University of Haifa, Haifa, Israel
| |
Collapse
|
7
|
Karagoz AB, Moran EK, Barch DM, Kool W, Reagh ZM. Evidence for shallow cognitive maps in schizophrenia. bioRxiv 2024:2024.02.26.582214. [PMID: 38464042 PMCID: PMC10925159 DOI: 10.1101/2024.02.26.582214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Individuals with schizophrenia can have marked deficits in goal-directed decision making. Prominent theories differ in whether schizophrenia (SZ) affects the ability to exert cognitive control, or the motivation to exert control. An alternative explanation is that schizophrenia negatively impacts the formation of cognitive maps, the internal representations of the way the world is structured, necessary for the formation of effective action plans. That is, deficits in decision-making could also arise when goal-directed control and motivation are intact, but used to plan over ill-formed maps. Here, we test the hypothesis that individuals with SZ are impaired in the construction of cognitive maps. We combine a behavioral representational similarity analysis technique with a sequential decision-making task. This enables us to examine how relationships between choice options change when individuals with SZ and healthy age-matched controls build a cognitive map of the task structure. Our results indicate that SZ affects how people represent the structure of the task, focusing more on simpler visual features and less on abstract, higher-order, planning-relevant features. At the same time, we find that SZ were able to display similar performance on this task compared to controls, emphasizing the need for a distinction between cognitive map formation and changes in goal-directed control in understanding cognitive deficits in schizophrenia.
Collapse
Affiliation(s)
- Ata B Karagoz
- Department of Psychological & Brain Sciences, Washington University in St. Louis
| | - Erin K Moran
- Department of Psychological & Brain Sciences, Washington University in St. Louis
| | - Deanna M Barch
- Department of Psychological & Brain Sciences, Washington University in St. Louis
- Department of Psychiatry, Washington University School of Medicine
| | - Wouter Kool
- Department of Psychological & Brain Sciences, Washington University in St. Louis
| | - Zachariah M Reagh
- Department of Psychological & Brain Sciences, Washington University in St. Louis
| |
Collapse
|
8
|
Wientjes S, Holroyd CB. The successor representation subserves hierarchical abstraction for goal-directed behavior. PLoS Comput Biol 2024; 20:e1011312. [PMID: 38377074 PMCID: PMC10906840 DOI: 10.1371/journal.pcbi.1011312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 03/01/2024] [Accepted: 02/05/2024] [Indexed: 02/22/2024] Open
Abstract
Humans have the ability to craft abstract, temporally extended and hierarchically organized plans. For instance, when considering how to make spaghetti for dinner, we typically concern ourselves with useful "subgoals" in the task, such as cutting onions, boiling pasta, and cooking a sauce, rather than particulars such as how many cuts to make to the onion, or exactly which muscles to contract. A core question is how such decomposition of a more abstract task into logical subtasks happens in the first place. Previous research has shown that humans are sensitive to a form of higher-order statistical learning named "community structure". Community structure is a common feature of abstract tasks characterized by a logical ordering of subtasks. This structure can be captured by a model where humans learn predictions of upcoming events multiple steps into the future, discounting predictions of events further away in time. One such model is the "successor representation", which has been argued to be useful for hierarchical abstraction. As of yet, no study has convincingly shown that this hierarchical abstraction can be put to use for goal-directed behavior. Here, we investigate whether participants utilize learned community structure to craft hierarchically informed action plans for goal-directed behavior. Participants were asked to search for paintings in a virtual museum, where the paintings were grouped together in "wings" representing community structure in the museum. We find that participants' choices accord with the hierarchical structure of the museum and that their response times are best predicted by a successor representation. The degree to which the response times reflect the community structure of the museum correlates with several measures of performance, including the ability to craft temporally abstract action plans. These results suggest that successor representation learning subserves hierarchical abstractions relevant for goal-directed behavior.
Collapse
Affiliation(s)
- Sven Wientjes
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | - Clay B. Holroyd
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| |
Collapse
|
9
|
Schlafly M, Prabhakar A, Popovic K, Schlafly G, Kim C, Murphey TD. Collaborative robots can augment human cognition in regret-sensitive tasks. PNAS Nexus 2024; 3:pgae016. [PMID: 38725525 PMCID: PMC11079486 DOI: 10.1093/pnasnexus/pgae016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 01/02/2024] [Indexed: 05/12/2024]
Abstract
Despite theoretical benefits of collaborative robots, disappointing outcomes are well documented by clinical studies, spanning rehabilitation, prostheses, and surgery. Cognitive load theory provides a possible explanation for why humans in the real world are not realizing the benefits of collaborative robots: high cognitive loads may be impeding human performance. Measuring cognitive availability using an electrocardiogram, we ask 25 participants to complete a virtual-reality task alongside an invisible agent that determines optimal performance by iteratively updating the Bellman equation. Three robots assist by providing environmental information relevant to task performance. By enabling the robots to act more autonomously-managing more of their own behavior with fewer instructions from the human-here we show that robots can augment participants' cognitive availability and decision-making. The way in which robots describe and achieve their objective can improve the human's cognitive ability to reason about the task and contribute to human-robot collaboration outcomes. Augmenting human cognition provides a path to improve the efficacy of collaborative robots. By demonstrating how robots can improve human cognition, this work paves the way for improving the cognitive capabilities of first responders, manufacturing workers, surgeons, and other future users of collaborative autonomy systems.
Collapse
Affiliation(s)
- Millicent Schlafly
- Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Ahalya Prabhakar
- Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Katarina Popovic
- Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Geneva Schlafly
- Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Christopher Kim
- Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Todd D Murphey
- Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA
| |
Collapse
|
10
|
Li S, Li Z, Liu Q, Ren P, Sun L, Cui Z, Liang X. Predictable navigation through spontaneous brain states with cognitive-map-like representations. Prog Neurobiol 2024; 233:102570. [PMID: 38232783 DOI: 10.1016/j.pneurobio.2024.102570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Revised: 11/19/2023] [Accepted: 01/10/2024] [Indexed: 01/19/2024]
Abstract
Just as navigating a physical environment, navigating through the landscapes of spontaneous brain states may also require an internal cognitive map. Contemporary computation theories propose modeling a cognitive map from a reinforcement learning perspective and argue that the map would be predictive in nature, representing each state as its upcoming states. Here, we used resting-state fMRI to test the hypothesis that the spaces of spontaneously reoccurring brain states are cognitive map-like, and may exhibit future-oriented predictivity. We identified two discrete brain states of the navigation-related brain networks during rest. By combining pattern similarity and dimensional reduction analysis, we embedded the occurrences of each brain state in a two-dimensional space. Successor representation modeling analysis recognized that these brain state occurrences exhibit place cell-like representations, akin to those observed in a physical space. Moreover, we observed predictive transitions of reoccurring brain states, which strongly covaried with individual cognitive and emotional assessments. Our findings offer a novel perspective on the cognitive significance of spontaneous brain activity and support the theory of cognitive map as a unifying framework for mental navigation.
Collapse
Affiliation(s)
- Siyang Li
- School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, China; Laboratory for Space Environment and Physical Sciences, Harbin Institute of Technology, Harbin 150001, China; Research Center for Human-Machine Augmented Intelligence, Research Institute of Artificial Intelligence, Zhejiang Lab, Hangzhou, Zhejiang 311100, China
| | - Zhipeng Li
- School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, China; Laboratory for Space Environment and Physical Sciences, Harbin Institute of Technology, Harbin 150001, China
| | - Qiuyi Liu
- School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, China; Laboratory for Space Environment and Physical Sciences, Harbin Institute of Technology, Harbin 150001, China
| | - Peng Ren
- School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Lili Sun
- School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, China; Laboratory for Space Environment and Physical Sciences, Harbin Institute of Technology, Harbin 150001, China
| | - Zaixu Cui
- Chinese Institute for Brain Research, Beijing 102206, China
| | - Xia Liang
- Laboratory for Space Environment and Physical Sciences, Harbin Institute of Technology, Harbin 150001, China; Frontiers Science Center for Matter Behave in Space Environment, Harbin Institute of Technology, Harbin 150001, China.
| |
Collapse
|
11
|
Zheng XY, Hebart MN, Grill F, Dolan RJ, Doeller CF, Cools R, Garvert MM. Parallel cognitive maps for multiple knowledge structures in the hippocampal formation. Cereb Cortex 2024; 34:bhad485. [PMID: 38204296 PMCID: PMC10839836 DOI: 10.1093/cercor/bhad485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 11/27/2023] [Accepted: 11/30/2023] [Indexed: 01/12/2024] Open
Abstract
The hippocampal-entorhinal system uses cognitive maps to represent spatial knowledge and other types of relational information. However, objects can often be characterized by different types of relations simultaneously. How does the hippocampal formation handle the embedding of stimuli in multiple relational structures that differ vastly in their mode and timescale of acquisition? Does the hippocampal formation integrate different stimulus dimensions into one conjunctive map or is each dimension represented in a parallel map? Here, we reanalyzed human functional magnetic resonance imaging data from Garvert et al. (2017) that had previously revealed a map in the hippocampal formation coding for a newly learnt transition structure. Using functional magnetic resonance imaging adaptation analysis, we found that the degree of representational similarity in the bilateral hippocampus also decreased as a function of the semantic distance between presented objects. Importantly, while both map-like structures localized to the hippocampal formation, the semantic map was located in more posterior regions of the hippocampal formation than the transition structure and thus anatomically distinct. This finding supports the idea that the hippocampal-entorhinal system forms parallel cognitive maps that reflect the embedding of objects in diverse relational structures.
Collapse
Affiliation(s)
- Xiaochen Y Zheng
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, 6525 EN, Nijmegen, the Netherlands
| | - Martin N Hebart
- Max-Planck-Institute for Human Cognitive and Brain Sciences, 04103, Leipzig, Germany
- Department of Medicine, Justus Liebig University, 35390, Giessen, Germany
| | - Filip Grill
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, 6525 EN, Nijmegen, the Netherlands
- Radboud University Medical Center, Department of Neurology, 6525 GA, Nijmegen, the Netherlands
| | - Raymond J Dolan
- Wellcome Centre for Human Neuroimaging, University College London, London WC1N 3AR, United Kingdom
- Max Planck University College London Centre for Computational Psychiatry and Ageing Research, University College London, London WC1B 5EH, United Kingdom
| | - Christian F Doeller
- Max-Planck-Institute for Human Cognitive and Brain Sciences, 04103, Leipzig, Germany
- Kavli Institute for Systems Neuroscience, Centre for Neural Computation, The Egil and Pauline Braathen and Fred Kavli Centre for Cortical Microcircuits, Jebsen Centre for Alzheimer's Disease, NTNU, 7491, Trondheim, Norway
- Wilhelm Wundt Institute of Psychology, Leipzig University, 04109, Leipzig, Germany
| | - Roshan Cools
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, 6525 EN, Nijmegen, the Netherlands
- Radboud University Medical Center, Department of Psychiatry, 6525 GA, Nijmegen, the Netherlands
| | - Mona M Garvert
- Max-Planck-Institute for Human Cognitive and Brain Sciences, 04103, Leipzig, Germany
- Max Planck Research Group NeuroCode, Max Planck Institute for Human Development, 14195, Berlin, Germany
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, Berlin, Germany
- Faculty of Human Sciences, Julius-Maximilians-Universität Würzburg, Würzburg, Germany
| |
Collapse
|
12
|
Yang L, Jin F, Yang L, Li J, Li Z, Li M, Shang Z. The Hippocampus in Pigeons Contributes to the Model-Based Valuation and the Relationship between Temporal Context States. Animals (Basel) 2024; 14:431. [PMID: 38338074 PMCID: PMC10854895 DOI: 10.3390/ani14030431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 01/25/2024] [Accepted: 01/25/2024] [Indexed: 02/12/2024] Open
Abstract
Model-based decision-making guides organism behavior by the representation of the relationships between different states. Previous studies have shown that the mammalian hippocampus (Hp) plays a key role in learning the structure of relationships among experiences. However, the hippocampal neural mechanisms of birds for model-based learning have rarely been reported. Here, we trained six pigeons to perform a two-step task and explore whether their Hp contributes to model-based learning. Behavioral performance and hippocampal multi-channel local field potentials (LFPs) were recorded during the task. We estimated the subjective values using a reinforcement learning model dynamically fitted to the pigeon's choice of behavior. The results show that the model-based learner can capture the behavioral choices of pigeons well throughout the learning process. Neural analysis indicated that high-frequency (12-100 Hz) power in Hp represented the temporal context states. Moreover, dynamic correlation and decoding results provided further support for the high-frequency dependence of model-based valuations. In addition, we observed a significant increase in hippocampal neural similarity at the low-frequency band (1-12 Hz) for common temporal context states after learning. Overall, our findings suggest that pigeons use model-based inferences to learn multi-step tasks, and multiple LFP frequency bands collaboratively contribute to model-based learning. Specifically, the high-frequency (12-100 Hz) oscillations represent model-based valuations, while the low-frequency (1-12 Hz) neural similarity is influenced by the relationship between temporal context states. These results contribute to our understanding of the neural mechanisms underlying model-based learning and broaden the scope of hippocampal contributions to avian behavior.
Collapse
Affiliation(s)
- Lifang Yang
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Fuli Jin
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Long Yang
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Jiajia Li
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Zhihui Li
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
- Institute of Medical Engineering Technology and Data Mining, Zhengzhou University, Zhengzhou 450001, China
| | - Mengmeng Li
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Zhigang Shang
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
- Institute of Medical Engineering Technology and Data Mining, Zhengzhou University, Zhengzhou 450001, China
| |
Collapse
|
13
|
Chan HK, Toyoizumi T. A multi-stage anticipated surprise model with dynamic expectation for economic decision-making. Sci Rep 2024; 14:657. [PMID: 38182692 PMCID: PMC10770108 DOI: 10.1038/s41598-023-50529-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 12/20/2023] [Indexed: 01/07/2024] Open
Abstract
There are many modeling works that aim to explain people's behaviors that violate classical economic theories. However, these models often do not take into full account the multi-stage nature of real-life problems and people's tendency in solving complicated problems sequentially. In this work, we propose a descriptive decision-making model for multi-stage problems with perceived post-decision information. In the model, decisions are chosen based on an entity which we call the 'anticipated surprise'. The reference point is determined by the expected value of the possible outcomes, which we assume to be dynamically changing during the mental simulation of a sequence of events. We illustrate how our formalism can help us understand prominent economic paradoxes and gambling behaviors that involve multi-stage or sequential planning. We also discuss how neuroscience findings, like prediction error signals and introspective neuronal replay, as well as psychological theories like affective forecasting, are related to the features in our model. This provides hints for future experiments to investigate the role of these entities in decision-making.
Collapse
Affiliation(s)
- Ho Ka Chan
- Laboratory for Neural Computation and Adaptation, RIKEN Center for Brain Science, Wako, Japan.
| | - Taro Toyoizumi
- Laboratory for Neural Computation and Adaptation, RIKEN Center for Brain Science, Wako, Japan.
- Department of Mathematical Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan.
| |
Collapse
|
14
|
Son JY, Bhandari A, FeldmanHall O. Abstract cognitive maps of social network structure aid adaptive inference. Proc Natl Acad Sci U S A 2023; 120:e2310801120. [PMID: 37963254 PMCID: PMC10666027 DOI: 10.1073/pnas.2310801120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 10/12/2023] [Indexed: 11/16/2023] Open
Abstract
Social navigation-such as anticipating where gossip may spread, or identifying which acquaintances can help land a job-relies on knowing how people are connected within their larger social communities. Problematically, for most social networks, the space of possible relationships is too vast to observe and memorize. Indeed, people's knowledge of these social relations is well known to be biased and error-prone. Here, we reveal that these biased representations reflect a fundamental computation that abstracts over individual relationships to enable principled inferences about unseen relationships. We propose a theory of network representation that explains how people learn inferential cognitive maps of social relations from direct observation, what kinds of knowledge structures emerge as a consequence, and why it can be beneficial to encode systematic biases into social cognitive maps. Leveraging simulations, laboratory experiments, and "field data" from a real-world network, we find that people abstract observations of direct relations (e.g., friends) into inferences of multistep relations (e.g., friends-of-friends). This multistep abstraction mechanism enables people to discover and represent complex social network structure, affording adaptive inferences across a variety of contexts, including friendship, trust, and advice-giving. Moreover, this multistep abstraction mechanism unifies a variety of otherwise puzzling empirical observations about social behavior. Our proposal generalizes the theory of cognitive maps to the fundamental computational problem of social inference, presenting a powerful framework for understanding the workings of a predictive mind operating within a complex social world.
Collapse
Affiliation(s)
- Jae-Young Son
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI02912
| | - Apoorva Bhandari
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI02912
| | - Oriel FeldmanHall
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI02912
- Carney Institute for Brain Sciences, Brown University, Providence, RI02912
| |
Collapse
|
15
|
Eppinger B, Ruel A, Bolenz F. Diminished State Space Theory of Human Aging. Perspect Psychol Sci 2023:17456916231204811. [PMID: 37931229 DOI: 10.1177/17456916231204811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2023]
Abstract
Many new technologies, such as smartphones, computers, or public-access systems (like ticket-vending machines), are a challenge for older adults. One feature that these technologies have in common is that they involve underlying, partially observable, structures (state spaces) that determine the actions that are necessary to reach a certain goal (e.g., to move from one menu to another, to change a function, or to activate a new service). In this work we provide a theoretical, neurocomputational account to explain these behavioral difficulties in older adults. Based on recent findings from age-comparative computational- and cognitive-neuroscience studies, we propose that age-related impairments in complex goal-directed behavior result from an underlying deficit in the representation of state spaces of cognitive tasks. Furthermore, we suggest that these age-related deficits in adaptive decision-making are due to impoverished neural representations in the orbitofrontal cortex and hippocampus.
Collapse
Affiliation(s)
- Ben Eppinger
- Institute of Psychology, University of Greifswald
- Department of Psychology, Concordia University
- PERFORM Centre, Concordia University
- Faculty of Psychology, Technische Universität Dresden
| | - Alexa Ruel
- Department of Psychology, Concordia University
- PERFORM Centre, Concordia University
- Institute of Psychology, University of Hamburg
| | - Florian Bolenz
- Center for Adaptive Rationality, Max Planck Institute for Human Development, Berlin, Germany
- Science of Intelligence/Cluster of Excellence, Technical University of Berlin
| |
Collapse
|
16
|
Aronowitz S. Representational structures only make their mark over time: A case from memory. Behav Brain Sci 2023; 46:e263. [PMID: 37766654 DOI: 10.1017/s0140525x23001905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/29/2023]
Abstract
Memory structures range across the dimensions that distinguish language-like thought. Recent work suggests agent- or situation-specific information is embedded in these structures. Understanding why this is, and pulling these structures apart, requires observing what happens under major changes. The evidence presented for the language-of-thought (LoT) does not look broadly enough across time to capture the function of representational structure.
Collapse
Affiliation(s)
- Sara Aronowitz
- Department of Philosophy, University of Toronto, Toronto, ON, Canada ://www-personal.umich.edu/~skaron/
| |
Collapse
|
17
|
Mehrotra D, Dubé L. Accounting for multiscale processing in adaptive real-world decision-making via the hippocampus. Front Neurosci 2023; 17:1200842. [PMID: 37732307 PMCID: PMC10508350 DOI: 10.3389/fnins.2023.1200842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 08/25/2023] [Indexed: 09/22/2023] Open
Abstract
For adaptive real-time behavior in real-world contexts, the brain needs to allow past information over multiple timescales to influence current processing for making choices that create the best outcome as a person goes about making choices in their everyday life. The neuroeconomics literature on value-based decision-making has formalized such choice through reinforcement learning models for two extreme strategies. These strategies are model-free (MF), which is an automatic, stimulus-response type of action, and model-based (MB), which bases choice on cognitive representations of the world and causal inference on environment-behavior structure. The emphasis of examining the neural substrates of value-based decision making has been on the striatum and prefrontal regions, especially with regards to the "here and now" decision-making. Yet, such a dichotomy does not embrace all the dynamic complexity involved. In addition, despite robust research on the role of the hippocampus in memory and spatial learning, its contribution to value-based decision making is just starting to be explored. This paper aims to better appreciate the role of the hippocampus in decision-making and advance the successor representation (SR) as a candidate mechanism for encoding state representations in the hippocampus, separate from reward representations. To this end, we review research that relates hippocampal sequences to SR models showing that the implementation of such sequences in reinforcement learning agents improves their performance. This also enables the agents to perform multiscale temporal processing in a biologically plausible manner. Altogether, we articulate a framework to advance current striatal and prefrontal-focused decision making to better account for multiscale mechanisms underlying various real-world time-related concepts such as the self that cumulates over a person's life course.
Collapse
Affiliation(s)
- Dhruv Mehrotra
- Integrated Program in Neuroscience, McGill University, Montréal, QC, Canada
- Montréal Neurological Institute, McGill University, Montréal, QC, Canada
| | - Laurette Dubé
- Desautels Faculty of Management, McGill University, Montréal, QC, Canada
- McGill Center for the Convergence of Health and Economics, McGill University, Montréal, QC, Canada
| |
Collapse
|
18
|
Wise T, Charpentier CJ, Dayan P, Mobbs D. Interactive cognitive maps support flexible behavior under threat. Cell Rep 2023; 42:113008. [PMID: 37610871 PMCID: PMC10658881 DOI: 10.1016/j.celrep.2023.113008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 07/11/2023] [Accepted: 08/03/2023] [Indexed: 08/25/2023] Open
Abstract
In social environments, survival can depend upon inferring and adapting to other agents' goal-directed behavior. However, it remains unclear how humans achieve this, despite the fact that many decisions must account for complex, dynamic agents acting according to their own goals. Here, we use a predator-prey task (total n = 510) to demonstrate that humans exploit an interactive cognitive map of the social environment to infer other agents' preferences and simulate their future behavior, providing for flexible, generalizable responses. A model-based inverse reinforcement learning model explained participants' inferences about threatening agents' preferences, with participants using this inferred knowledge to enact generalizable, model-based behavioral responses. Using tree-search planning models, we then found that behavior was best explained by a planning algorithm that incorporated simulations of the threat's goal-directed behavior. Our results indicate that humans use a cognitive map to determine other agents' preferences, facilitating generalized predictions of their behavior and effective responses.
Collapse
Affiliation(s)
- Toby Wise
- Department of Neuroimaging, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK; Department of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA.
| | - Caroline J Charpentier
- Department of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA; Department of Psychology, University of Maryland, College Park, MD, USA; Brain and Behavior Institute, University of Maryland, College Park, MD, USA
| | - Peter Dayan
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany; University of Tübingen, Tübingen, Germany
| | - Dean Mobbs
- Department of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA; Computation and Neural Systems Program, California Institute of Technology, Pasadena, CA, USA
| |
Collapse
|
19
|
Tarder-Stoll H, Baldassano C, Aly M. The brain hierarchically represents the past and future during multistep anticipation. bioRxiv 2023:2023.07.24.550399. [PMID: 37546761 PMCID: PMC10402095 DOI: 10.1101/2023.07.24.550399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Memory for temporal structure enables both planning of future events and retrospection of past events. We investigated how the brain flexibly represents extended temporal sequences into the past and future during anticipation. Participants learned sequences of environments in immersive virtual reality. Pairs of sequences had the same environments in a different order, enabling context-specific learning. During fMRI, participants anticipated upcoming environments multiple steps into the future in a given sequence. Temporal structure was represented in the hippocampus and across visual regions (1) bidirectionally, with graded representations into the past and future and (2) hierarchically, with further events into the past and future represented in successively more anterior brain regions. Further, context-specific predictions were prioritized in the forward but not backward direction. Together, this work sheds light on how we flexibly represent sequential structure to enable planning over multiple timescales.
Collapse
|
20
|
Sato R, Shimomura K, Morita K. Opponent learning with different representations in the cortico-basal ganglia pathways can develop obsession-compulsion cycle. PLoS Comput Biol 2023; 19:e1011206. [PMID: 37319256 PMCID: PMC10306209 DOI: 10.1371/journal.pcbi.1011206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Accepted: 05/23/2023] [Indexed: 06/17/2023] Open
Abstract
Obsessive-compulsive disorder (OCD) has been suggested to be associated with impairment of model-based behavioral control. Meanwhile, recent work suggested shorter memory trace for negative than positive prediction errors (PEs) in OCD. We explored relations between these two suggestions through computational modeling. Based on the properties of cortico-basal ganglia pathways, we modeled human as an agent having a combination of successor representation (SR)-based system that enables model-based-like control and individual representation (IR)-based system that only hosts model-free control, with the two systems potentially learning from positive and negative PEs in different rates. We simulated the agent's behavior in the environmental model used in the recent work that describes potential development of obsession-compulsion cycle. We found that the dual-system agent could develop enhanced obsession-compulsion cycle, similarly to the agent having memory trace imbalance in the recent work, if the SR- and IR-based systems learned mainly from positive and negative PEs, respectively. We then simulated the behavior of such an opponent SR+IR agent in the two-stage decision task, in comparison with the agent having only SR-based control. Fitting of the agents' behavior by the model weighing model-based and model-free control developed in the original two-stage task study resulted in smaller weights of model-based control for the opponent SR+IR agent than for the SR-only agent. These results reconcile the previous suggestions about OCD, i.e., impaired model-based control and memory trace imbalance, raising a novel possibility that opponent learning in model(SR)-based and model-free controllers underlies obsession-compulsion. Our model cannot explain the behavior of OCD patients in punishment, rather than reward, contexts, but it could be resolved if opponent SR+IR learning operates also in the recently revealed non-canonical cortico-basal ganglia-dopamine circuit for threat/aversiveness, rather than reward, reinforcement learning, and the aversive SR + appetitive IR agent could actually develop obsession-compulsion if the environment is modeled differently.
Collapse
Affiliation(s)
- Reo Sato
- Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo, Japan
| | - Kanji Shimomura
- Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo, Japan
| | - Kenji Morita
- Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo, Japan
- International Research Center for Neurointelligence (WPI-IRCN), The University of Tokyo, Tokyo, Japan
| |
Collapse
|
21
|
Zhu SL, Lakshminarasimhan KJ, Angelaki DE. Computational cross-species views of the hippocampal formation. Hippocampus 2023; 33:586-599. [PMID: 37038890 PMCID: PMC10947336 DOI: 10.1002/hipo.23535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 03/17/2023] [Accepted: 03/21/2023] [Indexed: 04/12/2023]
Abstract
The discovery of place cells and head direction cells in the hippocampal formation of freely foraging rodents has led to an emphasis of its role in encoding allocentric spatial relationships. In contrast, studies in head-fixed primates have additionally found representations of spatial views. We review recent experiments in freely moving monkeys that expand upon these findings and show that postural variables such as eye/head movements strongly influence neural activity in the hippocampal formation, suggesting that the function of the hippocampus depends on where the animal looks. We interpret these results in the light of recent studies in humans performing challenging navigation tasks which suggest that depending on the context, eye/head movements serve one of two roles-gathering information about the structure of the environment (active sensing) or externalizing the contents of internal beliefs/deliberation (embodied cognition). These findings prompt future experimental investigations into the information carried by signals flowing between the hippocampal formation and the brain regions controlling postural variables, and constitute a basis for updating computational theories of the hippocampal system to accommodate the influence of eye/head movements.
Collapse
Affiliation(s)
- Seren L Zhu
- Center for Neural Science, New York University, New York, New York, USA
| | - Kaushik J Lakshminarasimhan
- Center for Theoretical Neuroscience, Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, USA
| | - Dora E Angelaki
- Center for Neural Science, New York University, New York, New York, USA
- Mechanical and Aerospace Engineering, Tandon School of Engineering, New York University, New York, New York, USA
| |
Collapse
|
22
|
Kato A, Shimomura K, Ognibene D, Parvaz MA, Berner LA, Morita K, Fiore VG. Computational models of behavioral addictions: State of the art and future directions. Addict Behav 2023; 140:107595. [PMID: 36621045 DOI: 10.1016/j.addbeh.2022.107595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 11/23/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022]
Abstract
Non-pharmacological behavioral addictions, such as pathological gambling, videogaming, social networking, or internet use, are becoming major public health concerns. It is not yet clear how behavioral addictions could share many major neurobiological and behavioral characteristics with substance use disorders, despite the absence of direct pharmacological influences. A deeper understanding of the neurocognitive mechanisms of addictive behavior is needed, and computational modeling could be one promising approach to explain intricately entwined cognitive and neural dynamics. This review describes computational models of addiction based on reinforcement learning algorithms, Bayesian inference, and biophysical neural simulations. We discuss whether computational frameworks originally conceived to explain maladaptive behavior in substance use disorders can be effectively extended to non-substance-related behavioral addictions. Moreover, we introduce recent studies on behavioral addictions that exemplify the possibility of such extension and propose future directions.
Collapse
Affiliation(s)
- Ayaka Kato
- RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan; Graduate School of Arts and Sciences, The University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo 153-8902, Japan
| | - Kanji Shimomura
- Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo 113-0033, Japan
| | - Dimitri Ognibene
- Department of Psychology, Università degli Studi Milano-Bicocca, Milan, Italy; School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK
| | - Muhammad A Parvaz
- Departments of Psychiatry and Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Laura A Berner
- Center of Excellence in Eating and Weight Disorders, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Center for Computational Psychiatry, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kenji Morita
- Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo 113-0033, Japan; International Research Center for Neurointelligence (WPI-IRCN), The University of Tokyo, Tokyo 113-0033, Japan
| | - Vincenzo G Fiore
- Center for Computational Psychiatry, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
23
|
George TM, de Cothi W, Stachenfeld KL, Barry C. Rapid learning of predictive maps with STDP and theta phase precession. eLife 2023; 12:80663. [PMID: 36927826 PMCID: PMC10019887 DOI: 10.7554/elife.80663] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 02/26/2023] [Indexed: 03/18/2023] Open
Abstract
The predictive map hypothesis is a promising candidate principle for hippocampal function. A favoured formalisation of this hypothesis, called the successor representation, proposes that each place cell encodes the expected state occupancy of its target location in the near future. This predictive framework is supported by behavioural as well as electrophysiological evidence and has desirable consequences for both the generalisability and efficiency of reinforcement learning algorithms. However, it is unclear how the successor representation might be learnt in the brain. Error-driven temporal difference learning, commonly used to learn successor representations in artificial agents, is not known to be implemented in hippocampal networks. Instead, we demonstrate that spike-timing dependent plasticity (STDP), a form of Hebbian learning, acting on temporally compressed trajectories known as 'theta sweeps', is sufficient to rapidly learn a close approximation to the successor representation. The model is biologically plausible - it uses spiking neurons modulated by theta-band oscillations, diffuse and overlapping place cell-like state representations, and experimentally matched parameters. We show how this model maps onto known aspects of hippocampal circuitry and explains substantial variance in the temporal difference successor matrix, consequently giving rise to place cells that demonstrate experimentally observed successor representation-related phenomena including backwards expansion on a 1D track and elongation near walls in 2D. Finally, our model provides insight into the observed topographical ordering of place field sizes along the dorsal-ventral axis by showing this is necessary to prevent the detrimental mixing of larger place fields, which encode longer timescale successor representations, with more fine-grained predictions of spatial location.
Collapse
Affiliation(s)
- Tom M George
- Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College LondonLondonUnited Kingdom
| | - William de Cothi
- Research Department of Cell and Developmental Biology, University College LondonLondonUnited Kingdom
| | | | - Caswell Barry
- Research Department of Cell and Developmental Biology, University College LondonLondonUnited Kingdom
| |
Collapse
|
24
|
Fang C, Aronov D, Abbott LF, Mackevicius EL. Neural learning rules for generating flexible predictions and computing the successor representation. eLife 2023; 12:e80680. [PMID: 36928104 PMCID: PMC10019889 DOI: 10.7554/elife.80680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 10/26/2022] [Indexed: 03/18/2023] Open
Abstract
The predictive nature of the hippocampus is thought to be useful for memory-guided cognitive behaviors. Inspired by the reinforcement learning literature, this notion has been formalized as a predictive map called the successor representation (SR). The SR captures a number of observations about hippocampal activity. However, the algorithm does not provide a neural mechanism for how such representations arise. Here, we show the dynamics of a recurrent neural network naturally calculate the SR when the synaptic weights match the transition probability matrix. Interestingly, the predictive horizon can be flexibly modulated simply by changing the network gain. We derive simple, biologically plausible learning rules to learn the SR in a recurrent network. We test our model with realistic inputs and match hippocampal data recorded during random foraging. Taken together, our results suggest that the SR is more accessible in neural circuits than previously thought and can support a broad range of cognitive functions.
Collapse
Affiliation(s)
- Ching Fang
- Zuckerman Institute, Department of Neuroscience, Columbia UniversityNew YorkUnited States
| | - Dmitriy Aronov
- Zuckerman Institute, Department of Neuroscience, Columbia UniversityNew YorkUnited States
| | - LF Abbott
- Zuckerman Institute, Department of Neuroscience, Columbia UniversityNew YorkUnited States
| | - Emily L Mackevicius
- Zuckerman Institute, Department of Neuroscience, Columbia UniversityNew YorkUnited States
- Basis Research InstituteNew YorkUnited States
| |
Collapse
|
25
|
Bono J, Zannone S, Pedrosa V, Clopath C. Learning predictive cognitive maps with spiking neurons during behavior and replays. eLife 2023; 12:e80671. [PMID: 36927625 PMCID: PMC10019888 DOI: 10.7554/elife.80671] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 01/12/2023] [Indexed: 03/18/2023] Open
Abstract
The hippocampus has been proposed to encode environments using a representation that contains predictive information about likely future states, called the successor representation. However, it is not clear how such a representation could be learned in the hippocampal circuit. Here, we propose a plasticity rule that can learn this predictive map of the environment using a spiking neural network. We connect this biologically plausible plasticity rule to reinforcement learning, mathematically and numerically showing that it implements the TD-lambda algorithm. By spanning these different levels, we show how our framework naturally encompasses behavioral activity and replays, smoothly moving from rate to temporal coding, and allows learning over behavioral timescales with a plasticity rule acting on a timescale of milliseconds. We discuss how biological parameters such as dwelling times at states, neuronal firing rates and neuromodulation relate to the delay discounting parameter of the TD algorithm, and how they influence the learned representation. We also find that, in agreement with psychological studies and contrary to reinforcement learning theory, the discount factor decreases hyperbolically with time. Finally, our framework suggests a role for replays, in both aiding learning in novel environments and finding shortcut trajectories that were not experienced during behavior, in agreement with experimental data.
Collapse
Affiliation(s)
- Jacopo Bono
- Department of Bioengineering, Imperial College LondonLondonUnited Kingdom
| | - Sara Zannone
- Department of Bioengineering, Imperial College LondonLondonUnited Kingdom
| | - Victor Pedrosa
- Department of Bioengineering, Imperial College LondonLondonUnited Kingdom
| | - Claudia Clopath
- Department of Bioengineering, Imperial College LondonLondonUnited Kingdom
| |
Collapse
|
26
|
Ekman M, Kusch S, de Lange FP. Successor-like representation guides the prediction of future events in human visual cortex and hippocampus. eLife 2023; 12:78904. [PMID: 36729024 PMCID: PMC9894584 DOI: 10.7554/elife.78904] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 01/13/2023] [Indexed: 02/03/2023] Open
Abstract
Human agents build models of their environment, which enable them to anticipate and plan upcoming events. However, little is known about the properties of such predictive models. Recently, it has been proposed that hippocampal representations take the form of a predictive map-like structure, the so-called successor representation (SR). Here, we used human functional magnetic resonance imaging to probe whether activity in the early visual cortex (V1) and hippocampus adhere to the postulated properties of the SR after visual sequence learning. Participants were exposed to an arbitrary spatiotemporal sequence consisting of four items (A-B-C-D). We found that after repeated exposure to the sequence, merely presenting single sequence items (e.g., - B - -) resulted in V1 activation at the successor locations of the full sequence (e.g., C-D), but not at the predecessor locations (e.g., A). This highlights that visual representations are skewed toward future states, in line with the SR. Similar results were also found in the hippocampus. Moreover, the hippocampus developed a coactivation profile that showed sensitivity to the temporal distance in sequence space, with fading representations for sequence events in the more distant past and future. V1, in contrast, showed a coactivation profile that was only sensitive to spatial distance in stimulus space. Taken together, these results provide empirical evidence for the proposition that both visual and hippocampal cortex represent a predictive map of the visual world akin to the SR.
Collapse
Affiliation(s)
- Matthias Ekman
- Radboud University Nijmegen, Donders Institute for Brain, Cognition and BehaviourNijmegenNetherlands
| | - Sarah Kusch
- Radboud University Nijmegen, Donders Institute for Brain, Cognition and BehaviourNijmegenNetherlands
| | - Floris P de Lange
- Radboud University Nijmegen, Donders Institute for Brain, Cognition and BehaviourNijmegenNetherlands
| |
Collapse
|
27
|
Malekzadeh P, Hou M, Plataniotis KN. Uncertainty-aware transfer across tasks using hybrid model-based successor feature reinforcement learning. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.01.076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
|
28
|
Garner KG, Dux PE. Knowledge generalization and the costs of multitasking. Nat Rev Neurosci 2023; 24:98-112. [PMID: 36347942 DOI: 10.1038/s41583-022-00653-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/12/2022] [Indexed: 11/10/2022]
Abstract
Humans are able to rapidly perform novel tasks, but show pervasive performance costs when attempting to do two things at once. Traditionally, empirical and theoretical investigations into the sources of such multitasking interference have largely focused on multitasking in isolation to other cognitive functions, characterizing the conditions that give rise to performance decrements. Here we instead ask whether multitasking costs are linked to the system's capacity for knowledge generalization, as is required to perform novel tasks. We show how interrogation of the neurophysiological circuitry underlying these two facets of cognition yields further insights for both. Specifically, we demonstrate how a system that rapidly generalizes knowledge may induce multitasking costs owing to sharing of task contingencies between contexts in neural representations encoded in frontoparietal and striatal brain regions. We discuss neurophysiological insights suggesting that prolonged learning segregates such representations by refining the brain's model of task-relevant contingencies, thereby reducing information sharing between contexts and improving multitasking performance while reducing flexibility and generalization. These proposed neural mechanisms explain why the brain shows rapid task understanding, multitasking limitations and practice effects. In short, multitasking limits are the price we pay for behavioural flexibility.
Collapse
|
29
|
Linton P, Morgan MJ, Read JCA, Vishwanath D, Creem-Regehr SH, Domini F. New Approaches to 3D Vision. Philos Trans R Soc Lond B Biol Sci 2023; 378:20210443. [PMID: 36511413 PMCID: PMC9745878 DOI: 10.1098/rstb.2021.0443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 10/25/2022] [Indexed: 12/15/2022] Open
Abstract
New approaches to 3D vision are enabling new advances in artificial intelligence and autonomous vehicles, a better understanding of how animals navigate the 3D world, and new insights into human perception in virtual and augmented reality. Whilst traditional approaches to 3D vision in computer vision (SLAM: simultaneous localization and mapping), animal navigation (cognitive maps), and human vision (optimal cue integration) start from the assumption that the aim of 3D vision is to provide an accurate 3D model of the world, the new approaches to 3D vision explored in this issue challenge this assumption. Instead, they investigate the possibility that computer vision, animal navigation, and human vision can rely on partial or distorted models or no model at all. This issue also highlights the implications for artificial intelligence, autonomous vehicles, human perception in virtual and augmented reality, and the treatment of visual disorders, all of which are explored by individual articles. This article is part of a discussion meeting issue 'New approaches to 3D vision'.
Collapse
Affiliation(s)
- Paul Linton
- Presidential Scholars in Society and Neuroscience, Center for Science and Society, Columbia University, New York, NY 10027, USA
- Italian Academy for Advanced Studies in America, Columbia University, New York, NY 10027, USA
- Visual Inference Lab, Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY 10027, USA
| | - Michael J. Morgan
- Department of Optometry and Visual Sciences, City, University of London, Northampton Square, London EC1V 0HB, UK
| | - Jenny C. A. Read
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, Tyne & Wear NE2 4HH, UK
| | - Dhanraj Vishwanath
- School of Psychology and Neuroscience, University of St Andrews, St Andrews, Fife KY16 9JP, UK
| | | | - Fulvio Domini
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI 02912-9067, USA
| |
Collapse
|
30
|
Momennejad I. A rubric for human-like agents and NeuroAI. Philos Trans R Soc Lond B Biol Sci 2023; 378:20210446. [PMID: 36511409 PMCID: PMC9745874 DOI: 10.1098/rstb.2021.0446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2022] [Accepted: 10/27/2022] [Indexed: 12/15/2022] Open
Abstract
Researchers across cognitive, neuro- and computer sciences increasingly reference 'human-like' artificial intelligence and 'neuroAI'. However, the scope and use of the terms are often inconsistent. Contributed research ranges widely from mimicking behaviour, to testing machine learning methods as neurally plausible hypotheses at the cellular or functional levels, or solving engineering problems. However, it cannot be assumed nor expected that progress on one of these three goals will automatically translate to progress in others. Here, a simple rubric is proposed to clarify the scope of individual contributions, grounded in their commitments to human-like behaviour, neural plausibility or benchmark/engineering/computer science goals. This is clarified using examples of weak and strong neuroAI and human-like agents, and discussing the generative, corroborate and corrective ways in which the three dimensions interact with one another. The author maintains that future progress in artificial intelligence will need strong interactions across the disciplines, with iterative feedback loops and meticulous validity tests-leading to both known and yet-unknown advances that may span decades to come. This article is part of a discussion meeting issue 'New approaches to 3D vision'.
Collapse
Affiliation(s)
- Ida Momennejad
- Microsoft Research NYC, Reinforcement Learning Station, 300 Lafayette, New York, NY 10012, USA
| |
Collapse
|
31
|
Morita K, Shimomura K, Kawaguchi Y. Opponent Learning with Different Representations in the Cortico-Basal Ganglia Circuits. eNeuro 2023; 10:ENEURO.0422-22.2023. [PMID: 36653187 PMCID: PMC9884109 DOI: 10.1523/eneuro.0422-22.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 12/06/2022] [Accepted: 01/03/2023] [Indexed: 01/20/2023] Open
Abstract
The direct and indirect pathways of the basal ganglia (BG) have been suggested to learn mainly from positive and negative feedbacks, respectively. Since these pathways unevenly receive inputs from different cortical neuron types and/or regions, they may preferentially use different state/action representations. We explored whether such a combined use of different representations, coupled with different learning rates from positive and negative reward prediction errors (RPEs), has computational benefits. We modeled animal as an agent equipped with two learning systems, each of which adopted individual representation (IR) or successor representation (SR) of states. With varying the combination of IR or SR and also the learning rates from positive and negative RPEs in each system, we examined how the agent performed in a dynamic reward navigation task. We found that combination of SR-based system learning mainly from positive RPEs and IR-based system learning mainly from negative RPEs could achieve a good performance in the task, as compared with other combinations. In such a combination of appetitive SR-based and aversive IR-based systems, both systems show activities of comparable magnitudes with opposite signs, consistent with the suggested profiles of the two BG pathways. Moreover, the architecture of such a combination provides a novel coherent explanation for the functional significance and underlying mechanism of diverse findings about the cortico-BG circuits. These results suggest that particularly combining different representations with appetitive and aversive learning could be an effective learning strategy in certain dynamic environments, and it might actually be implemented in the cortico-BG circuits.
Collapse
Affiliation(s)
- Kenji Morita
- Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo 113-0033, Japan
- International Research Center for Neurointelligence (WPI-IRCN), The University of Tokyo, Tokyo 113-0033, Japan
| | - Kanji Shimomura
- Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo 113-0033, Japan
- Department of Behavioral Medicine, National Institute of Mental Health, National Center of Neurology and Psychiatry, Kodaira 187-8551, Japan
| | - Yasuo Kawaguchi
- Brain Science Institute, Tamagawa University, Machida 194-8610, Japan
- National Institute for Physiological Sciences (NIPS), Okazaki 444-8787, Japan
| |
Collapse
|
32
|
Fan C, Yao L, Zhang J, Zhen Z, Wu X. Advanced Reinforcement Learning and Its Connections with Brain Neuroscience. Research (Wash D C) 2023; 6:0064. [PMID: 36939448 PMCID: PMC10017102 DOI: 10.34133/research.0064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2022] [Accepted: 01/10/2023] [Indexed: 01/22/2023]
Abstract
In recent years, brain science and neuroscience have greatly propelled the innovation of computer science. In particular, knowledge from the neurobiology and neuropsychology of the brain revolutionized the development of reinforcement learning (RL) by providing novel interpretable mechanisms of how the brain achieves intelligent and efficient decision making. Triggered by this, there has been a boom in research about advanced RL algorithms that are built upon the inspirations of brain neuroscience. In this work, to further strengthen the bidirectional link between the 2 communities and especially promote the research on modern RL technology, we provide a comprehensive survey of recent advances in the area of brain-inspired/related RL algorithms. We start with basis theories of RL, and present a concise introduction to brain neuroscience related to RL. Then, we classify these advanced RL methodologies into 3 categories according to different connections of the brain, i.e., micro-neural activity, macro-brain structure, and cognitive function. Each category is further surveyed by presenting several modern RL algorithms along with their mathematical models, correlations with the brain, and open issues. Finally, we introduce several important applications of RL algorithms, followed by the discussions of challenges and opportunities for future research.
Collapse
Affiliation(s)
- Chaoqiong Fan
- School of Artificial Intelligence,
Beijing Normal University, Beijing, China
| | - Li Yao
- School of Artificial Intelligence,
Beijing Normal University, Beijing, China
| | - Jiacai Zhang
- School of Artificial Intelligence,
Beijing Normal University, Beijing, China
| | - Zonglei Zhen
- Faculty of Psychology,
Beijing Normal University, Beijing, China
| | - Xia Wu
- School of Artificial Intelligence,
Beijing Normal University, Beijing, China
- Address correspondence to:
| |
Collapse
|
33
|
Yang Z, Diaz GJ, Fajen BR, Bailey R, Ororbia AG. A neural active inference model of perceptual-motor learning. Front Comput Neurosci 2023; 17:1099593. [PMID: 36890967 PMCID: PMC9986490 DOI: 10.3389/fncom.2023.1099593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 01/30/2023] [Indexed: 02/22/2023] Open
Abstract
The active inference framework (AIF) is a promising new computational framework grounded in contemporary neuroscience that can produce human-like behavior through reward-based learning. In this study, we test the ability for the AIF to capture the role of anticipation in the visual guidance of action in humans through the systematic investigation of a visual-motor task that has been well-explored-that of intercepting a target moving over a ground plane. Previous research demonstrated that humans performing this task resorted to anticipatory changes in speed intended to compensate for semi-predictable changes in target speed later in the approach. To capture this behavior, our proposed "neural" AIF agent uses artificial neural networks to select actions on the basis of a very short term prediction of the information about the task environment that these actions would reveal along with a long-term estimate of the resulting cumulative expected free energy. Systematic variation revealed that anticipatory behavior emerged only when required by limitations on the agent's movement capabilities, and only when the agent was able to estimate accumulated free energy over sufficiently long durations into the future. In addition, we present a novel formulation of the prior mapping function that maps a multi-dimensional world-state to a uni-dimensional distribution of free-energy/reward. Together, these results demonstrate the use of AIF as a plausible model of anticipatory visually guided behavior in humans.
Collapse
Affiliation(s)
- Zhizhuo Yang
- Golisano College of Computing and Information Sciences, Rochester Institute of Technology, Rochester, NY, United States
| | - Gabriel J Diaz
- Chester F. Carlson Center for Imaging Science, Rochester Institute of Technology, Rochester, NY, United States
| | - Brett R Fajen
- Department of Cognitive Science, Rensselaer Polytechnic Institute, Troy, NY, United States
| | - Reynold Bailey
- Golisano College of Computing and Information Sciences, Rochester Institute of Technology, Rochester, NY, United States
| | - Alexander G Ororbia
- Golisano College of Computing and Information Sciences, Rochester Institute of Technology, Rochester, NY, United States
| |
Collapse
|
34
|
De Martino B, Cortese A. Goals, usefulness and abstraction in value-based choice. Trends Cogn Sci 2023; 27:65-80. [PMID: 36446707 DOI: 10.1016/j.tics.2022.11.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 10/26/2022] [Accepted: 11/01/2022] [Indexed: 11/27/2022]
Abstract
Colombian drug lord Pablo Escobar, while on the run, purportedly burned two million dollars in banknotes to keep his daughter warm. A stark reminder that, in life, circumstances and goals can quickly change, forcing us to reassess and modify our values on-the-fly. Studies in decision-making and neuroeconomics have often implicitly equated value to reward, emphasising the hedonic and automatic aspect of the value computation, while overlooking its functional (concept-like) nature. Here we outline the computational and biological principles that enable the brain to compute the usefulness of an option or action by creating abstractions that flexibly adapt to changing goals. We present different algorithmic architectures, comparing ideas from artificial intelligence (AI) and cognitive neuroscience with psychological theories and, when possible, drawing parallels.
Collapse
Affiliation(s)
- Benedetto De Martino
- Institute of Cognitive Neuroscience, University College London, London WC1N 3AZ, UK; Computational Neuroscience Laboratories, ATR Institute International, 619-0288 Kyoto, Japan.
| | - Aurelio Cortese
- Institute of Cognitive Neuroscience, University College London, London WC1N 3AZ, UK; Computational Neuroscience Laboratories, ATR Institute International, 619-0288 Kyoto, Japan.
| |
Collapse
|
35
|
Abstract
Several authors have suggested a deep symmetry between the psychological processes that underlie our ability to remember the past and make predictions about the future. The judgment of recency (JOR) task measures temporal order judgments for the past by presenting pairs of probe stimuli; participants choose the probe that was presented more recently. We performed a short-term relative JOR task and introduced a novel judgment of imminence (JOI) task to study temporal order judgments for the future. In the JOR task, participants were presented with a sequence of stimuli and asked to choose which of two probe stimuli was presented closer to the present. In the JOI task, participants were trained on a probabilistic sequence. After training, the sequence was interrupted with probe stimuli. Participants were asked to choose which of two probe stimuli was expected to be presented closer to the present. Replicating prior work on JOR, we found that RT results supported a backward self-terminating search model operating on a temporally organized representation of the past. We also showed that RT distributions are consistent with this model and that the temporally organized representation is compressed. Critically, results for the JOI task probing expectations of the future suggest a forward self-terminating search model operating on a temporally organized representation of the future. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
Collapse
|
36
|
Gonzalez A, Giocomo LM. From Rats to Humans: how novel behavioral paradigms and reinforcement learning can bridge the gap in translation. Lab Anim (NY) 2022; 51:289-290. [PMID: 36258040 DOI: 10.1038/s41684-022-01077-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Affiliation(s)
| | - Lisa M Giocomo
- Stanford University School of Medicine, Stanford, CA, USA.
| |
Collapse
|
37
|
Abstract
Understanding Theory of Mind should begin with an analysis of the problems it solves. The traditional answer is that Theory of Mind is used for predicting others' thoughts and actions. However, the same Theory of Mind is also used for planning to change others' thoughts and actions. Planning requires that Theory of Mind consists of abstract structured causal representations and supports efficient search and selection from innumerable possible actions. Theory of Mind contrasts with less cognitively demanding alternatives: statistical predictive models of other people's actions, or model-free reinforcement of actions by their effects on other people. Theory of Mind is likely used to plan novel interventions and predict their effects, for example, in pedagogy, emotion regulation, and impression management.
Collapse
Affiliation(s)
- Mark K Ho
- Department of Computer Science, Princeton University, Princeton, NJ, USA; Department of Psychology, Princeton University, Princeton, NJ, USA.
| | - Rebecca Saxe
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA
| | - Fiery Cushman
- Department of Psychology, Harvard University, Cambridge, MA, USA
| |
Collapse
|
38
|
Colas JT, Dundon NM, Gerraty RT, Saragosa‐Harris NM, Szymula KP, Tanwisuth K, Tyszka JM, van Geen C, Ju H, Toga AW, Gold JI, Bassett DS, Hartley CA, Shohamy D, Grafton ST, O'Doherty JP. Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T. Hum Brain Mapp 2022; 43:4750-4790. [PMID: 35860954 PMCID: PMC9491297 DOI: 10.1002/hbm.25988] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 05/20/2022] [Accepted: 06/10/2022] [Indexed: 11/12/2022] Open
Abstract
The model-free algorithms of "reinforcement learning" (RL) have gained clout across disciplines, but so too have model-based alternatives. The present study emphasizes other dimensions of this model space in consideration of associative or discriminative generalization across states and actions. This "generalized reinforcement learning" (GRL) model, a frugal extension of RL, parsimoniously retains the single reward-prediction error (RPE), but the scope of learning goes beyond the experienced state and action. Instead, the generalized RPE is efficiently relayed for bidirectional counterfactual updating of value estimates for other representations. Aided by structural information but as an implicit rather than explicit cognitive map, GRL provided the most precise account of human behavior and individual differences in a reversal-learning task with hierarchical structure that encouraged inverse generalization across both states and actions. Reflecting inference that could be true, false (i.e., overgeneralization), or absent (i.e., undergeneralization), state generalization distinguished those who learned well more so than action generalization. With high-resolution high-field fMRI targeting the dopaminergic midbrain, the GRL model's RPE signals (alongside value and decision signals) were localized within not only the striatum but also the substantia nigra and the ventral tegmental area, including specific effects of generalization that also extend to the hippocampus. Factoring in generalization as a multidimensional process in value-based learning, these findings shed light on complexities that, while challenging classic RL, can still be resolved within the bounds of its core computations.
Collapse
Affiliation(s)
- Jaron T. Colas
- Department of Psychological and Brain SciencesUniversity of CaliforniaSanta BarbaraCaliforniaUSA
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
- Computation and Neural Systems Program, California Institute of TechnologyPasadenaCaliforniaUSA
| | - Neil M. Dundon
- Department of Psychological and Brain SciencesUniversity of CaliforniaSanta BarbaraCaliforniaUSA
- Department of Child and Adolescent Psychiatry, Psychotherapy, and PsychosomaticsUniversity of FreiburgFreiburg im BreisgauGermany
| | - Raphael T. Gerraty
- Department of PsychologyColumbia UniversityNew YorkNew YorkUSA
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkNew YorkUSA
- Center for Science and SocietyColumbia UniversityNew YorkNew YorkUSA
| | - Natalie M. Saragosa‐Harris
- Department of PsychologyNew York UniversityNew YorkNew YorkUSA
- Department of PsychologyUniversity of CaliforniaLos AngelesCaliforniaUSA
| | - Karol P. Szymula
- Department of BioengineeringUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Koranis Tanwisuth
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
- Department of PsychologyUniversity of CaliforniaBerkeleyCaliforniaUSA
| | - J. Michael Tyszka
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
| | - Camilla van Geen
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkNew YorkUSA
- Department of PsychologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Harang Ju
- Neuroscience Graduate GroupUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Arthur W. Toga
- Laboratory of Neuro ImagingUSC Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of USC, University of Southern CaliforniaLos AngelesCaliforniaUSA
| | - Joshua I. Gold
- Department of NeuroscienceUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Dani S. Bassett
- Department of BioengineeringUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of Electrical and Systems EngineeringUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of NeurologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of PsychiatryUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of Physics and AstronomyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Santa Fe InstituteSanta FeNew MexicoUSA
| | - Catherine A. Hartley
- Department of PsychologyNew York UniversityNew YorkNew YorkUSA
- Center for Neural ScienceNew York UniversityNew YorkNew YorkUSA
| | - Daphna Shohamy
- Department of PsychologyColumbia UniversityNew YorkNew YorkUSA
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkNew YorkUSA
- Kavli Institute for Brain ScienceColumbia UniversityNew YorkNew YorkUSA
| | - Scott T. Grafton
- Department of Psychological and Brain SciencesUniversity of CaliforniaSanta BarbaraCaliforniaUSA
| | - John P. O'Doherty
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
- Computation and Neural Systems Program, California Institute of TechnologyPasadenaCaliforniaUSA
| |
Collapse
|
39
|
Abstract
Learning and interpreting the structure of the environment is an innate feature of biological systems, and is integral to guiding flexible behaviors for evolutionary viability. The concept of a cognitive map has emerged as one of the leading metaphors for these capacities, and unraveling the learning and neural representation of such a map has become a central focus of neuroscience. In recent years, many models have been developed to explain cellular responses in the hippocampus and other brain areas. Because it can be difficult to see how these models differ, how they relate and what each model can contribute, this Review aims to organize these models into a clear ontology. This ontology reveals parallels between existing empirical results, and implies new approaches to understand hippocampal-cortical interactions and beyond.
Collapse
|
40
|
Pudhiyidath A, Morton NW, Viveros Duran R, Schapiro AC, Momennejad I, Hinojosa-Rowland DM, Molitor RJ, Preston AR. Representations of Temporal Community Structure in Hippocampus and Precuneus Predict Inductive Reasoning Decisions. J Cogn Neurosci 2022; 34:1736-1760. [PMID: 35579986 PMCID: PMC10262802 DOI: 10.1162/jocn_a_01864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Our understanding of the world is shaped by inferences about underlying structure. For example, at the gym, you might notice that the same people tend to arrive around the same time and infer that they are friends that work out together. Consistent with this idea, after participants are presented with a temporal sequence of objects that follows an underlying community structure, they are biased to infer that objects from the same community share the same properties. Here, we used fMRI to measure neural representations of objects after temporal community structure learning and examine how these representations support inference about object relationships. We found that community structure learning affected inferred object similarity: When asked to spatially group items based on their experience, participants tended to group together objects from the same community. Neural representations in perirhinal cortex predicted individual differences in object grouping, suggesting that high-level object representations are affected by temporal community learning. Furthermore, participants were biased to infer that objects from the same community would share the same properties. Using computational modeling of temporal learning and inference decisions, we found that inductive reasoning is influenced by both detailed knowledge of temporal statistics and abstract knowledge of the temporal communities. The fidelity of temporal community representations in hippocampus and precuneus predicted the degree to which temporal community membership biased reasoning decisions. Our results suggest that temporal knowledge is represented at multiple levels of abstraction, and that perirhinal cortex, hippocampus, and precuneus may support inference based on this knowledge.
Collapse
|
41
|
Qian W, Lynn CW, Klishin AA, Stiso J, Christianson NH, Bassett DS. Optimizing the human learnability of abstract network representations. Proc Natl Acad Sci U S A 2022; 119:e2121338119. [PMID: 35994661 DOI: 10.1073/pnas.2121338119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Information can often be viewed as a network of associations between concepts. Humans build mental models of information networks in the world around them, yet those models consistently contain some errors. Here, we present a computational framework for simulating the optimization of human network learning by intentionally emphasizing or exaggerating some network features over others. We demonstrate in a computational model of human learning that targeted emphasis and de-emphasis can substantially enhance a learner’s grasp of network structure. Further, we identify how optimal emphasis patterns vary with the topology of the target network structure to be learned, as well as the baseline accuracy of the human learner. Our findings illuminate the principles of design and the optimization of network learnability. Precisely how humans process relational patterns of information in knowledge, language, music, and society is not well understood. Prior work in the field of statistical learning has demonstrated that humans process such information by building internal models of the underlying network structure. However, these mental maps are often inaccurate due to limitations in human information processing. The existence of such limitations raises clear questions: Given a target network that one wishes for a human to learn, what network should one present to the human? Should one simply present the target network as-is, or should one emphasize certain parts of the network to proactively mitigate expected errors in learning? To investigate these questions, we study the optimization of network learnability in a computational model of human learning. Evaluating an array of synthetic and real-world networks, we find that learnability is enhanced by reinforcing connections within modules or clusters. In contrast, when networks contain significant core–periphery structure, we find that learnability is best optimized by reinforcing peripheral edges between low-degree nodes. Overall, our findings suggest that the accuracy of human network learning can be systematically enhanced by targeted emphasis and de-emphasis of prescribed sectors of information.
Collapse
|
42
|
Matsuo Y, LeCun Y, Sahani M, Precup D, Silver D, Sugiyama M, Uchibe E, Morimoto J. Deep learning, reinforcement learning, and world models. Neural Netw 2022; 152:267-275. [DOI: 10.1016/j.neunet.2022.03.037] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 02/19/2022] [Accepted: 03/28/2022] [Indexed: 12/01/2022]
|
43
|
de Cothi W, Nyberg N, Griesbauer EM, Ghanamé C, Zisch F, Lefort JM, Fletcher L, Newton C, Renaudineau S, Bendor D, Grieves R, Duvelle É, Barry C, Spiers HJ. Predictive maps in rats and humans for spatial navigation. Curr Biol 2022; 32:3676-3689.e5. [PMID: 35863351 DOI: 10.1016/j.cub.2022.06.090] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 05/19/2022] [Accepted: 06/29/2022] [Indexed: 11/25/2022]
Abstract
Much of our understanding of navigation comes from the study of individual species, often with specific tasks tailored to those species. Here, we provide a novel experimental and analytic framework integrating across humans, rats, and simulated reinforcement learning (RL) agents to interrogate the dynamics of behavior during spatial navigation. We developed a novel open-field navigation task ("Tartarus maze") requiring dynamic adaptation (shortcuts and detours) to frequently changing obstructions on the path to a hidden goal. Humans and rats were remarkably similar in their trajectories. Both species showed the greatest similarity to RL agents utilizing a "successor representation," which creates a predictive map. Humans also displayed trajectory features similar to model-based RL agents, which implemented an optimal tree-search planning procedure. Our results help refine models seeking to explain mammalian navigation in dynamic environments and highlight the utility of modeling the behavior of different species to uncover the shared mechanisms that support behavior.
Collapse
Affiliation(s)
- William de Cothi
- Department of Cell and Developmental Biology, University College London, London, UK; Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK.
| | - Nils Nyberg
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Eva-Maria Griesbauer
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Carole Ghanamé
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Fiona Zisch
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK; The Bartlett School of Architecture, University College London, London, UK
| | - Julie M Lefort
- Department of Cell and Developmental Biology, University College London, London, UK
| | - Lydia Fletcher
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Coco Newton
- Department of Clinical Neurosciences, University of Cambridge, Cambridge, UK
| | - Sophie Renaudineau
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Daniel Bendor
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Roddy Grieves
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK; Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Éléonore Duvelle
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK; Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Caswell Barry
- Department of Cell and Developmental Biology, University College London, London, UK
| | - Hugo J Spiers
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK.
| |
Collapse
|
44
|
Puelma Touzel M, Cisek P, Lajoie G. Performance-gated deliberation: A context-adapted strategy in which urgency is opportunity cost. PLoS Comput Biol 2022; 18:e1010080. [PMID: 35617370 PMCID: PMC9176815 DOI: 10.1371/journal.pcbi.1010080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 06/08/2022] [Accepted: 04/05/2022] [Indexed: 11/18/2022] Open
Abstract
Finding the right amount of deliberation, between insufficient and excessive, is a hard decision making problem that depends on the value we place on our time. Average-reward, putatively encoded by tonic dopamine, serves in existing reinforcement learning theory as the opportunity cost of time, including deliberation time. Importantly, this cost can itself vary with the environmental context and is not trivial to estimate. Here, we propose how the opportunity cost of deliberation can be estimated adaptively on multiple timescales to account for non-stationary contextual factors. We use it in a simple decision-making heuristic based on average-reward reinforcement learning (AR-RL) that we call Performance-Gated Deliberation (PGD). We propose PGD as a strategy used by animals wherein deliberation cost is implemented directly as urgency, a previously characterized neural signal effectively controlling the speed of the decision-making process. We show PGD outperforms AR-RL solutions in explaining behaviour and urgency of non-human primates in a context-varying random walk prediction task and is consistent with relative performance and urgency in a context-varying random dot motion task. We make readily testable predictions for both neural activity and behaviour.
Collapse
Affiliation(s)
- Maximilian Puelma Touzel
- Mila, Québec AI Institute, Montréal, Canada
- Department of Computer Science & Operations Research, Université de Montréal, Montréal, Canada
- * E-mail:
| | - Paul Cisek
- Department of Neuroscience, Université de Montréal, Montréal, Canada
| | - Guillaume Lajoie
- Mila, Québec AI Institute, Montréal, Canada
- Department of Mathematics & Statistics, Université de Montréal, Montréal, Canada
| |
Collapse
|
45
|
Ho MK, Abel D, Correa CG, Littman ML, Cohen JD, Griffiths TL. People construct simplified mental representations to plan. Nature 2022; 606:129-136. [PMID: 35589843 DOI: 10.1038/s41586-022-04743-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Accepted: 04/07/2022] [Indexed: 11/09/2022]
Abstract
One of the most striking features of human cognition is the ability to plan. Two aspects of human planning stand out-its efficiency and flexibility. Efficiency is especially impressive because plans must often be made in complex environments, and yet people successfully plan solutions to many everyday problems despite having limited cognitive resources1-3. Standard accounts in psychology, economics and artificial intelligence have suggested that human planning succeeds because people have a complete representation of a task and then use heuristics to plan future actions in that representation4-11. However, this approach generally assumes that task representations are fixed. Here we propose that task representations can be controlled and that such control provides opportunities to quickly simplify problems and more easily reason about them. We propose a computational account of this simplification process and, in a series of preregistered behavioural experiments, show that it is subject to online cognitive control12-14 and that people optimally balance the complexity of a task representation and its utility for planning and acting. These results demonstrate how strategically perceiving and conceiving problems facilitates the effective use of limited cognitive resources.
Collapse
Affiliation(s)
- Mark K Ho
- Department of Psychology, Princeton University, Princeton, NJ, USA. .,Department of Computer Science, Princeton University, Princeton, NJ, USA.
| | - David Abel
- Department of Computer Science, Brown University, Providence, RI, USA.,DeepMind, London, UK
| | - Carlos G Correa
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Michael L Littman
- Department of Computer Science, Brown University, Providence, RI, USA
| | - Jonathan D Cohen
- Department of Psychology, Princeton University, Princeton, NJ, USA.,Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Thomas L Griffiths
- Department of Psychology, Princeton University, Princeton, NJ, USA.,Department of Computer Science, Princeton University, Princeton, NJ, USA
| |
Collapse
|
46
|
Zhu S, Lakshminarasimhan KJ, Arfaei N, Angelaki DE. Eye movements reveal spatiotemporal dynamics of visually-informed planning in navigation. eLife 2022; 11:73097. [PMID: 35503099 PMCID: PMC9135400 DOI: 10.7554/elife.73097] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 05/01/2022] [Indexed: 11/28/2022] Open
Abstract
Goal-oriented navigation is widely understood to depend upon internal maps. Although this may be the case in many settings, humans tend to rely on vision in complex, unfamiliar environments. To study the nature of gaze during visually-guided navigation, we tasked humans to navigate to transiently visible goals in virtual mazes of varying levels of difficulty, observing that they took near-optimal trajectories in all arenas. By analyzing participants’ eye movements, we gained insights into how they performed visually-informed planning. The spatial distribution of gaze revealed that environmental complexity mediated a striking trade-off in the extent to which attention was directed towards two complimentary aspects of the world model: the reward location and task-relevant transitions. The temporal evolution of gaze revealed rapid, sequential prospection of the future path, evocative of neural replay. These findings suggest that the spatiotemporal characteristics of gaze during navigation are significantly shaped by the unique cognitive computations underlying real-world, sequential decision making.
Collapse
Affiliation(s)
- Seren Zhu
- Center for Neural Science, New York University, New York, United States
| | | | - Nastaran Arfaei
- Department of Psychology, New York University, New York, United States
| | - Dora E Angelaki
- Center for Neural Science, New York University, New York, United States
| |
Collapse
|
47
|
Dennison JB, Sazhin D, Smith DV. Decision neuroscience and neuroeconomics: Recent progress and ongoing challenges. Wiley Interdiscip Rev Cogn Sci 2022; 13:e1589. [PMID: 35137549 PMCID: PMC9124684 DOI: 10.1002/wcs.1589] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Revised: 11/28/2021] [Accepted: 12/21/2021] [Indexed: 01/10/2023]
Abstract
In the past decade, decision neuroscience and neuroeconomics have developed many new insights in the study of decision making. This review provides an overarching update on how the field has advanced in this time period. Although our initial review a decade ago outlined several theoretical, conceptual, methodological, empirical, and practical challenges, there has only been limited progress in resolving these challenges. We summarize significant trends in decision neuroscience through the lens of the challenges outlined for the field and review examples where the field has had significant, direct, and applicable impacts across economics and psychology. First, we review progress on topics including reward learning, explore-exploit decisions, risk and ambiguity, intertemporal choice, and valuation. Next, we assess the impacts of emotion, social rewards, and social context on decision making. Then, we follow up with how individual differences impact choices and new exciting developments in the prediction and neuroforecasting of future decisions. Finally, we consider how trends in decision-neuroscience research reflect progress toward resolving past challenges, discuss new and exciting applications of recent research, and identify new challenges for the field. This article is categorized under: Psychology > Reasoning and Decision Making Psychology > Emotion and Motivation.
Collapse
Affiliation(s)
- Jeffrey B Dennison
- Department of Psychology, Temple University, Philadelphia, Pennsylvania, USA
| | - Daniel Sazhin
- Department of Psychology, Temple University, Philadelphia, Pennsylvania, USA
| | - David V Smith
- Department of Psychology, Temple University, Philadelphia, Pennsylvania, USA
| |
Collapse
|
48
|
Stiso J, Lynn CW, Kahn AE, Rangarajan V, Szymula KP, Archer R, Revell A, Stein JM, Litt B, Davis KA, Lucas TH, Bassett DS. Neurophysiological Evidence for Cognitive Map Formation during Sequence Learning. eNeuro 2022; 9:ENEURO. [PMID: 35105662 DOI: 10.1523/ENEURO.0361-21.2022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Revised: 12/03/2021] [Accepted: 01/03/2022] [Indexed: 12/29/2022] Open
Abstract
Humans deftly parse statistics from sequences. Some theories posit that humans learn these statistics by forming cognitive maps, or underlying representations of the latent space which links items in the sequence. Here, an item in the sequence is a node, and the probability of transitioning between two items is an edge. Sequences can then be generated from walks through the latent space, with different spaces giving rise to different sequence statistics. Individual or group differences in sequence learning can be modeled by changing the time scale over which estimates of transition probabilities are built, or in other words, by changing the amount of temporal discounting. Latent space models with temporal discounting bear a resemblance to models of navigation through Euclidean spaces. However, few explicit links have been made between predictions from Euclidean spatial navigation and neural activity during human sequence learning. Here, we use a combination of behavioral modeling and intracranial encephalography (iEEG) recordings to investigate how neural activity might support the formation of space-like cognitive maps through temporal discounting during sequence learning. Specifically, we acquire human reaction times from a sequential reaction time task, to which we fit a model that formulates the amount of temporal discounting as a single free parameter. From the parameter, we calculate each individual's estimate of the latent space. We find that neural activity reflects these estimates mostly in the temporal lobe, including areas involved in spatial navigation. Similar to spatial navigation, we find that low-dimensional representations of neural activity allow for easy separation of important features, such as modules, in the latent space. Lastly, we take advantage of the high temporal resolution of iEEG data to determine the time scale on which latent spaces are learned. We find that learning typically happens within the first 500 trials, and is modulated by the underlying latent space and the amount of temporal discounting characteristic of each participant. Ultimately, this work provides important links between behavioral models of sequence learning and neural activity during the same behavior, and contextualizes these results within a broader framework of domain general cognitive maps.
Collapse
|
49
|
Sharp PB, Russek EM, Huys QJM, Dolan RJ, Eldar E. Humans perseverate on punishment avoidance goals in multigoal reinforcement learning. eLife 2022; 11:e74402. [PMID: 35199640 PMCID: PMC8912924 DOI: 10.7554/elife.74402] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Accepted: 02/21/2022] [Indexed: 11/20/2022] Open
Abstract
Managing multiple goals is essential to adaptation, yet we are only beginning to understand computations by which we navigate the resource demands entailed in so doing. Here, we sought to elucidate how humans balance reward seeking and punishment avoidance goals, and relate this to variation in its expression within anxious individuals. To do so, we developed a novel multigoal pursuit task that includes trial-specific instructed goals to either pursue reward (without risk of punishment) or avoid punishment (without the opportunity for reward). We constructed a computational model of multigoal pursuit to quantify the degree to which participants could disengage from the pursuit goals when instructed to, as well as devote less model-based resources toward goals that were less abundant. In general, participants (n = 192) were less flexible in avoiding punishment than in pursuing reward. Thus, when instructed to pursue reward, participants often persisted in avoiding features that had previously been associated with punishment, even though at decision time these features were unambiguously benign. In a similar vein, participants showed no significant downregulation of avoidance when punishment avoidance goals were less abundant in the task. Importantly, we show preliminary evidence that individuals with chronic worry may have difficulty disengaging from punishment avoidance when instructed to seek reward. Taken together, the findings demonstrate that people avoid punishment less flexibly than they pursue reward. Future studies should test in larger samples whether a difficulty to disengage from punishment avoidance contributes to chronic worry.
Collapse
Affiliation(s)
- Paul B Sharp
- The Hebrew University of JerusalemJerusalemIsrael
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Wellcome Centre for Human Neuroimaging, University College LondonLondonUnited Kingdom
| | - Evan M Russek
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Wellcome Centre for Human Neuroimaging, University College LondonLondonUnited Kingdom
| | - Quentin JM Huys
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Division of Psychiatry, University College LondonLondonUnited Kingdom
| | - Raymond J Dolan
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Wellcome Centre for Human Neuroimaging, University College LondonLondonUnited Kingdom
| | - Eran Eldar
- The Hebrew University of JerusalemJerusalemIsrael
| |
Collapse
|
50
|
Bashford L, Kobak D, Diedrichsen J, Mehring C. Motor skill learning decreases movement variability and increases planning horizon. J Neurophysiol 2022; 127:995-1006. [PMID: 35196180 DOI: 10.1152/jn.00631.2020] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
We investigated motor skill learning using a path tracking task, where human subjects had to track various curved paths at a constant speed while maintaining the cursor within the path width. Subjects' accuracy increased with practice, even when tracking novel untrained paths. Using a "searchlight" paradigm, where only a short segment of the path ahead of the cursor was shown, we found that subjects with a higher tracking skill differed from the novice subjects in two respects. First, they had lower movement variability, in agreement with previous findings. Second, they took a longer section of the future path into account when performing the task, i.e. had a longer planning horizon. We estimate that between one third and one half of the performance increase in the expert group was due to the longer planning horizon. An optimal control model with a fixed horizon (receding horizon control) that increases with tracking skill quantitatively captured the subjects' movement behaviour. These findings demonstrate that human subjects not only increase their motor acuity but also their planning horizon when acquiring a motor skill.
Collapse
Affiliation(s)
- Luke Bashford
- Bernstein Center Freiburg, University of Freiburg, Freiburg, Germany.,Faculty of Biology, University of Freiburg, Freiburg, Germany.,Imperial College London, London, United Kingdom.,California Institute of Technology, Pasadena, CA, United States
| | - Dmitry Kobak
- Bernstein Center Freiburg, University of Freiburg, Freiburg, Germany.,Faculty of Biology, University of Freiburg, Freiburg, Germany.,Imperial College London, London, United Kingdom.,Champalimaud Centre for the Unknown, Lisbon, Portugal.,Institute for Ophthalmic Research, Tübingen University, Tübingen, Germany
| | - Jörn Diedrichsen
- Brain and Mind Institute & Department for Computer Science, University of Western Ontario, Ontario, Canada
| | - Carsten Mehring
- Bernstein Center Freiburg, University of Freiburg, Freiburg, Germany.,Faculty of Biology, University of Freiburg, Freiburg, Germany.,Imperial College London, London, United Kingdom
| |
Collapse
|