1
|
Yang MA, Jung MW, Lee SW. Striatal arbitration between choice strategies guides few-shot adaptation. Nat Commun 2025; 16:1811. [PMID: 39979316 PMCID: PMC11842591 DOI: 10.1038/s41467-025-57049-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Accepted: 02/05/2025] [Indexed: 02/22/2025] Open
Abstract
Animals often exhibit rapid action changes in context-switching environments. This study hypothesized that, compared to the expected outcome, an unexpected outcome leads to distinctly different action-selection strategies to guide rapid adaptation. We designed behavioral measures differentiating between trial-by-trial dynamics after expected and unexpected events. In various reversal learning data with different rodent species and task complexities, conventional learning models failed to replicate the choice behavior following an unexpected outcome. This discrepancy was resolved by the proposed model with two different decision variables contingent on outcome expectation: the support-stay and conflict-shift bias. Electrophysiological data analyses revealed that striatal neurons encode our model's key variables. Furthermore, the inactivation of striatal direct and indirect pathways neutralizes the effect of past expected and unexpected outcomes, respectively, on the action-selection strategy following an unexpected outcome. Our study suggests unique roles of the striatum in arbitrating between different action selection strategies for few-shot adaptation.
Collapse
Affiliation(s)
- Minsu Abel Yang
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea
- Program of Brain and Cognitive Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea
| | - Min Whan Jung
- Center for Synaptic Brain Dysfunctions, Institute for Basic Science, Daejeon, Republic of Korea
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea
| | - Sang Wan Lee
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea.
- Program of Brain and Cognitive Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea.
- Department of Brain & Cognitive Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea.
- Kim Jaechul Graduate School of AI, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea.
- Center for Neuroscience-inspired Artificial Intelligence, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea.
- Graduate School of Data Science, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea.
| |
Collapse
|
2
|
Zuo Z, Yang LZ, Wang H, Li H. Working Memory Guides Action Valuation in Model-based Decision-making Strategy. J Cogn Neurosci 2025; 37:86-96. [PMID: 39136553 DOI: 10.1162/jocn_a_02237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2024]
Abstract
Humans use both model-free (or habitual) and model-based (or goal-directed) strategies in sequential decision-making. Working memory (WM) is essential for the model-based strategy; however, its exact role in these processes remains elusive. This study investigates the influence of WM processes on decision-making and the underlying cognitive computing mechanisms. Specifically, we used experimental data from two-stage decision tasks and found that delay and load, two WM-specific variables, impact goal-revisiting behaviors. Then, we proposed possible computational mechanisms by which WM participates in information processing and integrated them into the model-based system. The proposed Hybrid-WM model reproduced the observed experimental effects and fit human behavior better than the classic hybrid reinforcement learning model. These results were verified with independent data sets. Furthermore, differences in model parameters explain the age-related difference in sequential decision-making. Overall, this study suggests that WM guides action valuation in model-based strategies, highlighting the contribution of higher cognitive functions to sequential decision-making.
Collapse
Affiliation(s)
- Zhaoyu Zuo
- Anhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences
- University of Science and Technology of China
| | - Li-Zhuang Yang
- Anhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences
- University of Science and Technology of China
- Hefei Cancer Hospital, Chinese Academy of Sciences
| | - Hongzhi Wang
- Anhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences
- University of Science and Technology of China
- Hefei Cancer Hospital, Chinese Academy of Sciences
| | - Hai Li
- Anhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences
- University of Science and Technology of China
- Hefei Cancer Hospital, Chinese Academy of Sciences
| |
Collapse
|
3
|
Rasanan AHH, Evans NJ, Fontanesi L, Manning C, Huang-Pollock C, Matzke D, Heathcote A, Rieskamp J, Speekenbrink M, Frank MJ, Palminteri S, Lucas CG, Busemeyer JR, Ratcliff R, Rad JA. Beyond discrete-choice options. Trends Cogn Sci 2024; 28:857-870. [PMID: 39138030 DOI: 10.1016/j.tics.2024.07.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 07/12/2024] [Accepted: 07/14/2024] [Indexed: 08/15/2024]
Abstract
While decision theories have evolved over the past five decades, their focus has largely been on choices among a limited number of discrete options, even though many real-world situations have a continuous-option space. Recently, theories have attempted to address decisions with continuous-option spaces, and several computational models have been proposed within the sequential sampling framework to explain how we make a decision in continuous-option space. This article aims to review the main attempts to understand decisions on continuous-option spaces, give an overview of applications of these types of decisions, and present puzzles to be addressed by future developments.
Collapse
Affiliation(s)
| | - Nathan J Evans
- School of Psychology, The University of Queensland, St Lucia, QLD 4072, Australia; Department of Psychology, Ludwig Maximilian University of Munich, Munich, Germany
| | - Laura Fontanesi
- Department of Psychology, University of Basel, Missionsstrasse 62A, 4055, Basel, Switzerland
| | | | | | - Dora Matzke
- Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
| | - Andrew Heathcote
- Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands; School of Psychological Sciences, University of Newcastle, Newcastle, Australia
| | - Jörg Rieskamp
- Department of Psychology, University of Basel, Missionsstrasse 62A, 4055, Basel, Switzerland
| | | | - Michael J Frank
- Department of Cognitive, Linguistic, and Psychological Sciences and Carney Institute for Brain Science, Brown University, Providence, RI, USA
| | - Stefano Palminteri
- Laboratoire de Neurosciences Cognitives Computationnelles, Institut National de la Santé et Recherche Médicale, Paris, France; Département d'Etudes Cognitives, Ecole Normale Supérieure, PSL Research University, Paris, France
| | | | - Jerome R Busemeyer
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405, USA
| | - Roger Ratcliff
- The Ohio State University, 1835 Neil Avenue, Columbus, OH, 43210, USA
| | - Jamal Amani Rad
- Choice Modelling Centre and Institute for Transport Studies, University of Leeds, Leeds LS2 9JT, UK.
| |
Collapse
|
4
|
Gallo M, Hausladen CI, Hsu M, Jenkins AC, Ona V, Camerer CF. Perceived warmth and competence predict callback rates in meta-analyzed North American labor market experiments. PLoS One 2024; 19:e0304723. [PMID: 38985690 PMCID: PMC11236140 DOI: 10.1371/journal.pone.0304723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Accepted: 05/16/2024] [Indexed: 07/12/2024] Open
Abstract
Extensive literature probes labor market discrimination through correspondence studies in which researchers send pairs of resumes to employers, which are closely matched except for social signals such as gender or ethnicity. Upon perceiving these signals, individuals quickly activate associated stereotypes. The Stereotype Content Model (SCM; Fiske 2002) categorizes these stereotypes into two dimensions: warmth and competence. Our research integrates findings from correspondence studies with theories of social psychology, asking: Can discrimination between social groups, measured through employer callback disparities, be predicted by warmth and competence perceptions of social signals? We collect callback rates from 21 published correspondence studies, varying for 592 social signals. On those social signals, we collected warmth and competence perceptions from an independent group of online raters. We found that social perception predicts callback disparities for studies varying race and gender, which are indirectly signaled by names on these resumes. Yet, for studies adjusting other categories like sexuality and disability, the influence of social perception on callbacks is inconsistent. For instance, a more favorable perception of signals like parenthood does not consistently lead to increased callbacks, underscoring the necessity for further research. Our research offers pivotal strategies to address labor market discrimination in practice. Leveraging the warmth and competence framework allows for the predictive identification of bias against specific groups without extensive correspondence studies. By distilling hiring discrimination into these two dimensions, we not only facilitate the development of decision support systems for hiring managers but also equip computer scientists with a foundational framework for debiasing Large Language Models and other methods that are increasingly employed in hiring processes.
Collapse
Affiliation(s)
- Marcos Gallo
- Division of Humanities and Social Science, California Institute of Technology, Pasadena, CA, United States of America
| | - Carina I Hausladen
- Division of Humanities and Social Science, California Institute of Technology, Pasadena, CA, United States of America
- Computational Social Science, ETH Zurich, Zurich, Switzerland
| | - Ming Hsu
- Haas School of Business, University of California, Berkeley, Berkeley, CA, United States of America
| | - Adrianna C Jenkins
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Vaida Ona
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Colin F Camerer
- Division of Humanities and Social Science, California Institute of Technology, Pasadena, CA, United States of America
- Computational and Neural Systems, California Institute of Technology, Pasadena, CA, United States of America
| |
Collapse
|
5
|
Chen F, Zheng J, Wang L, Krajbich I. Attribute latencies causally shape intertemporal decisions. Nat Commun 2024; 15:2948. [PMID: 38580626 PMCID: PMC10997753 DOI: 10.1038/s41467-024-46657-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Accepted: 03/05/2024] [Indexed: 04/07/2024] Open
Abstract
Intertemporal choices - decisions that play out over time - pervade our life. Thus, how people make intertemporal choices is a fundamental question. Here, we investigate the role of attribute latency (the time between when people start to process different attributes) in shaping intertemporal preferences using five experiments with choices between smaller-sooner and larger-later rewards. In the first experiment, we identify attribute latencies using mouse-trajectories and find that they predict individual differences in choices, response times, and changes across time constraints. In the other four experiments we test the causal link from attribute latencies to choice, staggering the display of the attributes. This changes attribute latencies and intertemporal preferences. Displaying the amount information first makes people more patient, while displaying time information first does the opposite. These findings highlight the importance of intra-choice dynamics in shaping intertemporal choices and suggest that manipulating attribute latency may be a useful technique for nudging.
Collapse
Affiliation(s)
- Fadong Chen
- School of Management, Zhejiang University, Hangzhou, 310058, China
- Neuromanagement Laboratory, Zhejiang University, Hangzhou, 310058, China
- The State Key Laboratory of Brain-Machine Intelligence, Zhejiang University, Hangzhou, 310058, China
| | - Jiehui Zheng
- Alibaba Business School, Hangzhou Normal University, Hangzhou, 311121, China.
| | - Lei Wang
- School of Management, Zhejiang University, Hangzhou, 310058, China
- Neuromanagement Laboratory, Zhejiang University, Hangzhou, 310058, China
- The State Key Laboratory of Brain-Machine Intelligence, Zhejiang University, Hangzhou, 310058, China
| | - Ian Krajbich
- Department of Psychology, University of California Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
6
|
Donegan KR, Brown VM, Price RB, Gallagher E, Pringle A, Hanlon AK, Gillan CM. Using smartphones to optimise and scale-up the assessment of model-based planning. COMMUNICATIONS PSYCHOLOGY 2023; 1:31. [PMID: 39242869 PMCID: PMC11332031 DOI: 10.1038/s44271-023-00031-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 10/05/2023] [Indexed: 09/09/2024]
Abstract
Model-based planning is thought to protect against over-reliance on habits. It is reduced in individuals high in compulsivity, but effect sizes are small and may depend on subtle features of the tasks used to assess it. We developed a diamond-shooting smartphone game that measures model-based planning in an at-home setting, and varied the game's structure within and across participants to assess how it affects measurement reliability and validity with respect to previously established correlates of model-based planning, with a focus on compulsivity. Increasing the number of trials used to estimate model-based planning did remarkably little to affect the association with compulsivity, because the greatest signal was in earlier trials. Associations with compulsivity were higher when transition ratios were less deterministic and depending on the reward drift utilised. These findings suggest that model-based planning can be measured at home via an app, can be estimated in relatively few trials using certain design features, and can be optimised for sensitivity to compulsive symptoms in the general population.
Collapse
Affiliation(s)
- Kelly R Donegan
- School of Psychology, Trinity College Dublin, Dublin, Ireland
- Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, Ireland
| | - Vanessa M Brown
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, USA
| | - Rebecca B Price
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, USA
| | - Eoghan Gallagher
- School of Psychology, Trinity College Dublin, Dublin, Ireland
- Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, Ireland
| | - Andrew Pringle
- School of Psychology, Trinity College Dublin, Dublin, Ireland
- Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, Ireland
| | - Anna K Hanlon
- School of Psychology, Trinity College Dublin, Dublin, Ireland
- Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, Ireland
| | - Claire M Gillan
- School of Psychology, Trinity College Dublin, Dublin, Ireland.
- Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, Ireland.
- Global Brain Health Institute, Trinity College Dublin, Dublin, Ireland.
| |
Collapse
|
7
|
Sharp PB, Dolan RJ, Eldar E. Disrupted state transition learning as a computational marker of compulsivity. Psychol Med 2023; 53:2095-2105. [PMID: 37310326 PMCID: PMC10106291 DOI: 10.1017/s0033291721003846] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 08/28/2021] [Accepted: 09/02/2021] [Indexed: 11/07/2022]
Abstract
BACKGROUND Disorders involving compulsivity, fear, and anxiety are linked to beliefs that the world is less predictable. We lack a mechanistic explanation for how such beliefs arise. Here, we test a hypothesis that in people with compulsivity, fear, and anxiety, learning a probabilistic mapping between actions and environmental states is compromised. METHODS In Study 1 (n = 174), we designed a novel online task that isolated state transition learning from other facets of learning and planning. To determine whether this impairment is due to learning that is too fast or too slow, we estimated state transition learning rates by fitting computational models to two independent datasets, which tested learning in environments in which state transitions were either stable (Study 2: n = 1413) or changing (Study 3: n = 192). RESULTS Study 1 established that individuals with higher levels of compulsivity are more likely to demonstrate an impairment in state transition learning. Preliminary evidence here linked this impairment to a common factor comprising compulsivity and fear. Studies 2 and 3 showed that compulsivity is associated with learning that is too fast when it should be slow (i.e. when state transition are stable) and too slow when it should be fast (i.e. when state transitions change). CONCLUSIONS Together, these findings indicate that compulsivity is associated with a dysregulation of state transition learning, wherein the rate of learning is not well adapted to the task environment. Thus, dysregulated state transition learning might provide a key target for therapeutic intervention in compulsivity.
Collapse
Affiliation(s)
- Paul B. Sharp
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London, UK
- Wellcome Centre for Human Neuroimaging, University College London, London, UK
- The Hebrew University of Jerusalem, Jerusalem, IL, USA
| | - Raymond J. Dolan
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London, UK
- Wellcome Centre for Human Neuroimaging, University College London, London, UK
| | - Eran Eldar
- The Hebrew University of Jerusalem, Jerusalem, IL, USA
| |
Collapse
|
8
|
Castro-Rodrigues P, Akam T, Snorasson I, Camacho M, Paixão V, Maia A, Barahona-Corrêa JB, Dayan P, Simpson HB, Costa RM, Oliveira-Maia AJ. Explicit knowledge of task structure is a primary determinant of human model-based action. Nat Hum Behav 2022; 6:1126-1141. [PMID: 35589826 DOI: 10.1038/s41562-022-01346-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Revised: 03/19/2022] [Accepted: 03/31/2022] [Indexed: 11/09/2022]
Abstract
Explicit information obtained through instruction profoundly shapes human choice behaviour. However, this has been studied in computationally simple tasks, and it is unknown how model-based and model-free systems, respectively generating goal-directed and habitual actions, are affected by the absence or presence of instructions. We assessed behaviour in a variant of a computationally more complex decision-making task, before and after providing information about task structure, both in healthy volunteers and in individuals suffering from obsessive-compulsive or other disorders. Initial behaviour was model-free, with rewards directly reinforcing preceding actions. Model-based control, employing predictions of states resulting from each action, emerged with experience in a minority of participants, and less in those with obsessive-compulsive disorder. Providing task structure information strongly increased model-based control, similarly across all groups. Thus, in humans, explicit task structural knowledge is a primary determinant of model-based reinforcement learning and is most readily acquired from instruction rather than experience.
Collapse
Affiliation(s)
- Pedro Castro-Rodrigues
- Champalimaud Clinical Centre, Champalimaud Foundation, Lisbon, Portugal.,Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisbon, Portugal.,Centro Hospitalar Psiquiátrico de Lisboa, Lisbon, Portugal
| | - Thomas Akam
- Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,Department of Experimental Psychology, University of Oxford, Oxford, UK
| | - Ivar Snorasson
- Center for Obsessive-Compulsive & Related Disorders, New York State Psychiatric Institute, New York, NY, USA
| | - Marta Camacho
- Champalimaud Clinical Centre, Champalimaud Foundation, Lisbon, Portugal.,Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,John Van Geest Center for Brain Repair, University of Cambridge, Cambridge, UK
| | - Vitor Paixão
- Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal
| | - Ana Maia
- Champalimaud Clinical Centre, Champalimaud Foundation, Lisbon, Portugal.,Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisbon, Portugal.,Department of Psychiatry and Mental Health, Centro Hospitalar de Lisboa Ocidental, Lisbon, Portugal
| | - J Bernardo Barahona-Corrêa
- Champalimaud Clinical Centre, Champalimaud Foundation, Lisbon, Portugal.,Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisbon, Portugal
| | - Peter Dayan
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany.,The University of Tübingen, Tübingen, Germany
| | - H Blair Simpson
- Center for Obsessive-Compulsive & Related Disorders, New York State Psychiatric Institute, New York, NY, USA.,Department of Psychiatry, Columbia University, New York, NY, USA
| | - Rui M Costa
- Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisbon, Portugal.,Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
| | - Albino J Oliveira-Maia
- Champalimaud Clinical Centre, Champalimaud Foundation, Lisbon, Portugal. .,Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal. .,NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisbon, Portugal.
| |
Collapse
|
9
|
Rmus M, Ritz H, Hunter LE, Bornstein AM, Shenhav A. Humans can navigate complex graph structures acquired during latent learning. Cognition 2022; 225:105103. [PMID: 35364400 PMCID: PMC9201735 DOI: 10.1016/j.cognition.2022.105103] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 03/09/2022] [Accepted: 03/20/2022] [Indexed: 11/03/2022]
Abstract
Humans appear to represent many forms of knowledge in associative networks whose nodes are multiply connected, including sensory, spatial, and semantic. Recent work has shown that explicitly augmenting artificial agents with such graph-structured representations endows them with more human-like capabilities of compositionality and transfer learning. An open question is how humans acquire these representations. Previously, it has been shown that humans can learn to navigate graph-structured conceptual spaces on the basis of direct experience with trajectories that intentionally draw the network contours (Schapiro, Kustner, & Turk-Browne, 2012; Schapiro, Turk-Browne, Botvinick, & Norman, 2016), or through direct experience with rewards that covary with the underlying associative distance (Wu, Schulz, Speekenbrink, Nelson, & Meder, 2018). Here, we provide initial evidence that this capability is more general, extending to learning to reason about shortest-path distances across a graph structure acquired across disjoint experiences with randomized edges of the graph - a form of latent learning. In other words, we show that humans can infer graph structures, assembling them from disordered experiences. We further show that the degree to which individuals learn to reason correctly and with reference to the structure of the graph corresponds to their propensity, in a separate task, to use model-based reinforcement learning to achieve rewards. This connection suggests that the correct acquisition of graph-structured relationships is a central ability underlying forward planning and reasoning, and may be a core computation across the many domains in which graph-based reasoning is advantageous.
Collapse
Affiliation(s)
- Milena Rmus
- Department of Psychology, University of California, Berkeley, USA.
| | - Harrison Ritz
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, USA
| | | | - Aaron M Bornstein
- Department of Cognitive Sciences, University of California, Irvine, USA; Center for the Neurobiology of Learning and Memory, University of California, Irvine, USA
| | - Amitai Shenhav
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, USA; Carney Institute for Brain Science, Brown University, USA
| |
Collapse
|
10
|
Sharp PB, Russek EM, Huys QJM, Dolan RJ, Eldar E. Humans perseverate on punishment avoidance goals in multigoal reinforcement learning. eLife 2022; 11:e74402. [PMID: 35199640 PMCID: PMC8912924 DOI: 10.7554/elife.74402] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Accepted: 02/21/2022] [Indexed: 11/20/2022] Open
Abstract
Managing multiple goals is essential to adaptation, yet we are only beginning to understand computations by which we navigate the resource demands entailed in so doing. Here, we sought to elucidate how humans balance reward seeking and punishment avoidance goals, and relate this to variation in its expression within anxious individuals. To do so, we developed a novel multigoal pursuit task that includes trial-specific instructed goals to either pursue reward (without risk of punishment) or avoid punishment (without the opportunity for reward). We constructed a computational model of multigoal pursuit to quantify the degree to which participants could disengage from the pursuit goals when instructed to, as well as devote less model-based resources toward goals that were less abundant. In general, participants (n = 192) were less flexible in avoiding punishment than in pursuing reward. Thus, when instructed to pursue reward, participants often persisted in avoiding features that had previously been associated with punishment, even though at decision time these features were unambiguously benign. In a similar vein, participants showed no significant downregulation of avoidance when punishment avoidance goals were less abundant in the task. Importantly, we show preliminary evidence that individuals with chronic worry may have difficulty disengaging from punishment avoidance when instructed to seek reward. Taken together, the findings demonstrate that people avoid punishment less flexibly than they pursue reward. Future studies should test in larger samples whether a difficulty to disengage from punishment avoidance contributes to chronic worry.
Collapse
Affiliation(s)
- Paul B Sharp
- The Hebrew University of JerusalemJerusalemIsrael
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Wellcome Centre for Human Neuroimaging, University College LondonLondonUnited Kingdom
| | - Evan M Russek
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Wellcome Centre for Human Neuroimaging, University College LondonLondonUnited Kingdom
| | - Quentin JM Huys
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Division of Psychiatry, University College LondonLondonUnited Kingdom
| | - Raymond J Dolan
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Wellcome Centre for Human Neuroimaging, University College LondonLondonUnited Kingdom
| | - Eran Eldar
- The Hebrew University of JerusalemJerusalemIsrael
| |
Collapse
|
11
|
Konovalov A, Ruff CC. Enhancing models of social and strategic decision making with process tracing and neural data. WILEY INTERDISCIPLINARY REVIEWS. COGNITIVE SCIENCE 2021; 13:e1559. [PMID: 33880846 DOI: 10.1002/wcs.1559] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Revised: 02/26/2021] [Accepted: 03/24/2021] [Indexed: 11/11/2022]
Abstract
Every decision we take is accompanied by a characteristic pattern of response delay, gaze position, pupil dilation, and neural activity. Nevertheless, many models of social decision making neglect the corresponding process tracing data and focus exclusively on the final choice outcome. Here, we argue that this is a mistake, as the use of process data can help to build better models of human behavior, create better experiments, and improve policy interventions. Specifically, such data allow us to unlock the "black box" of the decision process and evaluate the mechanisms underlying our social choices. Using these data, we can directly validate latent model variables, arbitrate between competing personal motives, and capture information processing strategies. These benefits are especially valuable in social science, where models must predict multi-faceted decisions that are taken in varying contexts and are based on many different types of information. This article is categorized under: Economics > Interactive Decision-Making Neuroscience > Cognition Psychology > Reasoning and Decision Making.
Collapse
Affiliation(s)
- Arkady Konovalov
- Department of Economics, Zurich Center for Neuroeconomics (ZNE), University of Zurich
| | - Christian C Ruff
- Department of Economics, Zurich Center for Neuroeconomics (ZNE), University of Zurich
| |
Collapse
|
12
|
Akam T, Rodrigues-Vaz I, Marcelo I, Zhang X, Pereira M, Oliveira RF, Dayan P, Costa RM. The Anterior Cingulate Cortex Predicts Future States to Mediate Model-Based Action Selection. Neuron 2021; 109:149-163.e7. [PMID: 33152266 PMCID: PMC7837117 DOI: 10.1016/j.neuron.2020.10.013] [Citation(s) in RCA: 68] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Revised: 09/01/2020] [Accepted: 10/09/2020] [Indexed: 01/19/2023]
Abstract
Behavioral control is not unitary. It comprises parallel systems, model based and model free, that respectively generate flexible and habitual behaviors. Model-based decisions use predictions of the specific consequences of actions, but how these are implemented in the brain is poorly understood. We used calcium imaging and optogenetics in a sequential decision task for mice to show that the anterior cingulate cortex (ACC) predicts the state that actions will lead to, not simply whether they are good or bad, and monitors whether outcomes match these predictions. ACC represents the complete state space of the task, with reward signals that depend strongly on the state where reward is obtained but minimally on the preceding choice. Accordingly, ACC is necessary only for updating model-based strategies, not for basic reward-driven action reinforcement. These results reveal that ACC is a critical node in model-based control, with a specific role in predicting future states given chosen actions.
Collapse
Affiliation(s)
- Thomas Akam
- Champalimaud Neuroscience Program, Champalimaud Centre for the Unknown, Lisbon, Portugal; Department of Experimental Psychology, Oxford University, Oxford, UK.
| | - Ines Rodrigues-Vaz
- Champalimaud Neuroscience Program, Champalimaud Centre for the Unknown, Lisbon, Portugal; Department of Neuroscience and Neurology, Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
| | - Ivo Marcelo
- Champalimaud Neuroscience Program, Champalimaud Centre for the Unknown, Lisbon, Portugal; Department of Psychiatry, Erasmus MC University Medical Center, 3015 GD Rotterdam, the Netherlands
| | - Xiangyu Zhang
- RIKEN-MIT Center for Neural Circuit Genetics at the Picower Institute for Learning and Memory, Department of Biology and Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Michael Pereira
- Champalimaud Neuroscience Program, Champalimaud Centre for the Unknown, Lisbon, Portugal
| | | | - Peter Dayan
- Gatsby Computational Neuroscience Unit, University College London, London, UK; Max Planck Institute for Biological Cybernetics, Tübingen, Germany; University of Tübingen, Tübingen, Germany
| | - Rui M Costa
- Champalimaud Neuroscience Program, Champalimaud Centre for the Unknown, Lisbon, Portugal; Department of Neuroscience and Neurology, Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
| |
Collapse
|
13
|
Using dynamic monitoring of choices to predict and understand risk preferences. Proc Natl Acad Sci U S A 2020; 117:31738-31747. [PMID: 33234567 DOI: 10.1073/pnas.2010056117] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Navigating conflict is integral to decision-making, serving a central role both in the subjective experience of choice as well as contemporary theories of how we choose. However, the lack of a sensitive, accessible, and interpretable metric of conflict has led researchers to focus on choice itself rather than how individuals arrive at that choice. Using mouse-tracking-continuously sampling computer mouse location as participants decide-we demonstrate the theoretical and practical uses of dynamic assessments of choice from decision onset through conclusion. Specifically, we use mouse tracking to index conflict, quantified by the relative directness to the chosen option, in a domain for which conflict is integral: decisions involving risk. In deciding whether to accept risk, decision makers must integrate gains, losses, status quos, and outcome probabilities, a process that inevitably involves conflict. Across three preregistered studies, we tracked participants' motor movements while they decided whether to accept or reject gambles. Our results show that 1) mouse-tracking metrics of conflict sensitively detect differences in the subjective value of risky versus certain options; 2) these metrics of conflict strongly predict participants' risk preferences (loss aversion and decreasing marginal utility), even on a single-trial level; 3) these mouse-tracking metrics outperform participants' reaction times in predicting risk preferences; and 4) manipulating risk preferences via a broad versus narrow bracketing manipulation influences conflict as indexed by mouse tracking. Together, these results highlight the importance of measuring conflict during risky choice and demonstrate the usefulness of mouse tracking as a tool to do so.
Collapse
|
14
|
Collins AGE, Cockburn J. Beyond dichotomies in reinforcement learning. Nat Rev Neurosci 2020; 21:576-586. [PMID: 32873936 DOI: 10.1038/s41583-020-0355-6] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/20/2020] [Indexed: 11/09/2022]
Abstract
Reinforcement learning (RL) is a framework of particular importance to psychology, neuroscience and machine learning. Interactions between these fields, as promoted through the common hub of RL, has facilitated paradigm shifts that relate multiple levels of analysis in a singular framework (for example, relating dopamine function to a computationally defined RL signal). Recently, more sophisticated RL algorithms have been proposed to better account for human learning, and in particular its oft-documented reliance on two separable systems: a model-based (MB) system and a model-free (MF) system. However, along with many benefits, this dichotomous lens can distort questions, and may contribute to an unnecessarily narrow perspective on learning and decision-making. Here, we outline some of the consequences that come from overconfidently mapping algorithms, such as MB versus MF RL, with putative cognitive processes. We argue that the field is well positioned to move beyond simplistic dichotomies, and we propose a means of refocusing research questions towards the rich and complex components that comprise learning and decision-making.
Collapse
Affiliation(s)
- Anne G E Collins
- Department of Psychology and the Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA.
| | - Jeffrey Cockburn
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA
| |
Collapse
|