1
|
Shani-Narkiss H, Eitam B, Amsalem O. Using an algorithmic approach to shape human decision-making through attraction to patterns. Nat Commun 2025; 16:4110. [PMID: 40316528 PMCID: PMC12048589 DOI: 10.1038/s41467-025-59131-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 04/10/2025] [Indexed: 05/04/2025] Open
Abstract
Evidence suggests that people are attracted to patterns and regularity. We hypothesized that decision-makers, intending to maximize profit, may be lured by the existence of regularity, even when it does not confer any additional value. An algorithm based on this premise outperformed all other contenders in an international challenge to bias individuals' preferences. To create the bias, the algorithm allocates rewards in an evolving, yet easily trackable, pattern to one option but not the other. This leads decision-makers to prefer the regular option over the other 2:1, even though this preference proves to be relatively disadvantageous. The results support the idea that humans assign value to regularity and more generally, for the utility of qualitative approaches to human decision-making. They also suggest that models of decision making that are based solely on reward learning may be incomplete.
Collapse
Affiliation(s)
- Haran Shani-Narkiss
- UCL Sainsbury Wellcome Centre for Neural Circuits and Behaviour, London, W1T 4JG, UK.
| | - Baruch Eitam
- School of Psychological Sciences, University of Haifa, Mount Carmel, Haifa, Israel.
| | - Oren Amsalem
- Division of Endocrinology, Diabetes and Metabolism, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, 02215, USA.
| |
Collapse
|
2
|
Jin F, Li M, Yang L, Yang L, Shang Z. Exploring value learning in pigeons: the role of dual pathways in the basal ganglia and synaptic plasticity. J Exp Biol 2025; 228:jeb249507. [PMID: 40241515 DOI: 10.1242/jeb.249507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2024] [Accepted: 04/11/2025] [Indexed: 04/18/2025]
Abstract
Understanding value learning in animals is a key focus in cognitive neuroscience. Current models used in research are often simple, and while more complex models have been proposed, it remains unclear which assumptions align with actual value-learning strategies of animals. This study investigated the computational mechanisms behind value learning in pigeons using a free-choice task. Three models were constructed based on different assumptions about the role of the basal ganglia's dual pathways and synaptic plasticity in value computation, followed by model comparison and neural correlation analysis. Among the three models tested, the dual-pathway reinforcement learning model with Hebbian rules most closely matched the pigeons' behavior. Furthermore, the striatal gamma band connectivity showed the highest correlation with the values estimated by this model. Additionally, enhanced beta band connectivity in the nidopallium caudolaterale supported value learning. This study provides valuable insights into reinforcement learning mechanisms in non-human animals.
Collapse
Affiliation(s)
- Fuli Jin
- Zhengzhou University, School of Electrical and Information Engineering, Zhengzhou 450001, China
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Mengmeng Li
- Zhengzhou University, School of Electrical and Information Engineering, Zhengzhou 450001, China
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Long Yang
- Zhengzhou University, School of Electrical and Information Engineering, Zhengzhou 450001, China
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Lifang Yang
- Zhengzhou University, School of Electrical and Information Engineering, Zhengzhou 450001, China
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Zhigang Shang
- Zhengzhou University, School of Electrical and Information Engineering, Zhengzhou 450001, China
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| |
Collapse
|
3
|
Scholten M, Sanborn A, He L, Read D. Delay preference in intertemporal choice: Sooner or later OR faster or slower? Cogn Psychol 2025; 158:101732. [PMID: 40334377 DOI: 10.1016/j.cogpsych.2025.101732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 03/19/2025] [Accepted: 03/19/2025] [Indexed: 05/09/2025]
Abstract
Intertemporal choices are conventionally conceived as decisions about whether to be better off sooner or later. As a reflection of this, most experimental research on the topic has been restricted to choices between single-dated outcomes: One sooner, the other later. Even these decisions, however, can be conceived in a different way: As choices between an option that accumulates faster to its total outcome, and an option that accumulates more slowly to its total outcome. To empirically distinguish between these two interpretations, the experimental design must include options with multiple-dated outcomes, that is, outcome sequences. We report an experiment that includes choices involving outcome sequences as well as choices between single-dated outcomes, where the outcomes are monetary losses, or payments. This design allows us to evaluate a sooner-or-later model and a faster-or-slower model on their ability to predict single-payment choices once calibrated on payment-sequence choices (model generalizability). Moreover, people differ considerably in their preferences for the timing of losses, which we turn to our advantage by evaluating the models on their ability to associate preferences for the timing of multiple payments, as inferred from payment-sequence choices, with preferences for the timing of a single payment, as observed in single-payment choices (parameter generalizability). For that purpose, we develop the classic criteria of convergent validity and discriminant validity in the assessment of construct validity as criteria in the assessment of model validity. The results of a fully Bayesian analysis strongly favored the faster-or-slower model over the sooner-or-later model.
Collapse
Affiliation(s)
- Marc Scholten
- Faculdade de Design, Tecnologia e Comunicação, Universidade Europeia, Portugal; Unidade de Investigação em Design e Comunicação, Universidade Europeia, Portugal; Centro de Estudos em Gestão do Instituto Superior Técnico, Instituto Superior Técnico, Universidade de Lisboa, Portugal.
| | - Adam Sanborn
- Department of Psychology, University of Warwick, United Kingdom
| | - Lisheng He
- SILC Business School, Shanghai University, China
| | - Daniel Read
- Warwick Business School, University of Warwick, United Kingdom
| |
Collapse
|
4
|
Schmitt O. Relationships and representations of brain structures, connectivity, dynamics and functions. Prog Neuropsychopharmacol Biol Psychiatry 2025; 138:111332. [PMID: 40147809 DOI: 10.1016/j.pnpbp.2025.111332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Revised: 02/20/2025] [Accepted: 03/10/2025] [Indexed: 03/29/2025]
Abstract
The review explores the complex interplay between brain structures and their associated functions, presenting a diversity of hierarchical models that enhances our understanding of these relationships. Central to this approach are structure-function flow diagrams, which offer a visual representation of how specific neuroanatomical structures are linked to their functional roles. These diagrams are instrumental in mapping the intricate connections between different brain regions, providing a clearer understanding of how functions emerge from the underlying neural architecture. The study details innovative attempts to develop new functional hierarchies that integrate structural and functional data. These efforts leverage recent advancements in neuroimaging techniques such as fMRI, EEG, MEG, and PET, as well as computational models that simulate neural dynamics. By combining these approaches, the study seeks to create a more refined and dynamic hierarchy that can accommodate the brain's complexity, including its capacity for plasticity and adaptation. A significant focus is placed on the overlap of structures and functions within the brain. The manuscript acknowledges that many brain regions are multifunctional, contributing to different cognitive and behavioral processes depending on the context. This overlap highlights the need for a flexible, non-linear hierarchy that can capture the brain's intricate functional landscape. Moreover, the study examines the interdependence of these functions, emphasizing how the loss or impairment of one function can impact others. Another crucial aspect discussed is the brain's ability to compensate for functional deficits following neurological diseases or injuries. The investigation explores how the brain reorganizes itself, often through the recruitment of alternative neural pathways or the enhancement of existing ones, to maintain functionality despite structural damage. This compensatory mechanism underscores the brain's remarkable plasticity, demonstrating its ability to adapt and reconfigure itself in response to injury, thereby ensuring the continuation of essential functions. In conclusion, the study presents a system of brain functions that integrates structural, functional, and dynamic perspectives. It offers a robust framework for understanding how the brain's complex network of structures supports a wide range of cognitive and behavioral functions, with significant implications for both basic neuroscience and clinical applications.
Collapse
Affiliation(s)
- Oliver Schmitt
- Medical School Hamburg - University of Applied Sciences and Medical University - Institute for Systems Medicine, Am Kaiserkai 1, Hamburg 20457, Germany; University of Rostock, Department of Anatomy, Gertrudenstr. 9, Rostock, 18055 Rostock, Germany.
| |
Collapse
|
5
|
Vasta N, Xu S, Verguts T, Braem S. A shared temporal window of integration across cognitive control and reinforcement learning paradigms: A correlational study. Mem Cognit 2025; 53:1008-1021. [PMID: 39198341 DOI: 10.3758/s13421-024-01626-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/06/2024] [Indexed: 09/01/2024]
Abstract
Cognitive control refers to the ability to override prepotent response tendencies to achieve goal-directed behavior. On the other hand, reinforcement learning refers to the learning of actions through feedback and reward. Although cognitive control and reinforcement learning are often viewed as opposing forces in driving behavior, recent theories have emphasized possible similarities in their underling processes. With this study, we aimed to investigate whether a similar time window of integration could be observed during the learning of control on the one hand, and the learning rate in reinforcement learning paradigms on the other. To this end, we performed a correlational analysis on a large public dataset (n = 522) including data from two reinforcement learning tasks, i.e., a probabilistic selection task and a probabilistic Wisconsin Card Sorting Task (WCST), and data from a classic conflict task (i.e., the Stroop task). Results showed expected correlations between the time scale of control indices and learning rate in the probabilistic WCST. Moreover, the learning-rate parameters of the two reinforcement learning tasks did not correlate with each other. Together, these findings suggest a reliance on a shared learning mechanism between these two traditionally distinct domains, while at the same time emphasizing that value updating processes can still be very task-specific. We speculate that updating processes in the Stroop and WCST may be more related because both tasks require task-specific updating of stimulus features (e.g., color, word meaning, pattern, shape), as opposed to stimulus identity.
Collapse
Affiliation(s)
- Nicola Vasta
- Department of Psychology and Cognitive Science, University of Trento, Corso Bettini, 31, 38068, Rovereto, TN, Italy.
| | - Shengjie Xu
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | - Tom Verguts
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | - Senne Braem
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| |
Collapse
|
6
|
Pauli R, Brazil I, Kohls G, Hauser TU, Gistelinck L, Dikeos D, Dochnal R, Fairchild G, Fernández-Rivas A, Herpertz-Dahlmann B, Hervas A, Konrad K, Popma A, Stadler C, Freitag CM, De Brito SA, Lockwood PL. Conduct disorder is associated with heightened action initiation and reduced learning from punishment but not reward. Biol Psychiatry 2025:S0006-3223(25)01051-0. [PMID: 40090563 DOI: 10.1016/j.biopsych.2025.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 02/26/2025] [Accepted: 03/05/2025] [Indexed: 03/18/2025]
Abstract
BACKGROUND Theoretical and empirical accounts of conduct disorder (CD) suggest problems with reinforcement learning as well as heightened impulsivity. These two facets can manifest in similar behaviour, such as risk-taking. Computational models that can dissociate learning from impulsively initiating actions are essential for understanding the cognitive mechanisms underlying CD. METHODS A large, international sample of youths from 11 European countries (N = 1418, typically developing (TD) n = 742, CD n = 676) completed a learning task. We used computational modelling to disentangle reward and punishment learning from action initiation. RESULTS Punishment learning rates were significantly reduced in youths with CD compared to their TD peers, suggesting that they did not update their actions based on punishment outcomes as strongly. Intriguingly, those with CD also had a greater tendency to initiate actions regardless of outcomes, although their ability to learn from reward was comparable to their TD peers. We additionally observed that variability in action initiation correlated with self-reported impulsivity in youths with CD. CONCLUSIONS These findings provide empirical support for a reduced ability to learn from punishment in CD, while reward learning is typical. Our results also suggest that behaviours appearing superficially to reflect reward learning differences could reflect heightened impulsive action initiation instead. Such asymmetric learning from reward and punishment, with increased action initiation, could have important implications for tailoring learning-based interventions to help those with CD.
Collapse
Affiliation(s)
- Ruth Pauli
- Centre for Human Brain Health, School of Psychology, University of Birmingham, Birmingham, UK.
| | - Inti Brazil
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Gregor Kohls
- Department of Child and Adolescent Psychiatry, Faculty of Medicine, TU, Dresden, Germany
| | - Tobias U Hauser
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London, UK; Wellcome Centre for Human Neuroimaging, University College London, London, UK; Department of Psychiatry and Psychotherapy, Medical School and University Hospital, Eberhard Karls University of Tübingen, Tübingen, Germany; German Center for Mental Health (DZPG), Tübingen, Germany
| | - Lisa Gistelinck
- Center for Developmental Psychiatry, Katholieke Universiteit Leuven, Leuven, Belgium
| | - Dimitris Dikeos
- Department of Psychiatry, Medical School, National and Kapodistrian University of Athens, Athens, Greece
| | - Roberta Dochnal
- Faculty of Medicine, Child and Adolescent Psychiatry, Department of the Child Health Center, Szeged University, Szeged, Hungary
| | - Graeme Fairchild
- Department of Psychology, University of Bath, Bath, United Kingdom
| | | | - Beate Herpertz-Dahlmann
- Child Neuropsychology Section, Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, RWTH Aachen University, Aachen, Germany
| | - Amaia Hervas
- University Hospital Mutua Terrassa, Barcelona, Spain
| | | | - Arne Popma
- Department of Child and Adolescent Psychiatry, VU University Medical Center, Amsterdam, Netherlands
| | - Christina Stadler
- Department of Child and Adolescent Psychiatry, Psychiatric University Hospital, University of Basel, Basel, Switzerland
| | - Christine M Freitag
- Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, University Hospital Frankfurt, Goethe University, Frankfurt am Main, Germany
| | - Stephane A De Brito
- Centre for Human Brain Health, School of Psychology, University of Birmingham, Birmingham, UK; Institute for Mental Health, School of Psychology, University of Birmingham, Birmingham, UK
| | - Patricia L Lockwood
- Centre for Human Brain Health, School of Psychology, University of Birmingham, Birmingham, UK; Institute for Mental Health, School of Psychology, University of Birmingham, Birmingham, UK; Department of Experimental Psychology, University of Oxford, Oxford, UK; Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK.
| |
Collapse
|
7
|
Bruckner R, Heekeren HR, Nassar MR. Understanding learning through uncertainty and bias. COMMUNICATIONS PSYCHOLOGY 2025; 3:24. [PMID: 39948273 PMCID: PMC11825852 DOI: 10.1038/s44271-025-00203-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2024] [Accepted: 01/28/2025] [Indexed: 02/16/2025]
Abstract
Learning allows humans and other animals to make predictions about the environment that facilitate adaptive behavior. Casting learning as predictive inference can shed light on normative cognitive mechanisms that improve predictions under uncertainty. Drawing on normative learning models, we illustrate how learning should be adjusted to different sources of uncertainty, including perceptual uncertainty, risk, and uncertainty due to environmental changes. Such models explain many hallmarks of human learning in terms of specific statistical considerations that come into play when updating predictions under uncertainty. However, humans also display systematic learning biases that deviate from normative models, as studied in computational psychiatry. Some biases can be explained as normative inference conditioned on inaccurate prior assumptions about the environment, while others reflect approximations to Bayesian inference aimed at reducing cognitive demands. These biases offer insights into cognitive mechanisms underlying learning and how they might go awry in psychiatric illness.
Collapse
Affiliation(s)
- Rasmus Bruckner
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany.
- Institute of Psychology, University of Hamburg, Hamburg, Germany.
| | - Hauke R Heekeren
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany
- Executive University Board, University of Hamburg, Hamburg, Germany
| | - Matthew R Nassar
- Robert J. & Nancy D. Carney Institute for Brain Science, Brown University, Providence, RI, USA
- Department of Neuroscience, Brown University, Providence, RI, USA
| |
Collapse
|
8
|
Chase J, Li JJ, Lin WC, Tai LH, Castro F, Collins AGE, Wilbrecht L. Genetic changes linked to two different syndromic forms of autism enhance reinforcement learning in adolescent male but not female mice. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.15.633099. [PMID: 39868311 PMCID: PMC11760717 DOI: 10.1101/2025.01.15.633099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/28/2025]
Abstract
Autism Spectrum Disorder (ASD) is characterized by restricted and repetitive behaviors and social differences, both of which may manifest, in part, from underlying differences in corticostriatal circuits and reinforcement learning. Here, we investigated reinforcement learning in mice with mutations in either Tsc2 or Shank3, both high-confidence ASD risk genes associated with major syndromic forms of ASD. Using an odor-based two-alternative forced choice (2AFC) task, we tested adolescent mice of both sexes and found male Tsc2 and Shank3B heterozygote (Het) mice showed enhanced learning performance compared to their wild type (WT) siblings. No gain of function was observed in females. Using a novel reinforcement learning (RL) based computational model to infer learning rate as well as policy-level task engagement and disengagement, we found that the gain of function in males was driven by an enhanced positive learning rate in both Tsc2 and Shank3B Het mice. The gain of function in Het males was absent when mice were trained with a probabilistic reward schedule. These findings in two ASD mouse models reveal a convergent learning phenotype that shows similar sensitivity to sex and environmental uncertainty. These data can inform our understanding of both strengths and challenges associated with autism, while providing further evidence that sex and experience of uncertainty modulate autism-related phenotypes. Significance Statement Reinforcement learning is a foundational form of learning that is widely used in behavioral interventions for autism. Here, we measured reinforcement learning in adolescent mice carrying genetic mutations linked to two different syndromic forms of autism. We found that males showed strengths in reinforcement learning compared to their wild type siblings, while females showed no differences. This gain of function in males was no longer observed when uncertainty was introduced into the reward schedule for correct choices. These findings support a model in which diverse genetic changes interact with sex to generate common phenotypes underlying autism. Our data further support the idea that autism risk genes may produce strengths as well as challenges in behavioral function.
Collapse
Affiliation(s)
- Juliana Chase
- Department of Neuroscience, University of California, Berkeley, Berkeley, CA, 94720
| | - Jing-Jing Li
- Department of Neuroscience, University of California, Berkeley, Berkeley, CA, 94720
| | - Wan Chen Lin
- Department of Neuroscience, University of California, Berkeley, Berkeley, CA, 94720
| | - Lung-Hao Tai
- Department of Neuroscience, University of California, Berkeley, Berkeley, CA, 94720
| | - Fernanda Castro
- Current address: Cellular & Molecular Pharmacology, University of California, San Francisco, Mission Bay, CA 94143
- Department of Psychology, University of California, Berkeley, Berkeley, CA, 94720
| | - Anne GE Collins
- Department of Psychology, University of California, Berkeley, Berkeley, CA, 94720
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, 94720
| | - Linda Wilbrecht
- Department of Neuroscience, University of California, Berkeley, Berkeley, CA, 94720
- Department of Psychology, University of California, Berkeley, Berkeley, CA, 94720
| |
Collapse
|
9
|
Bussell JJ, Badman RP, Márton CD, Bromberg-Martin ES, Abbott L, Rajan K, Axel R. Representations of the intrinsic value of information in mouse orbitofrontal cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.13.562291. [PMID: 39416043 PMCID: PMC11482914 DOI: 10.1101/2023.10.13.562291] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/19/2024]
Abstract
Animals are motivated to seek information that does not influence reward outcomes, suggesting that information has intrinsic value. We have developed an odor-based information seeking task that reveals that mice choose to receive information even though it does not alter the reward outcome. Moreover, mice are willing to pay for information by sacrificing water reward, suggesting that information is of intrinsic value to a mouse. We used a microendoscope to reveal neural activity in orbitofrontal cortex (OFC) while mice learned the information seeking task. We observed the emergence of distinct populations of neurons responsive to odors predictive of information and odors predictive of water reward. A latent variable model recapitulated these different representations in the low-dimensional dynamics of OFC neuronal population activity. These data suggest that mice have evolved separate pathways to represent the intrinsic value of information and the extrinsic value of water reward. Thus, the desire to acquire knowledge is observed in mice, and the value of this information is represented in the OFC. The mouse now provides a facile experimental system to study the representation of the value of information, a higher cognitive variable.
Collapse
Affiliation(s)
- Jennifer J. Bussell
- Department of Neuroscience, Columbia University; New York, NY, 10027, USA
- Zuckerman Mind Brain and Behavior Institute, Columbia University; New York, NY, 10027, USA
| | - Ryan P. Badman
- Department of Neurobiology, Harvard Medical School; Boston, MA, 02115, USA
- Kempner Institute, Harvard University; Cambridge, MA, 02138, USA
| | | | | | - L.F. Abbott
- Department of Neuroscience, Columbia University; New York, NY, 10027, USA
- Zuckerman Mind Brain and Behavior Institute, Columbia University; New York, NY, 10027, USA
| | - Kanaka Rajan
- Department of Neurobiology, Harvard Medical School; Boston, MA, 02115, USA
- Kempner Institute, Harvard University; Cambridge, MA, 02138, USA
| | - Richard Axel
- Department of Neuroscience, Columbia University; New York, NY, 10027, USA
- Zuckerman Mind Brain and Behavior Institute, Columbia University; New York, NY, 10027, USA
- Howard Hughes Medical Institute; Chevy Chase, MD, 20815, USA
| |
Collapse
|
10
|
Ohta H, Nozawa T, Nakano T, Morimoto Y, Ishizuka T. Nonlinear age-related differences in probabilistic learning in mice: A 5-armed bandit task study. Neurobiol Aging 2024; 142:8-16. [PMID: 39029360 DOI: 10.1016/j.neurobiolaging.2024.06.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 06/17/2024] [Accepted: 06/19/2024] [Indexed: 07/21/2024]
Abstract
This study explores the impact of aging on reinforcement learning in mice, focusing on changes in learning rates and behavioral strategies. A 5-armed bandit task (5-ABT) and a computational Q-learning model were used to evaluate the positive and negative learning rates and the inverse temperature across three age groups (3, 12, and 18 months). Results showed a significant decline in the negative learning rate of 18-month-old mice, which was not observed for the positive learning rate. This suggests that older mice maintain the ability to learn from successful experiences while decreasing the ability to learn from negative outcomes. We also observed a significant age-dependent variation in inverse temperature, reflecting a shift in action selection policy. Middle-aged mice (12 months) exhibited higher inverse temperature, indicating a higher reliance on previous rewarding experiences and reduced exploratory behaviors, when compared to both younger and older mice. This study provides new insights into aging research by demonstrating that there are age-related differences in specific components of reinforcement learning, which exhibit a non-linear pattern.
Collapse
Affiliation(s)
- Hiroyuki Ohta
- Department of Pharmacology, National Defense Medical College, 3-2 Namiki, Tokorozawa, Saitama 359-8513, Japan.
| | - Takashi Nozawa
- Mejiro University, 4-31-1 Naka-Ochiai, Shinjuku, Tokyo 161-8539, Japan
| | - Takashi Nakano
- Department of Computational Biology, School of Medicine, Fujita Health University, 1-98 Dengakugakubo, Kutsukake, Toyoake, Aichi 470-1192, Japan; International Center for Brain Science (ICBS), Fujita Health University, 1-98 Dengakugakubo, Kutsukake, Toyoake, Aichi 470-1192, Japan
| | - Yuji Morimoto
- Department of Physiology, National Defense Medical College, 3-2 Namiki, Tokorozawa, Saitama 359-8513, Japan
| | - Toshiaki Ishizuka
- Department of Pharmacology, National Defense Medical College, 3-2 Namiki, Tokorozawa, Saitama 359-8513, Japan
| |
Collapse
|
11
|
Ger Y, Shahar M, Shahar N. Using recurrent neural network to estimate irreducible stochasticity in human choice behavior. eLife 2024; 13:RP90082. [PMID: 39240757 PMCID: PMC11379453 DOI: 10.7554/elife.90082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2024] Open
Abstract
Theoretical computational models are widely used to describe latent cognitive processes. However, these models do not equally explain data across participants, with some individuals showing a bigger predictive gap than others. In the current study, we examined the use of theory-independent models, specifically recurrent neural networks (RNNs), to classify the source of a predictive gap in the observed data of a single individual. This approach aims to identify whether the low predictability of behavioral data is mainly due to noisy decision-making or misspecification of the theoretical model. First, we used computer simulation in the context of reinforcement learning to demonstrate that RNNs can be used to identify model misspecification in simulated agents with varying degrees of behavioral noise. Specifically, both prediction performance and the number of RNN training epochs (i.e., the point of early stopping) can be used to estimate the amount of stochasticity in the data. Second, we applied our approach to an empirical dataset where the actions of low IQ participants, compared with high IQ participants, showed lower predictability by a well-known theoretical model (i.e., Daw's hybrid model for the two-step task). Both the predictive gap and the point of early stopping of the RNN suggested that model misspecification is similar across individuals. This led us to a provisional conclusion that low IQ subjects are mostly noisier compared to their high IQ peers, rather than being more misspecified by the theoretical model. We discuss the implications and limitations of this approach, considering the growing literature in both theoretical and data-driven computational modeling in decision-making science.
Collapse
Affiliation(s)
- Yoav Ger
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Moni Shahar
- TAD, Center of AI & Data Science, Tel Aviv University, Tel Aviv, Israel
| | - Nitzan Shahar
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
- School of Psychological Sciences, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
12
|
Schaaf JV, Weidinger L, Molleman L, van den Bos W. Test-retest reliability of reinforcement learning parameters. Behav Res Methods 2024; 56:4582-4599. [PMID: 37684495 PMCID: PMC11289054 DOI: 10.3758/s13428-023-02203-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/18/2023] [Indexed: 09/10/2023]
Abstract
It has recently been suggested that parameter estimates of computational models can be used to understand individual differences at the process level. One area of research in which this approach, called computational phenotyping, has taken hold is computational psychiatry. One requirement for successful computational phenotyping is that behavior and parameters are stable over time. Surprisingly, the test-retest reliability of behavior and model parameters remains unknown for most experimental tasks and models. The present study seeks to close this gap by investigating the test-retest reliability of canonical reinforcement learning models in the context of two often-used learning paradigms: a two-armed bandit and a reversal learning task. We tested independent cohorts for the two tasks (N = 69 and N = 47) via an online testing platform with a between-test interval of five weeks. Whereas reliability was high for personality and cognitive measures (with ICCs ranging from .67 to .93), it was generally poor for the parameter estimates of the reinforcement learning models (with ICCs ranging from .02 to .52 for the bandit task and from .01 to .71 for the reversal learning task). Given that simulations indicated that our procedures could detect high test-retest reliability, this suggests that a significant proportion of the variability must be ascribed to the participants themselves. In support of that hypothesis, we show that mood (stress and happiness) can partly explain within-participant variability. Taken together, these results are critical for current practices in computational phenotyping and suggest that individual variability should be taken into account in the future development of the field.
Collapse
Affiliation(s)
- Jessica V Schaaf
- Department of Psychology, University of Amsterdam, Amsterdam, the Netherlands.
- Cognitive Neuroscience Department, Radboud University Medical Centre, Nijmegen, the Netherlands.
- Donders Institute for Brain, Cognition and Behaviour, Nijmegen, the Netherlands.
| | - Laura Weidinger
- DeepMind, London, United Kingdom
- Center for Adaptive Rationality, Max Planck Institute for Human Development, Berlin, Germany
| | - Lucas Molleman
- Department of Psychology, University of Amsterdam, Amsterdam, the Netherlands
- Center for Adaptive Rationality, Max Planck Institute for Human Development, Berlin, Germany
| | - Wouter van den Bos
- Department of Psychology, University of Amsterdam, Amsterdam, the Netherlands
- Center for Adaptive Rationality, Max Planck Institute for Human Development, Berlin, Germany
| |
Collapse
|
13
|
Falck J, Zhang L, Raffington L, Mohn JJ, Triesch J, Heim C, Shing YL. Hippocampus and striatum show distinct contributions to longitudinal changes in value-based learning in middle childhood. eLife 2024; 12:RP89483. [PMID: 38953517 PMCID: PMC11219037 DOI: 10.7554/elife.89483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/04/2024] Open
Abstract
The hippocampal-dependent memory system and striatal-dependent memory system modulate reinforcement learning depending on feedback timing in adults, but their contributions during development remain unclear. In a 2-year longitudinal study, 6-to-7-year-old children performed a reinforcement learning task in which they received feedback immediately or with a short delay following their response. Children's learning was found to be sensitive to feedback timing modulations in their reaction time and inverse temperature parameter, which quantifies value-guided decision-making. They showed longitudinal improvements towards more optimal value-based learning, and their hippocampal volume showed protracted maturation. Better delayed model-derived learning covaried with larger hippocampal volume longitudinally, in line with the adult literature. In contrast, a larger striatal volume in children was associated with both better immediate and delayed model-derived learning longitudinally. These findings show, for the first time, an early hippocampal contribution to the dynamic development of reinforcement learning in middle childhood, with neurally less differentiated and more cooperative memory systems than in adults.
Collapse
Affiliation(s)
- Johannes Falck
- Department of Psychology, Goethe University FrankfurtFrankfurtGermany
| | - Lei Zhang
- Centre for Human Brain Health, School of Psychology, University of BirminghamBirminghamUnited Kingdom
- Institute for Mental Health, School of Psychology, University of BirminghamBirminghamUnited Kingdom
- Centre for Developmental Science, School of Psychology, University of BirminghamBirminghamUnited Kingdom
- Social, Cognitive and Affective Neuroscience Unit, Department of Cognition, Emotion, and Methods in Psychology, Faculty of Psychology, University of ViennaViennaAustria
| | - Laurel Raffington
- Max Planck Research Group Biosocial, Max Planck Institute for Human DevelopmentBerlinGermany
| | - Johannes Julius Mohn
- Charité – Universitätsmedizin Berlin, Institute of Medical PsychologyBerlinGermany
- Max Planck School of Cognition, Max Planck Institute for Human Cognitive and Brain SciencesLeipzigGermany
| | - Jochen Triesch
- Frankfurt Institute for Advanced Studies (FIAS)Frankfurt am MainGermany
| | - Christine Heim
- Charité – Universitätsmedizin Berlin, Institute of Medical PsychologyBerlinGermany
- Center for Safe & Healthy Children, The Pennsylvania State UniversityUniversity ParkUnited States
| | - Yee Lee Shing
- Department of Psychology, Goethe University FrankfurtFrankfurtGermany
| |
Collapse
|
14
|
Augustat N, Endres D, Mueller EM. Uncertainty of treatment efficacy moderates placebo effects on reinforcement learning. Sci Rep 2024; 14:14421. [PMID: 38909105 PMCID: PMC11193823 DOI: 10.1038/s41598-024-64240-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 06/06/2024] [Indexed: 06/24/2024] Open
Abstract
The placebo-reward hypothesis postulates that positive effects of treatment expectations on health (i.e., placebo effects) and reward processing share common neural underpinnings. Moreover, experiments in humans and animals indicate that reward uncertainty increases striatal dopamine, which is presumably involved in placebo responses and reward learning. Therefore, treatment uncertainty analogously to reward uncertainty may affect updating from rewards after placebo treatment. Here, we address whether different degrees of uncertainty regarding the efficacy of a sham treatment affect reward sensitivity. In an online between-subjects experiment with N = 141 participants, we systematically varied the provided efficacy instructions before participants first received a sham treatment that consisted of listening to binaural beats and then performed a probabilistic reinforcement learning task. We fitted a Q-learning model including two different learning rates for positive (gain) and negative (loss) reward prediction errors and an inverse gain parameter to behavioral decision data in the reinforcement learning task. Our results yielded an inverted-U-relationship between provided treatment efficacy probability and learning rates for gain, such that higher levels of treatment uncertainty, rather than of expected net efficacy, affect presumably dopamine-related reward learning. These findings support the placebo-reward hypothesis and suggest harnessing uncertainty in placebo treatment for recovering reward learning capabilities.
Collapse
Affiliation(s)
- Nick Augustat
- Department of Psychology, University of Marburg, Marburg, Germany.
| | - Dominik Endres
- Department of Psychology, University of Marburg, Marburg, Germany
| | - Erik M Mueller
- Department of Psychology, University of Marburg, Marburg, Germany
| |
Collapse
|
15
|
Colas JT, O’Doherty JP, Grafton ST. Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts. PLoS Comput Biol 2024; 20:e1011950. [PMID: 38552190 PMCID: PMC10980507 DOI: 10.1371/journal.pcbi.1011950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 02/26/2024] [Indexed: 04/01/2024] Open
Abstract
Active reinforcement learning enables dynamic prediction and control, where one should not only maximize rewards but also minimize costs such as of inference, decisions, actions, and time. For an embodied agent such as a human, decisions are also shaped by physical aspects of actions. Beyond the effects of reward outcomes on learning processes, to what extent can modeling of behavior in a reinforcement-learning task be complicated by other sources of variance in sequential action choices? What of the effects of action bias (for actions per se) and action hysteresis determined by the history of actions chosen previously? The present study addressed these questions with incremental assembly of models for the sequential choice data from a task with hierarchical structure for additional complexity in learning. With systematic comparison and falsification of computational models, human choices were tested for signatures of parallel modules representing not only an enhanced form of generalized reinforcement learning but also action bias and hysteresis. We found evidence for substantial differences in bias and hysteresis across participants-even comparable in magnitude to the individual differences in learning. Individuals who did not learn well revealed the greatest biases, but those who did learn accurately were also significantly biased. The direction of hysteresis varied among individuals as repetition or, more commonly, alternation biases persisting from multiple previous actions. Considering that these actions were button presses with trivial motor demands, the idiosyncratic forces biasing sequences of action choices were robust enough to suggest ubiquity across individuals and across tasks requiring various actions. In light of how bias and hysteresis function as a heuristic for efficient control that adapts to uncertainty or low motivation by minimizing the cost of effort, these phenomena broaden the consilient theory of a mixture of experts to encompass a mixture of expert and nonexpert controllers of behavior.
Collapse
Affiliation(s)
- Jaron T. Colas
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, California, United States of America
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, California, United States of America
- Computation and Neural Systems Program, California Institute of Technology, Pasadena, California, United States of America
| | - John P. O’Doherty
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, California, United States of America
- Computation and Neural Systems Program, California Institute of Technology, Pasadena, California, United States of America
| | - Scott T. Grafton
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, California, United States of America
| |
Collapse
|
16
|
Wilbrecht L, Davidow JY. Goal-directed learning in adolescence: neurocognitive development and contextual influences. Nat Rev Neurosci 2024; 25:176-194. [PMID: 38263216 DOI: 10.1038/s41583-023-00783-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/12/2023] [Indexed: 01/25/2024]
Abstract
Adolescence is a time during which we transition to independence, explore new activities and begin pursuit of major life goals. Goal-directed learning, in which we learn to perform actions that enable us to obtain desired outcomes, is central to many of these processes. Currently, our understanding of goal-directed learning in adolescence is itself in a state of transition, with the scientific community grappling with inconsistent results. When we examine metrics of goal-directed learning through the second decade of life, we find that many studies agree there are steady gains in performance in the teenage years, but others report that adolescent goal-directed learning is already adult-like, and some find adolescents can outperform adults. To explain the current variability in results, sophisticated experimental designs are being applied to test learning in different contexts. There is also increasing recognition that individuals of different ages and in different states will draw on different neurocognitive systems to support goal-directed learning. Through adoption of more nuanced approaches, we can be better prepared to recognize and harness adolescent strengths and to decipher the purpose (or goals) of adolescence itself.
Collapse
Affiliation(s)
- Linda Wilbrecht
- Department of Psychology, University of California, Berkeley, CA, USA.
- Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA.
| | - Juliet Y Davidow
- Department of Psychology, Northeastern University, Boston, MA, USA.
| |
Collapse
|
17
|
Cheng Z, Moser AD, Jones M, Kaiser RH. Reinforcement learning and working memory in mood disorders: A computational analysis in a developmental transdiagnostic sample. J Affect Disord 2024; 344:423-431. [PMID: 37839471 DOI: 10.1016/j.jad.2023.10.084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 10/08/2023] [Accepted: 10/10/2023] [Indexed: 10/17/2023]
Abstract
BACKGROUND Mood disorders commonly onset during adolescence and young adulthood and are conceptually and empirically related to reinforcement learning abnormalities. However, the nature of abnormalities associated with acute symptom severity versus lifetime diagnosis remains unclear, and prior research has often failed to disentangle working memory from reward processes. METHODS The present sample (N = 220) included adolescents and young adults with a lifetime history of unipolar disorders (n = 127), bipolar disorders (n = 28), or no history of psychopathology (n = 62), and varying severity of mood symptoms. Analyses fitted a reinforcement learning and working memory model to an instrumental learning task that varied working memory load, and tested associations between model parameters and diagnoses or current symptoms. RESULTS Current severity of manic or anhedonic symptoms negatively correlated with task performance. Participants reporting higher severity of current anhedonia, or with lifetime unipolar or bipolar disorders, showed lower reward learning rates. Participants reporting higher severity of current manic symptoms showed faster working memory decay and reduced use of working memory. LIMITATIONS Computational parameters should be interpreted in the task environment (a deterministic reward learning paradigm), and developmental population. Future work should test replication in other paradigms and populations. CONCLUSIONS Results indicate abnormalities in reinforcement learning processes that either scale with current symptom severity, or correspond with lifetime mood diagnoses. Findings may have implications for understanding reward processing anomalies related to state-like (current symptom) or trait-like (lifetime diagnosis) aspects of mood disorders.
Collapse
Affiliation(s)
- Ziwei Cheng
- Department of Psychology and Neuroscience, University of Colorado Boulder, Boulder, CO, United States; Institute for Cognitive Science, University of Colorado Boulder, Boulder, CO, United States
| | - Amelia D Moser
- Department of Psychology and Neuroscience, University of Colorado Boulder, Boulder, CO, United States; Institute for Cognitive Science, University of Colorado Boulder, Boulder, CO, United States
| | - Matt Jones
- Department of Psychology and Neuroscience, University of Colorado Boulder, Boulder, CO, United States
| | - Roselinde H Kaiser
- Department of Psychology and Neuroscience, University of Colorado Boulder, Boulder, CO, United States; Institute for Cognitive Science, University of Colorado Boulder, Boulder, CO, United States.
| |
Collapse
|
18
|
Chase HW. A novel technique for delineating the effect of variation in the learning rate on the neural correlates of reward prediction errors in model-based fMRI. Front Psychol 2023; 14:1211528. [PMID: 38187436 PMCID: PMC10768009 DOI: 10.3389/fpsyg.2023.1211528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 11/28/2023] [Indexed: 01/09/2024] Open
Abstract
Introduction Computational models play an increasingly important role in describing variation in neural activation in human neuroimaging experiments, including evaluating individual differences in the context of psychiatric neuroimaging. In particular, reinforcement learning (RL) techniques have been widely adopted to examine neural responses to reward prediction errors and stimulus or action values, and how these might vary as a function of clinical status. However, there is a lack of consensus around the importance of the precision of free parameter estimation for these methods, particularly with regard to the learning rate. In the present study, I introduce a novel technique which may be used within a general linear model (GLM) to model the effect of mis-estimation of the learning rate on reward prediction error (RPE)-related neural responses. Methods Simulations employed a simple RL algorithm, which was used to generate hypothetical neural activations that would be expected to be observed in functional magnetic resonance imaging (fMRI) studies of RL. Similar RL models were incorporated within a GLM-based analysis method including derivatives, with individual differences in the resulting GLM-derived beta parameters being evaluated with respect to the free parameters of the RL model or being submitted to other validation analyses. Results Initial simulations demonstrated that the conventional approach to fitting RL models to RPE responses is more likely to reflect individual differences in a reinforcement efficacy construct (lambda) rather than learning rate (alpha). The proposed method, adding a derivative regressor to the GLM, provides a second regressor which reflects the learning rate. Validation analyses were performed including examining another comparable method which yielded highly similar results, and a demonstration of sensitivity of the method in presence of fMRI-like noise. Conclusion Overall, the findings underscore the importance of the lambda parameter for interpreting individual differences in RPE-coupled neural activity, and validate a novel neural metric of the modulation of such activity by individual differences in the learning rate. The method is expected to find application in understanding aberrant reinforcement learning across different psychiatric patient groups including major depression and substance use disorder.
Collapse
Affiliation(s)
- Henry W. Chase
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
| |
Collapse
|
19
|
Sato Y, Sakai Y, Hirata S. State-transition-free reinforcement learning in chimpanzees (Pan troglodytes). Learn Behav 2023; 51:413-427. [PMID: 37369920 DOI: 10.3758/s13420-023-00591-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/07/2023] [Indexed: 06/29/2023]
Abstract
The outcome of an action often occurs after a delay. One solution for learning appropriate actions from delayed outcomes is to rely on a chain of state transitions. Another solution, which does not rest on state transitions, is to use an eligibility trace (ET) that directly bridges a current outcome and multiple past actions via transient memories. Previous studies revealed that humans (Homo sapiens) learned appropriate actions in a behavioral task in which solutions based on the ET were effective but transition-based solutions were ineffective. This suggests that ET may be used in human learning systems. However, no studies have examined nonhuman animals with an equivalent behavioral task. We designed a task for nonhuman animals following a previous human study. In each trial, participants chose one of two stimuli that were randomly selected from three stimulus types: a stimulus associated with a food reward delivered immediately, a stimulus associated with a reward delivered after a few trials, and a stimulus associated with no reward. The presented stimuli did not vary according to the participants' choices. To maximize the total reward, participants had to learn the value of the stimulus associated with a delayed reward. Five chimpanzees (Pan troglodytes) performed the task using a touchscreen. Two chimpanzees were able to learn successfully, indicating that learning mechanisms that do not depend on state transitions were involved in the learning processes. The current study extends previous ET research by proposing a behavioral task and providing empirical data from chimpanzees.
Collapse
Grants
- 16H06283 Ministry of Education, Culture, Sports, Science, Japan Society for the Promotion of Science
- 18H05524 Ministry of Education, Culture, Sports, Science, Japan Society for the Promotion of Science
- 19J22889 Ministry of Education, Culture, Sports, Science, Japan Society for the Promotion of Science
- 26245069 Ministry of Education, Culture, Sports, Science, Japan Society for the Promotion of Science
- U04 Program for Leading Graduate Schools
Collapse
Affiliation(s)
- Yutaro Sato
- Wildlife Research Center, Kyoto University, Kyoto, Japan.
- University Administration Office, Headquarters for Management Strategy, Niigata University, Niigata, Japan.
| | - Yutaka Sakai
- Brain Science Institute, Tamagawa University, Tokyo, Japan
| | - Satoshi Hirata
- Wildlife Research Center, Kyoto University, Kyoto, Japan
| |
Collapse
|
20
|
Giron AP, Ciranka S, Schulz E, van den Bos W, Ruggeri A, Meder B, Wu CM. Developmental changes in exploration resemble stochastic optimization. Nat Hum Behav 2023; 7:1955-1967. [PMID: 37591981 PMCID: PMC10663152 DOI: 10.1038/s41562-023-01662-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 06/21/2023] [Indexed: 08/19/2023]
Abstract
Human development is often described as a 'cooling off' process, analogous to stochastic optimization algorithms that implement a gradual reduction in randomness over time. Yet there is ambiguity in how to interpret this analogy, due to a lack of concrete empirical comparisons. Using data from n = 281 participants ages 5 to 55, we show that cooling off does not only apply to the single dimension of randomness. Rather, human development resembles an optimization process of multiple learning parameters, for example, reward generalization, uncertainty-directed exploration and random temperature. Rapid changes in parameters occur during childhood, but these changes plateau and converge to efficient values in adulthood. We show that while the developmental trajectory of human parameters is strikingly similar to several stochastic optimization algorithms, there are important differences in convergence. None of the optimization algorithms tested were able to discover reliably better regions of the strategy space than adult participants on this task.
Collapse
Affiliation(s)
- Anna P Giron
- Human and Machine Cognition Lab, University of Tübingen, Tübingen, Germany
- Attention and Affect Lab, University of Tübingen, Tübingen, Germany
| | - Simon Ciranka
- Center for Adaptive Rationality, Max Planck Institute for Human Development, Berlin, Germany
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, Berlin, Germany
| | - Eric Schulz
- MPRG Computational Principles of Intelligence, Max Planck Institute for Biological Cybernetics, Tübingen, Germany
| | - Wouter van den Bos
- Department of Psychology, University of Amsterdam, Amsterdam, the Netherlands
- Amsterdam Brain and Cognition, University of Amsterdam, Amsterdam, the Netherlands
| | - Azzurra Ruggeri
- MPRG iSearch, Max Planck Institute for Human Development, Berlin, Germany
- School of Social Sciences and Technology, Technical University Munich, Munich, Germany
- Central European University, Vienna, Austria
| | - Björn Meder
- MPRG iSearch, Max Planck Institute for Human Development, Berlin, Germany
- Institute for Mind, Brain and Behavior, Health and Medical University, Potsdam, Germany
| | - Charley M Wu
- Human and Machine Cognition Lab, University of Tübingen, Tübingen, Germany.
- Center for Adaptive Rationality, Max Planck Institute for Human Development, Berlin, Germany.
| |
Collapse
|
21
|
De Panfilis C, Lis S. Difficulties in updating social information in personality disorders: A commentary on the article by Rosenblau et al. Neurosci Biobehav Rev 2023; 153:105387. [PMID: 37683989 DOI: 10.1016/j.neubiorev.2023.105387] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 08/22/2023] [Accepted: 09/05/2023] [Indexed: 09/10/2023]
Affiliation(s)
- Chiara De Panfilis
- Unit of Neuroscience, Department of Medicine and Surgery, University of Parma, Italy.
| | - Stefanie Lis
- Department of Clinical Psychology, Department of Psychiatric and Psychosomatic Medicine, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany.
| |
Collapse
|
22
|
Pauli R, Brazil IA, Kohls G, Klein-Flügge MC, Rogers JC, Dikeos D, Dochnal R, Fairchild G, Fernández-Rivas A, Herpertz-Dahlmann B, Hervas A, Konrad K, Popma A, Stadler C, Freitag CM, De Brito SA, Lockwood PL. Action initiation and punishment learning differ from childhood to adolescence while reward learning remains stable. Nat Commun 2023; 14:5689. [PMID: 37709750 PMCID: PMC10502052 DOI: 10.1038/s41467-023-41124-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Accepted: 08/24/2023] [Indexed: 09/16/2023] Open
Abstract
Theoretical and empirical accounts suggest that adolescence is associated with heightened reward learning and impulsivity. Experimental tasks and computational models that can dissociate reward learning from the tendency to initiate actions impulsively (action initiation bias) are thus critical to characterise the mechanisms that drive developmental differences. However, existing work has rarely quantified both learning ability and action initiation, or it has relied on small samples. Here, using computational modelling of a learning task collected from a large sample (N = 742, 9-18 years, 11 countries), we test differences in reward and punishment learning and action initiation from childhood to adolescence. Computational modelling reveals that whilst punishment learning rates increase with age, reward learning remains stable. In parallel, action initiation biases decrease with age. Results are similar when considering pubertal stage instead of chronological age. We conclude that heightened reward responsivity in adolescence can reflect differences in action initiation rather than enhanced reward learning.
Collapse
Affiliation(s)
- Ruth Pauli
- Centre for Human Brain Health, School of Psychology, University of Birmingham, Birmingham, UK.
| | - Inti A Brazil
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Gregor Kohls
- Child Neuropsychology Section, Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, RWTH Aachen University, Aachen, Germany
- Department of Child and Adolescent Psychiatry, Faculty of Medicine, TU, Dresden, Germany
| | - Miriam C Klein-Flügge
- Department of Experimental Psychology, University of Oxford, Oxford, UK
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK
| | - Jack C Rogers
- Centre for Human Brain Health, School of Psychology, University of Birmingham, Birmingham, UK
- Institute for Mental Health, School of Psychology, University of Birmingham, Birmingham, UK
| | - Dimitris Dikeos
- Department of Psychiatry, Medical School, National and Kapodistrian University of Athens, Athens, Greece
| | - Roberta Dochnal
- Faculty of Medicine, Child and Adolescent Psychiatry, Department of the Child Health Center, Szeged University, Szeged, Hungary
| | | | | | - Beate Herpertz-Dahlmann
- Child Neuropsychology Section, Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, RWTH Aachen University, Aachen, Germany
| | - Amaia Hervas
- University Hospital Mutua Terrassa, Barcelona, Spain
| | - Kerstin Konrad
- Child Neuropsychology Section, Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, RWTH Aachen University, Aachen, Germany
- JARA-Brain Institute II, Molecular Neuroscience and Neuroimaging, RWTH Aachen and Research Centre Jülich, Jülich, Germany
| | - Arne Popma
- Department of Child and Adolescent Psychiatry, VU University Medical Center, Amsterdam, Netherlands
| | - Christina Stadler
- Department of Child and Adolescent Psychiatry, Psychiatric University Hospital, University of Basel, Basel, Switzerland
| | - Christine M Freitag
- Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, University Hospital Frankfurt, Goethe University, Frankfurt am Main, Germany
| | - Stephane A De Brito
- Centre for Human Brain Health, School of Psychology, University of Birmingham, Birmingham, UK
- Institute for Mental Health, School of Psychology, University of Birmingham, Birmingham, UK
| | - Patricia L Lockwood
- Centre for Human Brain Health, School of Psychology, University of Birmingham, Birmingham, UK.
- Department of Experimental Psychology, University of Oxford, Oxford, UK.
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK.
- Institute for Mental Health, School of Psychology, University of Birmingham, Birmingham, UK.
| |
Collapse
|
23
|
Yip SW, Barch DM, Chase HW, Flagel S, Huys QJ, Konova AB, Montague R, Paulus M. From Computation to Clinic. BIOLOGICAL PSYCHIATRY GLOBAL OPEN SCIENCE 2023; 3:319-328. [PMID: 37519475 PMCID: PMC10382698 DOI: 10.1016/j.bpsgos.2022.03.011] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 02/25/2022] [Accepted: 03/22/2022] [Indexed: 12/12/2022] Open
Abstract
Theory-driven and data-driven computational approaches to psychiatry have enormous potential for elucidating mechanism of disease and providing translational linkages between basic science findings and the clinic. These approaches have already demonstrated utility in providing clinically relevant understanding, primarily via back translation from clinic to computation, revealing how specific disorders or symptoms map onto specific computational processes. Nonetheless, forward translation, from computation to clinic, remains rare. In addition, consensus regarding specific barriers to forward translation-and on the best strategies to overcome these barriers-is limited. This perspective review brings together expert basic and computationally trained researchers and clinicians to 1) identify challenges specific to preclinical model systems and clinical translation of computational models of cognition and affect, and 2) discuss practical approaches to overcoming these challenges. In doing so, we highlight recent evidence for the ability of computational approaches to predict treatment responses in psychiatric disorders and discuss considerations for maximizing the clinical relevance of such models (e.g., via longitudinal testing) and the likelihood of stakeholder adoption (e.g., via cost-effectiveness analyses).
Collapse
Affiliation(s)
- Sarah W. Yip
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut
| | - Deanna M. Barch
- Departments of Psychological & Brain Sciences, Psychiatry, and Radiology, Washington University, St. Louis, Missouri
| | - Henry W. Chase
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Shelly Flagel
- Department of Psychiatry and Michigan Neuroscience Institute, University of Michigan, Ann Arbor, Michigan
| | - Quentin J.M. Huys
- Division of Psychiatry and Max Planck UCL Centre for Computational Psychiatry and Ageing Research, Institute of Neurology, University College London, London, United Kingdom
- Camden and Islington NHS Foundation Trust, London, United Kingdom
| | - Anna B. Konova
- Department of Psychiatry and Brain Health Institute, Rutgers University, Piscataway, New Jersey
| | - Read Montague
- Fralin Biomedical Research Institute and Department of Physics, Virginia Tech, Blacksburg, Virginia
| | - Martin Paulus
- Laureate Institute for Brain Research, Tulsa, Oklahoma
| |
Collapse
|
24
|
Topel S, Ma I, Sleutels J, van Steenbergen H, de Bruijn ERA, van Duijvenvoorde ACK. Expecting the unexpected: a review of learning under uncertainty across development. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2023:10.3758/s13415-023-01098-0. [PMID: 37237092 PMCID: PMC10390612 DOI: 10.3758/s13415-023-01098-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 03/28/2023] [Indexed: 05/28/2023]
Abstract
Many of our decisions take place under uncertainty. To successfully navigate the environment, individuals need to estimate the degree of uncertainty and adapt their behaviors accordingly by learning from experiences. However, uncertainty is a broad construct and distinct types of uncertainty may differentially influence our learning. We provide a semi-systematic review to illustrate cognitive and neurobiological processes involved in learning under two types of uncertainty: learning in environments with stochastic outcomes, and with volatile outcomes. We specifically reviewed studies (N = 26 studies) that included an adolescent population, because adolescence is a period in life characterized by heightened exploration and learning, as well as heightened uncertainty due to experiencing many new, often social, environments. Until now, reviews have not comprehensively compared learning under distinct types of uncertainties in this age range. Our main findings show that although the overall developmental patterns were mixed, most studies indicate that learning from stochastic outcomes, as indicated by increased accuracy in performance, improved with age. We also found that adolescents tended to have an advantage compared with adults and children when learning from volatile outcomes. We discuss potential mechanisms explaining these age-related differences and conclude by outlining future research directions.
Collapse
Affiliation(s)
- Selin Topel
- Leiden University, Institute of Psychology, Wassenaarseweg 52, 2333, AK, Leiden, The Netherlands.
- Leiden Institute for Brain and Cognition, Leiden, The Netherlands.
| | - Ili Ma
- Leiden University, Institute of Psychology, Wassenaarseweg 52, 2333, AK, Leiden, The Netherlands
- Leiden Institute for Brain and Cognition, Leiden, The Netherlands
| | - Jan Sleutels
- Leiden University, Institute of Psychology, Wassenaarseweg 52, 2333, AK, Leiden, The Netherlands
- Leiden University, Institute for Philosophy, Leiden, The Netherlands
| | - Henk van Steenbergen
- Leiden University, Institute of Psychology, Wassenaarseweg 52, 2333, AK, Leiden, The Netherlands
- Leiden Institute for Brain and Cognition, Leiden, The Netherlands
| | - Ellen R A de Bruijn
- Leiden University, Institute of Psychology, Wassenaarseweg 52, 2333, AK, Leiden, The Netherlands
- Leiden Institute for Brain and Cognition, Leiden, The Netherlands
| | - Anna C K van Duijvenvoorde
- Leiden University, Institute of Psychology, Wassenaarseweg 52, 2333, AK, Leiden, The Netherlands
- Leiden Institute for Brain and Cognition, Leiden, The Netherlands
| |
Collapse
|
25
|
Towner E, Chierchia G, Blakemore SJ. Sensitivity and specificity in affective and social learning in adolescence. Trends Cogn Sci 2023:S1364-6613(23)00092-X. [PMID: 37198089 DOI: 10.1016/j.tics.2023.04.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 03/23/2023] [Accepted: 04/05/2023] [Indexed: 05/19/2023]
Abstract
Adolescence is a period of heightened affective and social sensitivity. In this review we address how this increased sensitivity influences associative learning. Based on recent evidence from human and rodent studies, as well as advances in computational biology, we suggest that, compared to other age groups, adolescents show features of heightened Pavlovian learning but tend to perform worse than adults at instrumental learning. Because Pavlovian learning does not involve decision-making, whereas instrumental learning does, we propose that these developmental differences might be due to heightened sensitivity to rewards and threats in adolescence, coupled with a lower specificity of responding. We discuss the implications of these findings for adolescent mental health and education.
Collapse
Affiliation(s)
- Emily Towner
- Department of Psychology, University of Cambridge, Downing Street, Cambridge, UK.
| | - Gabriele Chierchia
- Department of Brain and Behavioral Sciences, University of Pavia, Pavia, Italy; Department of Psychology, University of Cambridge, Downing Street, Cambridge, UK
| | | |
Collapse
|
26
|
He Q, Beveridge EH, Vargas V, Salen A, Brown TI. Effects of Acute Stress on Rigid Learning, Flexible Learning, and Value-Based Decision-Making in Spatial Navigation. Psychol Sci 2023; 34:552-567. [PMID: 36944163 DOI: 10.1177/09567976231155870] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/23/2023] Open
Abstract
The current study investigated how stress affects value-based decision-making during spatial navigation and different types of learning underlying decisions. Eighty-two adult participants (42 females) first learned to find object locations in a virtual environment from a fixed starting location (rigid learning) and then to find the same objects from unpredictable starting locations (flexible learning). Participants then decided whether to reach goal objects from the fixed or unpredictable starting location. We found that stress impairs rigid learning in females, and it does not impair, and even improves, flexible learning when performance with rigid learning is controlled for. Critically, examining how earlier learning influences subsequent decision-making using computational models, we found that stress reduces memory integration, making participants more likely to focus on recent memory and less likely to integrate information from other sources. Collectively, our results show how stress impacts different memory systems and the communication between memory and decision-making.
Collapse
Affiliation(s)
- Qiliang He
- School of Psychology, Georgia Institute of Technology
| | | | - Vanesa Vargas
- School of Psychology, Georgia Institute of Technology
| | - Ashley Salen
- School of Psychology, Georgia Institute of Technology
| | | |
Collapse
|
27
|
Karvelis P, Paulus MP, Diaconescu AO. Individual differences in computational psychiatry: a review of current challenges. Neurosci Biobehav Rev 2023; 148:105137. [PMID: 36940888 DOI: 10.1016/j.neubiorev.2023.105137] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 03/04/2023] [Accepted: 03/14/2023] [Indexed: 03/23/2023]
Abstract
Bringing precision to the understanding and treatment of mental disorders requires instruments for studying clinically relevant individual differences. One promising approach is the development of computational assays: integrating computational models with cognitive tasks to infer latent patient-specific disease processes in brain computations. While recent years have seen many methodological advancements in computational modelling and many cross-sectional patient studies, much less attention has been paid to basic psychometric properties (reliability and construct validity) of the computational measures provided by the assays. In this review, we assess the extent of this issue by examining emerging empirical evidence. We find that many computational measures suffer from poor psychometric properties, which poses a risk of invalidating previous findings and undermining ongoing research efforts using computational assays to study individual (and even group) differences. We provide recommendations for how to address these problems and, crucially, embed them within a broader perspective on key developments that are needed for translating computational assays to clinical practice.
Collapse
Affiliation(s)
- Povilas Karvelis
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health (CAMH), Toronto, ON, Canada.
| | - Martin P Paulus
- Laureate Institute for Brain Research, Tulsa, OK, USA; Oxley College of Health Sciences, The University of Tulsa, Tulsa, OK, USA
| | - Andreea O Diaconescu
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health (CAMH), Toronto, ON, Canada; Department of Psychiatry, University of Toronto, Toronto, ON, Canada; Institute of Medical Sciences, University of Toronto, Toronto, ON, Canada; Department of Psychology, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
28
|
Fan C, Yao L, Zhang J, Zhen Z, Wu X. Advanced Reinforcement Learning and Its Connections with Brain Neuroscience. RESEARCH (WASHINGTON, D.C.) 2023; 6:0064. [PMID: 36939448 PMCID: PMC10017102 DOI: 10.34133/research.0064] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2022] [Accepted: 01/10/2023] [Indexed: 01/22/2023]
Abstract
In recent years, brain science and neuroscience have greatly propelled the innovation of computer science. In particular, knowledge from the neurobiology and neuropsychology of the brain revolutionized the development of reinforcement learning (RL) by providing novel interpretable mechanisms of how the brain achieves intelligent and efficient decision making. Triggered by this, there has been a boom in research about advanced RL algorithms that are built upon the inspirations of brain neuroscience. In this work, to further strengthen the bidirectional link between the 2 communities and especially promote the research on modern RL technology, we provide a comprehensive survey of recent advances in the area of brain-inspired/related RL algorithms. We start with basis theories of RL, and present a concise introduction to brain neuroscience related to RL. Then, we classify these advanced RL methodologies into 3 categories according to different connections of the brain, i.e., micro-neural activity, macro-brain structure, and cognitive function. Each category is further surveyed by presenting several modern RL algorithms along with their mathematical models, correlations with the brain, and open issues. Finally, we introduce several important applications of RL algorithms, followed by the discussions of challenges and opportunities for future research.
Collapse
Affiliation(s)
- Chaoqiong Fan
- School of Artificial Intelligence,
Beijing Normal University, Beijing, China
| | - Li Yao
- School of Artificial Intelligence,
Beijing Normal University, Beijing, China
| | - Jiacai Zhang
- School of Artificial Intelligence,
Beijing Normal University, Beijing, China
| | - Zonglei Zhen
- Faculty of Psychology,
Beijing Normal University, Beijing, China
| | - Xia Wu
- School of Artificial Intelligence,
Beijing Normal University, Beijing, China
| |
Collapse
|
29
|
Rutherford AV, McDougle SD, Joormann J. "Don't [ruminate], be happy": A cognitive perspective linking depression and anhedonia. Clin Psychol Rev 2023; 101:102255. [PMID: 36871425 DOI: 10.1016/j.cpr.2023.102255] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Revised: 12/19/2022] [Accepted: 02/16/2023] [Indexed: 02/22/2023]
Abstract
Anhedonia, a lack of pleasure in things an individual once enjoyed, and rumination, the process of perseverative and repetitive attention to specific thoughts, are hallmark features of depression. Though these both contribute to the same debilitating disorder, they have often been studied independently and through different theoretical lenses (e.g., biological vs. cognitive). Cognitive theories and research on rumination have largely focused on understanding negative affect in depression with much less focus on the etiology and maintenance of anhedonia. In this paper, we argue that by examining the relation between cognitive constructs and deficits in positive affect, we may better understand anhedonia in depression thereby improving prevention and intervention efforts. We review the extant literature on cognitive deficits in depression and discuss how these dysfunctions may not only lead to sustained negative affect but, importantly, interfere with an ability to attend to social and environmental cues that could restore positive affect. Specifically, we discuss how rumination is associated to deficits in working memory and propose that these deficits in working memory may contribute to anhedonia in depression. We further argue that analytical approaches such as computational modeling are needed to study these questions and, finally, discuss implications for treatment.
Collapse
Affiliation(s)
| | | | - Jutta Joormann
- Department of Psychology, Yale University, New Haven, CT, USA
| |
Collapse
|
30
|
Heald JB, Lengyel M, Wolpert DM. Contextual inference in learning and memory. Trends Cogn Sci 2023; 27:43-64. [PMID: 36435674 PMCID: PMC9789331 DOI: 10.1016/j.tics.2022.10.004] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 10/11/2022] [Accepted: 10/12/2022] [Indexed: 11/25/2022]
Abstract
Context is widely regarded as a major determinant of learning and memory across numerous domains, including classical and instrumental conditioning, episodic memory, economic decision-making, and motor learning. However, studies across these domains remain disconnected due to the lack of a unifying framework formalizing the concept of context and its role in learning. Here, we develop a unified vernacular allowing direct comparisons between different domains of contextual learning. This leads to a Bayesian model positing that context is unobserved and needs to be inferred. Contextual inference then controls the creation, expression, and updating of memories. This theoretical approach reveals two distinct components that underlie adaptation, proper and apparent learning, respectively referring to the creation and updating of memories versus time-varying adjustments in their expression. We review a number of extensions of the basic Bayesian model that allow it to account for increasingly complex forms of contextual learning.
Collapse
Affiliation(s)
- James B Heald
- Department of Neuroscience, Columbia University, New York, NY 10027, USA; Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY 10027, USA.
| | - Máté Lengyel
- Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, UK; Center for Cognitive Computation, Department of Cognitive Science, Central European University, Budapest, Hungary.
| | - Daniel M Wolpert
- Department of Neuroscience, Columbia University, New York, NY 10027, USA; Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY 10027, USA; Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, UK.
| |
Collapse
|
31
|
Abstract
In reinforcement learning (RL) experiments, participants learn to make rewarding choices in response to different stimuli; RL models use outcomes to estimate stimulus-response values that change incrementally. RL models consider any response type indiscriminately, ranging from more concretely defined motor choices (pressing a key with the index finger), to more general choices that can be executed in a number of ways (selecting dinner at the restaurant). However, does the learning process vary as a function of the choice type? In Experiment 1, we show that it does: Participants were slower and less accurate in learning correct choices of a general format compared with learning more concrete motor actions. Using computational modeling, we show that two mechanisms contribute to this. First, there was evidence of irrelevant credit assignment: The values of motor actions interfered with the values of other choice dimensions, resulting in more incorrect choices when the correct response was not defined by a single motor action; second, information integration for relevant general choices was slower. In Experiment 2, we replicated and further extended the findings from Experiment 1 by showing that slowed learning was attributable to weaker working memory use, rather than slowed RL. In both experiments, we ruled out the explanation that the difference in performance between two condition types was driven by difficulty/different levels of complexity. We conclude that defining a more abstract choice space used by multiple learning systems for credit assignment recruits executive resources, limiting how much such processes then contribute to fast learning.
Collapse
Affiliation(s)
| | - Amy Zou
- University of California, Berkeley
| | - Anne G E Collins
- University of California, Berkeley
- Helen Wills Neuroscience Institute Berkeley, CA
| |
Collapse
|
32
|
Eckstein MK, Master SL, Xia L, Dahl RE, Wilbrecht L, Collins AGE. The interpretation of computational model parameters depends on the context. eLife 2022; 11:e75474. [PMID: 36331872 PMCID: PMC9635876 DOI: 10.7554/elife.75474] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 09/09/2022] [Indexed: 11/06/2022] Open
Abstract
Reinforcement Learning (RL) models have revolutionized the cognitive and brain sciences, promising to explain behavior from simple conditioning to complex problem solving, to shed light on developmental and individual differences, and to anchor cognitive processes in specific brain mechanisms. However, the RL literature increasingly reveals contradictory results, which might cast doubt on these claims. We hypothesized that many contradictions arise from two commonly-held assumptions about computational model parameters that are actually often invalid: That parameters generalize between contexts (e.g. tasks, models) and that they capture interpretable (i.e. unique, distinctive) neurocognitive processes. To test this, we asked 291 participants aged 8-30 years to complete three learning tasks in one experimental session, and fitted RL models to each. We found that some parameters (exploration / decision noise) showed significant generalization: they followed similar developmental trajectories, and were reciprocally predictive between tasks. Still, generalization was significantly below the methodological ceiling. Furthermore, other parameters (learning rates, forgetting) did not show evidence of generalization, and sometimes even opposite developmental trajectories. Interpretability was low for all parameters. We conclude that the systematic study of context factors (e.g. reward stochasticity; task volatility) will be necessary to enhance the generalizability and interpretability of computational cognitive models.
Collapse
Affiliation(s)
| | - Sarah L Master
- Department of Psychology, University of California, BerkeleyBerkeleyUnited States
- Department of Psychology, New York UniversityNew YorkUnited States
| | - Liyu Xia
- Department of Psychology, University of California, BerkeleyBerkeleyUnited States
- Department of Mathematics, University of California, BerkeleyBerkeleyUnited States
| | - Ronald E Dahl
- Institute of Human Development, University of California, BerkeleyBerkeleyUnited States
| | - Linda Wilbrecht
- Department of Psychology, University of California, BerkeleyBerkeleyUnited States
- Helen Wills Neuroscience Institute, University of California, BerkeleyBerkeleyUnited States
| | - Anne GE Collins
- Department of Psychology, University of California, BerkeleyBerkeleyUnited States
- Helen Wills Neuroscience Institute, University of California, BerkeleyBerkeleyUnited States
| |
Collapse
|
33
|
Vinckier F, Jaffre C, Gauthier C, Smajda S, Abdel-Ahad P, Le Bouc R, Daunizeau J, Fefeu M, Borderies N, Plaze M, Gaillard R, Pessiglione M. Elevated Effort Cost Identified by Computational Modeling as a Distinctive Feature Explaining Multiple Behaviors in Patients With Depression. BIOLOGICAL PSYCHIATRY. COGNITIVE NEUROSCIENCE AND NEUROIMAGING 2022; 7:1158-1169. [PMID: 35952972 DOI: 10.1016/j.bpsc.2022.07.011] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 07/14/2022] [Accepted: 07/25/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND Motivational deficit is a core clinical manifestation of depression and a strong predictor of treatment failure. However, the underlying mechanisms, which cannot be accessed through conventional questionnaire-based scoring, remain largely unknown. According to decision theory, apathy could result either from biased subjective estimates (of action costs or outcomes) or from dysfunctional processes (in making decisions or allocating resources). METHODS Here, we combined a series of behavioral tasks with computational modeling to elucidate the motivational deficits of 35 patients with unipolar or bipolar depression under various treatments compared with 35 matched healthy control subjects. RESULTS The most striking feature, which was observed independent of medication across preference tasks (likeability ratings and binary decisions), performance tasks (physical and mental effort exertion), and instrumental learning tasks (updating choices to maximize outcomes), was an elevated sensitivity to effort cost. By contrast, sensitivity to action outcomes (reward and punishment) and task-specific processes were relatively spared. CONCLUSIONS These results highlight effort cost as a critical dimension that might explain multiple behavioral changes in patients with depression. More generally, they validate a test battery for computational phenotyping of motivational states, which could orientate toward specific medication or rehabilitation therapy, and thereby help pave the way for more personalized medicine in psychiatry.
Collapse
Affiliation(s)
- Fabien Vinckier
- Motivation, Brain & Behavior lab Institut du Cerveau, Hôpital Pitié-Salpêtrière, Paris, France; Université Paris Cité, Paris, France; Department of Psychiatry, Service Hospitalo-Universitaire, GHU Paris Psychiatrie & Neurosciences, Paris, France.
| | - Claire Jaffre
- Motivation, Brain & Behavior lab Institut du Cerveau, Hôpital Pitié-Salpêtrière, Paris, France; Université Paris Cité, Paris, France; Department of Psychiatry, Service Hospitalo-Universitaire, GHU Paris Psychiatrie & Neurosciences, Paris, France
| | - Claire Gauthier
- Université Paris Cité, Paris, France; Department of Psychiatry, Service Hospitalo-Universitaire, GHU Paris Psychiatrie & Neurosciences, Paris, France
| | - Sarah Smajda
- Université Paris Cité, Paris, France; Department of Psychiatry, Service Hospitalo-Universitaire, GHU Paris Psychiatrie & Neurosciences, Paris, France
| | - Pierre Abdel-Ahad
- Université Paris Cité, Paris, France; Department of Psychiatry, Service Hospitalo-Universitaire, GHU Paris Psychiatrie & Neurosciences, Paris, France
| | - Raphaël Le Bouc
- Motivation, Brain & Behavior lab Institut du Cerveau, Hôpital Pitié-Salpêtrière, Paris, France; Urgences cérébro-vasculaires, Pitié-Salpêtrière Hospital, Sorbonne University, Assistance Publique Hôpitaux de Paris, Paris, France; Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Zurich, Switzerland
| | - Jean Daunizeau
- Motivation, Brain & Behavior lab Institut du Cerveau, Hôpital Pitié-Salpêtrière, Paris, France; Sorbonne Universités, Inserm, CNRS, Paris, France
| | - Mylène Fefeu
- Université Paris Cité, Paris, France; Department of Psychiatry, Service Hospitalo-Universitaire, GHU Paris Psychiatrie & Neurosciences, Paris, France
| | - Nicolas Borderies
- Motivation, Brain & Behavior lab Institut du Cerveau, Hôpital Pitié-Salpêtrière, Paris, France
| | - Marion Plaze
- Université Paris Cité, Paris, France; Department of Psychiatry, Service Hospitalo-Universitaire, GHU Paris Psychiatrie & Neurosciences, Paris, France
| | - Raphaël Gaillard
- Université Paris Cité, Paris, France; Department of Psychiatry, Service Hospitalo-Universitaire, GHU Paris Psychiatrie & Neurosciences, Paris, France; Institut Pasteur, experimental neuropathology unit, Paris, France
| | - Mathias Pessiglione
- Motivation, Brain & Behavior lab Institut du Cerveau, Hôpital Pitié-Salpêtrière, Paris, France; Sorbonne Universités, Inserm, CNRS, Paris, France
| |
Collapse
|
34
|
Lin WC, Liu C, Kosillo P, Tai LH, Galarce E, Bateup HS, Lammel S, Wilbrecht L. Transient food insecurity during the juvenile-adolescent period affects adult weight, cognitive flexibility, and dopamine neurobiology. Curr Biol 2022; 32:3690-3703.e5. [PMID: 35863352 PMCID: PMC10519557 DOI: 10.1016/j.cub.2022.06.089] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 04/01/2022] [Accepted: 06/29/2022] [Indexed: 10/17/2022]
Abstract
A major challenge for neuroscience, public health, and evolutionary biology is to understand the effects of scarcity and uncertainty on the developing brain. Currently, a significant fraction of children and adolescents worldwide experience insecure access to food. The goal of our work was to test in mice whether the transient experience of insecure versus secure access to food during the juvenile-adolescent period produced lasting differences in learning, decision-making, and the dopamine system in adulthood. We manipulated feeding schedules in mice from postnatal day (P)21 to P40 as food insecure or ad libitum and found that when tested in adulthood (after P60), males with different developmental feeding history showed significant differences in multiple metrics of cognitive flexibility in learning and decision-making. Adult females with different developmental feeding history showed no differences in cognitive flexibility but did show significant differences in adult weight. We next applied reinforcement learning models to these behavioral data. The best fit models suggested that in males, developmental feeding history altered how mice updated their behavior after negative outcomes. This effect was sensitive to task context and reward contingencies. Consistent with these results, in males, we found that the two feeding history groups showed significant differences in the AMPAR/NMDAR ratio of excitatory synapses on nucleus-accumbens-projecting midbrain dopamine neurons and evoked dopamine release in dorsal striatal targets. Together, these data show in a rodent model that transient differences in feeding history in the juvenile-adolescent period can have significant impacts on adult weight, learning, decision-making, and dopamine neurobiology.
Collapse
Affiliation(s)
- Wan Chen Lin
- Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA 94720, USA
| | - Christine Liu
- Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA 94720, USA
| | - Polina Kosillo
- Department of Molecular and Cellular Biology, University of California Berkeley, Berkeley, CA 94720, USA
| | - Lung-Hao Tai
- Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA 94720, USA
| | - Ezequiel Galarce
- Robert Wood Johnson Foundation Health and Society Scholar, University of California Berkeley, Berkeley, CA 94720, USA
| | - Helen S Bateup
- Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA 94720, USA; Department of Molecular and Cellular Biology, University of California Berkeley, Berkeley, CA 94720, USA; Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| | - Stephan Lammel
- Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA 94720, USA; Department of Molecular and Cellular Biology, University of California Berkeley, Berkeley, CA 94720, USA
| | - Linda Wilbrecht
- Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA 94720, USA; Department of Psychology, University of California Berkeley, Berkeley, CA 94720, USA.
| |
Collapse
|
35
|
Nussenbaum K, Velez JA, Washington BT, Hamling HE, Hartley CA. Flexibility in valenced reinforcement learning computations across development. Child Dev 2022; 93:1601-1615. [PMID: 35596654 PMCID: PMC9831067 DOI: 10.1111/cdev.13791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Optimal integration of positive and negative outcomes during learning varies depending on an environment's reward statistics. The present study investigated the extent to which children, adolescents, and adults (N = 142 8-25 year-olds, 55% female, 42% White, 31% Asian, 17% mixed race, and 8% Black; data collected in 2021) adapt their weighting of better-than-expected and worse-than-expected outcomes when learning from reinforcement. Participants made choices across two contexts: one in which weighting positive outcomes more heavily than negative outcomes led to better performance, and one in which the reverse was true. Reinforcement learning modeling revealed that across age, participants shifted their valence biases in accordance with environmental structure. Exploratory analyses revealed strengthening of context-dependent flexibility with increasing age.
Collapse
|
36
|
A comparison of reinforcement learning models of human spatial navigation. Sci Rep 2022; 12:13923. [PMID: 35978035 PMCID: PMC9385652 DOI: 10.1038/s41598-022-18245-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 08/08/2022] [Indexed: 11/09/2022] Open
Abstract
Reinforcement learning (RL) models have been influential in characterizing human learning and decision making, but few studies apply them to characterizing human spatial navigation and even fewer systematically compare RL models under different navigation requirements. Because RL can characterize one's learning strategies quantitatively and in a continuous manner, and one's consistency of using such strategies, it can provide a novel and important perspective for understanding the marked individual differences in human navigation and disentangle navigation strategies from navigation performance. One-hundred and fourteen participants completed wayfinding tasks in a virtual environment where different phases manipulated navigation requirements. We compared performance of five RL models (3 model-free, 1 model-based and 1 "hybrid") at fitting navigation behaviors in different phases. Supporting implications from prior literature, the hybrid model provided the best fit regardless of navigation requirements, suggesting the majority of participants rely on a blend of model-free (route-following) and model-based (cognitive mapping) learning in such navigation scenarios. Furthermore, consistent with a key prediction, there was a correlation in the hybrid model between the weight on model-based learning (i.e., navigation strategy) and the navigator's exploration vs. exploitation tendency (i.e., consistency of using such navigation strategy), which was modulated by navigation task requirements. Together, we not only show how computational findings from RL align with the spatial navigation literature, but also reveal how the relationship between navigation strategy and a person's consistency using such strategies changes as navigation requirements change.
Collapse
|
37
|
Fengler A, Bera K, Pedersen ML, Frank MJ. Beyond Drift Diffusion Models: Fitting a Broad Class of Decision and Reinforcement Learning Models with HDDM. J Cogn Neurosci 2022; 34:1780-1805. [PMID: 35939629 DOI: 10.1162/jocn_a_01902] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Computational modeling has become a central aspect of research in the cognitive neurosciences. As the field matures, it is increasingly important to move beyond standard models to quantitatively assess models with richer dynamics that may better reflect underlying cognitive and neural processes. For example, sequential sampling models (SSMs) are a general class of models of decision-making intended to capture processes jointly giving rise to RT distributions and choice data in n-alternative choice paradigms. A number of model variations are of theoretical interest, but empirical data analysis has historically been tied to a small subset for which likelihood functions are analytically tractable. Advances in methods designed for likelihood-free inference have recently made it computationally feasible to consider a much larger spectrum of SSMs. In addition, recent work has motivated the combination of SSMs with reinforcement learning models, which had historically been considered in separate literatures. Here, we provide a significant addition to the widely used HDDM Python toolbox and include a tutorial for how users can easily fit and assess a (user-extensible) wide variety of SSMs and how they can be combined with reinforcement learning models. The extension comes batteries included, including model visualization tools, posterior predictive checks, and ability to link trial-wise neural signals with model parameters via hierarchical Bayesian regression.
Collapse
|
38
|
Lan DCL, Browning M. What Can Reinforcement Learning Models of Dopamine and Serotonin Tell Us about the Action of Antidepressants? COMPUTATIONAL PSYCHIATRY (CAMBRIDGE, MASS.) 2022; 6:166-188. [PMID: 38774776 PMCID: PMC11104395 DOI: 10.5334/cpsy.83] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 06/29/2022] [Indexed: 11/20/2022]
Abstract
Although evidence suggests that antidepressants are effective at treating depression, the mechanisms behind antidepressant action remain unclear, especially at the cognitive/computational level. In recent years, reinforcement learning (RL) models have increasingly been used to characterise the roles of neurotransmitters and to probe the computations that might be altered in psychiatric disorders like depression. Hence, RL models might present an opportunity for us to better understand the computational mechanisms underlying antidepressant effects. Moreover, RL models may also help us shed light on how these computations may be implemented in the brain (e.g., in midbrain, striatal, and prefrontal regions) and how these neural mechanisms may be altered in depression and remediated by antidepressant treatments. In this paper, we evaluate the ability of RL models to help us understand the processes underlying antidepressant action. To do this, we review the preclinical literature on the roles of dopamine and serotonin in RL, draw links between these findings and clinical work investigating computations altered in depression, and appraise the evidence linking modification of RL processes to antidepressant function. Overall, while there is no shortage of promising ideas about the computational mechanisms underlying antidepressant effects, there is insufficient evidence directly implicating these mechanisms in the response of depressed patients to antidepressant treatment. Consequently, future studies should investigate these mechanisms in samples of depressed patients and assess whether modifications in RL processes mediate the clinical effect of antidepressant treatments.
Collapse
Affiliation(s)
- Denis C. L. Lan
- Department of Experimental Psychology, University of Oxford, Oxford, GB
| | | |
Collapse
|
39
|
Eckstein MK, Master SL, Dahl RE, Wilbrecht L, Collins AGE. Reinforcement learning and Bayesian inference provide complementary models for the unique advantage of adolescents in stochastic reversal. Dev Cogn Neurosci 2022; 55:101106. [PMID: 35537273 PMCID: PMC9108470 DOI: 10.1016/j.dcn.2022.101106] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Revised: 03/01/2022] [Accepted: 03/25/2022] [Indexed: 12/02/2022] Open
Abstract
During adolescence, youth venture out, explore the wider world, and are challenged to learn how to navigate novel and uncertain environments. We investigated how performance changes across adolescent development in a stochastic, volatile reversal-learning task that uniquely taxes the balance of persistence and flexibility. In a sample of 291 participants aged 8-30, we found that in the mid-teen years, adolescents outperformed both younger and older participants. We developed two independent cognitive models, based on Reinforcement learning (RL) and Bayesian inference (BI). The RL parameter for learning from negative outcomes and the BI parameters specifying participants' mental models were closest to optimal in mid-teen adolescents, suggesting a central role in adolescent cognitive processing. By contrast, persistence and noise parameters improved monotonically with age. We distilled the insights of RL and BI using principal component analysis and found that three shared components interacted to form the adolescent performance peak: adult-like behavioral quality, child-like time scales, and developmentally-unique processing of positive feedback. This research highlights adolescence as a neurodevelopmental window that can create performance advantages in volatile and uncertain environments. It also shows how detailed insights can be gleaned by using cognitive models in new ways.
Collapse
Affiliation(s)
| | | | - Ronald E Dahl
- Institute of Human Development, 2121 Berkeley Way West, USA
| | - Linda Wilbrecht
- Department of Psychology, 2121 Berkeley Way West, USA; Helen Wills Neuroscience Institute, 175 Li Ka Shing Center, Berkeley, CA 94720, USA
| | | |
Collapse
|
40
|
Palminteri S, Lebreton M. The computational roots of positivity and confirmation biases in reinforcement learning. Trends Cogn Sci 2022; 26:607-621. [PMID: 35662490 DOI: 10.1016/j.tics.2022.04.005] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 04/13/2022] [Accepted: 04/18/2022] [Indexed: 12/16/2022]
Abstract
Humans do not integrate new information objectively: outcomes carrying a positive affective value and evidence confirming one's own prior belief are overweighed. Until recently, theoretical and empirical accounts of the positivity and confirmation biases assumed them to be specific to 'high-level' belief updates. We present evidence against this account. Learning rates in reinforcement learning (RL) tasks, estimated across different contexts and species, generally present the same characteristic asymmetry, suggesting that belief and value updating processes share key computational principles and distortions. This bias generates over-optimistic expectations about the probability of making the right choices and, consequently, generates over-optimistic reward expectations. We discuss the normative and neurobiological roots of these RL biases and their position within the greater picture of behavioral decision-making theories.
Collapse
Affiliation(s)
- Stefano Palminteri
- Laboratoire de Neurosciences Cognitives et Computationnelles, Institut National de la Santé et Recherche Médicale, Paris, France; Département d'Études Cognitives, Ecole Normale Supérieure, Paris, France; Université de Recherche Paris Sciences et Lettres, Paris, France.
| | - Maël Lebreton
- Paris School of Economics, Paris, France; LabNIC, Department of Fundamental Neurosciences, University of Geneva, Geneva, Switzerland; Swiss Center for Affective Science, Geneva, Switzerland.
| |
Collapse
|
41
|
Pike AC, Robinson OJ. Reinforcement Learning in Patients With Mood and Anxiety Disorders vs Control Individuals: A Systematic Review and Meta-analysis. JAMA Psychiatry 2022; 79:313-322. [PMID: 35234834 PMCID: PMC8892374 DOI: 10.1001/jamapsychiatry.2022.0051] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
IMPORTANCE Computational psychiatry studies have investigated how reinforcement learning may be different in individuals with mood and anxiety disorders compared with control individuals, but results are inconsistent. OBJECTIVE To assess whether there are consistent differences in reinforcement-learning parameters between patients with depression or anxiety and control individuals. DATA SOURCES Web of Knowledge, PubMed, Embase, and Google Scholar searches were performed between November 15, 2019, and December 6, 2019, and repeated on December 3, 2020, and February 23, 2021, with keywords (reinforcement learning) AND (computational OR model) AND (depression OR anxiety OR mood). STUDY SELECTION Studies were included if they fit reinforcement-learning models to human choice data from a cognitive task with rewards or punishments, had a case-control design including participants with mood and/or anxiety disorders and healthy control individuals, and included sufficient information about all parameters in the models. DATA EXTRACTION AND SYNTHESIS Articles were assessed for inclusion according to MOOSE guidelines. Participant-level parameters were extracted from included articles, and a conventional meta-analysis was performed using a random-effects model. Subsequently, these parameters were used to simulate choice performance for each participant on benchmarking tasks in a simulation meta-analysis. Models were fitted, parameters were extracted using bayesian model averaging, and differences between patients and control individuals were examined. Overall effect sizes across analytic strategies were inspected. MAIN OUTCOMES AND MEASURES The primary outcomes were estimated reinforcement-learning parameters (learning rate, inverse temperature, reward learning rate, and punishment learning rate). RESULTS A total of 27 articles were included (3085 participants, 1242 of whom had depression and/or anxiety). In the conventional meta-analysis, patients showed lower inverse temperature than control individuals (standardized mean difference [SMD], -0.215; 95% CI, -0.354 to -0.077), although no parameters were common across all studies, limiting the ability to infer differences. In the simulation meta-analysis, patients showed greater punishment learning rates (SMD, 0.107; 95% CI, 0.107 to 0.108) and slightly lower reward learning rates (SMD, -0.021; 95% CI, -0.022 to -0.020) relative to control individuals. The simulation meta-analysis showed no meaningful difference in inverse temperature between patients and control individuals (SMD, 0.003; 95% CI, 0.002 to 0.004). CONCLUSIONS AND RELEVANCE The simulation meta-analytic approach introduced in this article for inferring meta-group differences from heterogeneous computational psychiatry studies indicated elevated punishment learning rates in patients compared with control individuals. This difference may promote and uphold negative affective bias symptoms and hence constitute a potential mechanistic treatment target for mood and anxiety disorders.
Collapse
Affiliation(s)
- Alexandra C. Pike
- Anxiety Lab, Neuroscience and Mental Health Group, Institute of Cognitive Neuroscience, University College London, London, United Kingdom
| | - Oliver J. Robinson
- Anxiety Lab, Neuroscience and Mental Health Group, Institute of Cognitive Neuroscience, University College London, London, United Kingdom,Research Department of Clinical, Educational and Health Psychology, University College London, London, United Kingdom
| |
Collapse
|
42
|
Effective of Smart Mathematical Model by Machine Learning Classifier on Big Data in Healthcare Fast Response. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:6927170. [PMID: 35251298 PMCID: PMC8890881 DOI: 10.1155/2022/6927170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/11/2021] [Revised: 02/02/2022] [Accepted: 02/07/2022] [Indexed: 11/17/2022]
Abstract
In the past few years, big data related to healthcare has become more important, due to the abundance of data, the increasing cost of healthcare, and the privacy of healthcare. Create, analyze, and process large and complex data that cannot be processed by traditional methods. The proposed method is based on classifying data into several classes using the data weight derived from the features extracted from the big data. Three important criteria were used to evaluate the study as well as to benchmark the current study with previous studies using a standard dataset.
Collapse
|
43
|
Kieslich K, Valton V, Roiser JP. Pleasure, Reward Value, Prediction Error and Anhedonia. Curr Top Behav Neurosci 2022; 58:281-304. [PMID: 35156187 DOI: 10.1007/7854_2021_295] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In order to develop effective treatments for anhedonia we need to understand its underlying neurobiological mechanisms. Anhedonia is conceptually strongly linked to reward processing, which involves a variety of cognitive and neural operations. This chapter reviews the evidence for impairments in experiencing hedonic response (pleasure), reward valuation and reward learning based on outcomes (commonly conceptualised in terms of "reward prediction error"). Synthesising behavioural and neuroimaging findings, we examine case-control studies of patients with depression and schizophrenia, including those focusing specifically on anhedonia. Overall, there is reliable evidence that depression and schizophrenia are associated with disrupted reward processing. In contrast to the historical definition of anhedonia, there is surprisingly limited evidence for impairment in the ability to experience pleasure in depression and schizophrenia. There is some evidence that learning about reward and reward prediction error signals are impaired in depression and schizophrenia, but the literature is inconsistent. The strongest evidence is for impairments in the representation of reward value and how this is used to guide action. Future studies would benefit from focusing on impairments in reward processing specifically in anhedonic samples, including transdiagnostically, and from using designs separating different components of reward processing, formulating them in computational terms, and moving beyond cross-sectional designs to provide an assessment of causality.
Collapse
Affiliation(s)
- Karel Kieslich
- Institute of Cognitive Neuroscience, University College London, London, UK
| | - Vincent Valton
- Institute of Cognitive Neuroscience, University College London, London, UK
| | - Jonathan P Roiser
- Institute of Cognitive Neuroscience, University College London, London, UK.
| |
Collapse
|
44
|
Collins AGE, Shenhav A. Advances in modeling learning and decision-making in neuroscience. Neuropsychopharmacology 2022; 47:104-118. [PMID: 34453117 PMCID: PMC8617262 DOI: 10.1038/s41386-021-01126-y] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Revised: 07/14/2021] [Accepted: 07/22/2021] [Indexed: 02/07/2023]
Abstract
An organism's survival depends on its ability to learn about its environment and to make adaptive decisions in the service of achieving the best possible outcomes in that environment. To study the neural circuits that support these functions, researchers have increasingly relied on models that formalize the computations required to carry them out. Here, we review the recent history of computational modeling of learning and decision-making, and how these models have been used to advance understanding of prefrontal cortex function. We discuss how such models have advanced from their origins in basic algorithms of updating and action selection to increasingly account for complexities in the cognitive processes required for learning and decision-making, and the representations over which they operate. We further discuss how a deeper understanding of the real-world complexities in these computations has shed light on the fundamental constraints on optimal behavior, and on the complex interactions between corticostriatal pathways to determine such behavior. The continuing and rapid development of these models holds great promise for understanding the mechanisms by which animals adapt to their environments, and what leads to maladaptive forms of learning and decision-making within clinical populations.
Collapse
Affiliation(s)
- Anne G E Collins
- Department of Psychology and Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA.
| | - Amitai Shenhav
- Department of Cognitive, Linguistic, & Psychological Sciences and Carney Institute for Brain Science, Brown University, Providence, RI, USA.
| |
Collapse
|
45
|
Yoo AH, Collins AGE. How Working Memory and Reinforcement Learning Are Intertwined: A Cognitive, Neural, and Computational Perspective. J Cogn Neurosci 2021; 34:551-568. [PMID: 34942642 DOI: 10.1162/jocn_a_01808] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Reinforcement learning and working memory are two core processes of human cognition and are often considered cognitively, neuroscientifically, and algorithmically distinct. Here, we show that the brain networks that support them actually overlap significantly and that they are less distinct cognitive processes than often assumed. We review literature demonstrating the benefits of considering each process to explain properties of the other and highlight recent work investigating their more complex interactions. We discuss how future research in both computational and cognitive sciences can benefit from one another, suggesting that a key missing piece for artificial agents to learn to behave with more human-like efficiency is taking working memory's role in learning seriously. This review highlights the risks of neglecting the interplay between different processes when studying human behavior (in particular when considering individual differences). We emphasize the importance of investigating these dynamics to build a comprehensive understanding of human cognition.
Collapse
|
46
|
FeldmanHall O, Nassar MR. The computational challenge of social learning. Trends Cogn Sci 2021; 25:1045-1057. [PMID: 34583876 PMCID: PMC8585698 DOI: 10.1016/j.tics.2021.09.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 08/31/2021] [Accepted: 09/01/2021] [Indexed: 10/20/2022]
Abstract
The complex reward structure of the social world and the uncertainty endemic to social contexts poses a challenge for modeling. For example, during social interactions, the actions of one person influence the internal states of another. These social dependencies make it difficult to formalize social learning problems in a mathematically tractable way. While it is tempting to dispense with these complexities, they are a defining feature of social life. Because the structure of social interactions challenges the simplifying assumptions often made in models, they make an ideal testbed for computational models of cognition. By adopting a framework that embeds existing social knowledge into the model, we can go beyond explaining behaviors in laboratory tasks to explaining those observed in the wild.
Collapse
Affiliation(s)
- Oriel FeldmanHall
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI 02912, USA; Carney Institute for Brain Sciences, Brown University, Providence, RI 02912, USA.
| | - Matthew R Nassar
- Carney Institute for Brain Sciences, Brown University, Providence, RI 02912, USA; Department of Neuroscience, Brown University, Providence, RI 02912, USA
| |
Collapse
|
47
|
Bradfield L, Balleine B. Editorial overview: Value-based decision making: control, value, and context in action. Curr Opin Behav Sci 2021. [DOI: 10.1016/j.cobeha.2021.09.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|