1
|
Bein O, Niv Y. Schemas, reinforcement learning and the medial prefrontal cortex. Nat Rev Neurosci 2025; 26:141-157. [PMID: 39775183 DOI: 10.1038/s41583-024-00893-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/03/2024] [Indexed: 01/11/2025]
Abstract
Schemas are rich and complex knowledge structures about the typical unfolding of events in a context; for example, a schema of a dinner at a restaurant. In this Perspective, we suggest that reinforcement learning (RL), a computational theory of learning the structure of the world and relevant goal-oriented behaviour, underlies schema learning. We synthesize literature about schemas and RL to offer that three RL principles might govern the learning of schemas: learning via prediction errors, constructing hierarchical knowledge using hierarchical RL, and dimensionality reduction through learning a simplified and abstract representation of the world. We then suggest that the orbitomedial prefrontal cortex is involved in both schemas and RL due to its involvement in dimensionality reduction and in guiding memory reactivation through interactions with posterior brain regions. Last, we hypothesize that the amount of dimensionality reduction might underlie gradients of involvement along the ventral-dorsal and posterior-anterior axes of the orbitomedial prefrontal cortex. More specific and detailed representations might engage the ventral and posterior parts, whereas abstraction might shift representations towards the dorsal and anterior parts of the medial prefrontal cortex.
Collapse
Affiliation(s)
- Oded Bein
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA.
- Weill Cornell Institute of Geriatric Psychiatry, Department of Psychiatry, Weill Cornell Medicine, New York, NY, USA.
| | - Yael Niv
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
- Psychology Department, Princeton University, Princeton, NJ, USA
| |
Collapse
|
2
|
Dołęga K, Mentec I, Cleeremans A. How does the quality space come to be? Trends Cogn Sci 2025; 29:107-108. [PMID: 39550303 DOI: 10.1016/j.tics.2024.10.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2024] [Revised: 10/07/2024] [Accepted: 10/08/2024] [Indexed: 11/18/2024]
Affiliation(s)
- Krzysztof Dołęga
- Consciousness, Cognition & Computation Group, Center for Research in Cognition & Neurosciences, ULB Institute of Neuroscience, Université libre de Bruxelles, Brussels, Belgium
| | - Inès Mentec
- Consciousness, Cognition & Computation Group, Center for Research in Cognition & Neurosciences, ULB Institute of Neuroscience, Université libre de Bruxelles, Brussels, Belgium
| | - Axel Cleeremans
- Consciousness, Cognition & Computation Group, Center for Research in Cognition & Neurosciences, ULB Institute of Neuroscience, Université libre de Bruxelles, Brussels, Belgium.
| |
Collapse
|
3
|
Moreno-Rodriguez S, Béranger B, Volle E, Lopez-Persem A. The human reward system encodes the subjective value of ideas during creative thinking. Commun Biol 2025; 8:37. [PMID: 39794481 PMCID: PMC11723971 DOI: 10.1038/s42003-024-07427-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2024] [Accepted: 12/18/2024] [Indexed: 01/13/2025] Open
Abstract
Creative thinking involves the evaluation of one's ideas in order to select the best one, but the cognitive and neural mechanisms underlying this evaluation remain unclear. Using a combination of creativity and rating tasks, this study demonstrates that individuals attribute subjective values to their ideas, as a relative balance of their originality and adequacy. This relative balance depends on individual preferences and predicts individuals' creative abilities. Using functional Magnetic Resonance Imaging, we find that the Default Mode and the Executive Control Networks respectively encode the originality and adequacy of ideas, and that the human reward system encodes their subjective value. Interestingly, the relative functional connectivity of the Default Mode and Executive Control Networks with the human reward system correlates with the relative balance of adequacy and originality in individuals' preferences. These results add valuation to the incomplete behavioral and neural accounts of creativity, offering perspectives on the influence of individual preferences on creative abilities.
Collapse
Affiliation(s)
- Sarah Moreno-Rodriguez
- FrontLab, Institut du Cerveau - Paris Brain Institute - ICM, INSERM, CNRS, Hôpital de la Pitié Salpêtrière, AP-HP, Sorbonne University, Paris, France.
| | - Benoît Béranger
- CENIR, Institut du Cerveau - Paris Brain Institute - ICM, INSERM, CNRS, Hôpital de la Pitié Salpêtrière, AP-HP, Sorbonne University, Paris, France
| | - Emmanuelle Volle
- FrontLab, Institut du Cerveau - Paris Brain Institute - ICM, INSERM, CNRS, Hôpital de la Pitié Salpêtrière, AP-HP, Sorbonne University, Paris, France
| | - Alizée Lopez-Persem
- FrontLab, Institut du Cerveau - Paris Brain Institute - ICM, INSERM, CNRS, Hôpital de la Pitié Salpêtrière, AP-HP, Sorbonne University, Paris, France.
| |
Collapse
|
4
|
Shenhav A. The affective gradient hypothesis: an affect-centered account of motivated behavior. Trends Cogn Sci 2024; 28:1089-1104. [PMID: 39322489 PMCID: PMC11620945 DOI: 10.1016/j.tics.2024.08.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 08/09/2024] [Accepted: 08/12/2024] [Indexed: 09/27/2024]
Abstract
Everyone agrees that feelings and actions are intertwined, but cannot agree how. According to dominant models, actions are directed by estimates of value and these values shape or are shaped by affect. I propose instead that affect is the only form of value that drives actions. Our mind constantly represents potential future states and how they would make us feel. These states collectively form a gradient reflecting feelings we could experience depending on actions we take. Motivated behavior reflects the process of traversing this affective gradient, towards desirable states and away from undesirable ones. This affective gradient hypothesis solves the puzzle of where values and goals come from, and offers a parsimonious account of apparent conflicts between emotion and cognition.
Collapse
Affiliation(s)
- Amitai Shenhav
- Department of Psychology, Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA.
| |
Collapse
|
5
|
Moneta N, Grossman S, Schuck NW. Representational spaces in orbitofrontal and ventromedial prefrontal cortex: task states, values, and beyond. Trends Neurosci 2024; 47:1055-1069. [PMID: 39547861 DOI: 10.1016/j.tins.2024.10.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2024] [Revised: 10/16/2024] [Accepted: 10/17/2024] [Indexed: 11/17/2024]
Abstract
The orbitofrontal cortex (OFC) and ventromedial-prefrontal cortex (vmPFC) play a key role in decision-making and encode task states in addition to expected value. We review evidence suggesting a connection between value and state representations and argue that OFC / vmPFC integrate stimulus, context, and outcome information. Comparable encoding principles emerge in late layers of deep reinforcement learning (RL) models, where single nodes exhibit similar forms of mixed-selectivity, which enables flexible readout of relevant variables by downstream neurons. Based on these lines of evidence, we suggest that outcome-maximization leads to complex representational spaces that are insufficiently characterized by linear value signals that have been the focus of most prior research on the topic. Major outstanding questions concern the role of OFC/ vmPFC in learning across tasks, in encoding of task-irrelevant aspects, and the role of hippocampus-PFC interactions.
Collapse
Affiliation(s)
- Nir Moneta
- Institute of Psychology, Universität Hamburg, 20146 Hamburg, Germany; Einstein Center for Neurosciences Berlin, Charité Universitätsmedizin Berlin, 10117, Berlin, Germany.
| | - Shany Grossman
- Institute of Psychology, Universität Hamburg, 20146 Hamburg, Germany.
| | - Nicolas W Schuck
- Institute of Psychology, Universität Hamburg, 20146 Hamburg, Germany; Max Planck UCL Centre for Computational Psychiatry and Ageing Research, Berlin, 14195 Berlin, Germany.
| |
Collapse
|
6
|
Prater Fahey M, Yee DM, Leng X, Tarlow M, Shenhav A. Motivational context determines the impact of aversive outcomes on mental effort allocation. Cognition 2024; 254:105973. [PMID: 39413448 DOI: 10.1016/j.cognition.2024.105973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 09/26/2024] [Accepted: 09/29/2024] [Indexed: 10/18/2024]
Abstract
It is well known that people will exert effort on a task if sufficiently motivated, but how they distribute these efforts across different strategies (e.g., efficiency vs. caution) remains uncertain. Past work has shown that people invest effort differently for potential positive outcomes (rewards) versus potential negative outcomes (penalties). However, this research failed to account for differences in the context in which negative outcomes motivate someone - either as punishment or reinforcement. It is therefore unclear whether effort profiles differ as a function of outcome valence, motivational context, or both. Using computational modeling and our novel Multi-Incentive Control Task, we show that the influence of aversive outcomes on one's effort profile is entirely determined by their motivational context. Participants (N:91) favored increased caution in response to larger penalties for incorrect responses, and favored increased efficiency in response to larger reinforcement for correct responses, whether positively or negatively incentivized. STATEMENT OF RELEVANCE: People have to constantly decide how to allocate their mental effort, and in doing so can be motivated by both the positive outcomes that effort accrues and the negative outcomes that effort avoids. For example, someone might persist on a project for work in the hopes of being promoted or to avoid being reprimanded or even fired. Understanding how people weigh these different types of incentives is critical for understanding variability in human achievement as well as sources of motivational impairments (e.g., in major depression). We show that people not only consider both potential positive and negative outcomes when allocating mental effort, but that the profile of effort they engage under negative incentives differs depending on whether that outcome is contingent on sustaining good performance (negative reinforcement) or avoiding bad performance (punishment). Clarifying the motivational factors that determine effort exertion is an important step for understanding motivational impairments in psychopathology.
Collapse
Affiliation(s)
- Mahalia Prater Fahey
- Department of Cognitive and Psychological Sciences, Carney Institute for Brain Science, Brown University, USA.
| | - D M Yee
- Department of Cognitive and Psychological Sciences, Carney Institute for Brain Science, Brown University, USA.
| | - Xiamin Leng
- Department of Cognitive and Psychological Sciences, Carney Institute for Brain Science, Brown University, USA; Department of Psychology, Helen Willis Neuroscience Insitute, UC Berkeley, USA
| | - Maisy Tarlow
- Department of Cognitive and Psychological Sciences, Carney Institute for Brain Science, Brown University, USA
| | - Amitai Shenhav
- Department of Cognitive and Psychological Sciences, Carney Institute for Brain Science, Brown University, USA; Department of Psychology, Helen Willis Neuroscience Insitute, UC Berkeley, USA
| |
Collapse
|
7
|
Wischnewski M, Hörberg MOY, Schutter DJLG. Electrophysiological correlates of (mis)judging social information. Psychophysiology 2024; 61:e14590. [PMID: 38632827 DOI: 10.1111/psyp.14590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 04/02/2024] [Accepted: 04/03/2024] [Indexed: 04/19/2024]
Abstract
Social information can be used to optimize decision-making. However, the simultaneous presentation of multiple sources of advice can lead to a distinction bias in judging the validity of the information. While the involvement of event-related potential (ERP) components in social information processing has been studied, how they are modulated by (mis)judging an advisor's information validity remains unknown. In two experiments participants performed a decision-making task with highly accurate or inaccurate cues. Each experiment consisted of an initial, learning, and test phase. During the learning phase, three advice cues were simultaneously presented and the validity of them had to be assessed. The effect of different cue constellations on ERPs was investigated. In the subsequent test phase, the willingness to follow or oppose an advice cue was tested. Results demonstrated the distinction bias with participants over or underestimating the accuracy of the most uncertain cues. The P2 amplitude was significantly increased during cue presentation when advisors were in disagreement as compared to when all were in agreement, regardless of cue validity. Further, a larger P3 amplitude during outcome presentation was found when advisors were in disagreement and increased with more informative cues. As such, the most uncertain cues were related to the smallest P3 amplitude. The findings hint at the possible role of P3 in judging and learning the predictability of social cues. This study provides novel insights into the role of P2 and P3 components during the judgment of social information validity.
Collapse
Affiliation(s)
- Miles Wischnewski
- Department of Experimental Psychology, University of Groningen, Groningen, the Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, The Netherlands
- Department of Biomedical Engineering, University of Minnesota, Minneapolis, Minnesota, USA
| | - Michael O Y Hörberg
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, The Netherlands
| | - Dennis J L G Schutter
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, The Netherlands
- Department of Experimental Psychology, Helmholtz Institute, Utrecht University, Utrecht, the Netherlands
| |
Collapse
|
8
|
Hall AF, Browning M, Huys QJM. The computational structure of consummatory anhedonia. Trends Cogn Sci 2024; 28:541-553. [PMID: 38423829 DOI: 10.1016/j.tics.2024.01.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 01/18/2024] [Accepted: 01/18/2024] [Indexed: 03/02/2024]
Abstract
Anhedonia is a reduction in enjoyment, motivation, or interest. It is common across mental health disorders and a harbinger of poor treatment outcomes. The enjoyment aspect, termed 'consummatory anhedonia', in particular poses fundamental questions about how the brain constructs rewards: what processes determine how intensely a reward is experienced? Here, we outline limitations of existing computational conceptualisations of consummatory anhedonia. We then suggest a richer reinforcement learning (RL) account of consummatory anhedonia with a reconceptualisation of subjective hedonic experience in terms of goal progress. This accounts qualitatively for the impact of stress, dysfunctional cognitions, and maladaptive beliefs on hedonic experience. The model also offers new views on the treatments for anhedonia.
Collapse
Affiliation(s)
- Anna F Hall
- Applied Computational Psychiatry Lab, Mental Health Neuroscience Department, Division of Psychiatry and Max Planck Centre for Computational Psychiatry and Ageing Research, Queen Square Institute of Neurology, University College London, London, UK
| | - Michael Browning
- Department of Psychiatry, University of Oxford, Oxford, UK; Oxford Health NHS Trust, Oxford, UK
| | - Quentin J M Huys
- Applied Computational Psychiatry Lab, Mental Health Neuroscience Department, Division of Psychiatry and Max Planck Centre for Computational Psychiatry and Ageing Research, Queen Square Institute of Neurology, University College London, London, UK.
| |
Collapse
|
9
|
Liu C, Wang K, Yu R. The neural representation of metacognition in preferential decision-making. Hum Brain Mapp 2024; 45:e26651. [PMID: 38646963 PMCID: PMC11033923 DOI: 10.1002/hbm.26651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 02/06/2024] [Accepted: 02/26/2024] [Indexed: 04/25/2024] Open
Abstract
Humans regularly assess the quality of their judgements, which helps them adjust their behaviours. Metacognition is the ability to accurately evaluate one's own judgements, and it is assessed by comparing objective task performance with subjective confidence report in perceptual decisions. However, for preferential decisions, assessing metacognition in preference-based decisions is difficult because it depends on subjective goals rather than the objective criterion. Here, we develop a new index that integrates choice, reaction time, and confidence report to quantify trial-by-trial metacognitive sensitivity in preference judgements. We found that the dorsomedial prefrontal cortex (dmPFC) and the right anterior insular were more activated when participants made bad metacognitive evaluations. Our study suggests a crucial role of the dmPFC-insula network in representing online metacognitive sensitivity in preferential decisions.
Collapse
Affiliation(s)
- Cuizhen Liu
- School of PsychologyShaanxi Normal UniversityXi'anChina
| | - Keqing Wang
- School of PsychologyShaanxi Normal UniversityXi'anChina
| | - Rongjun Yu
- Department of Management, Marketing, and Information SystemsHong Kong Baptist UniversityHong KongChina
| |
Collapse
|
10
|
Bénon J, Lee D, Hopper W, Verdeil M, Pessiglione M, Vinckier F, Bouret S, Rouault M, Lebouc R, Pezzulo G, Schreiweis C, Burguière E, Daunizeau J. The online metacognitive control of decisions. COMMUNICATIONS PSYCHOLOGY 2024; 2:23. [PMID: 39242926 PMCID: PMC11332065 DOI: 10.1038/s44271-024-00071-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 02/28/2024] [Indexed: 09/09/2024]
Abstract
Difficult decisions typically involve mental effort, which scales with the deployment of cognitive (e.g., mnesic, attentional) resources engaged in processing decision-relevant information. But how does the brain regulate mental effort? A possibility is that the brain optimizes a resource allocation problem, whereby the amount of invested resources balances its expected cost (i.e. effort) and benefit. Our working assumption is that subjective decision confidence serves as the benefit term of the resource allocation problem, hence the "metacognitive" nature of decision control. Here, we present a computational model for the online metacognitive control of decisions or oMCD. Formally, oMCD is a Markov Decision Process that optimally solves the ensuing resource allocation problem under agnostic assumptions about the inner workings of the underlying decision system. We demonstrate how this makes oMCD a quasi-optimal control policy for a broad class of decision processes, including -but not limited to- progressive attribute integration. We disclose oMCD's main properties (in terms of choice, confidence and response time), and show that they reproduce most established empirical results in the field of value-based decision making. Finally, we discuss the possible connections between oMCD and most prominent neurocognitive theories about decision control and mental effort regulation.
Collapse
Affiliation(s)
| | - Douglas Lee
- School of Electrical and Electronic Engineering, University College Dublin, Dublin, Ireland
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Polanía R, Burdakov D, Hare TA. Rationality, preferences, and emotions with biological constraints: it all starts from our senses. Trends Cogn Sci 2024; 28:264-277. [PMID: 38341322 DOI: 10.1016/j.tics.2024.01.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 01/10/2024] [Accepted: 01/11/2024] [Indexed: 02/12/2024]
Abstract
Is the role of our sensory systems to represent the physical world as accurately as possible? If so, are our preferences and emotions, often deemed irrational, decoupled from these 'ground-truth' sensory experiences? We show why the answer to both questions is 'no'. Brain function is metabolically costly, and the brain loses some fraction of the information that it encodes and transmits. Therefore, if brains maximize objective functions that increase the fitness of their species, they should adapt to the objective-maximizing rules of the environment at the earliest stages of sensory processing. Consequently, observed 'irrationalities', preferences, and emotions stem from the necessity for our early sensory systems to adapt and process information while considering the metabolic costs and internal states of the organism.
Collapse
Affiliation(s)
- Rafael Polanía
- Decision Neuroscience Laboratory, Department of Health Sciences and Technology, ETH, Zurich, Zurich, Switzerland.
| | - Denis Burdakov
- Neurobehavioral Dynamics Laboratory, Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland
| | - Todd A Hare
- Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Zurich, Switzerland
| |
Collapse
|
12
|
Brooks HR, Sokol-Hessner P. Multiple timescales of temporal context in risky choice: Behavioral identification and relationships to physiological arousal. PLoS One 2024; 19:e0296681. [PMID: 38241251 PMCID: PMC10798524 DOI: 10.1371/journal.pone.0296681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 12/15/2023] [Indexed: 01/21/2024] Open
Abstract
Context-dependence is fundamental to risky monetary decision-making. A growing body of evidence suggests that temporal context, or recent events, alters risk-taking at a minimum of three timescales: immediate (e.g. trial-by-trial), neighborhood (e.g. a group of consecutive trials), and global (e.g. task-level). To examine context effects, we created a novel monetary choice set with intentional temporal structure in which option values shifted between multiple levels of value magnitude ("contexts") several times over the course of the task. This structure allowed us to examine whether effects of each timescale were simultaneously present in risky choice behavior and the potential mechanistic role of arousal, an established correlate of risk-taking, in context-dependency. We found that risk-taking was sensitive to immediate, neighborhood, and global timescales: risk-taking decreased following large (vs. small) outcome amounts, increased following large positive (but not negative) shifts in context, and increased when cumulative earnings exceeded expectations. We quantified arousal with skin conductance responses, which were related to the global timescale, increasing with cumulative earnings, suggesting that physiological arousal captures a task-level assessment of performance. Our results both replicate and extend prior research by demonstrating that risky decision-making is consistently dynamic at multiple timescales and that the role of arousal in risk-taking extends to some, but not all timescales of context-dependence.
Collapse
Affiliation(s)
- Hayley R. Brooks
- Department of Psychology, University of Denver, Denver, Colorado, United States of America
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, Rhode Island, United States of America
| | - Peter Sokol-Hessner
- Department of Psychology, University of Denver, Denver, Colorado, United States of America
| |
Collapse
|
13
|
Fahey MP, Yee DM, Leng X, Tarlow M, Shenhav A. Motivational context determines the impact of aversive outcomes on mental effort allocation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.27.564461. [PMID: 37961466 PMCID: PMC10634922 DOI: 10.1101/2023.10.27.564461] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
It is well known that people will exert effort on a task if sufficiently motivated, but how they distribute these efforts across different strategies (e.g., efficiency vs. caution) remains uncertain. Past work has shown that people invest effort differently for potential positive outcomes (rewards) versus potential negative outcomes (penalties). However, this research failed to account for differences in the context in which negative outcomes motivate someone - either as punishment or reinforcement. It is therefore unclear whether effort profiles differ as a function of outcome valence, motivational context, or both. Using computational modeling and our novel Multi-Incentive Control Task, we show that the influence of aversive outcomes on one's effort profile is entirely determined by their motivational context. Participants (N:91) favored increased caution in response to larger penalties for incorrect responses, and favored increased efficiency in response to larger reinforcement for correct responses, whether positively or negatively incentivized.
Collapse
Affiliation(s)
- Mahalia Prater Fahey
- Cognitive, Linguistic, and Psychological Sciences, Brown University Carney Institute for Brain Science, Brown University
| | - Debbie M Yee
- Cognitive, Linguistic, and Psychological Sciences, Brown University Carney Institute for Brain Science, Brown University
| | - Xiamin Leng
- Cognitive, Linguistic, and Psychological Sciences, Brown University Carney Institute for Brain Science, Brown University
| | - Maisy Tarlow
- Cognitive, Linguistic, and Psychological Sciences, Brown University Carney Institute for Brain Science, Brown University
| | - Amitai Shenhav
- Cognitive, Linguistic, and Psychological Sciences, Brown University Carney Institute for Brain Science, Brown University
| |
Collapse
|
14
|
Molinaro G, Collins AGE. A goal-centric outlook on learning. Trends Cogn Sci 2023; 27:1150-1164. [PMID: 37696690 DOI: 10.1016/j.tics.2023.08.011] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 08/11/2023] [Accepted: 08/14/2023] [Indexed: 09/13/2023]
Abstract
Goals play a central role in human cognition. However, computational theories of learning and decision-making often take goals as given. Here, we review key empirical findings showing that goals shape the representations of inputs, responses, and outcomes, such that setting a goal crucially influences the central aspects of any learning process: states, actions, and rewards. We thus argue that studying goal selection is essential to advance our understanding of learning. By following existing literature in framing goal selection within a hierarchy of decision-making problems, we synthesize important findings on the principles underlying goal value attribution and exploration strategies. Ultimately, we propose that a goal-centric perspective will help develop more complete accounts of learning in both biological and artificial agents.
Collapse
Affiliation(s)
- Gaia Molinaro
- Department of Psychology, University of California, Berkeley, Berkeley, CA, USA.
| | - Anne G E Collins
- Department of Psychology, University of California, Berkeley, Berkeley, CA, USA; Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA
| |
Collapse
|
15
|
Mehrotra D, Dubé L. Accounting for multiscale processing in adaptive real-world decision-making via the hippocampus. Front Neurosci 2023; 17:1200842. [PMID: 37732307 PMCID: PMC10508350 DOI: 10.3389/fnins.2023.1200842] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 08/25/2023] [Indexed: 09/22/2023] Open
Abstract
For adaptive real-time behavior in real-world contexts, the brain needs to allow past information over multiple timescales to influence current processing for making choices that create the best outcome as a person goes about making choices in their everyday life. The neuroeconomics literature on value-based decision-making has formalized such choice through reinforcement learning models for two extreme strategies. These strategies are model-free (MF), which is an automatic, stimulus-response type of action, and model-based (MB), which bases choice on cognitive representations of the world and causal inference on environment-behavior structure. The emphasis of examining the neural substrates of value-based decision making has been on the striatum and prefrontal regions, especially with regards to the "here and now" decision-making. Yet, such a dichotomy does not embrace all the dynamic complexity involved. In addition, despite robust research on the role of the hippocampus in memory and spatial learning, its contribution to value-based decision making is just starting to be explored. This paper aims to better appreciate the role of the hippocampus in decision-making and advance the successor representation (SR) as a candidate mechanism for encoding state representations in the hippocampus, separate from reward representations. To this end, we review research that relates hippocampal sequences to SR models showing that the implementation of such sequences in reinforcement learning agents improves their performance. This also enables the agents to perform multiscale temporal processing in a biologically plausible manner. Altogether, we articulate a framework to advance current striatal and prefrontal-focused decision making to better account for multiscale mechanisms underlying various real-world time-related concepts such as the self that cumulates over a person's life course.
Collapse
Affiliation(s)
- Dhruv Mehrotra
- Integrated Program in Neuroscience, McGill University, Montréal, QC, Canada
- Montréal Neurological Institute, McGill University, Montréal, QC, Canada
| | - Laurette Dubé
- Desautels Faculty of Management, McGill University, Montréal, QC, Canada
- McGill Center for the Convergence of Health and Economics, McGill University, Montréal, QC, Canada
| |
Collapse
|
16
|
Dulberg Z, Dubey R, Berwian IM, Cohen JD. Having multiple selves helps learning agents explore and adapt in complex changing worlds. Proc Natl Acad Sci U S A 2023; 120:e2221180120. [PMID: 37399387 PMCID: PMC10334746 DOI: 10.1073/pnas.2221180120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 05/09/2023] [Indexed: 07/05/2023] Open
Abstract
Satisfying a variety of conflicting needs in a changing environment is a fundamental challenge for any adaptive agent. Here, we show that designing an agent in a modular fashion as a collection of subagents, each dedicated to a separate need, powerfully enhanced the agent's capacity to satisfy its overall needs. We used the formalism of deep reinforcement learning to investigate a biologically relevant multiobjective task: continually maintaining homeostasis of a set of physiologic variables. We then conducted simulations in a variety of environments and compared how modular agents performed relative to standard monolithic agents (i.e., agents that aimed to satisfy all needs in an integrated manner using a single aggregate measure of success). Simulations revealed that modular agents a) exhibited a form of exploration that was intrinsic and emergent rather than extrinsically imposed; b) were robust to changes in nonstationary environments, and c) scaled gracefully in their ability to maintain homeostasis as the number of conflicting objectives increased. Supporting analysis suggested that the robustness to changing environments and increasing numbers of needs were due to intrinsic exploration and efficiency of representation afforded by the modular architecture. These results suggest that the normative principles by which agents have adapted to complex changing environments may also explain why humans have long been described as consisting of "multiple selves."
Collapse
Affiliation(s)
- Zack Dulberg
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ08544
| | - Rachit Dubey
- Department of Computer Science, Princeton University, Princeton, NJ08544
| | - Isabel M. Berwian
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ08544
| | - Jonathan D. Cohen
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ08544
| |
Collapse
|
17
|
Li AY, Yuan JY, Pun C, Barense MD. The effect of memory load on object reconstruction: Insights from an online mouse-tracking task. Atten Percept Psychophys 2023; 85:1612-1630. [PMID: 36600154 DOI: 10.3758/s13414-022-02650-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/20/2022] [Indexed: 01/05/2023]
Abstract
Why can't we remember everything that we experience? Previous work in the domain of object memory has suggested that our ability to resolve interference between relevant and irrelevant object features may limit how much we can remember at any given moment. Here, we developed an online mouse-tracking task to study how memory load influences object reconstruction, testing participants synchronously over virtual conference calls. We first tested up to 18 participants concurrently, replicating memory findings from a condition where participants were tested individually. Next, we examined how memory load influenced mouse trajectories as participants reconstructed target objects. We found interference between the contents of working memory and what was perceived during object reconstruction, an effect that interacted with visual similarity and memory load. Furthermore, we found interference from previously studied but currently irrelevant objects, providing evidence of object-to-location binding errors. At the greatest memory load, participants were nearly three times more likely to move their mouse cursor over previously studied nontarget objects, an effect observed primarily during object reconstruction rather than in the period before the final response. As evidence of the dynamic interplay between working memory and perception, these results show that object reconstruction behavior may be altered by (i) interference between what is represented in mind and what is currently being viewed, and (ii) interference from previously studied but currently irrelevant information. Finally, we discuss how mouse tracking can provide a rich characterization of participant behavior at millisecond temporal resolution, enormously increasing power in cognitive psychology experiments.
Collapse
Affiliation(s)
- Aedan Y Li
- Department of Psychology, University of Toronto, 100 St. George Street, Toronto, ON, M5S 3G3, Canada.
| | - James Y Yuan
- Department of Psychology, University of Toronto, 100 St. George Street, Toronto, ON, M5S 3G3, Canada.
| | - Carson Pun
- Department of Psychology, University of Toronto, 100 St. George Street, Toronto, ON, M5S 3G3, Canada
| | - Morgan D Barense
- Department of Psychology, University of Toronto, 100 St. George Street, Toronto, ON, M5S 3G3, Canada
| |
Collapse
|