1
|
D’Amato L, Luca Lancia G, Pezzulo G. The geometry of efficient codes: How rate-distortion trade-offs distort the latent representations of generative models. PLoS Comput Biol 2025; 21:e1012952. [PMID: 40354307 PMCID: PMC12068621 DOI: 10.1371/journal.pcbi.1012952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Accepted: 03/11/2025] [Indexed: 05/14/2025] Open
Abstract
Living organisms rely on internal models of the world to act adaptively. These models, because of resource limitations, cannot encode every detail and hence need to compress information. From a cognitive standpoint, information compression can manifest as a distortion of latent representations, resulting in the emergence of representations that may not accurately reflect the external world or its geometry. Rate-distortion theory formalizes the optimal way to compress information while minimizing such distortions, by considering factors such as capacity limitations, the frequency and the utility of stimuli. However, while this theory explains why the above factors distort latent representations, it does not specify which specific distortions they produce. To address this question, here we investigate how rate-distortion trade-offs shape the latent representations of images in generative models, specifically Beta Variational Autoencoders ([Formula: see text]-VAEs), under varying constraints of model capacity, data distributions, and task objectives. By systematically exploring these factors, we identify three primary distortions in latent representations: prototypization, specialization, and orthogonalization. These distortions emerge as signatures of information compression, reflecting the model's adaptation to capacity limitations, data imbalances, and task demands. Additionally, our findings demonstrate that these distortions can coexist, giving rise to a rich landscape of latent spaces, whose geometry could differ significantly across generative models subject to different constraints. Our findings contribute to explain how the normative constraints of rate-distortion theory shape the geometry of latent representations of generative models of artificial systems and living organisms.
Collapse
Affiliation(s)
- Leo D’Amato
- Department of Control and Computer Engineering, Polytechnic University of Turin, Turin, Italy
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
| | - Gian Luca Lancia
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
- Department of Psychology, Sapienza University of Rome, Rome, Italy
| | - Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
| |
Collapse
|
2
|
Blanc A, Laurent F, Barbier-Chebbah A, Van Assel H, Cocanougher BT, Jones BMW, Hague P, Zlatic M, Chikhi R, Vestergaard CL, Jovanic T, Masson JB, Barré C. Statistical signature of subtle behavioral changes in large-scale assays. PLoS Comput Biol 2025; 21:e1012990. [PMID: 40258220 DOI: 10.1371/journal.pcbi.1012990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Accepted: 03/24/2025] [Indexed: 04/23/2025] Open
Abstract
The central nervous system can generate various behaviors, including motor responses, which we can observe through video recordings. Recent advances in gene manipulation, automated behavioral acquisition at scale, and machine learning enable us to causally link behaviors to their underlying neural mechanisms. Moreover, in some animals, such as the Drosophila melanogaster larva, this mapping is possible at the unprecedented scale of single neurons, allowing us to identify the neural microcircuits generating particular behaviors. These high-throughput screening efforts, linking the activation or suppression of specific neurons to behavioral patterns in millions of animals, provide a rich dataset to explore the diversity of nervous system responses to the same stimuli. However, important challenges remain in identifying subtle behaviors, including immediate and delayed responses to neural activation or suppression, and understanding these behaviors on a large scale. We here introduce several statistically robust methods for analyzing behavioral data in response to these challenges: 1) A generative physical model that regularizes the inference of larval shapes across the entire dataset. 2) An unsupervised kernel-based method for statistical testing in learned behavioral spaces aimed at detecting subtle deviations in behavior. 3) A generative model for larval behavioral sequences, providing a benchmark for identifying higher-order behavioral changes. 4) A comprehensive analysis technique using suffix trees to categorize genetic lines into clusters based on common action sequences. We showcase these methodologies through a behavioral screen focused on responses to an air puff, analyzing data from 280 716 larvae across 569 genetic lines.
Collapse
Affiliation(s)
- Alexandre Blanc
- Institut Pasteur, Université Paris Cité, CNRS UMR 3751, Decision and Bayesian Computation, Paris, France
- Epiméthée, INRIA, Paris, France
| | - François Laurent
- Institut Pasteur, Université Paris Cité, CNRS UMR 3751, Decision and Bayesian Computation, Paris, France
- Epiméthée, INRIA, Paris, France
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, Paris, France
| | - Alex Barbier-Chebbah
- Institut Pasteur, Université Paris Cité, CNRS UMR 3751, Decision and Bayesian Computation, Paris, France
- Epiméthée, INRIA, Paris, France
| | | | - Benjamin T Cocanougher
- University of Cambridge, Department of Zoology, Cambridge, United Kingdom
- MRC Laboratory of Molecular Biology, Neurobiology Division, Cambridge, United Kingdom
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, Virginia, United States of America
| | - Benjamin M W Jones
- University of Cambridge, Department of Zoology, Cambridge, United Kingdom
- MRC Laboratory of Molecular Biology, Neurobiology Division, Cambridge, United Kingdom
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, Virginia, United States of America
| | - Peter Hague
- University of Cambridge, Department of Zoology, Cambridge, United Kingdom
- MRC Laboratory of Molecular Biology, Neurobiology Division, Cambridge, United Kingdom
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, Virginia, United States of America
| | - Marta Zlatic
- University of Cambridge, Department of Zoology, Cambridge, United Kingdom
- MRC Laboratory of Molecular Biology, Neurobiology Division, Cambridge, United Kingdom
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, Virginia, United States of America
| | - Rayan Chikhi
- G5 Sequence Bioinformatics, Department of Computational Biology, Institut Pasteur, Paris, France
| | - Christian L Vestergaard
- Institut Pasteur, Université Paris Cité, CNRS UMR 3751, Decision and Bayesian Computation, Paris, France
- Epiméthée, INRIA, Paris, France
| | - Tihana Jovanic
- Institut des Neurosciences Paris-Saclay, Université Paris-Saclay, Centre National de la Recherche Scientifique, UMR 9197, Saclay, France
| | - Jean-Baptiste Masson
- Institut Pasteur, Université Paris Cité, CNRS UMR 3751, Decision and Bayesian Computation, Paris, France
- Epiméthée, INRIA, Paris, France
| | - Chloé Barré
- Institut Pasteur, Université Paris Cité, CNRS UMR 3751, Decision and Bayesian Computation, Paris, France
- Epiméthée, INRIA, Paris, France
| |
Collapse
|
3
|
Jones SD, Rauwolf P, Westermann G. Computational rationality and developmental neurodivergence. Trends Cogn Sci 2025; 29:314-317. [PMID: 39924396 DOI: 10.1016/j.tics.2025.01.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2024] [Revised: 01/17/2025] [Accepted: 01/17/2025] [Indexed: 02/11/2025]
Abstract
The role of behaviour - choices, actions, and habits - in shaping neurodivergent development remains unclear. In this forum article we introduce computational rationality as a framework for understanding dynamic feedback between brain and behavioural development, and neurodevelopmental variation.
Collapse
Affiliation(s)
| | - Paul Rauwolf
- Department of Psychology, Bangor University, Bangor, UK
| | - Gert Westermann
- Department of Psychology, Lancaster University, Lancaster, UK
| |
Collapse
|
4
|
Gershman SJ, Lak A. Policy Complexity Suppresses Dopamine Responses. J Neurosci 2025; 45:e1756242024. [PMID: 39788740 PMCID: PMC11866995 DOI: 10.1523/jneurosci.1756-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2024] [Revised: 12/17/2024] [Accepted: 12/23/2024] [Indexed: 01/12/2025] Open
Abstract
Limits on information processing capacity impose limits on task performance. We show that male and female mice achieve performance on a perceptual decision task that is near-optimal given their capacity limits, as measured by policy complexity (the mutual information between states and actions). This behavioral profile could be achieved by reinforcement learning with a penalty on high complexity policies, realized through modulation of dopaminergic learning signals. In support of this hypothesis, we find that policy complexity suppresses midbrain dopamine responses to reward outcomes. Furthermore, neural and behavioral reward sensitivity were positively correlated across sessions. Our results suggest that policy compression shapes basic mechanisms of reinforcement learning in the brain.
Collapse
Affiliation(s)
- Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, Massachusetts 02138
| | - Armin Lak
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
5
|
Li JJ, Collins AGE. An algorithmic account for how humans efficiently learn, transfer, and compose hierarchically structured decision policies. Cognition 2025; 254:105967. [PMID: 39368350 PMCID: PMC12052257 DOI: 10.1016/j.cognition.2024.105967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Revised: 09/17/2024] [Accepted: 09/23/2024] [Indexed: 10/07/2024]
Abstract
Learning structures that effectively abstract decision policies is key to the flexibility of human intelligence. Previous work has shown that humans use hierarchically structured policies to efficiently navigate complex and dynamic environments. However, the computational processes that support the learning and construction of such policies remain insufficiently understood. To address this question, we tested 1026 human participants, who made over 1 million choices combined, in a decision-making task where they could learn, transfer, and recompose multiple sets of hierarchical policies. We propose a novel algorithmic account for the learning processes underlying observed human behavior. We show that humans rely on compressed policies over states in early learning, which gradually unfold into hierarchical representations via meta-learning and Bayesian inference. Our modeling evidence suggests that these hierarchical policies are structured in a temporally backward, rather than forward, fashion. Taken together, these algorithmic architectures characterize how the interplay between reinforcement learning, policy compression, meta-learning, and working memory supports structured decision-making and compositionality in a resource-rational way.
Collapse
Affiliation(s)
- Jing-Jing Li
- Helen Wills Neuroscience Institute, University of California, Berkeley, United States of America.
| | - Anne G E Collins
- Helen Wills Neuroscience Institute, University of California, Berkeley, United States of America; Department of Psychology, University of California, Berkeley, United States of America.
| |
Collapse
|
6
|
Moskovitz T, Miller KJ, Sahani M, Botvinick MM. Understanding dual process cognition via the minimum description length principle. PLoS Comput Biol 2024; 20:e1012383. [PMID: 39423224 PMCID: PMC11534269 DOI: 10.1371/journal.pcbi.1012383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 11/04/2024] [Accepted: 08/01/2024] [Indexed: 10/21/2024] Open
Abstract
Dual-process theories play a central role in both psychology and neuroscience, figuring prominently in domains ranging from executive control to reward-based learning to judgment and decision making. In each of these domains, two mechanisms appear to operate concurrently, one relatively high in computational complexity, the other relatively simple. Why is neural information processing organized in this way? We propose an answer to this question based on the notion of compression. The key insight is that dual-process structure can enhance adaptive behavior by allowing an agent to minimize the description length of its own behavior. We apply a single model based on this observation to findings from research on executive control, reward-based learning, and judgment and decision making, showing that seemingly diverse dual-process phenomena can be understood as domain-specific consequences of a single underlying set of computational principles.
Collapse
Affiliation(s)
- Ted Moskovitz
- Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom
- Google DeepMind, London, United Kingdom
| | - Kevin J. Miller
- Google DeepMind, London, United Kingdom
- Department of Ophthalmology, University College London, London, United Kingdom
| | - Maneesh Sahani
- Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom
| | - Matthew M. Botvinick
- Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom
- Google DeepMind, London, United Kingdom
| |
Collapse
|
7
|
Gershman SJ, Lak A. Policy complexity suppresses dopamine responses. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.15.613150. [PMID: 39345642 PMCID: PMC11429712 DOI: 10.1101/2024.09.15.613150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
Limits on information processing capacity impose limits on task performance. We show that animals achieve performance on a perceptual decision task that is near-optimal given their capacity limits, as measured by policy complexity (the mutual information between states and actions). This behavioral profile could be achieved by reinforcement learning with a penalty on high complexity policies, realized through modulation of dopaminergic learning signals. In support of this hypothesis, we find that policy complexity suppresses midbrain dopamine responses to reward outcomes, thereby reducing behavioral sensitivity to these outcomes. Our results suggest that policy compression shapes basic mechanisms of reinforcement learning in the brain.
Collapse
Affiliation(s)
- Samuel J. Gershman
- Department of Psychology and Center for Brain Science, Harvard University
| | - Armin Lak
- Department of Physiology, Anatomy and Genetics, University of Oxford
| |
Collapse
|
8
|
Zhou D, Bornstein AM. Expanding horizons in reinforcement learning for curious exploration and creative planning. Behav Brain Sci 2024; 47:e118. [PMID: 38770877 DOI: 10.1017/s0140525x23003394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Curiosity and creativity are expressions of the trade-off between leveraging that with which we are familiar or seeking out novelty. Through the computational lens of reinforcement learning, we describe how formulating the value of information seeking and generation via their complementary effects on planning horizons formally captures a range of solutions to striking this balance.
Collapse
Affiliation(s)
- Dale Zhou
- Neurobiology and Behavior, 519 Biological Sciences Quad, University of California, Irvine, CA, USA ://dalezhou.com
- Center for the Neurobiology of Learning and Memory, Qureshey, Research Laboratory, University of California, Irvine, CA, USA ://aaron.bornstein.org/
| | - Aaron M Bornstein
- Center for the Neurobiology of Learning and Memory, Qureshey, Research Laboratory, University of California, Irvine, CA, USA ://aaron.bornstein.org/
- Department of Cognitive Sciences, 2318 Social & Behavioral Sciences Gateway, University of California, Irvine, CA, USA
| |
Collapse
|
9
|
Malloy T, Gonzalez C. Applying Generative Artificial Intelligence to cognitive models of decision making. Front Psychol 2024; 15:1387948. [PMID: 38765837 PMCID: PMC11100990 DOI: 10.3389/fpsyg.2024.1387948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Accepted: 04/12/2024] [Indexed: 05/22/2024] Open
Abstract
Introduction Generative Artificial Intelligence has made significant impacts in many fields, including computational cognitive modeling of decision making, although these applications have not yet been theoretically related to each other. This work introduces a categorization of applications of Generative Artificial Intelligence to cognitive models of decision making. Methods This categorization is used to compare the existing literature and to provide insight into the design of an ablation study to evaluate our proposed model in three experimental paradigms. These experiments used for model comparison involve modeling human learning and decision making based on both visual information and natural language, in tasks that vary in realism and complexity. This comparison of applications takes as its basis Instance-Based Learning Theory, a theory of experiential decision making from which many models have emerged and been applied to a variety of domains and applications. Results The best performing model from the ablation we performed used a generative model to both create memory representations as well as predict participant actions. The results of this comparison demonstrates the importance of generative models in both forming memories and predicting actions in decision-modeling research. Discussion In this work, we present a model that integrates generative and cognitive models, using a variety of stimuli, applications, and training methods. These results can provide guidelines for cognitive modelers and decision making researchers interested in integrating Generative AI into their methods.
Collapse
Affiliation(s)
- Tyler Malloy
- Dynamic Decision Making Laboratory, Department of Social and Decision Sciences, Dietrich College, Carnegie Mellon University, Pittsburgh, PA, United States
| | | |
Collapse
|
10
|
Arumugam D, Ho MK, Goodman ND, Van Roy B. Bayesian Reinforcement Learning With Limited Cognitive Load. Open Mind (Camb) 2024; 8:395-438. [PMID: 38665544 PMCID: PMC11045037 DOI: 10.1162/opmi_a_00132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Accepted: 02/16/2024] [Indexed: 04/28/2024] Open
Abstract
All biological and artificial agents must act given limits on their ability to acquire and process information. As such, a general theory of adaptive behavior should be able to account for the complex interactions between an agent's learning history, decisions, and capacity constraints. Recent work in computer science has begun to clarify the principles that shape these dynamics by bridging ideas from reinforcement learning, Bayesian decision-making, and rate-distortion theory. This body of work provides an account of capacity-limited Bayesian reinforcement learning, a unifying normative framework for modeling the effect of processing constraints on learning and action selection. Here, we provide an accessible review of recent algorithms and theoretical results in this setting, paying special attention to how these ideas can be applied to studying questions in the cognitive and behavioral sciences.
Collapse
Affiliation(s)
| | - Mark K. Ho
- Center for Data Science, New York University
| | - Noah D. Goodman
- Department of Computer Science, Stanford University
- Department of Psychology, Stanford University
| | - Benjamin Van Roy
- Department of Electrical Engineering, Stanford University
- Department of Management Science & Engineering, Stanford University
| |
Collapse
|
11
|
Colas JT, O’Doherty JP, Grafton ST. Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts. PLoS Comput Biol 2024; 20:e1011950. [PMID: 38552190 PMCID: PMC10980507 DOI: 10.1371/journal.pcbi.1011950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 02/26/2024] [Indexed: 04/01/2024] Open
Abstract
Active reinforcement learning enables dynamic prediction and control, where one should not only maximize rewards but also minimize costs such as of inference, decisions, actions, and time. For an embodied agent such as a human, decisions are also shaped by physical aspects of actions. Beyond the effects of reward outcomes on learning processes, to what extent can modeling of behavior in a reinforcement-learning task be complicated by other sources of variance in sequential action choices? What of the effects of action bias (for actions per se) and action hysteresis determined by the history of actions chosen previously? The present study addressed these questions with incremental assembly of models for the sequential choice data from a task with hierarchical structure for additional complexity in learning. With systematic comparison and falsification of computational models, human choices were tested for signatures of parallel modules representing not only an enhanced form of generalized reinforcement learning but also action bias and hysteresis. We found evidence for substantial differences in bias and hysteresis across participants-even comparable in magnitude to the individual differences in learning. Individuals who did not learn well revealed the greatest biases, but those who did learn accurately were also significantly biased. The direction of hysteresis varied among individuals as repetition or, more commonly, alternation biases persisting from multiple previous actions. Considering that these actions were button presses with trivial motor demands, the idiosyncratic forces biasing sequences of action choices were robust enough to suggest ubiquity across individuals and across tasks requiring various actions. In light of how bias and hysteresis function as a heuristic for efficient control that adapts to uncertainty or low motivation by minimizing the cost of effort, these phenomena broaden the consilient theory of a mixture of experts to encompass a mixture of expert and nonexpert controllers of behavior.
Collapse
Affiliation(s)
- Jaron T. Colas
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, California, United States of America
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, California, United States of America
- Computation and Neural Systems Program, California Institute of Technology, Pasadena, California, United States of America
| | - John P. O’Doherty
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, California, United States of America
- Computation and Neural Systems Program, California Institute of Technology, Pasadena, California, United States of America
| | - Scott T. Grafton
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, California, United States of America
| |
Collapse
|
12
|
Wilbrecht L, Davidow JY. Goal-directed learning in adolescence: neurocognitive development and contextual influences. Nat Rev Neurosci 2024; 25:176-194. [PMID: 38263216 DOI: 10.1038/s41583-023-00783-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/12/2023] [Indexed: 01/25/2024]
Abstract
Adolescence is a time during which we transition to independence, explore new activities and begin pursuit of major life goals. Goal-directed learning, in which we learn to perform actions that enable us to obtain desired outcomes, is central to many of these processes. Currently, our understanding of goal-directed learning in adolescence is itself in a state of transition, with the scientific community grappling with inconsistent results. When we examine metrics of goal-directed learning through the second decade of life, we find that many studies agree there are steady gains in performance in the teenage years, but others report that adolescent goal-directed learning is already adult-like, and some find adolescents can outperform adults. To explain the current variability in results, sophisticated experimental designs are being applied to test learning in different contexts. There is also increasing recognition that individuals of different ages and in different states will draw on different neurocognitive systems to support goal-directed learning. Through adoption of more nuanced approaches, we can be better prepared to recognize and harness adolescent strengths and to decipher the purpose (or goals) of adolescence itself.
Collapse
Affiliation(s)
- Linda Wilbrecht
- Department of Psychology, University of California, Berkeley, CA, USA.
- Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA.
| | - Juliet Y Davidow
- Department of Psychology, Northeastern University, Boston, MA, USA.
| |
Collapse
|
13
|
Wientjes S, Holroyd CB. The successor representation subserves hierarchical abstraction for goal-directed behavior. PLoS Comput Biol 2024; 20:e1011312. [PMID: 38377074 PMCID: PMC10906840 DOI: 10.1371/journal.pcbi.1011312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 03/01/2024] [Accepted: 02/05/2024] [Indexed: 02/22/2024] Open
Abstract
Humans have the ability to craft abstract, temporally extended and hierarchically organized plans. For instance, when considering how to make spaghetti for dinner, we typically concern ourselves with useful "subgoals" in the task, such as cutting onions, boiling pasta, and cooking a sauce, rather than particulars such as how many cuts to make to the onion, or exactly which muscles to contract. A core question is how such decomposition of a more abstract task into logical subtasks happens in the first place. Previous research has shown that humans are sensitive to a form of higher-order statistical learning named "community structure". Community structure is a common feature of abstract tasks characterized by a logical ordering of subtasks. This structure can be captured by a model where humans learn predictions of upcoming events multiple steps into the future, discounting predictions of events further away in time. One such model is the "successor representation", which has been argued to be useful for hierarchical abstraction. As of yet, no study has convincingly shown that this hierarchical abstraction can be put to use for goal-directed behavior. Here, we investigate whether participants utilize learned community structure to craft hierarchically informed action plans for goal-directed behavior. Participants were asked to search for paintings in a virtual museum, where the paintings were grouped together in "wings" representing community structure in the museum. We find that participants' choices accord with the hierarchical structure of the museum and that their response times are best predicted by a successor representation. The degree to which the response times reflect the community structure of the museum correlates with several measures of performance, including the ability to craft temporally abstract action plans. These results suggest that successor representation learning subserves hierarchical abstractions relevant for goal-directed behavior.
Collapse
Affiliation(s)
- Sven Wientjes
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | - Clay B. Holroyd
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| |
Collapse
|
14
|
Futrell R. An Information-Theoretic Account of Availability Effects in Language Production. Top Cogn Sci 2024; 16:38-53. [PMID: 38145974 DOI: 10.1111/tops.12716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 11/30/2023] [Accepted: 12/01/2023] [Indexed: 12/27/2023]
Abstract
I present a computational-level model of language production in terms of a combination of information theory and control theory in which words are chosen incrementally in order to maximize communicative value subject to an information-theoretic capacity constraint. The theory generally predicts a tradeoff between ease of production and communicative accuracy. I apply the theory to two cases of apparent availability effects in language production, in which words are selected on the basis of their accessibility to a speaker who has not yet perfectly planned the rest of the utterance. Using corpus data on English relative clause complementizer dropping and experimental data on Mandarin noun classifier choice, I show that the theory reproduces the observed phenomena, providing an alternative account to Uniform Information Density and a promising general model of language production which is tightly linked to emerging theories in computational neuroscience.
Collapse
Affiliation(s)
- Richard Futrell
- Department of Language Science, University of California, Irvine
| |
Collapse
|
15
|
Chen S, Futrell R, Mahowald K. An information-theoretic approach to the typology of spatial demonstratives. Cognition 2023; 240:105505. [PMID: 37598582 DOI: 10.1016/j.cognition.2023.105505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 05/26/2023] [Accepted: 05/28/2023] [Indexed: 08/22/2023]
Abstract
We explore systems of spatial deictic words (such as 'here' and 'there') from the perspective of communicative efficiency using typological data from over 200 languages Nintemann et al. (2020). We argue from an information-theoretic perspective that spatial deictic systems balance informativity and complexity in the sense of the Information Bottleneck (Zaslavsky et al., (2018). We find that under an appropriate choice of cost function and need probability over meanings, among all the 21,146 theoretically possible spatial deictic systems, those adopted by real languages lie near an efficient frontier of informativity and complexity. Moreover, we find that the conditions that the need probability and the cost function need to satisfy for this result are consistent with the cognitive science literature on spatial cognition, especially regarding the source-goal asymmetry. We further show that the typological data are better explained by introducing a notion of consistency into the Information Bottleneck framework, which is jointly optimized along with informativity and complexity.
Collapse
Affiliation(s)
- Sihan Chen
- Department of Brain and Cognitive Sciences, MIT, United States of America.
| | - Richard Futrell
- Department of Language Science, University of California, Irvine, United States of America
| | - Kyle Mahowald
- Department of Linguistics, The University of Texas at Austin, United States of America
| |
Collapse
|
16
|
Futrell R. Information-theoretic principles in incremental language production. Proc Natl Acad Sci U S A 2023; 120:e2220593120. [PMID: 37725652 PMCID: PMC10523564 DOI: 10.1073/pnas.2220593120] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 07/22/2023] [Indexed: 09/21/2023] Open
Abstract
I apply a recently emerging perspective on the complexity of action selection, the rate-distortion theory of control, to provide a computational-level model of errors and difficulties in human language production, which is grounded in information theory and control theory. Language production is cast as the sequential selection of actions to achieve a communicative goal subject to a capacity constraint on cognitive control. In a series of calculations, simulations, corpus analyses, and comparisons to experimental data, I show that the model directly predicts some of the major known qualitative and quantitative phenomena in language production, including semantic interference and predictability effects in word choice; accessibility-based ("easy-first") production preferences in word order alternations; and the existence and distribution of disfluencies including filled pauses, corrections, and false starts. I connect the rate-distortion view to existing models of human language production, to probabilistic models of semantics and pragmatics, and to proposals for controlled language generation in the machine learning and reinforcement learning literature.
Collapse
Affiliation(s)
- Richard Futrell
- Department of Language Science, University of California, Irvine, CA92617
| |
Collapse
|
17
|
Gong T, Gerstenberg T, Mayrhofer R, Bramley NR. Active causal structure learning in continuous time. Cogn Psychol 2023; 140:101542. [PMID: 36586246 DOI: 10.1016/j.cogpsych.2022.101542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 12/07/2022] [Accepted: 12/11/2022] [Indexed: 12/30/2022]
Abstract
Research on causal cognition has largely focused on learning and reasoning about contingency data aggregated across discrete observations or experiments. However, this setting represents only the tip of the causal cognition iceberg. A more general problem lurking beneath is that of learning the latent causal structure that connects events and actions as they unfold in continuous time. In this paper, we examine how people actively learn about causal structure in a continuous-time setting, focusing on when and where they intervene and how this shapes their learning. Across two experiments, we find that participants' accuracy depends on both the informativeness and evidential complexity of the data they generate. Moreover, participants' intervention choices strike a balance between maximizing expected information and minimizing inferential complexity. People time and target their interventions to create simple yet informative causal dynamics. We discuss how the continuous-time setting challenges existing computational accounts of active causal learning, and argue that metacognitive awareness of one's inferential limitations plays a critical role for successful learning in the wild.
Collapse
Affiliation(s)
- Tianwei Gong
- Department of Psychology, University of Edinburgh, United Kingdom.
| | | | - Ralf Mayrhofer
- Department of Psychology, University of Göttingen, Germany
| | - Neil R Bramley
- Department of Psychology, University of Edinburgh, United Kingdom
| |
Collapse
|
18
|
Bari BA, Gershman SJ. Undermatching Is a Consequence of Policy Compression. J Neurosci 2023; 43:447-457. [PMID: 36639891 PMCID: PMC9864556 DOI: 10.1523/jneurosci.1003-22.2022] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 10/14/2022] [Accepted: 11/17/2022] [Indexed: 12/12/2022] Open
Abstract
The matching law describes the tendency of agents to match the ratio of choices allocated to the ratio of rewards received when choosing among multiple options (Herrnstein, 1961). Perfect matching, however, is infrequently observed. Instead, agents tend to undermatch or bias choices toward the poorer option. Overmatching, or the tendency to bias choices toward the richer option, is rarely observed. Despite the ubiquity of undermatching, it has received an inadequate normative justification. Here, we assume agents not only seek to maximize reward, but also seek to minimize cognitive cost, which we formalize as policy complexity (the mutual information between actions and states of the environment). Policy complexity measures the extent to which the policy of an agent is state dependent. Our theory states that capacity-constrained agents (i.e., agents that must compress their policies to reduce complexity) can only undermatch or perfectly match, but not overmatch, consistent with the empirical evidence. Moreover, using mouse behavioral data (male), we validate a novel prediction about which task conditions exaggerate undermatching. Finally, in patients with Parkinson's disease (male and female), we argue that a reduction in undermatching with higher dopamine levels is consistent with an increased policy complexity.SIGNIFICANCE STATEMENT The matching law describes the tendency of agents to match the ratio of choices allocated to different options to the ratio of reward received. For example, if option a yields twice as much reward as option b, matching states that agents will choose option a twice as much. However, agents typically undermatch: they choose the poorer option more frequently than expected. Here, we assume that agents seek to simultaneously maximize reward and minimize the complexity of their action policies. We show that this theory explains when and why undermatching occurs. Neurally, we show that policy complexity, and by extension undermatching, is controlled by tonic dopamine, consistent with other evidence that dopamine plays an important role in cognitive resource allocation.
Collapse
Affiliation(s)
- Bilal A Bari
- Department of Psychiatry, Massachusetts General Hospital, Boston, Massachusetts 02114
| | - Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, Massachusetts 02138
- Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
| |
Collapse
|
19
|
Catenacci Volpi N, Greaves M, Trendafilov D, Salge C, Pezzulo G, Polani D. Skilled motor control of an inverted pendulum implies low entropy of states but high entropy of actions. PLoS Comput Biol 2023; 19:e1010810. [PMID: 36608159 PMCID: PMC9851554 DOI: 10.1371/journal.pcbi.1010810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 01/19/2023] [Accepted: 12/12/2022] [Indexed: 01/07/2023] Open
Abstract
The mastery of skills, such as balancing an inverted pendulum, implies a very accurate control of movements to achieve the task goals. Traditional accounts of skilled action control that focus on either routinization or perceptual control make opposite predictions about the ways we achieve mastery. The notion of routinization emphasizes the decrease of the variance of our actions, whereas the notion of perceptual control emphasizes the decrease of the variance of the states we visit, but not of the actions we execute. Here, we studied how participants managed control tasks of varying levels of difficulty, which consisted of controlling inverted pendulums of different lengths. We used information-theoretic measures to compare the predictions of alternative accounts that focus on routinization and perceptual control, respectively. Our results indicate that the successful performance of the control task strongly correlates with the decrease of state variability and the increase of action variability. As postulated by perceptual control theory, the mastery of skilled pendulum control consists in achieving stable control of goals by flexible means.
Collapse
Affiliation(s)
- Nicola Catenacci Volpi
- Department of Computer Science, University of Hertfordshire, Hatfield, England, United Kingdom
- * E-mail:
| | - Martin Greaves
- Department of Computer Science, University of Hertfordshire, Hatfield, England, United Kingdom
| | - Dari Trendafilov
- Institute for Pervasive Computing, Johannes Kepler University, Linz, Austria
| | - Christoph Salge
- Department of Computer Science, University of Hertfordshire, Hatfield, England, United Kingdom
| | - Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
| | - Daniel Polani
- Department of Computer Science, University of Hertfordshire, Hatfield, England, United Kingdom
| |
Collapse
|
20
|
Lancia GL, Eluchans M, D’Alessandro M, Spiers HJ, Pezzulo G. Humans account for cognitive costs when finding shortcuts: An information-theoretic analysis of navigation. PLoS Comput Biol 2023; 19:e1010829. [PMID: 36608145 PMCID: PMC9851521 DOI: 10.1371/journal.pcbi.1010829] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 01/19/2023] [Accepted: 12/19/2022] [Indexed: 01/09/2023] Open
Abstract
When faced with navigating back somewhere we have been before we might either retrace our steps or seek a shorter path. Both choices have costs. Here, we ask whether it is possible to characterize formally the choice of navigational plans as a bounded rational process that trades off the quality of the plan (e.g., its length) and the cognitive cost required to find and implement it. We analyze the navigation strategies of two groups of people that are firstly trained to follow a "default policy" taking a route in a virtual maze and then asked to navigate to various known goal destinations, either in the way they want ("Go To Goal") or by taking novel shortcuts ("Take Shortcut"). We address these wayfinding problems using InfoRL: an information-theoretic approach that formalizes the cognitive cost of devising a navigational plan, as the informational cost to deviate from a well-learned route (the "default policy"). In InfoRL, optimality refers to finding the best trade-off between route length and the amount of control information required to find it. We report five main findings. First, the navigational strategies automatically identified by InfoRL correspond closely to different routes (optimal or suboptimal) in the virtual reality map, which were annotated by hand in previous research. Second, people deliberate more in places where the value of investing cognitive resources (i.e., relevant goal information) is greater. Third, compared to the group of people who receive the "Go To Goal" instruction, those who receive the "Take Shortcut" instruction find shorter but less optimal solutions, reflecting the intrinsic difficulty of finding optimal shortcuts. Fourth, those who receive the "Go To Goal" instruction modulate flexibly their cognitive resources, depending on the benefits of finding the shortcut. Finally, we found a surprising amount of variability in the choice of navigational strategies and resource investment across participants. Taken together, these results illustrate the benefits of using InfoRL to address navigational planning problems from a bounded rational perspective.
Collapse
Affiliation(s)
- Gian Luca Lancia
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
- University of Rome “La Sapienza”, Rome, Italy
| | - Mattia Eluchans
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
- University of Rome “La Sapienza”, Rome, Italy
| | - Marco D’Alessandro
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
| | - Hugo J. Spiers
- Institute of Behavioural Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, United Kingdom
| | - Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
| |
Collapse
|
21
|
Gershman SJ, Burke T. Mental control of uncertainty. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2022:10.3758/s13415-022-01034-8. [PMID: 36168079 DOI: 10.3758/s13415-022-01034-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 09/01/2022] [Indexed: 06/16/2023]
Abstract
Can you reduce uncertainty by thinking? Intuition suggests that this happens through the elusive process of attention: if we expend mental effort, we can increase the reliability of our sensory data. Models based on "rational inattention" formalize this idea in terms of a trade-off between the costs and benefits of attention. This paper surveys the origin of these models in economics, their connection to rate-distortion theory, and some of their recent applications to psychology and neuroscience. We also report new data from a numerosity judgment task in which we manipulate performance incentives. Consistent with rational inattention, people are able to improve performance on this task when incentivized, in part by increasing the reliability of their sensory data.
Collapse
Affiliation(s)
- Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, MA, Cambridge, USA.
| | - Taylor Burke
- Department of Psychology and Center for Brain Science, Harvard University, MA, Cambridge, USA
| |
Collapse
|
22
|
Beron CC, Neufeld SQ, Linderman SW, Sabatini BL. Mice exhibit stochastic and efficient action switching during probabilistic decision making. Proc Natl Acad Sci U S A 2022; 119:e2113961119. [PMID: 35385355 PMCID: PMC9169659 DOI: 10.1073/pnas.2113961119] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 03/03/2022] [Indexed: 12/05/2022] Open
Abstract
In probabilistic and nonstationary environments, individuals must use internal and external cues to flexibly make decisions that lead to desirable outcomes. To gain insight into the process by which animals choose between actions, we trained mice in a task with time-varying reward probabilities. In our implementation of such a two-armed bandit task, thirsty mice use information about recent action and action–outcome histories to choose between two ports that deliver water probabilistically. Here we comprehensively modeled choice behavior in this task, including the trial-to-trial changes in port selection, i.e., action switching behavior. We find that mouse behavior is, at times, deterministic and, at others, apparently stochastic. The behavior deviates from that of a theoretically optimal agent performing Bayesian inference in a hidden Markov model (HMM). We formulate a set of models based on logistic regression, reinforcement learning, and sticky Bayesian inference that we demonstrate are mathematically equivalent and that accurately describe mouse behavior. The switching behavior of mice in the task is captured in each model by a stochastic action policy, a history-dependent representation of action value, and a tendency to repeat actions despite incoming evidence. The models parsimoniously capture behavior across different environmental conditionals by varying the stickiness parameter, and like the mice, they achieve nearly maximal reward rates. These results indicate that mouse behavior reaches near-maximal performance with reduced action switching and can be described by a set of equivalent models with a small number of relatively fixed parameters.
Collapse
Affiliation(s)
- Celia C. Beron
- Department of Neurobiology, Harvard Medical School, Boston, MA 02115
- HHMI, Harvard Medical School, Boston, MA 02115
| | - Shay Q. Neufeld
- Department of Neurobiology, Harvard Medical School, Boston, MA 02115
- HHMI, Harvard Medical School, Boston, MA 02115
| | - Scott W. Linderman
- Department of Statistics, Stanford University, Stanford, CA 94305
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305
| | - Bernardo L. Sabatini
- Department of Neurobiology, Harvard Medical School, Boston, MA 02115
- HHMI, Harvard Medical School, Boston, MA 02115
| |
Collapse
|
23
|
Sharp PB, Russek EM, Huys QJM, Dolan RJ, Eldar E. Humans perseverate on punishment avoidance goals in multigoal reinforcement learning. eLife 2022; 11:e74402. [PMID: 35199640 PMCID: PMC8912924 DOI: 10.7554/elife.74402] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Accepted: 02/21/2022] [Indexed: 11/20/2022] Open
Abstract
Managing multiple goals is essential to adaptation, yet we are only beginning to understand computations by which we navigate the resource demands entailed in so doing. Here, we sought to elucidate how humans balance reward seeking and punishment avoidance goals, and relate this to variation in its expression within anxious individuals. To do so, we developed a novel multigoal pursuit task that includes trial-specific instructed goals to either pursue reward (without risk of punishment) or avoid punishment (without the opportunity for reward). We constructed a computational model of multigoal pursuit to quantify the degree to which participants could disengage from the pursuit goals when instructed to, as well as devote less model-based resources toward goals that were less abundant. In general, participants (n = 192) were less flexible in avoiding punishment than in pursuing reward. Thus, when instructed to pursue reward, participants often persisted in avoiding features that had previously been associated with punishment, even though at decision time these features were unambiguously benign. In a similar vein, participants showed no significant downregulation of avoidance when punishment avoidance goals were less abundant in the task. Importantly, we show preliminary evidence that individuals with chronic worry may have difficulty disengaging from punishment avoidance when instructed to seek reward. Taken together, the findings demonstrate that people avoid punishment less flexibly than they pursue reward. Future studies should test in larger samples whether a difficulty to disengage from punishment avoidance contributes to chronic worry.
Collapse
Affiliation(s)
- Paul B Sharp
- The Hebrew University of JerusalemJerusalemIsrael
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Wellcome Centre for Human Neuroimaging, University College LondonLondonUnited Kingdom
| | - Evan M Russek
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Wellcome Centre for Human Neuroimaging, University College LondonLondonUnited Kingdom
| | - Quentin JM Huys
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Division of Psychiatry, University College LondonLondonUnited Kingdom
| | - Raymond J Dolan
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Wellcome Centre for Human Neuroimaging, University College LondonLondonUnited Kingdom
| | - Eran Eldar
- The Hebrew University of JerusalemJerusalemIsrael
| |
Collapse
|
24
|
Fine JM, Hayden BY. The whole prefrontal cortex is premotor cortex. Philos Trans R Soc Lond B Biol Sci 2022; 377:20200524. [PMID: 34957853 PMCID: PMC8710885 DOI: 10.1098/rstb.2020.0524] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Accepted: 10/01/2021] [Indexed: 11/12/2022] Open
Abstract
We propose that the entirety of the prefrontal cortex (PFC) can be seen as fundamentally premotor in nature. By this, we mean that the PFC consists of an action abstraction hierarchy whose core function is the potentiation and depotentiation of possible action plans at different levels of granularity. We argue that the apex of the hierarchy should revolve around the process of goal-selection, which we posit is inherently a form of optimization over action abstraction. Anatomical and functional evidence supports the idea that this hierarchy originates on the orbital surface of the brain and extends dorsally to motor cortex. Accordingly, our viewpoint positions the orbitofrontal cortex in a key role in the optimization of goal-selection policies, and suggests that its other proposed roles are aspects of this more general function. Our proposed perspective will reframe outstanding questions, open up new areas of inquiry and align theories of prefrontal function with evolutionary principles. This article is part of the theme issue 'Systems neuroscience through the lens of evolutionary theory'.
Collapse
Affiliation(s)
- Justin M. Fine
- Department of Neuroscience, Center for Magnetic Resonance Research, University of Minnesota, Minneapolis, MN 55455, USA
- Department of Biomedical Engineering, University of Minnesota, Minneapolis, MN 55455, USA
| | - Benjamin Y. Hayden
- Department of Neuroscience, Center for Magnetic Resonance Research, University of Minnesota, Minneapolis, MN 55455, USA
- Department of Biomedical Engineering, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|