Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Botvinick M, Weinstein A. Model-based hierarchical reinforcement learning and human action control. Philos Trans R Soc Lond B Biol Sci 2014;369:20130480. [PMID: 25267822 PMCID: PMC4186233 DOI: 10.1098/rstb.2013.0480] [Citation(s) in RCA: 81] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

For:	Botvinick M, Weinstein A. Model-based hierarchical reinforcement learning and human action control. Philos Trans R Soc Lond B Biol Sci 2014;369:20130480. [PMID: 25267822 PMCID: PMC4186233 DOI: 10.1098/rstb.2013.0480] [Citation(s) in RCA: 81] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Number

Cited by Other Article(s)

Yang Q, Zhu Z, Si R, Li Y, Zhang J, Yang T. A language model of problem solving in humans and macaque monkeys. Curr Biol 2025;35:11-20.e10. [PMID: 39631400 DOI: 10.1016/j.cub.2024.10.074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 09/30/2024] [Accepted: 10/29/2024] [Indexed: 12/07/2024]

Schiewer R, Subramoney A, Wiskott L. Exploring the limits of hierarchical world models in reinforcement learning. Sci Rep 2024;14:26856. [PMID: 39500969 PMCID: PMC11538428 DOI: 10.1038/s41598-024-76719-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2024] [Accepted: 10/16/2024] [Indexed: 11/08/2024] Open

Russin J, Pavlick E, Frank MJ. CURRICULUM EFFECTS AND COMPOSITIONALITY EMERGE WITH IN-CONTEXT LEARNING IN NEURAL NETWORKS. ARXIV 2024:arXiv:2402.08674v3. [PMID: 38410645 PMCID: PMC10896373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 02/28/2024]

Wu CM, Dale R, Hawkins RD. Group Coordination Catalyzes Individual and Cultural Intelligence. Open Mind (Camb) 2024;8:1037-1057. [PMID: 39229610 PMCID: PMC11370978 DOI: 10.1162/opmi_a_00155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 06/17/2024] [Indexed: 09/05/2024] Open

Zhu X. Temporally extended successor feature neural episodic control. Sci Rep 2024;14:15103. [PMID: 38956201 PMCID: PMC11219751 DOI: 10.1038/s41598-024-65687-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 06/24/2024] [Indexed: 07/04/2024] Open

Alejandro RJ, Holroyd CB. Hierarchical control over foraging behavior by anterior cingulate cortex. Neurosci Biobehav Rev 2024;160:105623. [PMID: 38490499 DOI: 10.1016/j.neubiorev.2024.105623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 02/14/2024] [Accepted: 03/13/2024] [Indexed: 03/17/2024]

Wenzl P, Schultheis H. Action Selection in Everyday Activities: The Opportunistic Planning Model. Cogn Sci 2024;48:e13444. [PMID: 38659094 DOI: 10.1111/cogs.13444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 02/23/2024] [Accepted: 04/02/2024] [Indexed: 04/26/2024]

Wientjes S, Holroyd CB. The successor representation subserves hierarchical abstraction for goal-directed behavior. PLoS Comput Biol 2024;20:e1011312. [PMID: 38377074 PMCID: PMC10906840 DOI: 10.1371/journal.pcbi.1011312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 03/01/2024] [Accepted: 02/05/2024] [Indexed: 02/22/2024] Open

Abstract

Humans have the ability to craft abstract, temporally extended and hierarchically organized plans. For instance, when considering how to make spaghetti for dinner, we typically concern ourselves with useful "subgoals" in the task, such as cutting onions, boiling pasta, and cooking a sauce, rather than particulars such as how many cuts to make to the onion, or exactly which muscles to contract. A core question is how such decomposition of a more abstract task into logical subtasks happens in the first place. Previous research has shown that humans are sensitive to a form of higher-order statistical learning named "community structure". Community structure is a common feature of abstract tasks characterized by a logical ordering of subtasks. This structure can be captured by a model where humans learn predictions of upcoming events multiple steps into the future, discounting predictions of events further away in time. One such model is the "successor representation", which has been argued to be useful for hierarchical abstraction. As of yet, no study has convincingly shown that this hierarchical abstraction can be put to use for goal-directed behavior. Here, we investigate whether participants utilize learned community structure to craft hierarchically informed action plans for goal-directed behavior. Participants were asked to search for paintings in a virtual museum, where the paintings were grouped together in "wings" representing community structure in the museum. We find that participants' choices accord with the hierarchical structure of the museum and that their response times are best predicted by a successor representation. The degree to which the response times reflect the community structure of the museum correlates with several measures of performance, including the ability to craft temporally abstract action plans. These results suggest that successor representation learning subserves hierarchical abstractions relevant for goal-directed behavior.

Collapse

McCarthy WP, Kirsh D, Fan JE. Consistency and Variation in Reasoning About Physical Assembly. Cogn Sci 2023;47:e13397. [PMID: 38146204 DOI: 10.1111/cogs.13397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 10/27/2023] [Accepted: 12/06/2023] [Indexed: 12/27/2023]

Stolz C, Pickering AD, Mueller EM. Dissociable feedback valence effects on frontal midline theta during reward gain versus threat avoidance learning. Psychophysiology 2022;60:e14235. [PMID: 36529988 DOI: 10.1111/psyp.14235] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 10/17/2022] [Accepted: 11/17/2022] [Indexed: 12/23/2022]

Scleidorovich P, Fellous JM, Weitzenfeld A. Adapting hippocampus multi-scale place field distributions in cluttered environments optimizes spatial navigation and learning. Front Comput Neurosci 2022;16:1039822. [PMID: 36578316 PMCID: PMC9792172 DOI: 10.3389/fncom.2022.1039822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 11/21/2022] [Indexed: 12/14/2022] Open

Janssen M, LeWarne C, Burk D, Averbeck BB. Hierarchical Reinforcement Learning, Sequential Behavior, and the Dorsal Frontostriatal System. J Cogn Neurosci 2022;34:1307-1325. [PMID: 35579977 PMCID: PMC9274316 DOI: 10.1162/jocn_a_01869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]

Li JJ, Xia L, Dong F, Collins AGE. Credit assignment in hierarchical option transfer. COGSCI ... ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY. COGNITIVE SCIENCE SOCIETY (U.S.). CONFERENCE 2022;44:948-954. [PMID: 36534042 PMCID: PMC9751259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]

Li Y, McClelland JL. A weighted constraint satisfaction approach to human goal-directed decision making. PLoS Comput Biol 2022;18:e1009553. [PMID: 35709299 PMCID: PMC9255770 DOI: 10.1371/journal.pcbi.1009553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 07/05/2022] [Accepted: 05/19/2022] [Indexed: 11/29/2022] Open

Abstract

When we plan for long-range goals, proximal information cannot be exploited in a blindly myopic way, as relevant future information must also be considered. But when a subgoal must be resolved first, irrelevant future information should not interfere with the processing of more proximal, subgoal-relevant information. We explore the idea that decision making in both situations relies on the flexible modulation of the degree to which different pieces of information under consideration are weighted, rather than explicitly decomposing a problem into smaller parts and solving each part independently. We asked participants to find the shortest goal-reaching paths in mazes and modeled their initial path choices as a noisy, weighted information integration process. In a base task where choosing the optimal initial path required weighting starting-point and goal-proximal factors equally, participants did take both constraints into account, with participants who made more accurate choices tending to exhibit more balanced weighting. The base task was then embedded as an initial subtask in a larger maze, where the same two factors constrained the optimal path to a subgoal, and the final goal position was irrelevant to the initial path choice. In this more complex task, participants’ choices reflected predominant consideration of the subgoal-relevant constraints, but also some influence of the initially-irrelevant final goal. More accurate participants placed much less weight on the optimality-irrelevant goal and again tended to weight the two initially-relevant constraints more equally. These findings suggest that humans may rely on a graded, task-sensitive weighting of multiple constraints to generate approximately optimal decision outcomes in both hierarchical and non-hierarchical goal-directed tasks.

Different problems require the consideration of different information sources, including often useful long-range, future information that may impact our immediate decisions. However, when future information is irrelevant to a key subgoal, it can be desirable to focus on achieving the subgoal first. We suggest that humans rely on appropriately weighting relevant information over irrelevant information to generate decision outcomes in both types of situations. We conducted behavioral experiments and fitted models of decision processes to understand to what extent people considered various task factors in choosing the initial path in different mazes, both when a simple maze occurred alone or was embedded as an initial part in a larger maze. Our results show that people approximate the optimal decision outcomes in both tasks by modulating the weighting of different factors during planning, and that people who made more accurate initial path choices modulated these weightings more successfully than those who made less accurate choices.

Collapse

Kok A. Cognitive control, motivation and fatigue: A cognitive neuroscience perspective. Brain Cogn 2022;160:105880. [PMID: 35617813 DOI: 10.1016/j.bandc.2022.105880] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2021] [Revised: 04/07/2022] [Accepted: 05/02/2022] [Indexed: 01/22/2023]

Yang Q, Lin Z, Zhang W, Li J, Chen X, Zhang J, Yang T. Monkey plays Pac-Man with compositional strategies and hierarchical decision-making. eLife 2022;11:74500. [PMID: 35286255 PMCID: PMC8963886 DOI: 10.7554/elife.74500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 03/13/2022] [Indexed: 11/18/2022] Open

Yan Y, Zhuang N, Ni B, Zhang J, Xu M, Zhang Q, Zhang Z, Cheng S, Tian Q, Xu Y, Yang X, Zhang W. Fine-Grained Video Captioning via Graph-based Multi-Granularity Interaction Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022;44:666-683. [PMID: 31613750 DOI: 10.1109/tpami.2019.2946823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Abstract

Learning to generate continuous linguistic descriptions for multi-subject interactive videos in great details has particular applications in team sports auto-narrative. In contrast to traditional video caption, this task is more challenging as it requires simultaneous modeling of fine-grained individual actions, uncovering of spatio-temporal dependency structures of frequent group interactions, and then accurate mapping of these complex interaction details into long and detailed commentary. To explicitly address these challenges, we propose a novel framework Graph-based Learning for Multi-Granularity Interaction Representation (GLMGIR) for fine-grained team sports auto-narrative task. A multi-granular interaction modeling module is proposed to extract among-subjects' interactive actions in a progressive way for encoding both intra- and inter-team interactions. Based on the above multi-granular representations, a multi-granular attention module is developed to consider action/event descriptions of multiple spatio-temporal resolutions. Both modules are integrated seamlessly and work in a collaborative way to generate the final narrative. In the meantime, to facilitate reproducible research, we collect a new video dataset from YouTube.com called Sports Video Narrative dataset (SVN). It is a novel direction as it contains 6K team sports videos (i.e., NBA basketball games) with 10K ground-truth narratives(e.g., sentences). Furthermore, as previous metrics such as METEOR (i.e., used in coarse-grained video caption task) DO NOT cope with fine-grained sports narrative task well, we hence develop a novel evaluation metric named Fine-grained Captioning Evaluation (FCE), which measures how accurate the generated linguistic description reflects fine-grained action details as well as the overall spatio-temporal interactional structure. Extensive experiments on our SVN dataset have demonstrated the effectiveness of the proposed framework for fine-grained team sports video auto-narrative.

Collapse

Intelligent problem-solving as integrated hierarchical reinforcement learning. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-021-00433-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Trinh TT, Kimura M. Cognitive prediction of obstacle's movement for reinforcement learning pedestrian interacting model. JOURNAL OF INTELLIGENT SYSTEMS 2022. [DOI: 10.1515/jisys-2022-0002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Stetter M, Lang EW. Learning Intuitive Physics and One-Shot Imitation Using State-Action-Prediction Self-Organizing Maps. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021;2021:5590445. [PMID: 34804145 PMCID: PMC8604601 DOI: 10.1155/2021/5590445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 10/14/2021] [Accepted: 10/21/2021] [Indexed: 11/17/2022]

Chalita MA, Sedzielarz A. Beyond the frame problem: what (else) can Heidegger do for AI? AI & SOCIETY 2021. [DOI: 10.1007/s00146-021-01280-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Röder F, Özdemir O, Nguyen PDH, Wermter S, Eppe M. The Embodied Crossmodal Self Forms Language and Interaction: A Computational Cognitive Review. Front Psychol 2021;12:716671. [PMID: 34484079 PMCID: PMC8415221 DOI: 10.3389/fpsyg.2021.716671] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Accepted: 07/16/2021] [Indexed: 11/13/2022] Open

Stout D, Chaminade T, Apel J, Shafti A, Faisal AA. The measurement, evolution, and neural representation of action grammars of human behavior. Sci Rep 2021;11:13720. [PMID: 34215758 PMCID: PMC8253764 DOI: 10.1038/s41598-021-92992-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Accepted: 06/18/2021] [Indexed: 02/06/2023] Open

Eckstein MK, Collins AGE. How the Mind Creates Structure: Hierarchical Learning of Action Sequences. COGSCI ... ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY. COGNITIVE SCIENCE SOCIETY (U.S.). CONFERENCE 2021;43:618-624. [PMID: 34964045 PMCID: PMC8711273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Xia L, Collins AGE. Temporal and state abstractions for efficient learning, transfer, and composition in humans. Psychol Rev 2021;128:643-666. [PMID: 34014709 PMCID: PMC8485577 DOI: 10.1037/rev0000295] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Sullivan B, Ludwig CJH, Damen D, Mayol-Cuevas W, Gilchrist ID. Look-ahead fixations during visuomotor behavior: Evidence from assembling a camping tent. J Vis 2021;21:13. [PMID: 33688920 PMCID: PMC7961111 DOI: 10.1167/jov.21.3.13] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open

Marković D, Goschke T, Kiebel SJ. Meta-control of the exploration-exploitation dilemma emerges from probabilistic inference over a hierarchy of time scales. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2021;21:509-533. [PMID: 33372237 PMCID: PMC8208938 DOI: 10.3758/s13415-020-00837-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Accepted: 09/17/2020] [Indexed: 12/12/2022]

Gumbsch C, Butz MV, Martius G. Autonomous Identification and Goal-Directed Invocation of Event-Predictive Behavioral Primitives. IEEE Trans Cogn Dev Syst 2021. [DOI: 10.1109/tcds.2019.2925890] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Banerjee A, Rikhye RV, Marblestone A. Reinforcement-guided learning in frontal neocortex: emerging computational concepts. Curr Opin Behav Sci 2021. [DOI: 10.1016/j.cobeha.2021.02.019] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

De Dreu CKW, Pliskin R, Rojek-Giffin M, Méder Z, Gross J. Political games of attack and defence. Philos Trans R Soc Lond B Biol Sci 2021;376:20200135. [PMID: 33611990 PMCID: PMC7934902 DOI: 10.1098/rstb.2020.0135] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open

The Best Laid Plans: Computational Principles of Anterior Cingulate Cortex. Trends Cogn Sci 2021;25:316-329. [PMID: 33593641 DOI: 10.1016/j.tics.2021.01.008] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Revised: 01/17/2021] [Accepted: 01/19/2021] [Indexed: 12/26/2022]

Herd S, Krueger K, Nair A, Mollick J, O'Reilly R. Neural Mechanisms of Human Decision-Making. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2021;21:35-57. [PMID: 33409958 DOI: 10.3758/s13415-020-00842-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 09/28/2020] [Indexed: 11/08/2022]

Goekoop R, de Kleijn R. How higher goals are constructed and collapse under stress: A hierarchical Bayesian control systems perspective. Neurosci Biobehav Rev 2021;123:257-285. [PMID: 33497783 DOI: 10.1016/j.neubiorev.2020.12.021] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Revised: 11/19/2020] [Accepted: 12/19/2020] [Indexed: 01/26/2023]

Márton CD, Schultz SR, Averbeck BB. Learning to select actions shapes recurrent dynamics in the corticostriatal system. Neural Netw 2020;132:375-393. [PMID: 32992244 PMCID: PMC7685243 DOI: 10.1016/j.neunet.2020.09.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Revised: 09/03/2020] [Accepted: 09/11/2020] [Indexed: 01/03/2023]

Abstract

Learning to select appropriate actions based on their values is fundamental to adaptive behavior. This form of learning is supported by fronto-striatal systems. The dorsal-lateral prefrontal cortex (dlPFC) and the dorsal striatum (dSTR), which are strongly interconnected, are key nodes in this circuitry. Substantial experimental evidence, including neurophysiological recordings, have shown that neurons in these structures represent key aspects of learning. The computational mechanisms that shape the neurophysiological responses, however, are not clear. To examine this, we developed a recurrent neural network (RNN) model of the dlPFC-dSTR circuit and trained it on an oculomotor sequence learning task. We compared the activity generated by the model to activity recorded from monkey dlPFC and dSTR in the same task. This network consisted of a striatal component which encoded action values, and a prefrontal component which selected appropriate actions. After training, this system was able to autonomously represent and update action values and select actions, thus being able to closely approximate the representational structure in corticostriatal recordings. We found that learning to select the correct actions drove action-sequence representations further apart in activity space, both in the model and in the neural data. The model revealed that learning proceeds by increasing the distance between sequence-specific representations. This makes it more likely that the model will select the appropriate action sequence as learning develops. Our model thus supports the hypothesis that learning in networks drives the neural representations of actions further apart, increasing the probability that the network generates correct actions as learning proceeds. Altogether, this study advances our understanding of how neural circuit dynamics are involved in neural computation, revealing how dynamics in the corticostriatal system support task learning.

Collapse

Mollick JA, Hazy TE, Krueger KA, Nair A, Mackie P, Herd SA, O'Reilly RC. A systems-neuroscience model of phasic dopamine. Psychol Rev 2020;127:972-1021. [PMID: 32525345 PMCID: PMC8453660 DOI: 10.1037/rev0000199] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Abstract

We describe a neurobiologically informed computational model of phasic dopamine signaling to account for a wide range of findings, including many considered inconsistent with the simple reward prediction error (RPE) formalism. The central feature of this PVLV framework is a distinction between a primary value (PV) system for anticipating primary rewards (Unconditioned Stimuli [USs]), and a learned value (LV) system for learning about stimuli associated with such rewards (CSs). The LV system represents the amygdala, which drives phasic bursting in midbrain dopamine areas, while the PV system represents the ventral striatum, which drives shunting inhibition of dopamine for expected USs (via direct inhibitory projections) and phasic pausing for expected USs (via the lateral habenula). Our model accounts for data supporting the separability of these systems, including individual differences in CS-based (sign-tracking) versus US-based learning (goal-tracking). Both systems use competing opponent-processing pathways representing evidence for and against specific USs, which can explain data dissociating the processes involved in acquisition versus extinction conditioning. Further, opponent processing proved critical in accounting for the full range of conditioned inhibition phenomena, and the closely related paradigm of second-order conditioning. Finally, we show how additional separable pathways representing aversive USs, largely mirroring those for appetitive USs, also have important differences from the positive valence case, allowing the model to account for several important phenomena in aversive conditioning. Overall, accounting for all of these phenomena strongly constrains the model, thus providing a well-validated framework for understanding phasic dopamine signaling. (PsycInfo Database Record (c) 2020 APA, all rights reserved).

Collapse

Daryanavard S, Porr B. Closed-Loop Deep Learning: Generating Forward Models With Backpropagation. Neural Comput 2020;32:2122-2144. [PMID: 32946708 DOI: 10.1162/neco_a_01317] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]

Averbeck BB, Murray EA. Hypothalamic Interactions with Large-Scale Neural Circuits Underlying Reinforcement Learning and Motivated Behavior. Trends Neurosci 2020;43:681-694. [PMID: 32762959 PMCID: PMC7483858 DOI: 10.1016/j.tins.2020.06.006] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 06/02/2020] [Accepted: 06/19/2020] [Indexed: 02/02/2023]

Jara-Ettinger J, Schulz LE, Tenenbaum JB. The Naïve Utility Calculus as a unified, quantitative framework for action understanding. Cogn Psychol 2020;123:101334. [PMID: 32738590 DOI: 10.1016/j.cogpsych.2020.101334] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2019] [Revised: 05/22/2020] [Accepted: 07/17/2020] [Indexed: 11/24/2022]

Momennejad I. Learning Structures: Predictive Representations, Replay, and Generalization. Curr Opin Behav Sci 2020;32:155-166. [DOI: 10.1016/j.cobeha.2020.02.017] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]

Tschantz A, Seth AK, Buckley CL. Learning action-oriented models through active inference. PLoS Comput Biol 2020;16:e1007805. [PMID: 32324758 PMCID: PMC7200021 DOI: 10.1371/journal.pcbi.1007805] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Revised: 05/05/2020] [Accepted: 03/19/2020] [Indexed: 11/29/2022] Open

Abstract

Converging theories suggest that organisms learn and exploit probabilistic models of their environment. However, it remains unclear how such models can be learned in practice. The open-ended complexity of natural environments means that it is generally infeasible for organisms to model their environment comprehensively. Alternatively, action-oriented models attempt to encode a parsimonious representation of adaptive agent-environment interactions. One approach to learning action-oriented models is to learn online in the presence of goal-directed behaviours. This constrains an agent to behaviourally relevant trajectories, reducing the diversity of the data a model need account for. Unfortunately, this approach can cause models to prematurely converge to sub-optimal solutions, through a process we refer to as a bad-bootstrap. Here, we exploit the normative framework of active inference to show that efficient action-oriented models can be learned by balancing goal-oriented and epistemic (information-seeking) behaviours in a principled manner. We illustrate our approach using a simple agent-based model of bacterial chemotaxis. We first demonstrate that learning via goal-directed behaviour indeed constrains models to behaviorally relevant aspects of the environment, but that this approach is prone to sub-optimal convergence. We then demonstrate that epistemic behaviours facilitate the construction of accurate and comprehensive models, but that these models are not tailored to any specific behavioural niche and are therefore less efficient in their use of data. Finally, we show that active inference agents learn models that are parsimonious, tailored to action, and which avoid bad bootstraps and sub-optimal convergence. Critically, our results indicate that models learned through active inference can support adaptive behaviour in spite of, and indeed because of, their departure from veridical representations of the environment. Our approach provides a principled method for learning adaptive models from limited interactions with an environment, highlighting a route to sample efficient learning algorithms.

Collapse

Tomov MS, Yagati S, Kumar A, Yang W, Gershman SJ. Discovery of hierarchical representations for efficient planning. PLoS Comput Biol 2020;16:e1007594. [PMID: 32251444 PMCID: PMC7162548 DOI: 10.1371/journal.pcbi.1007594] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Revised: 04/16/2020] [Accepted: 12/10/2019] [Indexed: 12/12/2022] Open

Kaplan R, Tauste Campo A, Bush D, King J, Principe A, Koster R, Ley Nacher M, Rocamora R, Friston KJ. Human hippocampal theta oscillations reflect sequential dependencies during spatial planning. Cogn Neurosci 2019;11:122-131. [DOI: 10.1080/17588928.2019.1676711] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

The value of abstraction. Curr Opin Behav Sci 2019. [DOI: 10.1016/j.cobeha.2019.05.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Pezzulo G, Donnarumma F, Maisto D, Stoianov I. Planning at decision time and in the background during spatial navigation. Curr Opin Behav Sci 2019. [DOI: 10.1016/j.cobeha.2019.04.009] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Pitti A, Quoy M, Lavandier C, Boucenna S. Gated spiking neural network using Iterative Free-Energy Optimization and rank-order coding for structure learning in memory sequences (INFERNO GATE). Neural Netw 2019;121:242-258. [PMID: 31581065 DOI: 10.1016/j.neunet.2019.09.023] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 09/16/2019] [Accepted: 09/17/2019] [Indexed: 11/16/2022]

Nguyen ND, Nguyen T, Nahavandi S. Multi-agent behavioral control system using deep reinforcement learning. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.05.062] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Budaev S, Jørgensen C, Mangel M, Eliassen S, Giske J. Decision-Making From the Animal Perspective: Bridging Ecology and Subjective Cognition. Front Ecol Evol 2019. [DOI: 10.3389/fevo.2019.00164] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Reinforcement learning in artificial and biological systems. NAT MACH INTELL 2019. [DOI: 10.1038/s42256-019-0025-4] [Citation(s) in RCA: 96] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Ramírez-Vizcaya S, Froese T. The Enactive Approach to Habits: New Concepts for the Cognitive Science of Bad Habits and Addiction. Front Psychol 2019;10:301. [PMID: 30863334 PMCID: PMC6399396 DOI: 10.3389/fpsyg.2019.00301] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Accepted: 01/30/2019] [Indexed: 11/13/2022] Open

Eckstein MK, Starr A, Bunge SA. How the inference of hierarchical rules unfolds over time. Cognition 2019;185:151-162. [PMID: 30711815 DOI: 10.1016/j.cognition.2019.01.009] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2017] [Revised: 01/08/2019] [Accepted: 01/09/2019] [Indexed: 01/20/2023]