1
|
Tomov MS, Tsividis PA, Pouncy T, Tenenbaum JB, Gershman SJ. The neural architecture of theory-based reinforcement learning. Neuron 2023; 111:1331-1344.e8. [PMID: 36898374 PMCID: PMC10200004 DOI: 10.1016/j.neuron.2023.01.023] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 11/06/2022] [Accepted: 01/27/2023] [Indexed: 03/11/2023]
Abstract
Humans learn internal models of the world that support planning and generalization in complex environments. Yet it remains unclear how such internal models are represented and learned in the brain. We approach this question using theory-based reinforcement learning, a strong form of model-based reinforcement learning in which the model is a kind of intuitive theory. We analyzed fMRI data from human participants learning to play Atari-style games. We found evidence of theory representations in prefrontal cortex and of theory updating in prefrontal cortex, occipital cortex, and fusiform gyrus. Theory updates coincided with transient strengthening of theory representations. Effective connectivity during theory updating suggests that information flows from prefrontal theory-coding regions to posterior theory-updating regions. Together, our results are consistent with a neural architecture in which top-down theory representations originating in prefrontal regions shape sensory predictions in visual areas, where factored theory prediction errors are computed and trigger bottom-up updates of the theory.
Collapse
Affiliation(s)
- Momchil S Tomov
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA 02138, USA; Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Motional AD, Inc., Boston, MA 02210, USA.
| | - Pedro A Tsividis
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Thomas Pouncy
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| | - Joshua B Tenenbaum
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA 02138, USA; Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
2
|
Wenjuan Hu. The Application of Artificial Intelligence and Big Data Technology in Basketball Sports Training. ICST TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS 2023. [DOI: 10.4108/eetsis.v10i3.3046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/01/2023]
Abstract
INTRODUCTION: Basketball involves a wide variety of complex human motions. Thus, recognizing them with Precision is essential for both training and competition. The subjective perceptions and experiences of the trainers are heavily relied upon while training players. Big data and Artificial Intelligence (AI) technology may be utilized to track athlete training. Sensing their motions may also help instructors make choices that dramatically improve athletic ability.
OBJECTIVES: This research paper developed an Action Recognition technique for teaching basketball players using Big Data, and CapsNet called ARBIGNet
METHODS: The technique uses a network that is trained using large amounts of data from basketball games called a Whale Optimized Artificial Neural Network (WO-ANN) which is collected using capsules. In order to determine the spatiotemporal information aspects of basketball sports training from videos, this study first employs the Convolution Random Forest (ConvRF) unit. The second accomplishment of this study is creating the Attention Random Forest (AttRF) unit, which combines the RF with the attention mechanism. The study used big data analytics for fast data transmissions. The unit scans each site randomly, focusing more on the region where the activity occurs. The network architecture is then created by enhancing the standard encoder-decoder paradigm. Then, using the Enhanced Darknet network model, the spatiotemporal data in the video is encoded. The AttRF structure is replaced by the standard RF at the decoding step. The ARBIGNet architecture is created by combining these components.
RESULTS: The efficiency of the suggested strategy implemented on action recognition in basketball sports training has been tested via experiments, which have yielded 95.5% mAP and 98.8% accuracy.
Collapse
|
3
|
Pouncy T, Gershman SJ. Inductive biases in theory-based reinforcement learning. Cogn Psychol 2022; 138:101509. [PMID: 36152355 DOI: 10.1016/j.cogpsych.2022.101509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 07/16/2022] [Accepted: 08/23/2022] [Indexed: 11/03/2022]
Abstract
Understanding the inductive biases that allow humans to learn in complex environments has been an important goal of cognitive science. Yet, while we have discovered much about human biases in specific learning domains, much of this research has focused on simple tasks that lack the complexity of the real world. In contrast, video games involving agents and objects embedded in richly structured systems provide an experimentally tractable proxy for real-world complexity. Recent work has suggested that key aspects of human learning in domains like video games can be captured by model-based reinforcement learning (RL) with object-oriented relational models-what we term theory-based RL. Restricting the model class in this way provides an inductive bias that dramatically increases learning efficiency, but in this paper we show that humans employ a stronger set of biases in addition to syntactic constraints on the structure of theories. In particular, we catalog a set of semantic biases that constrain the content of theories. Building these semantic biases into a theory-based RL system produces more human-like learning in video game environments.
Collapse
Affiliation(s)
- Thomas Pouncy
- Department of Psychology and Center for Brain Science, Harvard University, United States of America.
| | - Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, United States of America; Center for Brains, Minds and Machines, MIT, United States of America
| |
Collapse
|
4
|
Abstract
AbstractMonte Carlo Tree Search (MCTS) is a powerful approach to designing game-playing bots or solving sequential decision problems. The method relies on intelligent tree search that balances exploration and exploitation. MCTS performs random sampling in the form of simulations and stores statistics of actions to make more educated choices in each subsequent iteration. The method has become a state-of-the-art technique for combinatorial games. However, in more complex games (e.g. those with a high branching factor or real-time ones) as well as in various practical domains (e.g. transportation, scheduling or security) an efficient MCTS application often requires its problem-dependent modification or integration with other techniques. Such domain-specific modifications and hybrid approaches are the main focus of this survey. The last major MCTS survey was published in 2012. Contributions that appeared since its release are of particular interest for this review.
Collapse
|
5
|
Abstract
Evaluating AI is a challenging task, as it requires an operative definition of intelligence and the metrics to quantify it, including amongst other factors economic drivers, depending on specific domains. From the viewpoint of AI basic research, the ability to play a game against a human has historically been adopted as a criterion of evaluation, as competition can be characterized by an algorithmic approach. Starting from the end of the 1990s, the deployment of sophisticated hardware identified a significant improvement in the ability of a machine to play and win popular games. In spite of the spectacular victory of IBM’s Deep Blue over Garry Kasparov, many objections still remain. This is due to the fact that it is not clear how this result can be applied to solve real-world problems or simulate human abilities, e.g., common sense, and also exhibit a form of generalized AI. An evaluation based uniquely on the capacity of playing games, even when enriched by the capability of learning complex rules without any human supervision, is bound to be unsatisfactory. As the internet has dramatically changed the cultural habits and social interaction of users, who continuously exchange information with intelligent agents, it is quite natural to consider cooperation as the next step in AI software evaluation. Although this concept has already been explored in the scientific literature in the fields of economics and mathematics, its consideration in AI is relatively recent and generally covers the study of cooperation between agents. This paper focuses on more complex problems involving heterogeneity (specifically, the cooperation between humans and software agents, or even robots), which are investigated by taking into account ethical issues occurring during attempts to achieve a common goal shared by both parties, with a possible result of either conflict or stalemate. The contribution of this research consists in identifying those factors (trust, autonomy, and cooperative learning) on which to base ethical guidelines in agent software programming, making cooperation a more suitable benchmark for AI applications.
Collapse
|
6
|
Abstract
AbstractIn ‘Computing Machinery and Intelligence’, Turing, sceptical of the question ‘Can machines think?’, quickly replaces it with an experimentally verifiable test: the imitation game. I suggest that for such a move to be successful the test needs to be relevant, expansive, solvable by exemplars, unpredictable, and lead to actionable research. The Imitation Game is only partially successful in this regard and its reliance on language, whilst insightful for partially solving the problem, has put AI progress on the wrong foot, prescribing a top-down approach for building thinking machines. I argue that to fix shortcomings with modern AI systems a nonverbal operationalisation is required. This is provided by the recent Animal-AI Testbed, which translates animal cognition tests for AI and provides a bottom-up research pathway for building thinking machines that create predictive models of their environment from sensory input.
Collapse
|
7
|
HRLB⌃2: A Reinforcement Learning Based Framework for Believable Bots. APPLIED SCIENCES-BASEL 2018. [DOI: 10.3390/app8122453] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The creation of believable behaviors for Non-Player Characters (NPCs) is key to improve the players’ experience while playing a game. To achieve this objective, we need to design NPCs that appear to be controlled by a human player. In this paper, we propose a hierarchical reinforcement learning framework for believable bots (HRLB⌃2). This novel approach has been designed so it can overcome two main challenges currently faced in the creation of human-like NPCs. The first difficulty is exploring domains with high-dimensional state–action spaces, while satisfying constraints imposed by traits that characterize human-like behavior. The second problem is generating behavior diversity, by also adapting to the opponent’s playing style. We evaluated the effectiveness of our framework in the domain of the 2D fighting game named Street Fighter IV. The results of our tests demonstrate that our bot behaves in a human-like manner.
Collapse
|
8
|
Zuckerman I, Wilson B, Nau DS. Avoiding game-tree pathology in 2-player adversarial search. Comput Intell 2018. [DOI: 10.1111/coin.12162] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Inon Zuckerman
- Department of Industrial Engineering and Management; Ariel University; Ariel Israel
| | - Brandon Wilson
- Department of Computer Science; University of Maryland; College Park MD USA
| | - Dana S. Nau
- Department of Computer Science; University of Maryland; College Park MD USA
| |
Collapse
|
9
|
|