1
|
Hierarchical-Task Reservoir for Online Semantic Analysis From Continuous Speech. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:2654-2663. [PMID: 34570710 DOI: 10.1109/tnnls.2021.3095140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In this article, we propose a novel architecture called hierarchical-task reservoir (HTR) suitable for real-time applications for which different levels of abstraction are available. We apply it to semantic role labeling (SRL) based on continuous speech recognition. Taking inspiration from the brain, this demonstrates the hierarchies of representations from perceptive to integrative areas, and we consider a hierarchy of four subtasks with increasing levels of abstraction (phone, word, part-of-speech (POS), and semantic role tags). These tasks are progressively learned by the layers of the HTR architecture. Interestingly, quantitative and qualitative results show that the hierarchical-task approach provides an advantage to improve the prediction. In particular, the qualitative results show that a shallow or a hierarchical reservoir, considered as baselines, does not produce estimations as good as the HTR model would. Moreover, we show that it is possible to further improve the accuracy of the model by designing skip connections and by considering word embedding (WE) in the internal representations. Overall, the HTR outperformed the other state-of-the-art reservoir-based approaches and it resulted in extremely efficient with respect to typical recurrent neural networks (RNNs) in deep learning (DL) [e.g., long short term memory (LSTMs)]. The HTR architecture is proposed as a step toward the modeling of online and hierarchical processes at work in the brain during language comprehension.
Collapse
|
2
|
The GummiArm Project: A Replicable and Variable-Stiffness Robot Arm for Experiments on Embodied AI. Front Neurorobot 2022; 16:836772. [PMID: 35360828 PMCID: PMC8963345 DOI: 10.3389/fnbot.2022.836772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Accepted: 01/31/2022] [Indexed: 11/29/2022] Open
Abstract
Robots used in research on Embodied AI often need to physically explore the world, to fail in the process, and to develop from such experiences. Most research robots are unfortunately too stiff to safely absorb impacts, too expensive to repair if broken repeatedly, and are never operated without the red kill-switch prominently displayed. The GummiArm Project was intended to be an open-source “soft” robot arm with human-inspired tendon actuation, sufficient dexterity for simple manipulation tasks, and with an eye on enabling easy replication of robotics experiments. The arm offers variable-stiffness and damped actuation, which lowers the potential for damage, and which enables new research opportunities in Embodied AI. The arm structure is printable on hobby-grade 3D printers for ease of manufacture, exploits stretchable composite tendons for robustness to impacts, and has a repair-cycle of minutes when something does break. The material cost of the arm is less than $6000, while the full set of structural parts, the ones most likely to break, can be printed with less than $20 worth of plastic filament. All this promotes a concurrent approach to the design of “brain” and “body,” and can help increase productivity and reproducibility in Embodied AI research. In this work we describe the motivation for, and the development and application of, this 6 year project.
Collapse
|
3
|
Grammatical structure detection by Instinct Plasticity based Echo State Networks with Genetic Algorithm. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.09.073] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
4
|
An individual prediction model of the pre-loading motion for operator and backhoe pairs. Adv Robot 2021. [DOI: 10.1080/01691864.2021.1980101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
5
|
Learning Actions From Natural Language Instructions Using an ON-World Embodied Cognitive Architecture. Front Neurorobot 2021; 15:626380. [PMID: 34054452 PMCID: PMC8155541 DOI: 10.3389/fnbot.2021.626380] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Accepted: 04/08/2021] [Indexed: 11/13/2022] Open
Abstract
Endowing robots with the ability to view the world the way humans do, to understand natural language and to learn novel semantic meanings when they are deployed in the physical world, is a compelling problem. Another significant aspect is linking language to action, in particular, utterances involving abstract words, in artificial agents. In this work, we propose a novel methodology, using a brain-inspired architecture, to model an appropriate mapping of language with the percept and internal motor representation in humanoid robots. This research presents the first robotic instantiation of a complex architecture based on the Baddeley's Working Memory (WM) model. Our proposed method grants a scalable knowledge representation of verbal and non-verbal signals in the cognitive architecture, which supports incremental open-ended learning. Human spoken utterances about the workspace and the task are combined with the internal knowledge map of the robot to achieve task accomplishment goals. We train the robot to understand instructions involving higher-order (abstract) linguistic concepts of developmental complexity, which cannot be directly hooked in the physical world and are not pre-defined in the robot's static self-representation. Our proposed interactive learning method grants flexible run-time acquisition of novel linguistic forms and real-world information, without training the cognitive model anew. Hence, the robot can adapt to new workspaces that include novel objects and task outcomes. We assess the potential of the proposed methodology in verification experiments with a humanoid robot. The obtained results suggest robust capabilities of the model to link language bi-directionally with the physical environment and solve a variety of manipulation tasks, starting with limited knowledge and gradually learning from the run-time interaction with the tutor, past the pre-trained stage.
Collapse
|
6
|
|
7
|
Learning to Use Narrative Function Words for the Organization and Communication of Experience. Front Psychol 2021; 12:591703. [PMID: 33762991 PMCID: PMC7982915 DOI: 10.3389/fpsyg.2021.591703] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 01/22/2021] [Indexed: 11/16/2022] Open
Abstract
How do people learn to talk about the causal and temporal relations between events, and the motivation behind why people do what they do? The narrative practice hypothesis of Hutto and Gallagher holds that children are exposed to narratives that provide training for understanding and expressing reasons for why people behave as they do. In this context, we have recently developed a model of narrative processing where a structured model of the developing situation (the situation model) is built up from experienced events, and enriched by sentences in a narrative that describe event meanings. The main interest is to develop a proof of concept for how narrative can be used to structure, organize and describe experience. Narrative sentences describe events, and they also define temporal and causal relations between events. These relations are specified by a class of narrative function words, including “because, before, after, first, finally.” The current research develops a proof of concept that by observing how people describe social events, a developmental robotic system can begin to acquire early knowledge of how to explain the reasons for events. We collect data from naïve subjects who use narrative function words to describe simple scenes of human-robot interaction, and then employ algorithms for extracting the statistical structure of how narrative function words link events in the situation model. By using these statistical regularities, the robot can thus learn from human experience about how to properly employ in question-answering dialogues with the human, and in generating canonical narratives for new experiences. The behavior of the system is demonstrated over several behavioral interactions, and associated narrative interaction sessions, while a more formal extended evaluation and user study will be the subject of future research. Clearly this is far removed from the power of the full blown narrative practice capability, but it provides a first step in the development of an experimental infrastructure for the study of socially situated narrative practice in human-robot interaction.
Collapse
|
8
|
Teach Your Robot Your Language! Trainable Neural Parser for Modeling Human Sentence Processing: Examples for 15 Languages. IEEE Trans Cogn Dev Syst 2020. [DOI: 10.1109/tcds.2019.2957006] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
9
|
Modeling the Co-Emergence of Linguistic Constructions and Action Concepts: The Case of Action Verbs. IEEE Trans Cogn Dev Syst 2019. [DOI: 10.1109/tcds.2019.2900418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
10
|
|
11
|
|
12
|
Cross-Situational Learning with Bayesian Generative Models for Multimodal Category and Word Learning in Robots. Front Neurorobot 2018; 11:66. [PMID: 29311888 PMCID: PMC5742219 DOI: 10.3389/fnbot.2017.00066] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2017] [Accepted: 11/21/2017] [Indexed: 11/24/2022] Open
Abstract
In this paper, we propose a Bayesian generative model that can form multiple categories based on each sensory-channel and can associate words with any of the four sensory-channels (action, position, object, and color). This paper focuses on cross-situational learning using the co-occurrence between words and information of sensory-channels in complex situations rather than conventional situations of cross-situational learning. We conducted a learning scenario using a simulator and a real humanoid iCub robot. In the scenario, a human tutor provided a sentence that describes an object of visual attention and an accompanying action to the robot. The scenario was set as follows: the number of words per sensory-channel was three or four, and the number of trials for learning was 20 and 40 for the simulator and 25 and 40 for the real robot. The experimental results showed that the proposed method was able to estimate the multiple categorizations and to learn the relationships between multiple sensory-channels and words accurately. In addition, we conducted an action generation task and an action description task based on word meanings learned in the cross-situational learning scenario. The experimental results showed that the robot could successfully use the word meanings learned by using the proposed method.
Collapse
|
13
|
Representation Learning of Logic Words by an RNN: From Word Sequences to Robot Actions. Front Neurorobot 2017; 11:70. [PMID: 29311891 PMCID: PMC5744442 DOI: 10.3389/fnbot.2017.00070] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Accepted: 12/14/2017] [Indexed: 11/13/2022] Open
Abstract
An important characteristic of human language is compositionality. We can efficiently express a wide variety of real-world situations, events, and behaviors by compositionally constructing the meaning of a complex expression from a finite number of elements. Previous studies have analyzed how machine-learning models, particularly neural networks, can learn from experience to represent compositional relationships between language and robot actions with the aim of understanding the symbol grounding structure and achieving intelligent communicative agents. Such studies have mainly dealt with the words (nouns, adjectives, and verbs) that directly refer to real-world matters. In addition to these words, the current study deals with logic words, such as “not,” “and,” and “or” simultaneously. These words are not directly referring to the real world, but are logical operators that contribute to the construction of meaning in sentences. In human–robot communication, these words may be used often. The current study builds a recurrent neural network model with long short-term memory units and trains it to learn to translate sentences including logic words into robot actions. We investigate what kind of compositional representations, which mediate sentences and robot actions, emerge as the network's internal states via the learning process. Analysis after learning shows that referential words are merged with visual information and the robot's own current state, and the logical words are represented by the model in accordance with their functions as logical operators. Words such as “true,” “false,” and “not” work as non-linear transformations to encode orthogonal phrases into the same area in a memory cell state space. The word “and,” which required a robot to lift up both its hands, worked as if it was a universal quantifier. The word “or,” which required action generation that looked apparently random, was represented as an unstable space of the network's dynamical system.
Collapse
|
14
|
Narrative Constructions for the Organization of Self Experience: Proof of Concept via Embodied Robotics. Front Psychol 2017; 8:1331. [PMID: 28861011 PMCID: PMC5559541 DOI: 10.3389/fpsyg.2017.01331] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2016] [Accepted: 07/20/2017] [Indexed: 11/13/2022] Open
Abstract
It has been proposed that starting from meaning that the child derives directly from shared experience with others, adult narrative enriches this meaning and its structure, providing causal links between unseen intentional states and actions. This would require a means for representing meaning from experience-a situation model-and a mechanism that allows information to be extracted from sentences and mapped onto the situation model that has been derived from experience, thus enriching that representation. We present a hypothesis and theory concerning how the language processing infrastructure for grammatical constructions can naturally be extended to narrative constructions to provide a mechanism for using language to enrich meaning derived from physical experience. Toward this aim, the grammatical construction models are augmented with additional structures for representing relations between events across sentences. Simulation results demonstrate proof of concept for how the narrative construction model supports multiple successive levels of meaning creation which allows the system to learn about the intentionality of mental states, and argument substitution which allows extensions to metaphorical language and analogical problem solving. Cross-linguistic validity of the system is demonstrated in Japanese. The narrative construction model is then integrated into the cognitive system of a humanoid robot that provides the memory systems and world-interaction required for representing meaning in a situation model. In this context proof of concept is demonstrated for how the system enriches meaning in the situation model that has been directly derived from experience. In terms of links to empirical data, the model predicts strong usage based effects: that is, that the narrative constructions used by children will be highly correlated with those that they experience. It also relies on the notion of narrative or discourse function words. Both of these are validated in the experimental literature.
Collapse
|
15
|
The Role of Autobiographical Memory in the Development of a Robot Self. Front Neurorobot 2017; 11:27. [PMID: 28676751 PMCID: PMC5476692 DOI: 10.3389/fnbot.2017.00027] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2016] [Accepted: 05/22/2017] [Indexed: 11/13/2022] Open
Abstract
This article briefly reviews research in cognitive development concerning the nature of the human self. It then reviews research in developmental robotics that has attempted to retrace parts of the developmental trajectory of the self. This should be of interest to developmental psychologists, and researchers in developmental robotics. As a point of departure, one of the most characteristic aspects of human social interaction is cooperation-the process of entering into a joint enterprise to achieve a common goal. Fundamental to this ability to cooperate is the underlying ability to enter into, and engage in, a self-other relation. This suggests that if we intend for robots to cooperate with humans, then to some extent robots must engage in these self-other relations, and hence they must have some aspect of a self. Decades of research in human cognitive development indicate that the self is not fully present from the outset, but rather that it is developed in a usage-based fashion, that is, through engaging with the world, including the physical world and the social world of animate intentional agents. In an effort to characterize the self, Ulric Neisser noted that self is not unitary, and he thus proposed five types of self-knowledge that correspond to five distinct components of self: ecological, interpersonal, conceptual, temporally extended, and private. He emphasized the ecological nature of each of these levels, how they are developed through the engagement of the developing child with the physical and interpersonal worlds. Crucially, development of the self has been shown to rely on the child's autobiographical memory. From the developmental robotics perspective, this suggests that in principal it would be possible to develop certain aspects of self in a robot cognitive system where the robot is engaged in the physical and social world, equipped with an autobiographical memory system. We review a series of developmental robotics studies that make progress in this enterprise. We conclude with a summary of the properties that are required for the development of these different levels of self, and we identify topics for future research.
Collapse
|
16
|
Dynamical Integration of Language and Behavior in a Recurrent Neural Network for Human-Robot Interaction. Front Neurorobot 2016; 10:5. [PMID: 27471463 PMCID: PMC4946379 DOI: 10.3389/fnbot.2016.00005] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2016] [Accepted: 06/23/2016] [Indexed: 12/03/2022] Open
Abstract
To work cooperatively with humans by using language, robots must not only acquire a mapping between language and their behavior but also autonomously utilize the mapping in appropriate contexts of interactive tasks online. To this end, we propose a novel learning method linking language to robot behavior by means of a recurrent neural network. In this method, the network learns from correct examples of the imposed task that are given not as explicitly separated sets of language and behavior but as sequential data constructed from the actual temporal flow of the task. By doing this, the internal dynamics of the network models both language-behavior relationships and the temporal patterns of interaction. Here, "internal dynamics" refers to the time development of the system defined on the fixed-dimensional space of the internal states of the context layer. Thus, in the execution phase, by constantly representing where in the interaction context it is as its current state, the network autonomously switches between recognition and generation phases without any explicit signs and utilizes the acquired mapping in appropriate contexts. To evaluate our method, we conducted an experiment in which a robot generates appropriate behavior responding to a human's linguistic instruction. After learning, the network actually formed the attractor structure representing both language-behavior relationships and the task's temporal pattern in its internal dynamics. In the dynamics, language-behavior mapping was achieved by the branching structure. Repetition of human's instruction and robot's behavioral response was represented as the cyclic structure, and besides, waiting to a subsequent instruction was represented as the fixed-point attractor. Thanks to this structure, the robot was able to interact online with a human concerning the given task by autonomously switching phases.
Collapse
|
17
|
|
18
|
How? Why? What? Where? When? Who? Grounding Ontology in the Actions of a Situated Social Agent. ROBOTICS 2015. [DOI: 10.3390/robotics4020169] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|
19
|
Toward embodied artificial cognition: TIME is on my side. Front Neurorobot 2014; 8:25. [PMID: 25538614 PMCID: PMC4259165 DOI: 10.3389/fnbot.2014.00025] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2014] [Accepted: 11/19/2014] [Indexed: 12/03/2022] Open
|