1
|
Sasaki K, Nishikawa J, Morita J. Evaluation of co-speech gestures grounded in word-distributed representation. Front Robot AI 2024; 11:1362463. [PMID: 38726067 PMCID: PMC11079185 DOI: 10.3389/frobt.2024.1362463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Accepted: 03/25/2024] [Indexed: 05/12/2024] Open
Abstract
The condition for artificial agents to possess perceivable intentions can be considered that they have resolved a form of the symbol grounding problem. Here, the symbol grounding is considered an achievement of the state where the language used by the agent is endowed with some quantitative meaning extracted from the physical world. To achieve this type of symbol grounding, we adopt a method for characterizing robot gestures with quantitative meaning calculated from word-distributed representations constructed from a large corpus of text. In this method, a "size image" of a word is generated by defining an axis (index) that discriminates the "size" of the word in the word-distributed vector space. The generated size images are converted into gestures generated by a physical artificial agent (robot). The robot's gesture can be set to reflect either the size of the word in terms of the amount of movement or in terms of its posture. To examine the perception of communicative intention in the robot that performs the gestures generated as described above, the authors examine human ratings on "the naturalness" obtained through an online survey, yielding results that partially validate our proposed method. Based on the results, the authors argue for the possibility of developing advanced artifacts that achieve human-like symbolic grounding.
Collapse
Affiliation(s)
- Kosuke Sasaki
- Department of Informatics, Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
| | - Jumpei Nishikawa
- Department of Information Science and Technology, Graduate School of Science and Technology, Shizuoka University, Shizuoka, Japan
| | - Junya Morita
- Department of Informatics, Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
- Department of Information Science and Technology, Graduate School of Science and Technology, Shizuoka University, Shizuoka, Japan
- Department of Behavior Informatics, Faculty of Informatics, Shizuoka University, Hamamatsu, Japan
| |
Collapse
|
2
|
Diveica V, Muraki EJ, Binney RJ, Pexman PM. Mapping semantic space: Exploring the higher-order structure of word meaning. Cognition 2024; 248:105794. [PMID: 38653181 DOI: 10.1016/j.cognition.2024.105794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 03/27/2024] [Accepted: 04/15/2024] [Indexed: 04/25/2024]
Abstract
Multiple representation theories posit that concepts are represented via a combination of properties derived from sensorimotor, affective, and linguistic experiences. Recently, it has been proposed that information derived from social experience, or socialness, represents another key aspect of conceptual representation. How these various dimensions interact to form a coherent conceptual space has yet to be fully explored. To address this, we capitalized on openly available word property norms for 6339 words and conducted a large-scale investigation into the relationships between 18 dimensions. An exploratory factor analysis reduced the dimensions to six higher-order factors: sub-lexical, distributional, visuotactile, body action, affective and social interaction. All these factors explained unique variance in performance on lexical and semantic tasks, demonstrating that they make important contributions to the representation of word meaning. An important and novel finding was that the socialness dimension clustered with the auditory modality and with mouth and head actions. We suggest this reflects experiential learning from verbal interpersonal interactions. Moreover, formally modelling the network structure of semantic space revealed pairwise partial correlations between most dimensions and highlighted the centrality of the interoception dimension. Altogether, these findings provide new insights into the architecture of conceptual space, including the importance of inner and social experience, and highlight promising avenues for future research.
Collapse
Affiliation(s)
- Veronica Diveica
- Cognitive Neuroscience Institute, Department of Psychology, Bangor University, Gwynedd LL57 2AS, UK; Montreal Neurological Institute, Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec H3A 2B4, Canada.
| | - Emiko J Muraki
- Department of Psychology and Hotchkiss Brain Institute, University of Calgary, Calgary, Alberta T2N 1N4, Canada.
| | - Richard J Binney
- Cognitive Neuroscience Institute, Department of Psychology, Bangor University, Gwynedd LL57 2AS, UK.
| | - Penny M Pexman
- Department of Psychology and Hotchkiss Brain Institute, University of Calgary, Calgary, Alberta T2N 1N4, Canada; Department of Psychology, Western University, London, Ontario N6A 5C2, Canada.
| |
Collapse
|
3
|
Motamedi Y, Murgiano M, Grzyb B, Gu Y, Kewenig V, Brieke R, Donnellan E, Marshall C, Wonnacott E, Perniss P, Vigliocco G. Language development beyond the here-and-now: Iconicity and displacement in child-directed communication. Child Dev 2024. [PMID: 38563146 DOI: 10.1111/cdev.14099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Most language use is displaced, referring to past, future, or hypothetical events, posing the challenge of how children learn what words refer to when the referent is not physically available. One possibility is that iconic cues that imagistically evoke properties of absent referents support learning when referents are displaced. In an audio-visual corpus of caregiver-child dyads, English-speaking caregivers interacted with their children (N = 71, 24-58 months) in contexts in which the objects talked about were either familiar or unfamiliar to the child, and either physically present or displaced. The analysis of the range of vocal, manual, and looking behaviors caregivers produced suggests that caregivers used iconic cues especially in displaced contexts and for unfamiliar objects, using other cues when objects were present.
Collapse
Affiliation(s)
- Yasamin Motamedi
- Department of Experimental Psychology, University College London, London, UK
| | - Margherita Murgiano
- Department of Experimental Psychology, University College London, London, UK
| | - Beata Grzyb
- Department of Experimental Psychology, University College London, London, UK
| | - Yan Gu
- Department of Experimental Psychology, University College London, London, UK
- Department of Psychology, University of Essex, Colchester, UK
| | - Viktor Kewenig
- Department of Experimental Psychology, University College London, London, UK
| | - Ricarda Brieke
- Department of Experimental Psychology, University College London, London, UK
| | - Ed Donnellan
- Department of Experimental Psychology, University College London, London, UK
| | - Chloe Marshall
- Institute of Education, University College London, London, UK
| | - Elizabeth Wonnacott
- Department of Language and Cognition, University College London, London, UK
- Department of Education, University of Oxford, Oxford, UK
| | | | - Gabriella Vigliocco
- Department of Experimental Psychology, University College London, London, UK
| |
Collapse
|
4
|
Sidhu DM, Athanasopoulou A, Archer SL, Czarnecki N, Curtin S, Pexman PM. The maluma/takete effect is late: No longitudinal evidence for shape sound symbolism in the first year. PLoS One 2023; 18:e0287831. [PMID: 37943758 PMCID: PMC10635456 DOI: 10.1371/journal.pone.0287831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 06/14/2023] [Indexed: 11/12/2023] Open
Abstract
The maluma/takete effect refers to an association between certain language sounds (e.g., /m/ and /o/) and round shapes, and other language sounds (e.g., /t/ and /i/) and spiky shapes. This is an example of sound symbolism and stands in opposition to arbitrariness of language. It is still unknown when sensitivity to sound symbolism emerges. In the present series of studies, we first confirmed that the classic maluma/takete effect would be observed in adults using our novel 3-D object stimuli (Experiments 1a and 1b). We then conducted the first longitudinal test of the maluma/takete effect, testing infants at 4-, 8- and 12-months of age (Experiment 2). Sensitivity to sound symbolism was measured with a looking time preference task, in which infants were shown images of a round and a spiky 3-D object while hearing either a round- or spiky-sounding nonword. We did not detect a significant difference in looking time based on nonword type. We also collected a series of individual difference measures including measures of vocabulary, movement ability and babbling. Analyses of these measures revealed that 12-month olds who babbled more showed a greater sensitivity to sound symbolism. Finally, in Experiment 3, we had parents take home round or spiky 3-D printed objects, to present to 7- to 8-month-old infants paired with either congruent or incongruent nonwords. This language experience had no effect on subsequent measures of sound symbolism sensitivity. Taken together these studies demonstrate that sound symbolism is elusive in the first year, and shed light on the mechanisms that may contribute to its eventual emergence.
Collapse
Affiliation(s)
- David M. Sidhu
- Department of Psychology, Carleton University, Ottawa, Canada
| | - Angeliki Athanasopoulou
- School of Languages, Linguistics, Literatures, and Cultures, University of Calgary, Calgary, Canada
| | | | | | - Suzanne Curtin
- Department of Child and Youth Studies, Brock University, St. Catharines, Canada
| | - Penny M. Pexman
- Department of Psychology, University of Calgary, Calgary, Canada
| |
Collapse
|
5
|
Sidhu DM, Khachatoorian N, Vigliocco G. Effects of Iconicity in Recognition Memory. Cogn Sci 2023; 47:e13382. [PMID: 38010057 DOI: 10.1111/cogs.13382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 10/01/2023] [Accepted: 11/07/2023] [Indexed: 11/29/2023]
Abstract
Iconicity refers to a resemblance between word form and meaning. Previous work has shown that iconic words are learned earlier and processed faster. Here, we examined whether iconic words are recognized better on a recognition memory task. We also manipulated the level at which items were encoded-with a focus on either their meaning or their form-in order to gain insight into the mechanism by which iconicity would affect memory. In comparison with non-iconic words, iconic words were associated with a higher false alarm rate, a lower d' score, and a lower response criterion in Experiment 1. We did not observe any interaction between iconicity and encoding condition. To test the generalizability of these findings, we examined effects of iconicity in a recognition memory megastudy across 3880 items. After controlling for a variety of lexical and semantic variables, iconicity was predictive of more hits and false alarms, and a lower response criterion in this dataset. In Experiment 2, we examined whether these effects were due to increased feelings of familiarity for iconic items by including a familiar versus recollect decision. This experiment replicated the overall results of Experiment 1 and found that participants were more likely to categorize words that they had seen before as familiar (vs. recollected) if they were iconic. Together, these results demonstrate that iconicity has an effect on memory. We discuss implications for theories of iconicity.
Collapse
Affiliation(s)
- David M Sidhu
- Department of Psychology, Carleton University
- Department of Experimental Psychology, University College London
| | | | | |
Collapse
|
6
|
Reggin LD, Gómez Franco LE, Horchak OV, Labrecque D, Lana N, Rio L, Vigliocco G. Consensus Paper: Situated and Embodied Language Acquisition. J Cogn 2023; 6:63. [PMID: 37841673 PMCID: PMC10573584 DOI: 10.5334/joc.308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Accepted: 07/10/2023] [Indexed: 10/17/2023] Open
Abstract
Theories of embodied cognition postulate that perceptual, sensorimotor, and affective properties of concepts support language learning and processing. In this paper, we argue that language acquisition, as well as processing, is situated in addition to being embodied. In particular, first, it is the situated nature of initial language development that affords for the developing system to become embodied. Second, the situated nature of language use changes across development and adulthood. We provide evidence from empirical studies for embodied effects of perception, action, and valence as they apply to both embodied cognition and situated cognition across developmental stages. Although the evidence is limited, we urge researchers to consider differentiating embodied cognition within situated context, in order to better understand how these separate mechanisms interact for learning to occur. This delineation also provides further clarity to the study of classroom-based applications and the role of embodied and situated cognition in the study of developmental disorders. We argue that theories of language acquisition need to address for the complex situated context of real-world learning by completing a "circular notion": observing experimental paradigms in real-world settings and taking these observations to later refine lab-based experiments.
Collapse
Affiliation(s)
| | | | | | | | - Nadia Lana
- McMaster University, Hamilton, ON, Canada
| | - Laura Rio
- Universitàdi Bologna, Bologna, Italy
| | | |
Collapse
|
7
|
Dove G. Language is a Source of Grounding and a Mode of Action. Top Cogn Sci 2023; 15:688-692. [PMID: 37212318 DOI: 10.1111/tops.12665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 05/01/2023] [Accepted: 05/03/2023] [Indexed: 05/23/2023]
Abstract
Kemmerer argues that grounded cognition explains how language-specific semantic structures can influence nonlinguistic cognition. In this commentary, I argue that his proposal fails to fully consider the possibility that language itself can serve as a source of grounding. Our concepts are not merely shaped by a disembodied language system; they emerge in the context of linguistic experience and action. This inclusive approach to grounded cognition offers an expanded conception of the phenomena associated with linguistic relativity. I provide empirical and theoretical reasons to adopt this theoretical perspective.
Collapse
Affiliation(s)
- Guy Dove
- Department of Philosophy, University of Louisville
| |
Collapse
|
8
|
Sidhu DM, Vigliocco G. I don't see what you're saying: The maluma/takete effect does not depend on the visual appearance of phonemes as they are articulated. Psychon Bull Rev 2023; 30:1521-1529. [PMID: 36520277 PMCID: PMC10482773 DOI: 10.3758/s13423-022-02224-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/16/2022] [Indexed: 12/23/2022]
Abstract
In contrast to the principle of arbitrariness, recent work has shown that language can iconically depict referents being talked about. One such example is the maluma/takete effect: an association between certain phonemes (e.g., those in maluma) and round shapes, and other phonemes (e.g., those in takete and spiky shapes). An open question has been whether this association is crossmodal (arising from phonemes' sound or kinesthetics) or unimodal (arising from phonemes' visual appearance). In the latter case, individuals may associate a person's rounded lips as they pronounce the /u/ in maluma with round shapes. We examined this hypothesis by having participants pair nonwords with shapes in either an audio-only condition (they only heard nonwords) or an audiovisual condition (they both heard nonwords and saw them articulated). We found no evidence that seeing nonwords articulated enhanced the maluma/takete effect. In fact, there was evidence that it decreased it in some cases. This was confirmed with a Bayesian analysis. These results eliminate a plausible explanation for the maluma/takete effect, as an instance of visual matching. We discuss the alternate possibility that it involves crossmodal associations.
Collapse
Affiliation(s)
- David M. Sidhu
- Department of Psychology, University College London, London, UK
- Department of Psychology, Carleton University, Ottawa, Ontario Canada
| | | |
Collapse
|
9
|
Muraki EJ, Abdalla S, Brysbaert M, Pexman PM. Concreteness ratings for 62,000 English multiword expressions. Behav Res Methods 2023; 55:2522-2531. [PMID: 35867207 DOI: 10.3758/s13428-022-01912-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/16/2022] [Indexed: 11/08/2022]
Abstract
Concreteness describes the degree to which a word's meaning is understood through perception and action. Many studies use the Brysbaert et al. (2014) concreteness ratings to investigate language processing and text analysis. However, these ratings are limited to English single words and a few two-word expressions. Increasingly, attention is focused on the importance of multiword expressions, given their centrality in everyday language use and language acquisition. We present concreteness ratings for 62,889 multiword expressions and examine their relationship to the existing concreteness ratings for single words and two-word expressions. These new ratings represent the first big dataset of multiword expressions, and will be useful for researchers interested in language acquisition and language processing, as well as natural language processing and text analysis.
Collapse
Affiliation(s)
- Emiko J Muraki
- Department of Psychology, University of Calgary, 2500 University Drive, Calgary, AB, T2N 1N4, Canada.
| | - Summer Abdalla
- School of Languages, Linguistics, Literatures and Cultures, University of Calgary, Calgary, Canada
| | - Marc Brysbaert
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | - Penny M Pexman
- Department of Psychology, University of Calgary, 2500 University Drive, Calgary, AB, T2N 1N4, Canada
| |
Collapse
|
10
|
van der Klis A, Adriaans F, Kager R. Infants' behaviours elicit different verbal, nonverbal, and multimodal responses from caregivers during early play. Infant Behav Dev 2023; 71:101828. [PMID: 36827720 DOI: 10.1016/j.infbeh.2023.101828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 02/03/2023] [Accepted: 02/16/2023] [Indexed: 02/25/2023]
Abstract
Caregivers use a range of verbal and nonverbal behaviours when responding to their infants. Previous studies have typically focused on the role of the caregiver in providing verbal responses, while communication is inherently multimodal (involving audio and visual information) and bidirectional (exchange of information between infant and caregiver). In this paper, we present a comprehensive study of caregivers' verbal, nonverbal, and multimodal responses to 10-month-old infants' vocalisations and gestures during free play. A new coding scheme was used to annotate 2036 infant vocalisations and gestures of which 87.1 % received a caregiver response. Most caregiver responses were verbal, but 39.7 % of all responses were multimodal. We also examined whether different infant behaviours elicited different responses from caregivers. Infant bimodal (i.e., vocal-gestural combination) behaviours elicited high rates of verbal responses and high rates of multimodal responses, while infant gestures elicited high rates of nonverbal responses. We also found that the types of verbal and nonverbal responses differed as a function of infant behaviour. The results indicate that infants influence the rates and types of responses they receive from caregivers. When examining caregiver-child interactions, analysing caregivers' verbal responses alone undermines the multimodal richness and bidirectionality of early communication.
Collapse
|
11
|
Abstract
There has been a lot of recent interest in the way that language might enhance embodied cognition. This interest is driven in large part by a growing body of evidence implicating the language system in various aspects of semantic memory-including, but not limited to, its apparent contribution to abstract concepts. In this essay, I develop and defend a novel account of the cognitive role played by language in our concepts. This account relies on the embodied nature of the language system itself, diverges in significant ways from traditional accounts, and is part of a flexible, multimodal and multilevel view of our conceptual system. This article is part of the theme issue 'Concepts in interaction: social engagement and inner experiences'.
Collapse
Affiliation(s)
- Guy O. Dove
- Department of Philosophy, University of Louisville, 313 Humanities Building, Louisville, KY 40292, USA
| |
Collapse
|
12
|
Abstract
The view put forward here is that visual bodily signals play a core role in human communication and the coordination of minds. Critically, this role goes far beyond referential and propositional meaning. The human communication system that we consider to be the explanandum in the evolution of language thus is not spoken language. It is, instead, a deeply multimodal, multilayered, multifunctional system that developed-and survived-owing to the extraordinary flexibility and adaptability that it endows us with. Beyond their undisputed iconic power, visual bodily signals (manual and head gestures, facial expressions, gaze, torso movements) fundamentally contribute to key pragmatic processes in modern human communication. This contribution becomes particularly evident with a focus that includes non-iconic manual signals, non-manual signals and signal combinations. Such a focus also needs to consider meaning encoded not just via iconic mappings, since kinematic modulations and interaction-bound meaning are additional properties equipping the body with striking pragmatic capacities. Some of these capacities, or its precursors, may have already been present in the last common ancestor we share with the great apes and may qualify as early versions of the components constituting the hypothesized interaction engine. This article is part of the theme issue 'Revisiting the human 'interaction engine': comparative approaches to social action coordination'.
Collapse
Affiliation(s)
- Judith Holler
- Max-Planck-Institut für Psycholinguistik, Nijmegen, The Netherlands
- Donders Centre for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| |
Collapse
|
13
|
de Varda AG, Strapparava C. A Cross‐Modal and Cross‐lingual Study of Iconicity in Language: Insights From Deep Learning. Cogn Sci 2022; 46:e13147. [PMID: 35665953 PMCID: PMC9285447 DOI: 10.1111/cogs.13147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 04/27/2022] [Accepted: 04/28/2022] [Indexed: 11/30/2022]
Abstract
The present paper addresses the study of non‐arbitrariness in language within a deep learning framework. We present a set of experiments aimed at assessing the pervasiveness of different forms of non‐arbitrary phonological patterns across a set of typologically distant languages. Different sequence‐processing neural networks are trained in a set of languages to associate the phonetic vectorization of a set of words to their sensory (Experiment 1), semantic (Experiment 2), and word‐class representations (Experiment 3). The models are then tested, without further training, in a set of novel instances in a language belonging to a different language family, and their performance is compared with a randomized baseline. We show that the three cross‐domain mappings can be successfully transferred across languages and language families, suggesting that the phonological structure of the lexicon is pervaded with language‐invariant cues about the words' meaning and their syntactic classes.
Collapse
|