1
|
Ter Bekke M, Levinson SC, van Otterdijk L, Kühn M, Holler J. Visual bodily signals and conversational context benefit the anticipation of turn ends. Cognition 2024; 248:105806. [PMID: 38749291 DOI: 10.1016/j.cognition.2024.105806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Revised: 03/04/2024] [Accepted: 04/24/2024] [Indexed: 05/26/2024]
Abstract
The typical pattern of alternating turns in conversation seems trivial at first sight. But a closer look quickly reveals the cognitive challenges involved, with much of it resulting from the fast-paced nature of conversation. One core ingredient to turn coordination is the anticipation of upcoming turn ends so as to be able to ready oneself for providing the next contribution. Across two experiments, we investigated two variables inherent to face-to-face conversation, the presence of visual bodily signals and preceding discourse context, in terms of their contribution to turn end anticipation. In a reaction time paradigm, participants anticipated conversational turn ends better when seeing the speaker and their visual bodily signals than when they did not, especially so for longer turns. Likewise, participants were better able to anticipate turn ends when they had access to the preceding discourse context than when they did not, and especially so for longer turns. Critically, the two variables did not interact, showing that visual bodily signals retain their influence even in the context of preceding discourse. In a pre-registered follow-up experiment, we manipulated the visibility of the speaker's head, eyes and upper body (i.e. torso + arms). Participants were better able to anticipate turn ends when the speaker's upper body was visible, suggesting a role for manual gestures in turn end anticipation. Together, these findings show that seeing the speaker during conversation may critically facilitate turn coordination in interaction.
Collapse
Affiliation(s)
- Marlijn Ter Bekke
- Donders Institute for Brain, Cognition & Behaviour, Radboud University, Nijmegen, the Netherlands; Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
| | | | - Lina van Otterdijk
- Donders Institute for Brain, Cognition & Behaviour, Radboud University, Nijmegen, the Netherlands
| | - Michelle Kühn
- Donders Institute for Brain, Cognition & Behaviour, Radboud University, Nijmegen, the Netherlands
| | - Judith Holler
- Donders Institute for Brain, Cognition & Behaviour, Radboud University, Nijmegen, the Netherlands; Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands.
| |
Collapse
|
2
|
Weissman B, Cohn N, Tanner D. The electrophysiology of lexical prediction of emoji and text. Neuropsychologia 2024; 198:108881. [PMID: 38579906 DOI: 10.1016/j.neuropsychologia.2024.108881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Revised: 02/18/2024] [Accepted: 03/27/2024] [Indexed: 04/07/2024]
Abstract
As emoji often appear naturally alongside text in utterances, they provide a way to study how prediction unfolds in multimodal sentences in direct comparison to unimodal sentences. In this experiment, participants (N = 40) read sentences in which the sentence-final noun appeared in either word form or emoji form, a between-subjects manipulation. The experiment featured both high constraint sentences and low constraint sentences to examine how the lexical processing of emoji interacts with prediction processes in sentence comprehension. Two well-established ERP components linked to lexical processing and prediction - the N400 and the Late Frontal Positivity - are investigated for sentence-final words and emoji to assess whether, to what extent, and in what linguistic contexts emoji are processed like words. Results indicate that the expected effects, namely an N400 effect to an implausible lexical item compared to a plausible one and an LFP effect to an unexpected lexical item compared to an expected one, emerged for both words and emoji. This paper discusses the similarities and differences between the stimulus types and constraint conditions, contextualized within theories of linguistic prediction, ERP components, and a multimodal lexicon.
Collapse
Affiliation(s)
- Benjamin Weissman
- Department of Cognitive Science Rensselaer Polytechnic Institute 110 8th Street, Troy, NY, 12180, USA; Department of Linguistics University of Illinois at Urbana-Champaign 707 S Mathews Ave, Urbana, IL, 61801, USA.
| | - Neil Cohn
- Department of Communication and Cognition Tilburg University PO Box 90153, 5000, LE Tilburg, the Netherlands
| | - Darren Tanner
- Department of Linguistics University of Illinois at Urbana-Champaign 707 S Mathews Ave, Urbana, IL, 61801, USA; AI For Good Lab Microsoft 1 Microsoft Way, Redmond, WA, USA
| |
Collapse
|
3
|
Kosie JE, Lew-Williams C. Infant-directed communication: Examining the many dimensions of everyday caregiver-infant interactions. Dev Sci 2024:e13515. [PMID: 38618899 DOI: 10.1111/desc.13515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 03/22/2024] [Accepted: 03/30/2024] [Indexed: 04/16/2024]
Abstract
Everyday caregiver-infant interactions are dynamic and multidimensional. However, existing research underestimates the dimensionality of infants' experiences, often focusing on one or two communicative signals (e.g., speech alone, or speech and gesture together). Here, we introduce "infant-directed communication" (IDC): the suite of communicative signals from caregivers to infants including speech, action, gesture, emotion, and touch. We recorded 10 min of at-home play between 44 caregivers and their 18- to 24-month-old infants from predominantly white, middle-class, English-speaking families in the United States. Interactions were coded for five dimensions of IDC as well as infants' gestures and vocalizations. Most caregivers used all five dimensions of IDC throughout the interaction, and these dimensions frequently overlapped. For example, over 60% of the speech that infants heard was accompanied by one or more non-verbal communicative cues. However, we saw marked variation across caregivers in their use of IDC, likely reflecting tailored communication to the behaviors and abilities of their infant. Moreover, caregivers systematically increased the dimensionality of IDC, using more overlapping cues in response to infant gestures and vocalizations, and more IDC with infants who had smaller vocabularies. Understanding how and when caregivers use all five signals-together and separately-in interactions with infants has the potential to redefine how developmental scientists conceive of infants' communicative environments, and enhance our understanding of the relations between caregiver input and early learning. RESEARCH HIGHLIGHTS: Infants' everyday interactions with caregivers are dynamic and multimodal, but existing research has underestimated the multidimensionality (i.e., the diversity of simultaneously occurring communicative cues) inherent in infant-directed communication. Over 60% of the speech that infants encounter during at-home, free play interactions overlap with one or more of a variety of non-speech communicative cues. The multidimensionality of caregivers' communicative cues increases in response to infants' gestures and vocalizations, providing new information about how infants' own behaviors shape their input. These findings emphasize the importance of understanding how caregivers use a diverse set of communicative behaviors-both separately and together-during everyday interactions with infants.
Collapse
Affiliation(s)
- Jessica E Kosie
- Department of Psychology, Princeton University, Princeton, New Jersey, USA
- School of Social and Behavioral Sciences, Arizona State University, Phoenix, Arizona, USA
| | - Casey Lew-Williams
- Department of Psychology, Princeton University, Princeton, New Jersey, USA
| |
Collapse
|
4
|
Whitley A, Naylor G, Hadley LV. Used to Be a Dime, Now It's a Dollar: Revised Speech Perception in Noise Key Word Predictability Revisited 40 Years On. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:1229-1242. [PMID: 38563688 PMCID: PMC11005954 DOI: 10.1044/2024_jslhr-23-00615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 12/11/2023] [Accepted: 01/12/2024] [Indexed: 04/04/2024]
Abstract
PURPOSE Almost 40 years after its development, in this article, we reexamine the relevance and validity of the ubiquitously used Revised Speech Perception in Noise (R-SPiN) sentence corpus. The R-SPiN corpus includes "high-context" and "low-context" sentences and has been widely used in the field of hearing research to examine the benefit derived from semantic context across English-speaking listeners, but research investigating age differences has yielded somewhat inconsistent findings. We assess the appropriateness of the corpus for use today in different English-language cultures (i.e., British and American) as well as for older and younger adults. METHOD Two hundred forty participants, including older (60-80 years) and younger (19-31 years) adult groups in the the United Kingdom and United States, completed a cloze task consisting of R-SPiN sentences with the final word removed. Cloze, as a measure of predictability, and entropy, as a measure of response uncertainty, were compared between culture and age groups. RESULTS Most critically, of the 200 "high-context" stimuli, only around half were assessed as highly predictable for older adults (United Kingdom: 109; United States: 107); and fewer still, for younger adults (United Kingdom: 75; United States: 81). We also found dominant responses to these "high-context" stimuli varied between cultures, with U.S. responses being more likely to match the original R-SPiN target. CONCLUSIONS Our findings highlight the issue of incomplete transferability of corpus items across English-language cultures as well as diminished equivalency for older and younger adults. By identifying relevant items for each population, this work could facilitate the interpretation of inconsistent findings in the literature, particularly relating to age effects.
Collapse
Affiliation(s)
- Alexina Whitley
- Hearing Sciences – Scottish Section, University of Nottingham, United Kingdom
| | - Graham Naylor
- Hearing Sciences – Scottish Section, University of Nottingham, United Kingdom
| | - Lauren V. Hadley
- Hearing Sciences – Scottish Section, University of Nottingham, United Kingdom
| |
Collapse
|
5
|
Motamedi Y, Murgiano M, Grzyb B, Gu Y, Kewenig V, Brieke R, Donnellan E, Marshall C, Wonnacott E, Perniss P, Vigliocco G. Language development beyond the here-and-now: Iconicity and displacement in child-directed communication. Child Dev 2024. [PMID: 38563146 DOI: 10.1111/cdev.14099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Most language use is displaced, referring to past, future, or hypothetical events, posing the challenge of how children learn what words refer to when the referent is not physically available. One possibility is that iconic cues that imagistically evoke properties of absent referents support learning when referents are displaced. In an audio-visual corpus of caregiver-child dyads, English-speaking caregivers interacted with their children (N = 71, 24-58 months) in contexts in which the objects talked about were either familiar or unfamiliar to the child, and either physically present or displaced. The analysis of the range of vocal, manual, and looking behaviors caregivers produced suggests that caregivers used iconic cues especially in displaced contexts and for unfamiliar objects, using other cues when objects were present.
Collapse
Affiliation(s)
- Yasamin Motamedi
- Department of Experimental Psychology, University College London, London, UK
| | - Margherita Murgiano
- Department of Experimental Psychology, University College London, London, UK
| | - Beata Grzyb
- Department of Experimental Psychology, University College London, London, UK
| | - Yan Gu
- Department of Experimental Psychology, University College London, London, UK
- Department of Psychology, University of Essex, Colchester, UK
| | - Viktor Kewenig
- Department of Experimental Psychology, University College London, London, UK
| | - Ricarda Brieke
- Department of Experimental Psychology, University College London, London, UK
| | - Ed Donnellan
- Department of Experimental Psychology, University College London, London, UK
| | - Chloe Marshall
- Institute of Education, University College London, London, UK
| | - Elizabeth Wonnacott
- Department of Language and Cognition, University College London, London, UK
- Department of Education, University of Oxford, Oxford, UK
| | | | - Gabriella Vigliocco
- Department of Experimental Psychology, University College London, London, UK
| |
Collapse
|
6
|
Özyurt G, Öztürk Y, Turan S, Çıray RO, Tanıgör EK, Ermiş Ç, Tufan AE, Akay A. Are Communication Skills, Emotion Regulation and Theory of Mind Skills Impaired in Adolescents with Developmental Dyslexia? Dev Neuropsychol 2024; 49:99-110. [PMID: 38466040 DOI: 10.1080/87565641.2024.2325338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 02/26/2024] [Indexed: 03/12/2024]
Abstract
This study investigates pragmatic language impairment, Theory of Mind (ToM), and emotion regulation in adolescents with Developmental Dyslexia(DD). The Social Responsiveness Scale-2(SRS) and Children's Communication Checklist-2(CCC-2) scores were found to be statistically significantly higher in the DD group than in healthy controls. DD group had lower performance in ToM skills and they have more difficulties in emotion regulation. We also found that CCC-2 and ToM scores were significantly correlated in adolescents with DD. These results may be important in understanding the difficulties experienced in social functioning and interpersonal relationships in adolescents with DD.
Collapse
Affiliation(s)
- Gonca Özyurt
- Department of Child and Adolescent Psychiatry, Katip Çelebi University Ataturk Training and Research Hospital, İzmir, Turkey
| | - Yusuf Öztürk
- Department of Child and Adolescent Psychiatry, Bolu Abant İzzet Baysal University, School of Medicine, Bolu, Turkey
| | - Serkan Turan
- Department of Child and Adolescent Psychiatry, Bursa Uludağ University, School of Medicine, Bursa, Turkey
| | - Remzi Oğulcan Çıray
- Department of Child and Adolescent Psychiatry, Dokuz Eylul University, School of Medicine, Izmir, Turkey
| | - Ezgi Karagöz Tanıgör
- Department of Child and Adolescent Psychiatry, Katip Çelebi University Ataturk Training and Research Hospital, İzmir, Turkey
| | - Çağatay Ermiş
- Department of Child and Adolescent Psychiatry, Dokuz Eylul University, School of Medicine, Izmir, Turkey
| | - Ali Evren Tufan
- Department of Child and Adolescent Psychiatry, Bolu Abant İzzet Baysal University, School of Medicine, Bolu, Turkey
| | - Aynur Akay
- Department of Child and Adolescent Psychiatry, Dokuz Eylul University, School of Medicine, Izmir, Turkey
| |
Collapse
|
7
|
Hagoort P, Özyürek A. Extending the Architecture of Language From a Multimodal Perspective. Top Cogn Sci 2024. [PMID: 38493475 DOI: 10.1111/tops.12728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Revised: 02/26/2024] [Accepted: 02/27/2024] [Indexed: 03/19/2024]
Abstract
Language is inherently multimodal. In spoken languages, combined spoken and visual signals (e.g., co-speech gestures) are an integral part of linguistic structure and language representation. This requires an extension of the parallel architecture, which needs to include the visual signals concomitant to speech. We present the evidence for the multimodality of language. In addition, we propose that distributional semantics might provide a format for integrating speech and co-speech gestures in a common semantic representation.
Collapse
Affiliation(s)
- Peter Hagoort
- Max Planck Institute for Psycholinguistics, Nijmegen
- Donders Institute for Brain, Cognition and Behaviour, Nijmegen
| | - Aslı Özyürek
- Max Planck Institute for Psycholinguistics, Nijmegen
- Donders Institute for Brain, Cognition and Behaviour, Nijmegen
| |
Collapse
|
8
|
Elbadawi M, Li H, Basit AW, Gaisford S. The role of artificial intelligence in generating original scientific research. Int J Pharm 2024; 652:123741. [PMID: 38181989 DOI: 10.1016/j.ijpharm.2023.123741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 12/20/2023] [Accepted: 12/22/2023] [Indexed: 01/07/2024]
Abstract
Artificial intelligence (AI) is a revolutionary technology that is finding wide application across numerous sectors. Large language models (LLMs) are an emerging subset technology of AI and have been developed to communicate using human languages. At their core, LLMs are trained with vast amounts of information extracted from the internet, including text and images. Their ability to create human-like, expert text in almost any subject means they are increasingly being used as an aid to presentation, particularly in scientific writing. However, we wondered whether LLMs could go further, generating original scientific research and preparing the results for publication. We taskedGPT-4, an LLM, to write an original pharmaceutics manuscript, on a topic that is itself novel. It was able to conceive a research hypothesis, define an experimental protocol, produce photo-realistic images of 3D printed tablets, generate believable analytical data from a range of instruments and write a convincing publication-ready manuscript with evidence of critical interpretation. The model achieved all this is less than 1 h. Moreover, the generated data were multi-modal in nature, including thermal analyses, vibrational spectroscopy and dissolution testing, demonstrating multi-disciplinary expertise in the LLM. One area in which the model failed, however, was in referencing to the literature. Since the generated experimental results appeared believable though, we suggest that LLMs could certainly play a role in scientific research but with human input, interpretation and data validation. We discuss the potential benefits and current bottlenecks for realising this ambition here.
Collapse
Affiliation(s)
- Moe Elbadawi
- UCL School of Pharmacy, University College London, 29-39 Brunswick Square, London WC1N 1AX, UK.
| | - Hanxiang Li
- UCL School of Pharmacy, University College London, 29-39 Brunswick Square, London WC1N 1AX, UK
| | - Abdul W Basit
- UCL School of Pharmacy, University College London, 29-39 Brunswick Square, London WC1N 1AX, UK
| | - Simon Gaisford
- UCL School of Pharmacy, University College London, 29-39 Brunswick Square, London WC1N 1AX, UK.
| |
Collapse
|
9
|
Saito H, Tiede M, Whalen DH, Ménard L. The effect of native language and bilingualism on multimodal perception in speech: A study of audio-aerotactile integrationa). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:2209-2220. [PMID: 38526052 PMCID: PMC10965246 DOI: 10.1121/10.0025381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 02/22/2024] [Accepted: 02/27/2024] [Indexed: 03/26/2024]
Abstract
Previous studies of speech perception revealed that tactile sensation can be integrated into the perception of stop consonants. It remains uncertain whether such multisensory integration can be shaped by linguistic experience, such as the listener's native language(s). This study investigates audio-aerotactile integration in phoneme perception for English and French monolinguals as well as English-French bilingual listeners. Six step voice onset time continua of alveolar (/da/-/ta/) and labial (/ba/-/pa/) stops constructed from both English and French end points were presented to listeners who performed a forced-choice identification task. Air puffs were synchronized to syllable onset and randomly applied to the back of the hand. Results show that stimuli with an air puff elicited more "voiceless" responses for the /da/-/ta/ continuum by both English and French listeners. This suggests that audio-aerotactile integration can occur even though the French listeners did not have an aspiration/non-aspiration contrast in their native language. Furthermore, bilingual speakers showed larger air puff effects compared to monolinguals in both languages, perhaps due to bilinguals' heightened receptiveness to multimodal information in speech.
Collapse
Affiliation(s)
- Haruka Saito
- Département de Linguistique, Université du Québec à Montréal, Montréal, Québec H2L2C5, Canada
| | - Mark Tiede
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut 06520, USA
| | - D H Whalen
- The Graduate Center, City University of New York (CUNY), New York, New York 10016, USA
- Yale Child Study Center, New Haven, Connecticut 06520, USA
| | - Lucie Ménard
- Département de Linguistique, Université du Québec à Montréal, Montréal, Québec H2L2C5, Canada
| |
Collapse
|
10
|
ter Bekke M, Drijvers L, Holler J. Gestures speed up responses to questions. LANGUAGE, COGNITION AND NEUROSCIENCE 2024; 39:423-430. [PMID: 38812611 PMCID: PMC11132552 DOI: 10.1080/23273798.2024.2314021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 01/21/2024] [Indexed: 05/31/2024]
Abstract
Most language use occurs in face-to-face conversation, which involves rapid turn-taking. Seeing communicative bodily signals in addition to hearing speech may facilitate such fast responding. We tested whether this holds for co-speech hand gestures by investigating whether these gestures speed up button press responses to questions. Sixty native speakers of Dutch viewed videos in which an actress asked yes/no-questions, either with or without a corresponding iconic hand gesture. Participants answered the questions as quickly and accurately as possible via button press. Gestures did not impact response accuracy, but crucially, gestures sped up responses, suggesting that response planning may be finished earlier when gestures are seen. How much gestures sped up responses was not related to their timing in the question or their timing with respect to the corresponding information in speech. Overall, these results are in line with the idea that multimodality may facilitate fast responding during face-to-face conversation.
Collapse
Affiliation(s)
- Marlijn ter Bekke
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
| | - Linda Drijvers
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
| | - Judith Holler
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
| |
Collapse
|
11
|
Titus A, Dijkstra T, Willems RM, Peeters D. Beyond the tried and true: How virtual reality, dialog setups, and a focus on multimodality can take bilingual language production research forward. Neuropsychologia 2024; 193:108764. [PMID: 38141963 DOI: 10.1016/j.neuropsychologia.2023.108764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 10/20/2023] [Accepted: 12/16/2023] [Indexed: 12/25/2023]
Abstract
Bilinguals possess the ability of expressing themselves in more than one language, and typically do so in contextually rich and dynamic settings. Theories and models have indeed long considered context factors to affect bilingual language production in many ways. However, most experimental studies in this domain have failed to fully incorporate linguistic, social, or physical context aspects, let alone combine them in the same study. Indeed, most experimental psycholinguistic research has taken place in isolated and constrained lab settings with carefully selected words or sentences, rather than under rich and naturalistic conditions. We argue that the most influential experimental paradigms in the psycholinguistic study of bilingual language production fall short of capturing the effects of context on language processing and control presupposed by prominent models. This paper therefore aims to enrich the methodological basis for investigating context aspects in current experimental paradigms and thereby move the field of bilingual language production research forward theoretically. After considering extensions of existing paradigms proposed to address context effects, we present three far-ranging innovative proposals, focusing on virtual reality, dialog situations, and multimodality in the context of bilingual language production.
Collapse
Affiliation(s)
- Alex Titus
- Radboud University, Centre for Language Studies, Nijmegen, the Netherlands; Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands.
| | - Ton Dijkstra
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, the Netherlands
| | - Roel M Willems
- Radboud University, Centre for Language Studies, Nijmegen, the Netherlands
| | - David Peeters
- Tilburg University, Department of Communication and Cognition, TiCC, Tilburg, the Netherlands
| |
Collapse
|
12
|
Pishghadam R, Shayesteh S, Daneshvarfard F, Boustani N, Seyednozadi Z, Zabetipour M, Pishghadam M. Cognition-Emotion Interaction during L2 Sentence Comprehension: The Correlation of ERP and GSR Responses to Sense Combinations. JOURNAL OF PSYCHOLINGUISTIC RESEARCH 2024; 53:7. [PMID: 38281286 DOI: 10.1007/s10936-024-10039-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 11/18/2023] [Indexed: 01/30/2024]
Abstract
This study mainly examined the role of the combination of three senses (i.e., auditory, visual, and tactile) and five senses (i.e., auditory, visual, tactile, olfactory, and gustatory) in the correlation between electrophysiological and electrodermal responses underlying second language (L2) sentence comprehension. Forty subjects did two acceptability judgment tasks, encompassing congruent and semantically/pragmatically incongruent sentences. The event-related potential (ERP) and galvanic skin response (GSR) data for both the target and final words of the sentences were collected and analyzed. The results revealed that there is an interaction between cognitive and emotional responses in both semantically and pragmatically incongruent sentences, yet the timing of the interaction is longer in sentences with pragmatic incongruity due to their complexity. Based on the ERP and GSR correlation results, it was further found that the five-sense combination approach improves L2 sentence comprehension and interest in learning materials yet reduces the level of excitement or arousal. While this approach might be beneficial for some learners, it might be detrimental for those in favor of stimulating learning environments.
Collapse
Affiliation(s)
- Reza Pishghadam
- Faculty of Letters and Humanities, Ferdowsi University of Mashhad, Azadi Square, Mashhad, Khorasan-e-Razavi, Iran
| | - Shaghayegh Shayesteh
- Faculty of Letters and Humanities, Ferdowsi University of Mashhad, Azadi Square, Mashhad, Khorasan-e-Razavi, Iran.
| | - Farveh Daneshvarfard
- Faculty of Letters and Humanities, Ferdowsi University of Mashhad, Azadi Square, Mashhad, Khorasan-e-Razavi, Iran
| | - Nasim Boustani
- Faculty of Letters and Humanities, Ferdowsi University of Mashhad, Azadi Square, Mashhad, Khorasan-e-Razavi, Iran
| | - Zahra Seyednozadi
- Faculty of Letters and Humanities, Ferdowsi University of Mashhad, Azadi Square, Mashhad, Khorasan-e-Razavi, Iran
| | - Mohammad Zabetipour
- Faculty of Letters and Humanities, Ferdowsi University of Mashhad, Azadi Square, Mashhad, Khorasan-e-Razavi, Iran
| | - Morteza Pishghadam
- Faculty of Letters and Humanities, Ferdowsi University of Mashhad, Azadi Square, Mashhad, Khorasan-e-Razavi, Iran
| |
Collapse
|
13
|
Trujillo JP, Holler J. Conversational facial signals combine into compositional meanings that change the interpretation of speaker intentions. Sci Rep 2024; 14:2286. [PMID: 38280963 PMCID: PMC10821935 DOI: 10.1038/s41598-024-52589-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 01/20/2024] [Indexed: 01/29/2024] Open
Abstract
Human language is extremely versatile, combining a limited set of signals in an unlimited number of ways. However, it is unknown whether conversational visual signals feed into the composite utterances with which speakers communicate their intentions. We assessed whether different combinations of visual signals lead to different intent interpretations of the same spoken utterance. Participants viewed a virtual avatar uttering spoken questions while producing single visual signals (i.e., head turn, head tilt, eyebrow raise) or combinations of these signals. After each video, participants classified the communicative intention behind the question. We found that composite utterances combining several visual signals conveyed different meaning compared to utterances accompanied by the single visual signals. However, responses to combinations of signals were more similar to the responses to related, rather than unrelated, individual signals, indicating a consistent influence of the individual visual signals on the whole. This study therefore provides first evidence for compositional, non-additive (i.e., Gestalt-like) perception of multimodal language.
Collapse
Affiliation(s)
- James P Trujillo
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands.
- Donders Institute for Brain, Cognition, and Behaviour, Nijmegen, The Netherlands.
| | - Judith Holler
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition, and Behaviour, Nijmegen, The Netherlands
| |
Collapse
|
14
|
Ter Bekke M, Drijvers L, Holler J. Hand Gestures Have Predictive Potential During Conversation: An Investigation of the Timing of Gestures in Relation to Speech. Cogn Sci 2024; 48:e13407. [PMID: 38279899 DOI: 10.1111/cogs.13407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 07/09/2023] [Accepted: 01/10/2024] [Indexed: 01/29/2024]
Abstract
During face-to-face conversation, transitions between speaker turns are incredibly fast. These fast turn exchanges seem to involve next speakers predicting upcoming semantic information, such that next turn planning can begin before a current turn is complete. Given that face-to-face conversation also involves the use of communicative bodily signals, an important question is how bodily signals such as co-speech hand gestures play into these processes of prediction and fast responding. In this corpus study, we found that hand gestures that depict or refer to semantic information started before the corresponding information in speech, which held both for the onset of the gesture as a whole, as well as the onset of the stroke (the most meaningful part of the gesture). This early timing potentially allows listeners to use the gestural information to predict the corresponding semantic information to be conveyed in speech. Moreover, we provided further evidence that questions with gestures got faster responses than questions without gestures. However, we found no evidence for the idea that how much a gesture precedes its lexical affiliate (i.e., its predictive potential) relates to how fast responses were given. The findings presented here highlight the importance of the temporal relation between speech and gesture and help to illuminate the potential mechanisms underpinning multimodal language processing during face-to-face conversation.
Collapse
Affiliation(s)
- Marlijn Ter Bekke
- Donders Institute for Brain, Cognition and Behaviour, Radboud University
- Max Planck Institute for Psycholinguistics
| | - Linda Drijvers
- Donders Institute for Brain, Cognition and Behaviour, Radboud University
- Max Planck Institute for Psycholinguistics
| | - Judith Holler
- Donders Institute for Brain, Cognition and Behaviour, Radboud University
- Max Planck Institute for Psycholinguistics
| |
Collapse
|
15
|
Huizeling E, Alday PM, Peeters D, Hagoort P. Combining EEG and 3D-eye-tracking to study the prediction of upcoming speech in naturalistic virtual environments: A proof of principle. Neuropsychologia 2023; 191:108730. [PMID: 37939871 DOI: 10.1016/j.neuropsychologia.2023.108730] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 09/15/2023] [Accepted: 11/03/2023] [Indexed: 11/10/2023]
Abstract
EEG and eye-tracking provide complementary information when investigating language comprehension. Evidence that speech processing may be facilitated by speech prediction comes from the observation that a listener's eye gaze moves towards a referent before it is mentioned if the remainder of the spoken sentence is predictable. However, changes to the trajectory of anticipatory fixations could result from a change in prediction or an attention shift. Conversely, N400 amplitudes and concurrent spectral power provide information about the ease of word processing the moment the word is perceived. In a proof-of-principle investigation, we combined EEG and eye-tracking to study linguistic prediction in naturalistic, virtual environments. We observed increased processing, reflected in theta band power, either during verb processing - when the verb was predictive of the noun - or during noun processing - when the verb was not predictive of the noun. Alpha power was higher in response to the predictive verb and unpredictable nouns. We replicated typical effects of noun congruence but not predictability on the N400 in response to the noun. Thus, the rich visual context that accompanied speech in virtual reality influenced language processing compared to previous reports, where the visual context may have facilitated processing of unpredictable nouns. Finally, anticipatory fixations were predictive of spectral power during noun processing and the length of time fixating the target could be predicted by spectral power at verb onset, conditional on the object having been fixated. Overall, we show that combining EEG and eye-tracking provides a promising new method to answer novel research questions about the prediction of upcoming linguistic input, for example, regarding the role of extralinguistic cues in prediction during language comprehension.
Collapse
Affiliation(s)
- Eleanor Huizeling
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands.
| | | | - David Peeters
- Department of Communication and Cognition, TiCC, Tilburg University, Tilburg, the Netherlands
| | - Peter Hagoort
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands; Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, the Netherlands
| |
Collapse
|
16
|
Olson HA, Chen EM, Lydic KO, Saxe RR. Left-Hemisphere Cortical Language Regions Respond Equally to Observed Dialogue and Monologue. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2023; 4:575-610. [PMID: 38144236 PMCID: PMC10745132 DOI: 10.1162/nol_a_00123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Accepted: 09/20/2023] [Indexed: 12/26/2023]
Abstract
Much of the language we encounter in our everyday lives comes in the form of conversation, yet the majority of research on the neural basis of language comprehension has used input from only one speaker at a time. Twenty adults were scanned while passively observing audiovisual conversations using functional magnetic resonance imaging. In a block-design task, participants watched 20 s videos of puppets speaking either to another puppet (the dialogue condition) or directly to the viewer (the monologue condition), while the audio was either comprehensible (played forward) or incomprehensible (played backward). Individually functionally localized left-hemisphere language regions responded more to comprehensible than incomprehensible speech but did not respond differently to dialogue than monologue. In a second task, participants watched videos (1-3 min each) of two puppets conversing with each other, in which one puppet was comprehensible while the other's speech was reversed. All participants saw the same visual input but were randomly assigned which character's speech was comprehensible. In left-hemisphere cortical language regions, the time course of activity was correlated only among participants who heard the same character speaking comprehensibly, despite identical visual input across all participants. For comparison, some individually localized theory of mind regions and right-hemisphere homologues of language regions responded more to dialogue than monologue in the first task, and in the second task, activity in some regions was correlated across all participants regardless of which character was speaking comprehensibly. Together, these results suggest that canonical left-hemisphere cortical language regions are not sensitive to differences between observed dialogue and monologue.
Collapse
|
17
|
Nota N, Trujillo JP, Jacobs V, Holler J. Facilitating question identification through natural intensity eyebrow movements in virtual avatars. Sci Rep 2023; 13:21295. [PMID: 38042876 PMCID: PMC10693605 DOI: 10.1038/s41598-023-48586-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 11/28/2023] [Indexed: 12/04/2023] Open
Abstract
In conversation, recognizing social actions (similar to 'speech acts') early is important to quickly understand the speaker's intended message and to provide a fast response. Fast turns are typical for fundamental social actions like questions, since a long gap can indicate a dispreferred response. In multimodal face-to-face interaction, visual signals may contribute to this fast dynamic. The face is an important source of visual signalling, and previous research found that prevalent facial signals such as eyebrow movements facilitate the rapid recognition of questions. We aimed to investigate whether early eyebrow movements with natural movement intensities facilitate question identification, and whether specific intensities are more helpful in detecting questions. Participants were instructed to view videos of avatars where the presence of eyebrow movements (eyebrow frown or raise vs. no eyebrow movement) was manipulated, and to indicate whether the utterance in the video was a question or statement. Results showed higher accuracies for questions with eyebrow frowns, and faster response times for questions with eyebrow frowns and eyebrow raises. No additional effect was observed for the specific movement intensity. This suggests that eyebrow movements that are representative of naturalistic multimodal behaviour facilitate question recognition.
Collapse
Affiliation(s)
- Naomi Nota
- Donders Institute for Brain, Cognition, and Behaviour, Nijmegen, The Netherlands.
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands.
| | - James P Trujillo
- Donders Institute for Brain, Cognition, and Behaviour, Nijmegen, The Netherlands
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Vere Jacobs
- Faculty of Arts, Radboud University, Nijmegen, The Netherlands
| | - Judith Holler
- Donders Institute for Brain, Cognition, and Behaviour, Nijmegen, The Netherlands
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| |
Collapse
|
18
|
Nota N, Trujillo JP, Holler J. Conversational Eyebrow Frowns Facilitate Question Identification: An Online Study Using Virtual Avatars. Cogn Sci 2023; 47:e13392. [PMID: 38058215 DOI: 10.1111/cogs.13392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 10/11/2023] [Accepted: 11/27/2023] [Indexed: 12/08/2023]
Abstract
Conversation is a time-pressured environment. Recognizing a social action (the ''speech act,'' such as a question requesting information) early is crucial in conversation to quickly understand the intended message and plan a timely response. Fast turns between interlocutors are especially relevant for responses to questions since a long gap may be meaningful by itself. Human language is multimodal, involving speech as well as visual signals from the body, including the face. But little is known about how conversational facial signals contribute to the communication of social actions. Some of the most prominent facial signals in conversation are eyebrow movements. Previous studies found links between eyebrow movements and questions, suggesting that these facial signals could contribute to the rapid recognition of questions. Therefore, we aimed to investigate whether early eyebrow movements (eyebrow frown or raise vs. no eyebrow movement) facilitate question identification. Participants were instructed to view videos of avatars where the presence of eyebrow movements accompanying questions was manipulated. Their task was to indicate whether the utterance was a question or a statement as accurately and quickly as possible. Data were collected using the online testing platform Gorilla. Results showed higher accuracies and faster response times for questions with eyebrow frowns, suggesting a facilitative role of eyebrow frowns for question identification. This means that facial signals can critically contribute to the communication of social actions in conversation by signaling social action-specific visual information and providing visual cues to speakers' intentions.
Collapse
Affiliation(s)
- Naomi Nota
- Donders Institute for Brain, Cognition, and Behaviour, Nijmegen
- Max Planck Institute for Psycholinguistics, Nijmegen
| | - James P Trujillo
- Donders Institute for Brain, Cognition, and Behaviour, Nijmegen
- Max Planck Institute for Psycholinguistics, Nijmegen
| | - Judith Holler
- Donders Institute for Brain, Cognition, and Behaviour, Nijmegen
- Max Planck Institute for Psycholinguistics, Nijmegen
| |
Collapse
|
19
|
Zhang Y, Ding R, Frassinelli D, Tuomainen J, Klavinskis-Whiting S, Vigliocco G. The role of multimodal cues in second language comprehension. Sci Rep 2023; 13:20824. [PMID: 38012193 PMCID: PMC10682458 DOI: 10.1038/s41598-023-47643-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 11/16/2023] [Indexed: 11/29/2023] Open
Abstract
In face-to-face communication, multimodal cues such as prosody, gestures, and mouth movements can play a crucial role in language processing. While several studies have addressed how these cues contribute to native (L1) language processing, their impact on non-native (L2) comprehension is largely unknown. Comprehension of naturalistic language by L2 comprehenders may be supported by the presence of (at least some) multimodal cues, as these provide correlated and convergent information that may aid linguistic processing. However, it is also the case that multimodal cues may be less used by L2 comprehenders because linguistic processing is more demanding than for L1 comprehenders, leaving more limited resources for the processing of multimodal cues. In this study, we investigated how L2 comprehenders use multimodal cues in naturalistic stimuli (while participants watched videos of a speaker), as measured by electrophysiological responses (N400) to words, and whether there are differences between L1 and L2 comprehenders. We found that prosody, gestures, and informative mouth movements each reduced the N400 in L2, indexing easier comprehension. Nevertheless, L2 participants showed weaker effects for each cue compared to L1 comprehenders, with the exception of meaningful gestures and informative mouth movements. These results show that L2 comprehenders focus on specific multimodal cues - meaningful gestures that support meaningful interpretation and mouth movements that enhance the acoustic signal - while using multimodal cues to a lesser extent than L1 comprehenders overall.
Collapse
Affiliation(s)
- Ye Zhang
- Experimental Psychology, University College London, London, UK
| | - Rong Ding
- Language and Computation in Neural Systems, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Diego Frassinelli
- Department of Linguistics, University of Konstanz, Konstanz, Germany
| | - Jyrki Tuomainen
- Speech, Hearing and Phonetic Sciences, University College London, London, UK
| | | | | |
Collapse
|
20
|
Raghavan R, Raviv L, Peeters D. What's your point? Insights from virtual reality on the relation between intention and action in the production of pointing gestures. Cognition 2023; 240:105581. [PMID: 37573692 DOI: 10.1016/j.cognition.2023.105581] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 07/03/2023] [Accepted: 07/26/2023] [Indexed: 08/15/2023]
Abstract
Human communication involves the process of translating intentions into communicative actions. But how exactly do our intentions surface in the visible communicative behavior we display? Here we focus on pointing gestures, a fundamental building block of everyday communication, and investigate whether and how different types of underlying intent modulate the kinematics of the pointing hand and the brain activity preceding the gestural movement. In a dynamic virtual reality environment, participants pointed at a referent to either share attention with their addressee, inform their addressee, or get their addressee to perform an action. Behaviorally, it was observed that these different underlying intentions modulated how long participants kept their arm and finger still, both prior to starting the movement and when keeping their pointing hand in apex position. In early planning stages, a neurophysiological distinction was observed between a gesture that is used to share attitudes and knowledge with another person versus a gesture that mainly uses that person as a means to perform an action. Together, these findings suggest that our intentions influence our actions from the earliest neurophysiological planning stages to the kinematic endpoint of the movement itself.
Collapse
Affiliation(s)
- Renuka Raghavan
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands; Radboud University, Donders Institute for Brain, Cognition, and Behavior, Nijmegen, The Netherlands
| | - Limor Raviv
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands; Centre for Social, Cognitive and Affective Neuroscience (cSCAN), University of Glasgow, United Kingdom
| | - David Peeters
- Tilburg University, Department of Communication and Cognition, TiCC, Tilburg, The Netherlands.
| |
Collapse
|
21
|
Dideriksen C, Christiansen MH, Dingemanse M, Højmark-Bertelsen M, Johansson C, Tylén K, Fusaroli R. Language-Specific Constraints on Conversation: Evidence from Danish and Norwegian. Cogn Sci 2023; 47:e13387. [PMID: 38009981 DOI: 10.1111/cogs.13387] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 10/30/2023] [Accepted: 11/13/2023] [Indexed: 11/29/2023]
Abstract
Establishing and maintaining mutual understanding in everyday conversations is crucial. To do so, people employ a variety of conversational devices, such as backchannels, repair, and linguistic entrainment. Here, we explore whether the use of conversational devices might be influenced by cross-linguistic differences in the speakers' native language, comparing two matched languages-Danish and Norwegian-differing primarily in their sound structure, with Danish being more opaque, that is, less acoustically distinguished. Across systematically manipulated conversational contexts, we find that processes supporting mutual understanding in conversations vary with external constraints: across different contexts and, crucially, across languages. In accord with our predictions, linguistic entrainment was overall higher in Danish than in Norwegian, while backchannels and repairs presented a more nuanced pattern. These findings are compatible with the hypothesis that native speakers of Danish may compensate for its opaque sound structure by adopting a top-down strategy of building more conversational redundancy through entrainment, which also might reduce the need for repairs. These results suggest that linguistic differences might be met by systematic changes in language processing and use. This paves the way for further cross-linguistic investigations and critical assessment of the interplay between cultural and linguistic factors on the one hand and conversational dynamics on the other.
Collapse
Affiliation(s)
| | - Morten H Christiansen
- School of Communication and Culture, Aarhus University
- The Interacting Minds Center, Aarhus University
- Department of Psychology, Cornell University
| | | | | | - Christer Johansson
- Department of Linguistic, Literary and Aesthetic Studies, University of Bergen
| | - Kristian Tylén
- School of Communication and Culture, Aarhus University
- The Interacting Minds Center, Aarhus University
| | - Riccardo Fusaroli
- School of Communication and Culture, Aarhus University
- The Interacting Minds Center, Aarhus University
- Linguistic Data Consortium, University of Pennsylvania
| |
Collapse
|
22
|
Öztürk Y, Özyurt G, Turan S, Tufan AE, Akay AP. Emotion dysregulation and social communication problems but not ToM properties may predict obsessive-compulsive disorder symptom severity. Nord J Psychiatry 2023; 77:778-787. [PMID: 37665655 DOI: 10.1080/08039488.2023.2251953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 07/06/2023] [Accepted: 08/16/2023] [Indexed: 09/06/2023]
Abstract
OBJECTIVE Studies have shown that theory of mind, emotion regulation and pragmatic abilities are negatively affected in people with obsessive-compulsive disorder (OCD). We aimed to investigate theory of mind (ToM) abilities, social responsiveness, pragmatic language, and emotion regulation skills in children with OCD and to compare them to healthy controls. METHODS This study was designed as a single-center, cross-sectional, case-control study. ToM abilities were evaluated via "Reading the Mind in the Eyes Test" (RMET), "Faces Test", "Faux-Pas Test", "Comprehension Test" and "Unexpected Outcomes Test". Social responsiveness, pragmatic language and emotion regulation were evaluated by Social Responsiveness Scale (SRS), Children's Communication Checklist- Second Edition (CCC-2), Difficulties in Emotion Regulation Scale (DERS) and Children's Yale-Brown Obsessive-Compulsive Scale (CY-BOCS). Within the study period, we enrolled 85 adolescents (42 with OCD and 43 controls). RESULTS The OCD group performed significantly lower than healthy controls in the Faux Pass and Comprehension tests (p = 0.003 for both). We found a statistically significant difference between groups in terms of the goal, strategy, non-acceptance subscales of the DERS (p < 0.001, p = 0.006, p = 0.008, respectively) as well as the total DERS score (p < 0.001). CY-BOCS total scores correlated significantly and negatively with Comprehension, Faux Pas and Unexpected Outcomes tests, and positively with CCC total, SRS total and DERS total scores. In regression analysis the DERS, SRS and CCC tests emerged as significant predictors of CY-BOCS total score. CONCLUSION Addressing ToM, pragmatic, and ER difficulties when planning the treatment of young people with OCD may contribute to positive outcomes.
Collapse
Affiliation(s)
- Yusuf Öztürk
- Department of Child and Adolescent Psychiatry, Bolu Abant Izzet Baysal University Medical Faculty, Bolu, Turkey
| | - Gonca Özyurt
- Department of Child and Adolescent Psychiatry, Izmir Katip Çelebi University Medical Faculty, İzmir, Turkey
| | - Serkan Turan
- Department of Child and Adolescent Psychiatry, Uludağ University Medical Faculty, Bursa, Turkey
| | - Ali Evren Tufan
- Department of Child and Adolescent Psychiatry, Bolu Abant Izzet Baysal University Medical Faculty, Bolu, Turkey
| | - Aynur Pekcanlar Akay
- Department of Child and Adolescent Psychiatry, Dokuz Eylul University Medical Faculty, Izmir, Turkey
| |
Collapse
|
23
|
Scott-Phillips T, Heintz C. Great ape interaction: Ladyginian but not Gricean. Proc Natl Acad Sci U S A 2023; 120:e2300243120. [PMID: 37824522 PMCID: PMC10589610 DOI: 10.1073/pnas.2300243120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2023] Open
Abstract
Nonhuman great apes inform one another in ways that can seem very humanlike. Especially in the gestural domain, their behavior exhibits many similarities with human communication, meeting widely used empirical criteria for intentionality. At the same time, there remain some manifest differences, most obviously the enormous range and scope of human expression. How to account for these similarities and differences in a unified way remains a major challenge. Here, we make a key distinction between the expression of intentions (Ladyginian) and the expression of specifically informative intentions (Gricean), and we situate this distinction within a "special case of" framework for classifying different modes of attention manipulation. We hence describe how the attested tendencies of great ape interaction-for instance, to be dyadic rather than triadic, to be about the here-and-now rather than "displaced," and to have a high degree of perceptual resemblance between form and meaning-are products of its Ladyginian but not Gricean character. We also reinterpret video footage of great ape gesture as Ladyginian but not Gricean, and we distinguish several varieties of meaning that are continuous with one another. We conclude that the evolutionary origins of linguistic meaning lie not in gradual changes in communication systems, but rather in gradual changes in social cognition, and specifically in what modes of attention manipulation are enabled by a species' cognitive phenotype: first Ladyginian and in turn Gricean. The second of these shifts rendered humans, and only humans, "language ready."
Collapse
Affiliation(s)
- Thom Scott-Phillips
- Institute for Logic, Cognition, Language and Information, 20018Donostia-San Sebastian, Spain
| | - Christophe Heintz
- Department of Cognitive Science, Central European University, A-1100Vienna, Austria
| |
Collapse
|
24
|
Zhang M, Zhang H, Tang E, Ding H, Zhang Y. Evaluating the Relative Perceptual Salience of Linguistic and Emotional Prosody in Quiet and Noisy Contexts. Behav Sci (Basel) 2023; 13:800. [PMID: 37887450 PMCID: PMC10603920 DOI: 10.3390/bs13100800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 09/22/2023] [Accepted: 09/25/2023] [Indexed: 10/28/2023] Open
Abstract
How people recognize linguistic and emotional prosody in different listening conditions is essential for understanding the complex interplay between social context, cognition, and communication. The perception of both lexical tones and emotional prosody depends on prosodic features including pitch, intensity, duration, and voice quality. However, it is unclear which aspect of prosody is perceptually more salient and resistant to noise. This study aimed to investigate the relative perceptual salience of emotional prosody and lexical tone recognition in quiet and in the presence of multi-talker babble noise. Forty young adults randomly sampled from a pool of native Mandarin Chinese with normal hearing listened to monosyllables either with or without background babble noise and completed two identification tasks, one for emotion recognition and the other for lexical tone recognition. Accuracy and speed were recorded and analyzed using generalized linear mixed-effects models. Compared with emotional prosody, lexical tones were more perceptually salient in multi-talker babble noise. Native Mandarin Chinese participants identified lexical tones more accurately and quickly than vocal emotions at the same signal-to-noise ratio. Acoustic and cognitive dissimilarities between linguistic prosody and emotional prosody may have led to the phenomenon, which calls for further explorations into the underlying psychobiological and neurophysiological mechanisms.
Collapse
Affiliation(s)
- Minyue Zhang
- Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, Shanghai 200240, China; (M.Z.); (H.Z.); (E.T.)
| | - Hui Zhang
- Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, Shanghai 200240, China; (M.Z.); (H.Z.); (E.T.)
| | - Enze Tang
- Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, Shanghai 200240, China; (M.Z.); (H.Z.); (E.T.)
| | - Hongwei Ding
- Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, Shanghai 200240, China; (M.Z.); (H.Z.); (E.T.)
| | - Yang Zhang
- Department of Speech-Language-Hearing Sciences and Masonic Institute for the Developing Brain, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|
25
|
Arminen IAT, Heino ASM. Civil inattention-On the sources of relational segregation. FRONTIERS IN SOCIOLOGY 2023; 8:1212090. [PMID: 37731909 PMCID: PMC10508291 DOI: 10.3389/fsoc.2023.1212090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 08/09/2023] [Indexed: 09/22/2023]
Abstract
The article employs ethnomethodological conversation analysis (CA) and experimental video analysis to scrutinize the gaze behavior of urban passersby. We operationalize Goffman's concept of civil inattention to make it an empirical research object with defined boundaries. Video analysis enabled measurement of gaze lengths to establish measures for "normal" gazes within civil inattention and to account for their breaches. We also studied the dependence of gazing behavior on the recipient's social appearance by comparing the unmarked condition, the experimenter wearing casual, indistinctive clothes, to marked conditions, the experimenter wearing either a distinct sunhat or an abaya and niqab. The breaches of civil inattention toward marked gaze recipients were 10-fold compared to unmarked recipients. Furthermore, the analysis points out the commonality of hitherto unknown micro gazes and multiple gazes. Together the findings suggest the existence of subconscious monitoring beneath the public social order, which pre-structures interaction order, and indicates that stigmatization is a source for relational segregation.
Collapse
|
26
|
Trujillo JP, Holler J. Interactionally Embedded Gestalt Principles of Multimodal Human Communication. PERSPECTIVES ON PSYCHOLOGICAL SCIENCE 2023; 18:1136-1159. [PMID: 36634318 PMCID: PMC10475215 DOI: 10.1177/17456916221141422] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Natural human interaction requires us to produce and process many different signals, including speech, hand and head gestures, and facial expressions. These communicative signals, which occur in a variety of temporal relations with each other (e.g., parallel or temporally misaligned), must be rapidly processed as a coherent message by the receiver. In this contribution, we introduce the notion of interactionally embedded, affordance-driven gestalt perception as a framework that can explain how this rapid processing of multimodal signals is achieved as efficiently as it is. We discuss empirical evidence showing how basic principles of gestalt perception can explain some aspects of unimodal phenomena such as verbal language processing and visual scene perception but require additional features to explain multimodal human communication. We propose a framework in which high-level gestalt predictions are continuously updated by incoming sensory input, such as unfolding speech and visual signals. We outline the constituent processes that shape high-level gestalt perception and their role in perceiving relevance and prägnanz. Finally, we provide testable predictions that arise from this multimodal interactionally embedded gestalt-perception framework. This review and framework therefore provide a theoretically motivated account of how we may understand the highly complex, multimodal behaviors inherent in natural social interaction.
Collapse
Affiliation(s)
- James P. Trujillo
- Donders Institute for Brain, Cognition, and Behaviour, Nijmegen, the Netherlands
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
| | - Judith Holler
- Donders Institute for Brain, Cognition, and Behaviour, Nijmegen, the Netherlands
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
| |
Collapse
|
27
|
Nicoras R, Gotowiec S, Hadley LV, Smeds K, Naylor G. Conversation success in one-to-one and group conversation: a group concept mapping study of adults with normal and impaired hearing. Int J Audiol 2023; 62:868-876. [PMID: 35875851 DOI: 10.1080/14992027.2022.2095538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Revised: 06/22/2022] [Accepted: 06/24/2022] [Indexed: 11/05/2022]
Abstract
OBJECTIVE The concept of conversation success is undefined, although prior work has variously related it to accurate exchange of information, alignment between interlocutors, and good management of misunderstandings. This study aimed (1) to identify factors of conversation success and (2) to explore the importance of these factors in one-to-one versus group conversations. DESIGN Group concept mapping method was applied. Participants responded to two brainstorming prompts ("What does 'successful conversation' look like?" and "Think about a successful conversation you have taken part in. What aspects of that conversation contributed to its success?"). The resulting statements were sorted into related clusters and rated in importance for one-to-one and group conversation. STUDY SAMPLE Thirty-five adults with normal and impaired hearing. RESULTS Seven clusters were identified: (1) Being able to listen easily; (2) Being spoken to in a helpful way; (3) Being engaged and accepted; (4) Sharing information as desired; (5) Perceiving flowing and balanced interaction; (6) Feeling positive emotions; (7) Not having to engage coping mechanisms. Three clusters (1, 2, and 4) were more important in group than in one-to-one conversation. There were no differences by hearing group. CONCLUSIONS These findings emphasise that conversation success is a multifaceted concept.
Collapse
Affiliation(s)
- Raluca Nicoras
- Hearing Sciences - Scottish Section, School of Medicine, University of Nottingham, Nottingham, UK
| | | | - Lauren V Hadley
- Hearing Sciences - Scottish Section, School of Medicine, University of Nottingham, Nottingham, UK
| | - Karolina Smeds
- Hearing Sciences - Scottish Section, School of Medicine, University of Nottingham, Nottingham, UK
- ORCA Europe, WS Audiology, Stockholm, Sweden
| | - Graham Naylor
- Hearing Sciences - Scottish Section, School of Medicine, University of Nottingham, Nottingham, UK
| |
Collapse
|
28
|
Holmlund TB, Chandler C, Foltz PW, Diaz-Asper C, Cohen AS, Rodriguez Z, Elvevåg B. Towards a temporospatial framework for measurements of disorganization in speech using semantic vectors. Schizophr Res 2023; 259:71-79. [PMID: 36372683 DOI: 10.1016/j.schres.2022.09.020] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 09/05/2022] [Accepted: 09/06/2022] [Indexed: 11/11/2022]
Abstract
Incoherent speech in schizophrenia has long been described as the mind making "leaps" of large distances between thoughts and ideas. Such a view seems intuitive, and for almost two decades, attempts to operationalize these conceptual "leaps" in spoken word meanings have used language-based embedding spaces. An embedding space represents meaning of words as numerical vectors where a greater proximity between word vectors represents more shared meaning. However, there are limitations with word vector-based operationalizations of coherence which can limit their appeal and utility in clinical practice. First, the use of esoteric word embeddings can be conceptually hard to grasp, and this is complicated by several different operationalizations of incoherent speech. This problem can be overcome by a better visualization of methods. Second, temporal information from the act of speaking has been largely neglected since models have been built using written text, yet speech is spoken in real time. This issue can be resolved by leveraging time stamped transcripts of speech. Third, contextual information - namely the situation of where something is spoken - has often only been inferred and never explicitly modeled. Addressing this situational issue opens up new possibilities for models with increased temporal resolution and contextual relevance. In this paper, direct visualizations of semantic distances are used to enable the inspection of examples of incoherent speech. Some common operationalizations of incoherence are illustrated, and suggestions are made for how temporal and spatial contextual information can be integrated in future implementations of measures of incoherence.
Collapse
Affiliation(s)
- Terje B Holmlund
- Department of Clinical Medicine, University of Tromsø - the Arctic University of Norway, Tromsø, Norway.
| | - Chelsea Chandler
- Institute of Cognitive Science, University of Colorado Boulder, United States of America
| | - Peter W Foltz
- Institute of Cognitive Science, University of Colorado Boulder, United States of America
| | | | - Alex S Cohen
- Department of Psychology, Louisiana State University, United States of America; Center for Computation and Technology, Louisiana State University, United States of America
| | - Zachary Rodriguez
- Department of Psychology, Louisiana State University, United States of America; Center for Computation and Technology, Louisiana State University, United States of America
| | - Brita Elvevåg
- Department of Clinical Medicine, University of Tromsø - the Arctic University of Norway, Tromsø, Norway; Norwegian Center for eHealth Research, University Hospital of North Norway, Tromsø, Norway
| |
Collapse
|
29
|
Yang H, Xie L, Pan H, Li C, Wang Z, Zhong J. Multimodal Attention Dynamic Fusion Network for Facial Micro-Expression Recognition. ENTROPY (BASEL, SWITZERLAND) 2023; 25:1246. [PMID: 37761545 PMCID: PMC10528512 DOI: 10.3390/e25091246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 08/07/2023] [Accepted: 08/15/2023] [Indexed: 09/29/2023]
Abstract
The emotional changes in facial micro-expressions are combinations of action units. The researchers have revealed that action units can be used as additional auxiliary data to improve facial micro-expression recognition. Most of the researchers attempt to fuse image features and action unit information. However, these works ignore the impact of action units on the facial image feature extraction process. Therefore, this paper proposes a local detail feature enhancement model based on a multimodal dynamic attention fusion network (MADFN) method for micro-expression recognition. This method uses a masked autoencoder based on learnable class tokens to remove local areas with low emotional expression ability in micro-expression images. Then, we utilize the action unit dynamic fusion module to fuse action unit representation to improve the potential representation ability of image features. The state-of-the-art performance of our proposed model is evaluated and verified on SMIC, CASME II, SAMM, and their combined 3DB-Combined datasets. The experimental results demonstrated that the proposed model achieved competitive performance with accuracy rates of 81.71%, 82.11%, and 77.21% on SMIC, CASME II, and SAMM datasets, respectively, that show the MADFN model can help to improve the discrimination of facial image emotional features.
Collapse
Affiliation(s)
- Hongling Yang
- Department of Computer Science, Changzhi University, Changzhi 046011, China;
| | - Lun Xie
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China; (L.X.); (C.L.); (Z.W.)
| | - Hang Pan
- Department of Computer Science, Changzhi University, Changzhi 046011, China;
| | - Chiqin Li
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China; (L.X.); (C.L.); (Z.W.)
| | - Zhiliang Wang
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China; (L.X.); (C.L.); (Z.W.)
| | - Jialiang Zhong
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang 330031, China;
| |
Collapse
|
30
|
Gabbatore I, Marchetti Guerrini A, Bosco F. The fuzzy boundaries of the social (pragmatic) communication disorder (SPCD): Why the picture is still so confusing? Heliyon 2023; 9:e19062. [PMID: 37664706 PMCID: PMC10468801 DOI: 10.1016/j.heliyon.2023.e19062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 08/03/2023] [Accepted: 08/09/2023] [Indexed: 09/05/2023] Open
Abstract
Introduction Since the introduction of Social (Pragmatic) Communication Disorder (SPCD) in the Diagnostic and Statistical Manual of Mental Disorders, 5th edition (DSM-5) in 2013, a debate has arisen in the scientific community about its usefulness in differential diagnosis for other clinical categories such as Autism Spectrum Disorder (ASD) and Specific Language Impairment (SLI). Indeed, SPCD criteria share a common deficit in communication and pragmatic skills with these diagnostic entities. Available assessment tools seem scarce and not sensitive enough to clarify diagnostic criteria and clinical boundaries. This study aims to review the existing literature on diagnostic screening for SPCD to highlight confounding variables in the domains examined, overlap with other diagnostic entities, and lack of specificity of available assessment tools in identifying the core deficits of the disorder. Methods The search strategy was defined by combining the following keywords: "social pragmatic communication disorder," "DSM-5," "differential diagnosis," and "child." The search was performed in three databases: Medline (PubMed), Scopus, and Web of Science. All studies published between 2013 and April 2023, written in English, and with a major focus on SPCD were included in the review. Results After the screening for the eligibility, 18 studies were included in the review. Most of these studies aimed to investigate the differential diagnosis between SPCD and other diagnostic categories (e.g., specific language impairment and autism spectrum disorder). Of these researches, only 6 were ad hoc experimental studies, while the others were based on previously collected databases. Conclusions SPCD seems to have its own peculiarities and characteristics, indicating its clinical relevance, as emphasized by the DSM-5. However, the lack of specific instruments and a number of confounding variables make it difficult to identify and differentiate SPCD from other diagnostic entities. Further research is needed to overcome the lack of specific clinical instruments and lack of empirical studies.
Collapse
Affiliation(s)
- I. Gabbatore
- Department of Psychology, GIPSI Research Group, University of Turin, Italy
| | - A. Marchetti Guerrini
- Department of Psychology, GIPSI Research Group, University of Turin, Italy
- Associazione La Nostra Famiglia – IRCCS Eugenio Medea, Bosisio Parini, Italy
| | - F.M. Bosco
- Department of Psychology, GIPSI Research Group, University of Turin, Italy
- Centro Interdipartimentale di Studi Avanzati di Neuroscienze – NIT, University of Turin, Turin, Italy
| |
Collapse
|
31
|
Pan H, Yang H, Xie L, Wang Z. Multi-scale fusion visual attention network for facial micro-expression recognition. Front Neurosci 2023; 17:1216181. [PMID: 37575295 PMCID: PMC10412924 DOI: 10.3389/fnins.2023.1216181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 06/26/2023] [Indexed: 08/15/2023] Open
Abstract
Introduction Micro-expressions are facial muscle movements that hide genuine emotions. In response to the challenge of micro-expression low-intensity, recent studies have attempted to locate localized areas of facial muscle movement. However, this ignores the feature redundancy caused by the inaccurate locating of the regions of interest. Methods This paper proposes a novel multi-scale fusion visual attention network (MFVAN), which learns multi-scale local attention weights to mask regions of redundancy features. Specifically, this model extracts the multi-scale features of the apex frame in the micro-expression video clips by convolutional neural networks. The attention mechanism focuses on the weights of local region features in the multi-scale feature maps. Then, we mask operate redundancy regions in multi-scale features and fuse local features with high attention weights for micro-expression recognition. The self-supervision and transfer learning reduce the influence of individual identity attributes and increase the robustness of multi-scale feature maps. Finally, the multi-scale classification loss, mask loss, and removing individual identity attributes loss joint to optimize the model. Results The proposed MFVAN method is evaluated on SMIC, CASME II, SAMM, and 3DB-Combined datasets that achieve state-of-the-art performance. The experimental results show that focusing on local at the multi-scale contributes to micro-expression recognition. Discussion This paper proposed MFVAN model is the first to combine image generation with visual attention mechanisms to solve the combination challenge problem of individual identity attribute interference and low-intensity facial muscle movements. Meanwhile, the MFVAN model reveal the impact of individual attributes on the localization of local ROIs. The experimental results show that a multi-scale fusion visual attention network contributes to micro-expression recognition.
Collapse
Affiliation(s)
- Hang Pan
- Department of Computer Science, Changzhi University, Changzhi, China
| | - Hongling Yang
- Department of Computer Science, Changzhi University, Changzhi, China
| | - Lun Xie
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
| | - Zhiliang Wang
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
| |
Collapse
|
32
|
Winter B, Marghetis T. Multimodality matters in numerical communication. Front Psychol 2023; 14:1130777. [PMID: 37564312 PMCID: PMC10411739 DOI: 10.3389/fpsyg.2023.1130777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 05/10/2023] [Indexed: 08/12/2023] Open
Abstract
Modern society depends on numerical information, which must be communicated accurately and effectively. Numerical communication is accomplished in different modalities-speech, writing, sign, gesture, graphs, and in naturally occurring settings it almost always involves more than one modality at once. Yet the modalities of numerical communication are often studied in isolation. Here we argue that, to understand and improve numerical communication, we must take seriously this multimodality. We first discuss each modality on its own terms, identifying their commonalities and differences. We then argue that numerical communication is shaped critically by interactions among modalities. We boil down these interactions to four types: one modality can amplify the message of another; it can direct attention to content from another modality (e.g., using a gesture to guide attention to a relevant aspect of a graph); it can explain another modality (e.g., verbally explaining the meaning of an axis in a graph); and it can reinterpret a modality (e.g., framing an upwards-oriented trend as a bad outcome). We conclude by discussing how a focus on multimodality raises entirely new research questions about numerical communication.
Collapse
Affiliation(s)
- Bodo Winter
- Department of English Language and Linguistics, University of Birmingham, Birmingham, United Kingdom
| | - Tyler Marghetis
- Cognitive and Information Sciences, University of California, Merced, Merced, CA, United States
| |
Collapse
|
33
|
Nota N, Trujillo JP, Holler J. Specific facial signals associate with categories of social actions conveyed through questions. PLoS One 2023; 18:e0288104. [PMID: 37467253 DOI: 10.1371/journal.pone.0288104] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Accepted: 06/20/2023] [Indexed: 07/21/2023] Open
Abstract
The early recognition of fundamental social actions, like questions, is crucial for understanding the speaker's intended message and planning a timely response in conversation. Questions themselves may express more than one social action category (e.g., an information request "What time is it?", an invitation "Will you come to my party?" or a criticism "Are you crazy?"). Although human language use occurs predominantly in a multimodal context, prior research on social actions has mainly focused on the verbal modality. This study breaks new ground by investigating how conversational facial signals may map onto the expression of different types of social actions conveyed through questions. The distribution, timing, and temporal organization of facial signals across social actions was analysed in a rich corpus of naturalistic, dyadic face-to-face Dutch conversations. These social actions were: Information Requests, Understanding Checks, Self-Directed questions, Stance or Sentiment questions, Other-Initiated Repairs, Active Participation questions, questions for Structuring, Initiating or Maintaining Conversation, and Plans and Actions questions. This is the first study to reveal differences in distribution and timing of facial signals across different types of social actions. The findings raise the possibility that facial signals may facilitate social action recognition during language processing in multimodal face-to-face interaction.
Collapse
Affiliation(s)
- Naomi Nota
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, The Netherlands
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - James P Trujillo
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, The Netherlands
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Judith Holler
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, The Netherlands
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| |
Collapse
|
34
|
Mehler A, Lücking A, Dong T. Editorial: Multimodal communication and multimodal computing. Front Artif Intell 2023; 6:1234920. [PMID: 37441006 PMCID: PMC10335352 DOI: 10.3389/frai.2023.1234920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 06/09/2023] [Indexed: 07/15/2023] Open
Affiliation(s)
- Alexander Mehler
- Text Technology Lab, Goethe-University Frankfurt, Frankfurt, Germany
| | - Andy Lücking
- Text Technology Lab, Goethe-University Frankfurt, Frankfurt, Germany
- Laboratoire de Linguistique Formelle (LLF), Université Paris Cité, Paris, France
| | - Tiansi Dong
- Neurosymbolic Representation Learning Group, Fraunhofer IAIS, Sankt Augustin, Germany
| |
Collapse
|
35
|
Zhao W. TMS reveals a two-stage priming circuit of gesture-speech integration. Front Psychol 2023; 14:1156087. [PMID: 37228338 PMCID: PMC10203497 DOI: 10.3389/fpsyg.2023.1156087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Accepted: 04/19/2023] [Indexed: 05/27/2023] Open
Abstract
Introduction Naturalistically, multisensory information of gesture and speech is intrinsically integrated to enable coherent comprehension. Such cross-modal semantic integration is temporally misaligned, with the onset of gesture preceding the relevant speech segment. It has been proposed that gestures prime subsequent speech. However, there are unresolved questions regarding the roles and time courses that the two sources of information play in integration. Methods In two between-subject experiments of healthy college students, we segmented the gesture-speech integration period into 40-ms time windows (TWs) based on two separately division criteria, while interrupting the activity of the integration node of the left posterior middle temporal gyrus (pMTG) and the left inferior frontal gyrus (IFG) with double-pulse transcranial magnetic stimulation (TMS). In Experiment 1, we created fixed time-advances of gesture over speech and divided the TWs from the onset of speech. In Experiment 2, we differentiated the processing stages of gesture and speech and segmented the TWs in reference to the speech lexical identification point (IP), while speech onset occurred at the gesture semantic discrimination point (DP). Results The results showed a TW-selective interruption of the pMTG and IFG only in Experiment 2, with the pMTG involved in TW1 (-120 ~ -80 ms of speech IP), TW2 (-80 ~ -40 ms), TW6 (80 ~ 120 ms) and TW7 (120 ~ 160 ms) and the IFG involved in TW3 (-40 ~ 0 ms) and TW6. Meanwhile no significant disruption of gesture-speech integration was reported in Experiment 1. Discussion We determined that after the representation of gesture has been established, gesture-speech integration occurs such that speech is first primed in a phonological processing stage before gestures are unified with speech to form a coherent meaning. Our findings provide new insights into multisensory speech and co-speech gesture integration by tracking the causal contributions of the two sources of information.
Collapse
|
36
|
Kelly SD, Ngo Tran QA. Exploring the Emotional Functions of Co-Speech Hand Gesture in Language and Communication. Top Cogn Sci 2023. [PMID: 37115518 DOI: 10.1111/tops.12657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 04/05/2023] [Accepted: 04/06/2023] [Indexed: 04/29/2023]
Abstract
Research over the past four decades has built a convincing case that co-speech hand gestures play a powerful role in human cognition . However, this recent focus on the cognitive function of gesture has, to a large extent, overlooked its emotional role-a role that was once central to research on bodily expression. In the present review, we first give a brief summary of the wealth of research demonstrating the cognitive function of co-speech gestures in language acquisition, learning, and thinking. Building on this foundation, we revisit the emotional function of gesture across a wide range of communicative contexts, from clinical to artistic to educational, and spanning diverse fields, from cognitive neuroscience to linguistics to affective science. Bridging the cognitive and emotional functions of gesture highlights promising avenues of research that have varied practical and theoretical implications for human-machine interactions, therapeutic interventions, language evolution, embodied cognition, and more.
Collapse
Affiliation(s)
- Spencer D Kelly
- Department of Psychological and Brain Sciences, Center for Language and Brain, Colgate University, 13 Oak Dr., Hamilton, NY, 13346, United States
| | - Quang-Anh Ngo Tran
- Department of Psychological and Brain Sciences, Indiana University, 1101 E. 10th St., Bloomington, IN, 47405, United States
| |
Collapse
|
37
|
Hamilton AFDC, Holler J. Face2face: advancing the science of social interaction. Philos Trans R Soc Lond B Biol Sci 2023; 378:20210470. [PMID: 36871590 PMCID: PMC9985963 DOI: 10.1098/rstb.2021.0470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 02/07/2023] [Indexed: 03/07/2023] Open
Abstract
Face-to-face interaction is core to human sociality and its evolution, and provides the environment in which most of human communication occurs. Research into the full complexities that define face-to-face interaction requires a multi-disciplinary, multi-level approach, illuminating from different perspectives how we and other species interact. This special issue showcases a wide range of approaches, bringing together detailed studies of naturalistic social-interactional behaviour with larger scale analyses for generalization, and investigations of socially contextualized cognitive and neural processes that underpin the behaviour we observe. We suggest that this integrative approach will allow us to propel forwards the science of face-to-face interaction by leading us to new paradigms and novel, more ecologically grounded and comprehensive insights into how we interact with one another and with artificial agents, how differences in psychological profiles might affect interaction, and how the capacity to socially interact develops and has evolved in the human and other species. This theme issue makes a first step into this direction, with the aim to break down disciplinary boundaries and emphasizing the value of illuminating the many facets of face-to-face interaction. This article is part of a discussion meeting issue 'Face2face: advancing the science of social interaction'.
Collapse
Affiliation(s)
| | - Judith Holler
- Donders Institute for Brain, Cognition & Behaviour, Radboud University, 6525 GD Nijmegen, The Netherlands
- Max Planck Institute for Psycholinguistics, 6525XD Nijmegen, The Netherlands
| |
Collapse
|
38
|
Levinson SC. Gesture, spatial cognition and the evolution of language. Philos Trans R Soc Lond B Biol Sci 2023; 378:20210481. [PMID: 36871589 PMCID: PMC9985965 DOI: 10.1098/rstb.2021.0481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Accepted: 08/03/2022] [Indexed: 03/07/2023] Open
Abstract
Human communication displays a striking contrast between the diversity of languages and the universality of the principles underlying their use in conversation. Despite the importance of this interactional base, it is not obvious that it heavily imprints the structure of languages. However, a deep-time perspective suggests that early hominin communication was gestural, in line with all the other Hominidae. This gestural phase of early language development seems to have left its traces in the way in which spatial concepts, implemented in the hippocampus, provide organizing principles at the heart of grammar. This article is part of a discussion meeting issue 'Face2face: advancing the science of social interaction'.
Collapse
Affiliation(s)
- Stephen C. Levinson
- Max Planck Institute for Psycholinguistics, Nijmegen, 6525XD, The Netherlands
| |
Collapse
|
39
|
Kuhlen AK, Abdel Rahman R. Beyond speaking: neurocognitive perspectives on language production in social interaction. Philos Trans R Soc Lond B Biol Sci 2023; 378:20210483. [PMID: 36871592 PMCID: PMC9985974 DOI: 10.1098/rstb.2021.0483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Accepted: 12/16/2022] [Indexed: 03/07/2023] Open
Abstract
The human faculty to speak has evolved, so has been argued, for communicating with others and for engaging in social interactions. Hence the human cognitive system should be equipped to address the demands that social interaction places on the language production system. These demands include the need to coordinate speaking with listening, the need to integrate own (verbal) actions with the interlocutor's actions, and the need to adapt language flexibly to the interlocutor and the social context. In order to meet these demands, core processes of language production are supported by cognitive processes that enable interpersonal coordination and social cognition. To fully understand the cognitive architecture and its neural implementation enabling humans to speak in social interaction, our understanding of how humans produce language needs to be connected to our understanding of how humans gain insights into other people's mental states and coordinate in social interaction. This article reviews theories and neurocognitive experiments that make this connection and can contribute to advancing our understanding of speaking in social interaction. This article is part of a discussion meeting issue 'Face2face: advancing the science of social interaction'.
Collapse
Affiliation(s)
- Anna K. Kuhlen
- Department of Psychology, Humboldt-Universität zu Berlin, 12489 Berlin, Germany
| | - Rasha Abdel Rahman
- Department of Psychology, Humboldt-Universität zu Berlin, 12489 Berlin, Germany
| |
Collapse
|
40
|
Kendrick KH, Holler J, Levinson SC. Turn-taking in human face-to-face interaction is multimodal: gaze direction and manual gestures aid the coordination of turn transitions. Philos Trans R Soc Lond B Biol Sci 2023; 378:20210473. [PMID: 36871587 PMCID: PMC9985971 DOI: 10.1098/rstb.2021.0473] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Accepted: 01/27/2023] [Indexed: 03/07/2023] Open
Abstract
Human communicative interaction is characterized by rapid and precise turn-taking. This is achieved by an intricate system that has been elucidated in the field of conversation analysis, based largely on the study of the auditory signal. This model suggests that transitions occur at points of possible completion identified in terms of linguistic units. Despite this, considerable evidence exists that visible bodily actions including gaze and gestures also play a role. To reconcile disparate models and observations in the literature, we combine qualitative and quantitative methods to analyse turn-taking in a corpus of multimodal interaction using eye-trackers and multiple cameras. We show that transitions seem to be inhibited when a speaker averts their gaze at a point of possible turn completion, or when a speaker produces gestures which are beginning or unfinished at such points. We further show that while the direction of a speaker's gaze does not affect the speed of transitions, the production of manual gestures does: turns with gestures have faster transitions. Our findings suggest that the coordination of transitions involves not only linguistic resources but also visual gestural ones and that the transition-relevance places in turns are multimodal in nature. This article is part of a discussion meeting issue 'Face2face: advancing the science of social interaction'.
Collapse
Affiliation(s)
- Kobin H. Kendrick
- Department of Language and Linguistic Science, University of York, York YO10 5DD, UK
| | - Judith Holler
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
- Max Planck Institute for Psycholinguistics, Nijmegen, Gelderland, The Netherlands
| | - Stephen C. Levinson
- Max Planck Institute for Psycholinguistics, Nijmegen, Gelderland, The Netherlands
| |
Collapse
|
41
|
Meyer AS. Timing in Conversation. J Cogn 2023; 6:20. [PMID: 37033404 PMCID: PMC10077995 DOI: 10.5334/joc.268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 03/17/2023] [Indexed: 04/08/2023] Open
Abstract
Turn-taking in everyday conversation is fast, with median latencies in corpora of conversational speech often reported to be under 300 ms. This seems like magic, given that experimental research on speech planning has shown that speakers need much more time to plan and produce even the shortest of utterances. This paper reviews how language scientists have combined linguistic analyses of conversations and experimental work to understand the skill of swift turn-taking and proposes a tentative solution to the riddle of fast turn-taking.
Collapse
Affiliation(s)
- Antje S. Meyer
- Radboud University Nijmegen and Max Planck Institute for Psycholinguistics and Radboud University, Nijmegen, The Netherlands
| |
Collapse
|
42
|
Fusaroli R, Weed E, Rocca R, Fein D, Naigles L. Caregiver linguistic alignment to autistic and typically developing children: A natural language processing approach illuminates the interactive components of language development. Cognition 2023; 236:105422. [PMID: 36871399 DOI: 10.1016/j.cognition.2023.105422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 12/13/2022] [Accepted: 02/23/2023] [Indexed: 03/06/2023]
Abstract
BACKGROUND Language development is a highly interactive activity. However, most research on linguistic environment has focused on the quantity and complexity of linguistic input to children, with current models showing that complexity facilitates language in both typically developing (TD) and autistic children. AIMS After reviewing existing work on caregiver engagement of children's utterances, we aim to operationalize such engagement with automated measures of linguistic alignment, thereby providing scalable tools to assess caregivers' active reuse of their children's language. By assessing the presence of alignment, its sensitivity to the child's individual differences and how well it predicts language development beyond current models across the two groups, we showcase the usefulness of the approach and provide initial empirical foundations for further conceptual and empirical investigations. METHODS We measure lexical, syntactic and semantic types of caregiver alignment in a longitudinal corpus involving 32 adult-autistic child and 35 adult-TD child dyads, with children between 2 and 5 years of age. We assess the extent to which caregivers repeat their children's words, syntax, and semantics, and whether these repetitions predict language development beyond more standard predictors. RESULTS Caregivers tend to re-use their child's language in a way that is related to the child's individual, primarily linguistic, differences. Caregivers' alignment provides unique information improving our ability to predict future language development in both typical and autistic children. CONCLUSIONS We provide evidence that language development also relies on interactive conversational processes, previously understudied. We share carefully detailed methods, and open-source scripts so as to systematically extend our approach to new contexts and languages.
Collapse
Affiliation(s)
- Riccardo Fusaroli
- Department of Linguistics, Cognitive Science and Semiotics, School of Communication and Culture, Aarhus University, Jens Chr Skous vej 2, 8000 Aarhus, Denmark; Interacting Minds Center, School of Culture and Society, Aarhus University, Jens Chr Skous vej 4, 8000 Aarhus, Denmark; Linguistic Data Consortium, University of Pennsylvania, 3600 Market St, Suite 810, Philadelphia, PA 19104-2653, USA.
| | - Ethan Weed
- Department of Linguistics, Cognitive Science and Semiotics, School of Communication and Culture, Aarhus University, Jens Chr Skous vej 2, 8000 Aarhus, Denmark; Interacting Minds Center, School of Culture and Society, Aarhus University, Jens Chr Skous vej 4, 8000 Aarhus, Denmark
| | - Roberta Rocca
- Department of Linguistics, Cognitive Science and Semiotics, School of Communication and Culture, Aarhus University, Jens Chr Skous vej 2, 8000 Aarhus, Denmark; Interacting Minds Center, School of Culture and Society, Aarhus University, Jens Chr Skous vej 4, 8000 Aarhus, Denmark
| | - Deborah Fein
- Psychological Sciences, University of Connecticut, 406 Babbidge Road, Unit 1020, Storrs, CT 0629-1020, USA
| | - Letitia Naigles
- Psychological Sciences, University of Connecticut, 406 Babbidge Road, Unit 1020, Storrs, CT 0629-1020, USA
| |
Collapse
|
43
|
Establishing conversational engagement and being effective: The role of body movement in mediated communication. Acta Psychol (Amst) 2023; 233:103840. [PMID: 36681014 DOI: 10.1016/j.actpsy.2023.103840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 10/28/2022] [Accepted: 01/16/2023] [Indexed: 01/21/2023] Open
Abstract
A model for investigating the effects of body movement on conversational effectiveness in computer-mediated communication (CMC) is developed based on theories of motor cognition and embodiment. Movement is relevant to a wide range of CMC settings, including remote interviews, court testimonials, instructing, medical consultation, and socializing. The present work allows for a consideration of different forms of motoric activation, including gesturing and full-body motion, in mediated conversational settings and the derivation of a range of testable hypothesis. Motor cognition and embodiment provide an account of how speaker and listener become subject to the consequences of the muscular activation patterns that come with body movement. While movement supports internal elaboration, thus helping the speaker in formulating messages, it also has direct effects on the listener through behavioral synchrony and motor contagion. The effects of movement in CMC environments depend on two general characteristics: the level of visibility of movement and the extent to which the technology facilitates or inhibits movement. Available channels, set-up of technology, and further customization therefore determine whether movement can fulfil its internal functions (relevant to cognitive-affective elaboration of what is being said by the speaker) and its external functions (relevant to what is being perceived by and activated within the listener). Several indicators of conversational effectiveness are identified that serve as outcome variables. This MCEE model is intended to help users, developers and service provides to make CMC more engaging and more meaningful.
Collapse
|
44
|
Electrophysiological evidence for the enhancement of gesture-speech integration by linguistic predictability during multimodal discourse comprehension. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2023; 23:340-353. [PMID: 36823247 PMCID: PMC9949912 DOI: 10.3758/s13415-023-01074-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 01/30/2023] [Indexed: 02/25/2023]
Abstract
In face-to-face discourse, listeners exploit cues in the input to generate predictions about upcoming words. Moreover, in addition to speech, speakers produce a multitude of visual signals, such as iconic gestures, which listeners readily integrate with incoming words. Previous studies have shown that processing of target words is facilitated when these are embedded in predictable compared to non-predictable discourses and when accompanied by iconic compared to meaningless gestures. In the present study, we investigated the interaction of both factors. We recorded electroencephalogram from 60 Dutch adults while they were watching videos of an actress producing short discourses. The stimuli consisted of an introductory and a target sentence; the latter contained a target noun. Depending on the preceding discourse, the target noun was either predictable or not. Each target noun was paired with an iconic gesture and a gesture that did not convey meaning. In both conditions, gesture presentation in the video was timed such that the gesture stroke slightly preceded the onset of the spoken target by 130 ms. Our ERP analyses revealed independent facilitatory effects for predictable discourses and iconic gestures. However, the interactive effect of both factors demonstrated that target processing (i.e., gesture-speech integration) was facilitated most when targets were part of predictable discourses and accompanied by an iconic gesture. Our results thus suggest a strong intertwinement of linguistic predictability and non-verbal gesture processing where listeners exploit predictive discourse cues to pre-activate verbal and non-verbal representations of upcoming target words.
Collapse
|
45
|
De Felice S, Hamilton AFDC, Ponari M, Vigliocco G. Learning from others is good, with others is better: the role of social interaction in human acquisition of new knowledge. Philos Trans R Soc Lond B Biol Sci 2023; 378:20210357. [PMID: 36571126 PMCID: PMC9791495 DOI: 10.1098/rstb.2021.0357] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Learning in humans is highly embedded in social interaction: since the very early stages of our lives, we form memories and acquire knowledge about the world from and with others. Yet, within cognitive science and neuroscience, human learning is mainly studied in isolation. The focus of past research in learning has been either exclusively on the learner or (less often) on the teacher, with the primary aim of determining developmental trajectories and/or effective teaching techniques. In fact, social interaction has rarely been explicitly taken as a variable of interest, despite being the medium through which learning occurs, especially in development, but also in adulthood. Here, we review behavioural and neuroimaging research on social human learning, specifically focusing on cognitive models of how we acquire semantic knowledge from and with others, and include both developmental as well as adult work. We then identify potential cognitive mechanisms that support social learning, and their neural correlates. The aim is to outline key new directions for experiments investigating how knowledge is acquired in its ecological niche, i.e. socially, within the framework of the two-person neuroscience approach. This article is part of the theme issue 'Concepts in interaction: social engagement and inner experiences'.
Collapse
Affiliation(s)
- Sara De Felice
- Institute of Cognitive Neuroscience, University College London (UCL), 17–19 Alexandra House Queen Square, London WC1N 3AZ, UK
| | - Antonia F. de C. Hamilton
- Institute of Cognitive Neuroscience, University College London (UCL), 17–19 Alexandra House Queen Square, London WC1N 3AZ, UK
| | - Marta Ponari
- School of Psychology, University of Kent, Canterbury CT2 7NP, UK
| | | |
Collapse
|
46
|
Benetti S, Ferrari A, Pavani F. Multimodal processing in face-to-face interactions: A bridging link between psycholinguistics and sensory neuroscience. Front Hum Neurosci 2023; 17:1108354. [PMID: 36816496 PMCID: PMC9932987 DOI: 10.3389/fnhum.2023.1108354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 01/11/2023] [Indexed: 02/05/2023] Open
Abstract
In face-to-face communication, humans are faced with multiple layers of discontinuous multimodal signals, such as head, face, hand gestures, speech and non-speech sounds, which need to be interpreted as coherent and unified communicative actions. This implies a fundamental computational challenge: optimally binding only signals belonging to the same communicative action while segregating signals that are not connected by the communicative content. How do we achieve such an extraordinary feat, reliably, and efficiently? To address this question, we need to further move the study of human communication beyond speech-centred perspectives and promote a multimodal approach combined with interdisciplinary cooperation. Accordingly, we seek to reconcile two explanatory frameworks recently proposed in psycholinguistics and sensory neuroscience into a neurocognitive model of multimodal face-to-face communication. First, we introduce a psycholinguistic framework that characterises face-to-face communication at three parallel processing levels: multiplex signals, multimodal gestalts and multilevel predictions. Second, we consider the recent proposal of a lateral neural visual pathway specifically dedicated to the dynamic aspects of social perception and reconceive it from a multimodal perspective ("lateral processing pathway"). Third, we reconcile the two frameworks into a neurocognitive model that proposes how multiplex signals, multimodal gestalts, and multilevel predictions may be implemented along the lateral processing pathway. Finally, we advocate a multimodal and multidisciplinary research approach, combining state-of-the-art imaging techniques, computational modelling and artificial intelligence for future empirical testing of our model.
Collapse
Affiliation(s)
- Stefania Benetti
- Centre for Mind/Brain Sciences, University of Trento, Trento, Italy,Interuniversity Research Centre “Cognition, Language, and Deafness”, CIRCLeS, Catania, Italy,*Correspondence: Stefania Benetti,
| | - Ambra Ferrari
- Max Planck Institute for Psycholinguistics, Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Francesco Pavani
- Centre for Mind/Brain Sciences, University of Trento, Trento, Italy,Interuniversity Research Centre “Cognition, Language, and Deafness”, CIRCLeS, Catania, Italy
| |
Collapse
|
47
|
Colombani A, Saksida A, Pavani F, Orzan E. Symbolic and deictic gestures as a tool to promote parent-child communication in the context of hearing loss: A systematic review. Int J Pediatr Otorhinolaryngol 2023; 165:111421. [PMID: 36669271 DOI: 10.1016/j.ijporl.2022.111421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 12/13/2022] [Accepted: 12/17/2022] [Indexed: 12/29/2022]
Abstract
BACKGROUND Language and communication outcomes in children with congenital sensorineural hearing loss (cSNHL) are highly variable, and some of this variance can be attributed to the quantity and quality of language input. In this paper, we build from the evidence that human language is inherently multimodal and positive scaffolding of children's linguistic, cognitive, and social-relational development can be supported by Parent Centered Early Interventions (PCEI), to suggest that the use of gestures in these interventions could be a beneficial approach, yet scarcely explored. AIMS AND METHODS This systematic review aimed to examine the literature on PCEI focused on gestures (symbolic and deictic) used to enhance the caregiver-child relationship and infant's language development, in both typically and atypically developing populations. The systematic review was conducted following the PRISMA guidelines for systematic reviews and meta-analyses. From 246 identified studies, 8 met PICO inclusion criteria and were eligible for inclusion. Two reviewers screened papers before completing data extraction and risk of bias assessment using the RoB2 Cochrane scale. RESULTS Included studies measured the effect of implementing symbolic or deictic gestures in daily communication on the relational aspects of mother/parent-child interaction or on language skills in infants. The studies indicate that gesture-oriented PCEI may benefit deprived populations such as atypically developing children, children from low-income families, and children who, for individual reasons, lag behind their peers in communication. CONCLUSIONS Although gesture-oriented PCEI appear to be beneficial in the early intervention for atypically developing populations, this approach has been so far scarcely explored directly in the context of hearing loss. Yet, symbolic gestures being a natural part of early vocabulary acquisition that emerges spontaneously regardless of hearing status, this approach could represent a promising line of intervention in infants with cSNHL, especially those with a worse head start.
Collapse
Affiliation(s)
- Arianna Colombani
- Institute for Maternal and Child Health - IRCCS "Burlo Garofolo" - Trieste, Italy
| | - Amanda Saksida
- Institute for Maternal and Child Health - IRCCS "Burlo Garofolo" - Trieste, Italy.
| | - Francesco Pavani
- Center for Mind/Brain Sciences - CIMeC, University of Trento, Trento, Italy; Centro Interateneo di Ricerca Cognizione, Linguaggio e Sordità (CIRCLeS), University of Trento, Trento, Italy
| | - Eva Orzan
- Institute for Maternal and Child Health - IRCCS "Burlo Garofolo" - Trieste, Italy
| |
Collapse
|
48
|
Gestures and pauses to help thought: hands, voice, and silence in the tourist guide's speech. Cogn Process 2023; 24:25-41. [PMID: 36495353 DOI: 10.1007/s10339-022-01116-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2022] [Accepted: 11/23/2022] [Indexed: 12/14/2022]
Abstract
In the body of research on the relationship between gesture and speech, some models propose they form an integrated system while others attribute gestures a compensatory role in communication. This study addresses the gesture-speech relationship by taking disfluency phenomena as a case study. Since it is part of a project aimed at designing virtual agents to be employed in museums, an analysis was performed on the communicative behavior of tourist guides. Results reveal that gesturing is more frequent during speech than pauses. Moreover, when comparing the types of gestures and types of pauses they co-occur with, non-communicative gestures (idles and manipulators) turn out to be more frequent than communicatively-meaningful gestures, which instead more often co-occur with speech. We discuss these findings as relevant for a theoretical model viewing speech and gesture as an integrated system.
Collapse
|
49
|
Żygis M, Fuchs S. Communicative constraints affect oro-facial gestures and acoustics: Whispered vs normal speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:613. [PMID: 36732243 DOI: 10.1121/10.0015251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Accepted: 11/04/2022] [Indexed: 06/18/2023]
Abstract
The present paper investigates a relationship between the acoustic signal and oro-facial expressions (gestures) when speakers (i) speak normally or whisper, (ii) do or do not see each other, and (iii) produce questions as opposed to statements. To this end, we conducted a motion capture experiment with 17 native speakers of German. The results provide partial support to the hypothesis that the most intensified oro-facial expressions occur when speakers whisper, do not see each other, and produce questions. The results are interpreted in terms of two hypotheses, i.e., the "hand-in-hand" and "trade-off" hypotheses. The relationship between acoustic properties and gestures does not provide straightforward support for one or the other hypothesis. Depending on the condition, speakers used more pronounced gestures and longer duration compensating for the lack of the fundamental frequency (supporting the trade-off hypothesis), but since the gestures were also enhanced when the listener was invisible, we conclude that they are not produced solely for the needs of the listener (supporting the hand-in-hand hypothesis), but rather they seem to help the speaker to achieve an overarching communicative goal.
Collapse
Affiliation(s)
- Marzena Żygis
- Leibniz-Zentrum Allgemeine Sprachwissenschaft, 10117 Berlin, Germany
| | - Susanne Fuchs
- Leibniz-Zentrum Allgemeine Sprachwissenschaft, 10117 Berlin, Germany
| |
Collapse
|
50
|
Tomasello R. Linguistic signs in action: The neuropragmatics of speech acts. BRAIN AND LANGUAGE 2023; 236:105203. [PMID: 36470125 PMCID: PMC9856589 DOI: 10.1016/j.bandl.2022.105203] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 09/07/2022] [Accepted: 11/07/2022] [Indexed: 06/05/2023]
Abstract
What makes human communication exceptional is the ability to grasp speaker's intentions beyond what is said verbally. How the brain processes communicative functions is one of the central concerns of the neurobiology of language and pragmatics. Linguistic-pragmatic theories define these functions as speech acts, and various pragmatic traits characterise them at the levels of propositional content, action sequence structure, related commitments and social aspects. Here I discuss recent neurocognitive studies, which have shown that the use of identical linguistic signs in conveying different communicative functions elicits distinct and ultra-rapid neural responses. Interestingly, cortical areas show differential involvement underlying various pragmatic features related to theory-of-mind, emotion and action for specific speech acts expressed with the same utterances. Drawing on a neurocognitive model, I posit that understanding speech acts involves the expectation of typical partner follow-up actions and that this predictive knowledge is immediately reflected in mind and brain.
Collapse
Affiliation(s)
- Rosario Tomasello
- Brain Language Laboratory, Department of Philosophy and Humanities, Freie Universität Berlin, 14195 Berlin, Germany; Cluster of Excellence 'Matters of Activity. Image Space Material', Humboldt Universität zu Berlin, 10099 Berlin, Germany.
| |
Collapse
|