1
|
Friedrich S, Brodkin ES, Derntl B, Habel U, Hüpen P. Assessing the association between menstrual cycle phase and voice-gender categorization: no robust evidence for an association. Front Psychol 2025; 16:1531021. [PMID: 40290539 PMCID: PMC12031663 DOI: 10.3389/fpsyg.2025.1531021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2024] [Accepted: 03/19/2025] [Indexed: 04/30/2025] Open
Abstract
Introduction Hormone fluctuations during the menstrual cycle are known to influence a wide variety of cognitive-emotional processes and behavior. Mate choice and changes in attractiveness ratings for faces and voices are often investigated in this context, but research on changes in voice-gender perception independent of attractiveness ratings is rare even though the voice is an essential element in social interactions. For this reason, we investigated the influence of cycle phase and levels of estrogen and progesterone on performance in a voice-gender categorization task. Our expectation was to find a more pronounced other-sex effect, so faster and more accurate reactions for masculine voices, in the follicular (fertile) phase than in the luteal phase. Methods We measured 65 healthy, naturally-cycling women, half of them in the follicular phase and the other half in the luteal phase. For the analyses, we used signal detection theory (SDT) measures in addition to reaction times and percent of correct reactions. The study was preregistered after measuring the first 33 participants and prior to any data analyses (https://osf.io/dteyn). Results Cycle phase and hormone levels showed no significant effect on reaction time or SDT measures. This was the case both using frequentist analyses and Bayesian statistics. Reaction time was influenced by voice-gender, with faster reactions for feminine voices compared to masculine voices in both cycle phases. Discussion Taken together, our results add to the increasing number of studies that do not find an interaction of menstrual cycle phase and reaction to gendered stimuli.
Collapse
Affiliation(s)
- Sarah Friedrich
- Department of Psychiatry, Psychotherapy and Psychosomatics, RWTH Aachen University Hospital, Aachen, Germany
| | - Edward S. Brodkin
- Department of Psychiatry, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, United States
| | - Birgit Derntl
- Department of Psychiatry and Psychotherapy, Tübingen Center for Mental Health (TüCMH), University of Tübingen, Tübingen, Germany
- LEAD Graduate School and Research Network, University of Tübingen, Tübingen, Germany
| | - Ute Habel
- Department of Psychiatry, Psychotherapy and Psychosomatics, RWTH Aachen University Hospital, Aachen, Germany
- Institute of Neuroscience and Medicine, JARA-Institute Brain Structure Function Relationship (INM 10), Research Center Jülich, Jülich, Germany
| | - Philippa Hüpen
- Department of Psychiatry, Psychotherapy and Psychosomatics, RWTH Aachen University Hospital, Aachen, Germany
- Institute of Neuroscience and Medicine, JARA-Institute Brain Structure Function Relationship (INM 10), Research Center Jülich, Jülich, Germany
| |
Collapse
|
2
|
Tucker AE, Crow K, Wark M, Eichorn N, van Mersbergen M. How Does Our Voice Reflect Who We Are? Connecting the Voice and the Self Using Implicit Association Tests. J Voice 2024:S0892-1997(24)00361-8. [PMID: 39516054 DOI: 10.1016/j.jvoice.2024.10.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Revised: 10/14/2024] [Accepted: 10/16/2024] [Indexed: 11/16/2024]
Abstract
PURPOSE This study examined the contribution of voice to the self via implicit associations. METHOD An implicit association test (IAT) of the voice and the self was created and presented to vocal performers and community controls. One-hundred eleven participants completed this voice-self IAT, the Vocal Congruence Scale (VCS), and the Voice Handicap Index (VHI) via an in-person, monitored, and timed Qualtrics survey. Student t tests comparing timing differences between congruent and incongruent conditions revealed the presence of an implicit relationship. RESULTS The findings demonstrated an implicit relationship between the voice and the self as measured using the IAT. Strength of implicit relationships between self and voice was significantly greater for community controls than vocal performers. Additionally, this IAT revealed divergent validity with the VCS, and the VHI using Spearman's correlation. CONCLUSION Implications suggest that even if overt declarations are absent, individuals with an implicit voice-self relationship rely on their voice to contribute to their sense of self. This implicit relationship is greater for community members than vocal performers.
Collapse
Affiliation(s)
- Audrey Elizabeth Tucker
- School of Communication Sciences and Disorders, University of Memphis, Memphis, Tennessee; Iowa ENT Center, PLLC, West Des Moines, Iowa
| | - Karen Crow
- Louisville Center for Voice Care, University of Louisville Physicians, Louisville, Kentucky
| | - Marilyn Wark
- School of Communication Sciences and Disorders, University of Memphis, Memphis, Tennessee
| | - Naomi Eichorn
- School of Communication Sciences and Disorders, University of Memphis, Memphis, Tennessee
| | - Miriam van Mersbergen
- School of Communication Sciences and Disorders, University of Memphis, Memphis, Tennessee.
| |
Collapse
|
3
|
Hirano Y, Nakamura I, Tamura S. Abnormal connectivity and activation during audiovisual speech perception in schizophrenia. Eur J Neurosci 2024; 59:1918-1932. [PMID: 37990611 DOI: 10.1111/ejn.16183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Revised: 10/14/2023] [Accepted: 10/20/2023] [Indexed: 11/23/2023]
Abstract
The unconscious integration of vocal and facial cues during speech perception facilitates face-to-face communication. Recent studies have provided substantial behavioural evidence concerning impairments in audiovisual (AV) speech perception in schizophrenia. However, the specific neurophysiological mechanism underlying these deficits remains unknown. Here, we investigated activities and connectivities centered on the auditory cortex during AV speech perception in schizophrenia. Using magnetoencephalography, we recorded and analysed event-related fields in response to auditory (A: voice), visual (V: face) and AV (voice-face) stimuli in 23 schizophrenia patients (13 males) and 22 healthy controls (13 males). The functional connectivity associated with the subadditive response to AV stimulus (i.e., [AV] < [A] + [V]) was also compared between the two groups. Within the healthy control group, [AV] activity was smaller than the sum of [A] and [V] at latencies of approximately 100 ms in the posterior ramus of the lateral sulcus in only the left hemisphere, demonstrating a subadditive N1m effect. Conversely, the schizophrenia group did not show such a subadditive response. Furthermore, weaker functional connectivity from the posterior ramus of the lateral sulcus of the left hemisphere to the fusiform gyrus of the right hemisphere was observed in schizophrenia. Notably, this weakened connectivity was associated with the severity of negative symptoms. These results demonstrate abnormalities in connectivity between speech- and face-related cortical areas in schizophrenia. This aberrant subadditive response and connectivity deficits for integrating speech and facial information may be the neural basis of social communication dysfunctions in schizophrenia.
Collapse
Affiliation(s)
- Yoji Hirano
- Department of Psychiatry, Division of Clinical Neuroscience, Faculty of Medicine, University of Miyazaki, Miyazaki, Japan
- Department of Neuropsychiatry, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
- Institute of Industrial Science, The University of Tokyo, Tokyo, Japan
| | - Itta Nakamura
- Department of Neuropsychiatry, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Shunsuke Tamura
- Department of Psychiatry, Division of Clinical Neuroscience, Faculty of Medicine, University of Miyazaki, Miyazaki, Japan
- Department of Neuropsychiatry, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| |
Collapse
|
4
|
Guldner S, Lavan N, Lally C, Wittmann L, Nees F, Flor H, McGettigan C. Human talkers change their voices to elicit specific trait percepts. Psychon Bull Rev 2024; 31:209-222. [PMID: 37507647 PMCID: PMC10866754 DOI: 10.3758/s13423-023-02333-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/30/2023] [Indexed: 07/30/2023]
Abstract
The voice is a variable and dynamic social tool with functional relevance for self-presentation, for example, during a job interview or courtship. Talkers adjust their voices flexibly to their situational or social environment. Here, we investigated how effectively intentional voice modulations can evoke trait impressions in listeners (Experiment 1), whether these trait impressions are recognizable (Experiment 2), and whether they meaningfully influence social interactions (Experiment 3). We recorded 40 healthy adult speakers' whilst speaking neutrally and whilst producing vocal expressions of six social traits (e.g., likeability, confidence). Multivariate ratings of 40 listeners showed that vocal modulations amplified specific trait percepts (Experiments 1 and 2), which could be explained by two principal components relating to perceived affiliation and competence. Moreover, vocal modulations increased the likelihood of listeners choosing the voice to be suitable for corresponding social goals (i.e., a confident rather than likeable voice to negotiate a promotion, Experiment 3). These results indicate that talkers modulate their voice along a common trait space for social navigation. Moreover, beyond reactive voice changes, vocal behaviour can be strategically used by talkers to communicate subtle information about themselves to listeners. These findings advance our understanding of non-verbal vocal behaviour for social communication.
Collapse
Affiliation(s)
- Stella Guldner
- Department of Child and Adolescent Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany.
- Institute of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany.
| | - Nadine Lavan
- Department of Psychology, Queen Mary University of London, London, UK
| | - Clare Lally
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| | - Lisa Wittmann
- Institute of Psychology, University of Regensburg, Regensburg, Germany
| | - Frauke Nees
- Institute of Medical Psychology and Medical Sociology, University Medical Centre Schleswig Holstein, Kiel University, Kiel, Germany
| | - Herta Flor
- Institute of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| |
Collapse
|
5
|
Lavan N, McGettigan C. A model for person perception from familiar and unfamiliar voices. COMMUNICATIONS PSYCHOLOGY 2023; 1:1. [PMID: 38665246 PMCID: PMC11041786 DOI: 10.1038/s44271-023-00001-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 04/28/2023] [Indexed: 04/28/2024]
Abstract
When hearing a voice, listeners can form a detailed impression of the person behind the voice. Existing models of voice processing focus primarily on one aspect of person perception - identity recognition from familiar voices - but do not account for the perception of other person characteristics (e.g., sex, age, personality traits). Here, we present a broader perspective, proposing that listeners have a common perceptual goal of perceiving who they are hearing, whether the voice is familiar or unfamiliar. We outline and discuss a model - the Person Perception from Voices (PPV) model - that achieves this goal via a common mechanism of recognising a familiar person, persona, or set of speaker characteristics. Our PPV model aims to provide a more comprehensive account of how listeners perceive the person they are listening to, using an approach that incorporates and builds on aspects of the hierarchical frameworks and prototype-based mechanisms proposed within existing models of voice identity recognition.
Collapse
Affiliation(s)
- Nadine Lavan
- Department of Experimental and Biological Psychology, Queen Mary University of London, London, UK
| | - Carolyn McGettigan
- Department of Speech, Hearing, and Phonetic Sciences, University College London, London, UK
| |
Collapse
|
6
|
Pinheiro AP, Sarzedas J, Roberto MS, Kotz SA. Attention and emotion shape self-voice prioritization in speech processing. Cortex 2023; 158:83-95. [PMID: 36473276 DOI: 10.1016/j.cortex.2022.10.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Revised: 09/27/2022] [Accepted: 10/06/2022] [Indexed: 01/18/2023]
Abstract
Both self-voice and emotional speech are salient signals that are prioritized in perception. Surprisingly, self-voice perception has been investigated to a lesser extent than the self-face. Therefore, it remains to be clarified whether self-voice prioritization is boosted by emotion, and whether self-relevance and emotion interact differently when attention is focused on who is speaking vs. what is being said. Thirty participants listened to 210 prerecorded words spoken in one's own or an unfamiliar voice and differing in emotional valence in two tasks, manipulating the attention focus on either speaker identity or speech emotion. Event-related potentials (ERP) of the electroencephalogram (EEG) informed on the temporal dynamics of self-relevance, emotion, and attention effects. Words spoken in one's own voice elicited a larger N1 and Late Positive Potential (LPP), but smaller N400. Identity and emotion interactively modulated the P2 (self-positivity bias) and LPP (self-negativity bias). Attention to speaker identity modulated more strongly ERP responses within 600 ms post-word onset (N1, P2, N400), whereas attention to speech emotion altered the late component (LPP). However, attention did not modulate the interaction of self-relevance and emotion. These findings suggest that the self-voice is prioritized for neural processing at early sensory stages, and that both emotion and attention shape self-voice prioritization in speech processing. They also confirm involuntary processing of salient signals (self-relevance and emotion) even in situations in which attention is deliberately directed away from those cues. These findings have important implications for a better understanding of symptoms thought to arise from aberrant self-voice monitoring such as auditory verbal hallucinations.
Collapse
Affiliation(s)
- Ana P Pinheiro
- CICPSI, Faculdade de Psicologia, Universidade de Lisboa, Lisboa, Portugal; Basic and Applied NeuroDynamics Lab, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, the Netherlands.
| | - João Sarzedas
- CICPSI, Faculdade de Psicologia, Universidade de Lisboa, Lisboa, Portugal
| | - Magda S Roberto
- CICPSI, Faculdade de Psicologia, Universidade de Lisboa, Lisboa, Portugal
| | - Sonja A Kotz
- Basic and Applied NeuroDynamics Lab, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, the Netherlands
| |
Collapse
|
7
|
Costello J, Smith M. The BCH message banking process™, voice banking, and double-dipping™. Augment Altern Commun 2022; 37:241-250. [PMID: 35000518 DOI: 10.1080/07434618.2021.2021554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
Abstract
Significant advances have been made in interventions to maintain communication and personhood for individuals with neurodegenerative conditions. One innovation is Message Banking, a clinical approach first developed at Boston Children's Hospital (BCH). This paper outlines the Message Banking process as implemented at BCH, which includes the option of "Double Dipping," where banked messages are mined to develop personalized synthesized voices. More than a decade of experience has led to the evolution of six core principles underpinning the BCH process, resulting in a structured introduction of the associated concepts and practices with people with amyotrophic lateral sclerosis (ALS) and their families. These principles highlight the importance of assigning ownership and control of the process to individuals with ALS and their families, ensuring that as a tool it is empowering and offers hope. Changes have been driven by feedback from individuals who have participated in the BCH process over many years. The success of the process has recently been extended through partnerships that allow the recorded messages to be used to develop individual personalized synthetic voices to complement banked messages. While the process of banking messages is technically relatively simple, the full value of the process should be underpinned by the values and principles outlined in this tutorial.
Collapse
Affiliation(s)
- John Costello
- Augmentative Communication Program and Jay S. Fishman ALS Augmentative Communication Program, Boston Children's Hospital, Adjunct Faculty Boston University, Boston, MA, USA
| | - Martine Smith
- Department of Clinical Speech and Language Studies, Trinity College Dublin, Dublin, Ireland
| |
Collapse
|
8
|
Guldner S, Nees F, McGettigan C. Vocomotor and Social Brain Networks Work Together to Express Social Traits in Voices. Cereb Cortex 2020; 30:6004-6020. [PMID: 32577719 DOI: 10.1093/cercor/bhaa175] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Revised: 05/08/2020] [Accepted: 05/31/2020] [Indexed: 11/14/2022] Open
Abstract
Voice modulation is important when navigating social interactions-tone of voice in a business negotiation is very different from that used to comfort an upset child. While voluntary vocal behavior relies on a cortical vocomotor network, social voice modulation may require additional social cognitive processing. Using functional magnetic resonance imaging, we investigated the neural basis for social vocal control and whether it involves an interplay of vocal control and social processing networks. Twenty-four healthy adult participants modulated their voice to express social traits along the dimensions of the social trait space (affiliation and competence) or to express body size (control for vocal flexibility). Naïve listener ratings showed that vocal modulations were effective in evoking social trait ratings along the two primary dimensions of the social trait space. Whereas basic vocal modulation engaged the vocomotor network, social voice modulation specifically engaged social processing regions including the medial prefrontal cortex, superior temporal sulcus, and precuneus. Moreover, these regions showed task-relevant modulations in functional connectivity to the left inferior frontal gyrus, a core vocomotor control network area. These findings highlight the impact of the integration of vocal motor control and social information processing for socially meaningful voice modulation.
Collapse
Affiliation(s)
- Stella Guldner
- Department of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim 68159, Germany.,Graduate School of Economic and Social Sciences, University of Mannheim, Mannheim 68159, Germany.,Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| | - Frauke Nees
- Department of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim 68159, Germany.,Institute of Medical Psychology and Medical Sociology, University Medical Center Schleswig Holstein, Kiel University, Kiel 24105, Germany
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK.,Department of Psychology, Royal Holloway, University of London, Egham TW20 0EX, UK
| |
Collapse
|
9
|
Sumathi TA, Spinola O, Singh NC, Chakrabarti B. Perceived Closeness and Autistic Traits Modulate Interpersonal Vocal Communication. Front Psychiatry 2020; 11:50. [PMID: 32180734 PMCID: PMC7059848 DOI: 10.3389/fpsyt.2020.00050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 01/21/2020] [Indexed: 11/29/2022] Open
Abstract
Vocal modulation is a critical component of interpersonal communication. It not only serves as a dynamic and flexible tool for self-expression and linguistic information but also plays a key role in social behavior. Variation in vocal modulation can be driven by individual traits of interlocutors as well as factors relating to the dyad, such as the perceived closeness between interlocutors. In this study we examine both of these sources of variation. At an individual level, we examine the impact of autistic traits, since lack of appropriate vocal modulation has often been associated with Autism Spectrum Disorders. At a dyadic level, we examine the role of perceived closeness between interlocutors on vocal modulation. The study was conducted in three separate samples from India, Italy, and the UK. Articulatory features were extracted from recorded conversations between a total of 85 same-sex pairs of participants, and the articulation space calculated. A larger articulation space corresponds to greater number of spectro-temporal modulations (articulatory variations) sampled by the speaker. Articulation space showed a positive association with interpersonal closeness and a weak negative association with autistic traits. This study thus provides novel insights into individual and dyadic variation that can influence interpersonal vocal communication.
Collapse
Affiliation(s)
- T. A. Sumathi
- National Brain Research Centre, Language, Literacy and Music Laboratory, Manesar, India
| | - Olivia Spinola
- Department of Psychology, Universita` degli Studi di Milano Bicocca, Milan, Italy
- Centre for Autism, School of Psychology & Clinical Language Sciences, University of Reading, Reading, United Kingdom
- Department of Psychology, Sapienza University of Rome, Rome, Italy
| | | | - Bhismadev Chakrabarti
- Centre for Autism, School of Psychology & Clinical Language Sciences, University of Reading, Reading, United Kingdom
- Inter University Centre for Biomedical Research, Mahatma Gandhi University, Kottayam, India
- India Autism Center, Kolkata, India
| |
Collapse
|
10
|
Amiriparian S, Han J, Schmitt M, Baird A, Mallol-Ragolta A, Milling M, Gerczuk M, Schuller B. Synchronization in Interpersonal Speech. Front Robot AI 2019; 6:116. [PMID: 33501131 PMCID: PMC7806071 DOI: 10.3389/frobt.2019.00116] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Accepted: 10/22/2019] [Indexed: 11/13/2022] Open
Abstract
During both positive and negative dyadic exchanges, individuals will often unconsciously imitate their partner. A substantial amount of research has been made on this phenomenon, and such studies have shown that synchronization between communication partners can improve interpersonal relationships. Automatic computational approaches for recognizing synchrony are still in their infancy. In this study, we extend on previous work in which we applied a novel method utilizing hand-crafted low-level acoustic descriptors and autoencoders (AEs) to analyse synchrony in the speech domain. For this purpose, a database consisting of 394 in-the-wild speakers from six different cultures, is used. For each speaker in the dyadic exchange, two AEs are implemented. Post the training phase, the acoustic features for one of the speakers is tested using the AE trained on their dyadic partner. In this same way, we also explore the benefits that deep representations from audio may have, implementing the state-of-the-art Deep Spectrum toolkit. For all speakers at varied time-points during their interaction, the calculation of reconstruction error from the AE trained on their respective dyadic partner is made. The results obtained from this acoustic analysis are then compared with the linguistic experiments based on word counts and word embeddings generated by our word2vec approach. The results demonstrate that there is a degree of synchrony during all interactions. We also find that, this degree varies across the 6 cultures found in the investigated database. These findings are further substantiated through the use of 4,096 dimensional Deep Spectrum features.
Collapse
Affiliation(s)
- Shahin Amiriparian
- ZD.B Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Jing Han
- ZD.B Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Maximilian Schmitt
- ZD.B Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Alice Baird
- ZD.B Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Adria Mallol-Ragolta
- ZD.B Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Manuel Milling
- ZD.B Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Maurice Gerczuk
- ZD.B Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Björn Schuller
- ZD.B Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany.,Group on Language, Audio & Music, Imperial College London, London, United Kingdom
| |
Collapse
|
11
|
Crow KM, van Mersbergen M, Payne AE. Vocal Congruence: The Voice and the Self Measured by Interoceptive Awareness. J Voice 2019; 35:324.e15-324.e28. [PMID: 31558332 DOI: 10.1016/j.jvoice.2019.08.027] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Revised: 08/26/2019] [Accepted: 08/27/2019] [Indexed: 11/18/2022]
Abstract
Voices are, by nature, idiosyncratic representations of individuals because they possess anatomical, physiological, and psychological characteristics that are unique to them, which contribute to vocal output, and thus, establish the voice as a salient marker of their individuality. The areas of experimental psychology and cognitive neuroscience have examined the psychological and neurological constructs that form one's sense of self and have employed measures of interoceptive and exteroceptive abilities to discover the underlying constructs of the sense of self. This study employed measures of interoceptive awareness to assess level of vocal congruence. Forty-one participants analyzed in this study underwent a heartbeat detection task designed to assess the level of interoceptive awareness and were placed into two groups: those high in interoceptive awareness and those low in interoceptive awareness. They completed two tasks, a speaking task, which included structured passages and conversation, and a listening task, where they listened to themselves in the speaking task. Following each task, they completed a Vocal Congruence Scale designed to assess the level of identification they have within themselves related to the sound of their voice. Individuals scoring high in interoceptive awareness scored significantly higher in vocal congruence than those scoring lower in interoceptive awareness. Additionally, when analyzed with other measures of personality, anxiety, mood, and voice handicap, the Vocal Congruence Scale appears to measure a unique aspect of vocal identity with one's self that encompasses interoceptive awareness.
Collapse
Affiliation(s)
- Karen M Crow
- School of Communication Sciences and Disorders, The University of Memphis, Memphis, Tennessee
| | - Miriam van Mersbergen
- School of Communication Sciences and Disorders, The University of Memphis, Memphis, Tennessee.
| | - Alexis E Payne
- School of Communication Sciences and Disorders, The University of Memphis, Memphis, Tennessee
| |
Collapse
|
12
|
Oesch N. Music and Language in Social Interaction: Synchrony, Antiphony, and Functional Origins. Front Psychol 2019; 10:1514. [PMID: 31312163 PMCID: PMC6614337 DOI: 10.3389/fpsyg.2019.01514] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Accepted: 06/17/2019] [Indexed: 11/13/2022] Open
Abstract
Music and language are universal human abilities with many apparent similarities relating to their acoustics, structure, and frequent use in social situations. We might therefore expect them to be understood and processed similarly, and indeed an emerging body of research suggests that this is the case. But the focus has historically been on the individual, looking at the passive listener or the isolated speaker or performer, even though social interaction is the primary site of use for both domains. Nonetheless, an important goal of emerging research is to compare music and language in terms of acoustics and structure, social interaction, and functional origins to develop parallel accounts across the two domains. Indeed, a central aim of both of evolutionary musicology and language evolution research is to understand the adaptive significance or functional origin of human music and language. An influential proposal to emerge in recent years has been referred to as the social bonding hypothesis. Here, within a comparative approach to animal communication systems, I review empirical studies in support of the social bonding hypothesis in humans, non-human primates, songbirds, and various other mammals. In support of this hypothesis, I review six research fields: (i) the functional origins of music; (ii) the functional origins of language; (iii) mechanisms of social synchrony for human social bonding; (iv) language and social bonding in humans; (v) music and social bonding in humans; and (vi) pitch, tone and emotional expression in human speech and music. I conclude that the comparative study of complex vocalizations and behaviors in various extant species can provide important insights into the adaptive function(s) of these traits in these species, as well as offer evidence-based speculations for the existence of "musilanguage" in our primate ancestors, and thus inform our understanding of the biology and evolution of human music and language.
Collapse
Affiliation(s)
- Nathan Oesch
- Music and Neuroscience Lab, Department of Psychology, The Brain and Mind Institute, Western University, London, ON, Canada
- Cognitive Neuroscience of Communication and Hearing (CoNCH) Lab, Department of Psychology, The Brain and Mind Institute, Western University, London, ON, Canada
| |
Collapse
|
13
|
Smith M. Innovations for Supporting Communication: Opportunities and Challenges for People with Complex Communication Needs. Folia Phoniatr Logop 2019; 71:156-167. [DOI: 10.1159/000496729] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Accepted: 01/07/2019] [Indexed: 11/19/2022] Open
Abstract
Individuals with complex communication needs have benefited greatly from technological innovations over the past two decades, as well as from social movements that have shifted focus from disability to functioning and participation in society. Three strands of technological innovation are reviewed in this paper: (1) innovations in the tools that have become available, specifically tablet technologies; (2) innovations in access methods (eye gaze technologies and brain-computer interfaces); and (3) innovations in output, specifically speech technologies. The opportunities these innovations offer are explored, as are some of the challenges that they imply, not only for individuals with complex communication needs, but also for families, professionals, and researchers.
Collapse
|
14
|
Pullin G, Treviranus J, Patel R, Higginbotham J. Designing interaction, voice, and inclusion in AAC research. Augment Altern Commun 2017; 33:139-148. [DOI: 10.1080/07434618.2017.1342690] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
Affiliation(s)
- Graham Pullin
- Duncan of Jordanstone College of Art and Design, University of Dundee, Dundee, UK
| | - Jutta Treviranus
- Inclusive Design Research Centre, OCAD University, Toronto, ON, Canada
| | - Rupal Patel
- VocaliD, and Department of Communication Sciences and Disorders, Northeastern University, Boston, MA, USA
| | - Jeff Higginbotham
- Department of Communicative Disorders and Sciences, University at Buffalo, Buffalo, NY, USA
| |
Collapse
|
15
|
Roswandowitz C, Schelinski S, von Kriegstein K. Developmental phonagnosia: Linking neural mechanisms with the behavioural phenotype. Neuroimage 2017; 155:97-112. [DOI: 10.1016/j.neuroimage.2017.02.064] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2016] [Revised: 12/16/2016] [Accepted: 02/21/2017] [Indexed: 11/30/2022] Open
|
16
|
McGettigan C, Jasmin K, Eisner F, Agnew ZK, Josephs OJ, Calder AJ, Jessop R, Lawson RP, Spielmann M, Scott SK. You talkin' to me? Communicative talker gaze activates left-lateralized superior temporal cortex during perception of degraded speech. Neuropsychologia 2017; 100:51-63. [PMID: 28400328 PMCID: PMC5446325 DOI: 10.1016/j.neuropsychologia.2017.04.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Revised: 04/05/2017] [Accepted: 04/07/2017] [Indexed: 11/13/2022]
Abstract
Neuroimaging studies of speech perception have consistently indicated a left-hemisphere dominance in the temporal lobes’ responses to intelligible auditory speech signals (McGettigan and Scott, 2012). However, there are important communicative cues that cannot be extracted from auditory signals alone, including the direction of the talker's gaze. Previous work has implicated the superior temporal cortices in processing gaze direction, with evidence for predominantly right-lateralized responses (Carlin & Calder, 2013). The aim of the current study was to investigate whether the lateralization of responses to talker gaze differs in an auditory communicative context. Participants in a functional MRI experiment watched and listened to videos of spoken sentences in which the auditory intelligibility and talker gaze direction were manipulated factorially. We observed a left-dominant temporal lobe sensitivity to the talker's gaze direction, in which the left anterior superior temporal sulcus/gyrus and temporal pole showed an enhanced response to direct gaze – further investigation revealed that this pattern of lateralization was modulated by auditory intelligibility. Our results suggest flexibility in the distribution of neural responses to social cues in the face within the context of a challenging speech perception task. Talker gaze is an important social cue during speech comprehension. Neural responses to gaze were measured during perception of degraded sentences. Gaze direction modulated activation in left-lateralized superior temporal cortex. Left lateralization became stronger when speech was less intelligible. Results suggest task-dependent flexibility in cortical responses to gaze.
Collapse
Affiliation(s)
- Carolyn McGettigan
- Department of Psychology, Royal Holloway University of London, Egham Hill, Egham TW20 0EX, UK; Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK.
| | - Kyle Jasmin
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK
| | - Frank Eisner
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK; Donders Institute, Radboud University, Montessorilaan 3, 6525 HR Nijmegen, Netherlands
| | - Zarinah K Agnew
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK; Department of Otolaryngology, University of California, San Francisco, 513 Parnassus Avenue, San Francisco, CA, USA
| | - Oliver J Josephs
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK; Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, 12 Queen Square, London WC1N 3BG, UK
| | - Andrew J Calder
- MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge CB2 7EF, UK
| | - Rosemary Jessop
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK
| | - Rebecca P Lawson
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK; Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, 12 Queen Square, London WC1N 3BG, UK
| | - Mona Spielmann
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK
| | - Sophie K Scott
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK
| |
Collapse
|
17
|
Carey D, McGettigan C. Magnetic resonance imaging of the brain and vocal tract: Applications to the study of speech production and language learning. Neuropsychologia 2016; 98:201-211. [PMID: 27288115 DOI: 10.1016/j.neuropsychologia.2016.06.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2016] [Revised: 06/02/2016] [Accepted: 06/05/2016] [Indexed: 10/21/2022]
Abstract
The human vocal system is highly plastic, allowing for the flexible expression of language, mood and intentions. However, this plasticity is not stable throughout the life span, and it is well documented that adult learners encounter greater difficulty than children in acquiring the sounds of foreign languages. Researchers have used magnetic resonance imaging (MRI) to interrogate the neural substrates of vocal imitation and learning, and the correlates of individual differences in phonetic "talent". In parallel, a growing body of work using MR technology to directly image the vocal tract in real time during speech has offered primarily descriptive accounts of phonetic variation within and across languages. In this paper, we review the contribution of neural MRI to our understanding of vocal learning, and give an overview of vocal tract imaging and its potential to inform the field. We propose methods by which our understanding of speech production and learning could be advanced through the combined measurement of articulation and brain activity using MRI - specifically, we describe a novel paradigm, developed in our laboratory, that uses both MRI techniques to for the first time map directly between neural, articulatory and acoustic data in the investigation of vocalisation. This non-invasive, multimodal imaging method could be used to track central and peripheral correlates of spoken language learning, and speech recovery in clinical settings, as well as provide insights into potential sites for targeted neural interventions.
Collapse
Affiliation(s)
- Daniel Carey
- Department of Psychology, Royal Holloway, University of London, Egham, UK
| | - Carolyn McGettigan
- Department of Psychology, Royal Holloway, University of London, Egham, UK
| |
Collapse
|
18
|
Abstract
Purpose (1) To explore the role of native voice and effects of voice loss on self-concept and identity, and survey the state of assistive voice technology; (2) to establish the moral case for developing personalized voice technology. Methods This narrative review examines published literature on the human significance of voice, the impact of voice loss on self-concept and identity, and the strengths and limitations of current voice technology. Based on the impact of voice loss on self and identity, and voice technology limitations, the moral case for personalized voice technology is developed. Results Given the richness of information conveyed by voice, loss of voice constrains expression of the self, but the full impact is poorly understood. Augmentative and alternative communication (AAC) devices facilitate communication but, despite advances in this field, voice output cannot yet express the unique nuances of individual voice. The ethical principles of autonomy, beneficence and equality of opportunity establish the moral responsibility to invest in accessible, cost-effective, personalized voice technology. Conclusions Although further research is needed to elucidate the full effects of voice loss on self-concept, identity and social functioning, current understanding of the profoundly negative impact of voice loss establishes the moral case for developing personalized voice technology. Implications for Rehabilitation Rehabilitation of voice-disordered patients should facilitate self-expression, interpersonal connectedness and social/occupational participation. Proactive questioning about the psychological and social experiences of patients with voice loss is a valuable entry point for rehabilitation planning. Personalized voice technology would enhance sense of self, communicative participation and autonomy and promote shared healthcare decision-making. Further research is needed to identify the best strategies to preserve and strengthen identity and sense of self.
Collapse
Affiliation(s)
- Esther Nathanson
- a The Neiswanger Institute for Bioethics, Loyola University Chicago Stritch School of Medicine , Maywood , IL , USA
| |
Collapse
|
19
|
Pisanski K, Cartei V, McGettigan C, Raine J, Reby D. Voice Modulation: A Window into the Origins of Human Vocal Control? Trends Cogn Sci 2016; 20:304-318. [PMID: 26857619 DOI: 10.1016/j.tics.2016.01.002] [Citation(s) in RCA: 103] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2015] [Revised: 01/05/2016] [Accepted: 01/07/2016] [Indexed: 11/17/2022]
Abstract
An unresolved issue in comparative approaches to speech evolution is the apparent absence of an intermediate vocal communication system between human speech and the less flexible vocal repertoires of other primates. We argue that humans' ability to modulate nonverbal vocal features evolutionarily linked to expression of body size and sex (fundamental and formant frequencies) provides a largely overlooked window into the nature of this intermediate system. Recent behavioral and neural evidence indicates that humans' vocal control abilities, commonly assumed to subserve speech, extend to these nonverbal dimensions. This capacity appears in continuity with context-dependent frequency modulations recently identified in other mammals, including primates, and may represent a living relic of early vocal control abilities that led to articulated human speech.
Collapse
Affiliation(s)
- Katarzyna Pisanski
- Mammal Vocal Communication and Cognition Research Group, School of Psychology, University of Sussex, Brighton, UK; Institute of Psychology, University of Wrocław, Wrocław, Poland
| | - Valentina Cartei
- Mammal Vocal Communication and Cognition Research Group, School of Psychology, University of Sussex, Brighton, UK
| | - Carolyn McGettigan
- Royal Holloway Vocal Communication Laboratory, Department of Psychology, Royal Holloway, University of London, Egham, UK
| | - Jordan Raine
- Mammal Vocal Communication and Cognition Research Group, School of Psychology, University of Sussex, Brighton, UK
| | - David Reby
- Mammal Vocal Communication and Cognition Research Group, School of Psychology, University of Sussex, Brighton, UK.
| |
Collapse
|
20
|
Acoustic richness modulates the neural networks supporting intelligible speech processing. Hear Res 2015; 333:108-117. [PMID: 26723103 DOI: 10.1016/j.heares.2015.12.008] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Revised: 12/07/2015] [Accepted: 12/10/2015] [Indexed: 11/20/2022]
Abstract
The information contained in a sensory signal plays a critical role in determining what neural processes are engaged. Here we used interleaved silent steady-state (ISSS) functional magnetic resonance imaging (fMRI) to explore how human listeners cope with different degrees of acoustic richness during auditory sentence comprehension. Twenty-six healthy young adults underwent scanning while hearing sentences that varied in acoustic richness (high vs. low spectral detail) and syntactic complexity (subject-relative vs. object-relative center-embedded clause structures). We manipulated acoustic richness by presenting the stimuli as unprocessed full-spectrum speech, or noise-vocoded with 24 channels. Importantly, although the vocoded sentences were spectrally impoverished, all sentences were highly intelligible. These manipulations allowed us to test how intelligible speech processing was affected by orthogonal linguistic and acoustic demands. Acoustically rich speech showed stronger activation than acoustically less-detailed speech in a bilateral temporoparietal network with more pronounced activity in the right hemisphere. By contrast, listening to sentences with greater syntactic complexity resulted in increased activation of a left-lateralized network including left posterior lateral temporal cortex, left inferior frontal gyrus, and left dorsolateral prefrontal cortex. Significant interactions between acoustic richness and syntactic complexity occurred in left supramarginal gyrus, right superior temporal gyrus, and right inferior frontal gyrus, indicating that the regions recruited for syntactic challenge differed as a function of acoustic properties of the speech. Our findings suggest that the neural systems involved in speech perception are finely tuned to the type of information available, and that reducing the richness of the acoustic signal dramatically alters the brain's response to spoken language, even when intelligibility is high.
Collapse
|