1
|
Kanber E, Lally C, Razin R, Rosi V, Garrido L, Lavan N, McGettigan C. Representations of personally familiar voices are better resolved in the brain. Curr Biol 2025; 35:2424-2432.e6. [PMID: 40252646 DOI: 10.1016/j.cub.2025.03.081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Revised: 02/21/2025] [Accepted: 03/31/2025] [Indexed: 04/21/2025]
Abstract
The human voice is highly flexible, allowing for diverse expression during communication,1 but presents perceptual challenges through large acoustic variability.2,3,4,5,6,7,8,9,10,11 The ability to recognize an individual person's voice depends on the listener's ability to overcome this within-speaker variability to extract a single identity percept.2,18 Previous work has found that this process is greatly assisted by familiarity,6,9,13 with evidence suggesting that more extensive and varied exposure to a voice is associated with the formation of a more robust mental representation of it.4,8 Here, we used functional magnetic resonance imaging (fMRI) with representational similarity analysis14 to characterize how personal familiarity with a voice is reflected in neural representations. We measured and compared brain responses with voices of differing familiarity-a personally familiar voice, a voice familiarized through lab training, and a new (untrained) voice-while listeners identified these voices from naturally varying, spontaneous speech clips. Personally familiar voices elicited brain response patterns in voice-, face-, and person-selective corticesthat showed higher within- and between-speaker dissimilarity, compared with lower-familiarity lab-trained and untrained voices. These findings indicated that representations for the sounds of personally familiar voices are better resolved from each other in the brain, and they align with other research reporting intelligibility advantages for speech produced by familiar talkers.15,16,17,18 Overall, our findings suggest that extensive and varied exposure to personally familiar voices results in the development of finer-grained representations of those voices, which cannot be achieved via short-term lab training.
Collapse
Affiliation(s)
- Elise Kanber
- Department of Speech, Hearing and Phonetic Sciences, UCL, Chandler House, 2 Wakefield Street, London WC1N 1PF, UK
| | - Clare Lally
- Department of Speech, Hearing and Phonetic Sciences, UCL, Chandler House, 2 Wakefield Street, London WC1N 1PF, UK
| | - Raha Razin
- Department of Speech, Hearing and Phonetic Sciences, UCL, Chandler House, 2 Wakefield Street, London WC1N 1PF, UK; Department Experimental Psychology, UCL, 26 Bedford Way, London WC1H 0AP, UK
| | - Victor Rosi
- Department of Speech, Hearing and Phonetic Sciences, UCL, Chandler House, 2 Wakefield Street, London WC1N 1PF, UK
| | - Lúcia Garrido
- Department of Psychology, City St George's, University of London, Northampton Square, London EC1V 0HB, UK
| | - Nadine Lavan
- School of Biological and Behavioural Sciences, Queen Mary University of London, Mile End Road, London E1 4NS, UK
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, UCL, Chandler House, 2 Wakefield Street, London WC1N 1PF, UK.
| |
Collapse
|
2
|
Cordero G, Paredes-Paredes JR, von Kriegstein K, Díaz B. Perceiving speech from a familiar speaker engages the person identity network. PLoS One 2025; 20:e0322927. [PMID: 40367292 PMCID: PMC12077772 DOI: 10.1371/journal.pone.0322927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2024] [Accepted: 03/31/2025] [Indexed: 05/16/2025] Open
Abstract
Numerous studies show that speaker familiarity influences speech perception. Here, we investigated the brain regions and their changes in functional connectivity involved in the use of person-specific information during speech perception. We employed functional magnetic resonance imaging to study changes in functional connectivity and Blood-Oxygenation-Level-Dependent (BOLD) responses associated with speaker familiarity in human adults while they performed a speech perception task. Twenty-seven right-handed participants performed the speech task before and after being familiarized with the voice and numerous autobiographical details of one of the speakers featured in the task. We found that speech perception from a familiar speaker was associated with BOLD activity changes in regions of the person identity network: the right temporal pole, a voice-sensitive region, and the right supramarginal gyrus, a region sensitive to speaker-specific aspects of speech sound productions. A speech-sensitive region located in the left superior temporal gyrus also exhibited sensitivity to speaker familiarity during speech perception. Lastly, speaker familiarity increased connectivity strength between the right temporal pole and the right superior frontal gyrus, a region associated with verbal working memory. Our findings unveil that speaker familiarity engages the person identity network during speech perception, extending the neural basis of speech processing beyond the canonical language network.
Collapse
Affiliation(s)
- Gaël Cordero
- Department of Psychology, Faculty of Medicine and Health Sciences, Universitat Internacional de Catalunya, Barcelona, Spain
| | - Jazmin R. Paredes-Paredes
- Department of Psychology, Faculty of Medicine and Health Sciences, Universitat Internacional de Catalunya, Barcelona, Spain
| | | | - Begoña Díaz
- Department of Psychology, Faculty of Medicine and Health Sciences, Universitat Internacional de Catalunya, Barcelona, Spain
| |
Collapse
|
3
|
Rupp KM, Hect JL, Harford EE, Holt LL, Ghuman AS, Abel TJ. A hierarchy of processing complexity and timescales for natural sounds in the human auditory cortex. Proc Natl Acad Sci U S A 2025; 122:e2412243122. [PMID: 40294254 PMCID: PMC12067213 DOI: 10.1073/pnas.2412243122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Accepted: 03/21/2025] [Indexed: 04/30/2025] Open
Abstract
Efficient behavior is supported by humans' ability to rapidly recognize acoustically distinct sounds as members of a common category. Within the auditory cortex, critical unanswered questions remain regarding the organization and dynamics of sound categorization. We performed intracerebral recordings during epilepsy surgery evaluation as 20 patient-participants listened to natural sounds. We then built encoding models to predict neural responses using sound representations extracted from different layers within a deep neural network (DNN) pretrained to categorize sounds from acoustics. This approach yielded accurate models of neural responses throughout the auditory cortex. The complexity of a cortical site's representation (measured by the depth of the DNN layer that produced the best model) was closely related to its anatomical location, with shallow, middle, and deep layers associated with core (primary auditory cortex), lateral belt, and parabelt regions, respectively. Smoothly varying gradients of representational complexity existed within these regions, with complexity increasing along a posteromedial-to-anterolateral direction in core and lateral belt and along posterior-to-anterior and dorsal-to-ventral dimensions in parabelt. We then characterized the time (relative to sound onset) when feature representations emerged; this measure of temporal dynamics increased across the auditory hierarchy. Finally, we found separable effects of region and temporal dynamics on representational complexity: sites that took longer to begin encoding stimulus features had higher representational complexity independent of region, and downstream regions encoded more complex features independent of temporal dynamics. These findings suggest that hierarchies of timescales and complexity represent a functional organizational principle of the auditory stream underlying our ability to rapidly categorize sounds.
Collapse
Affiliation(s)
- Kyle M. Rupp
- Department of Neurological Surgery, University of Pittsburgh, PA15213
| | - Jasmine L. Hect
- Department of Neurological Surgery, University of Pittsburgh, PA15213
| | - Emily E. Harford
- Department of Neurological Surgery, University of Pittsburgh, PA15213
| | - Lori L. Holt
- Department of Psychology, The University of Texas at Austin, TX78712
| | | | - Taylor J. Abel
- Department of Neurological Surgery, University of Pittsburgh, PA15213
- Department of Bioengineering, University of Pittsburgh, PA15261
| |
Collapse
|
4
|
Rupp KM, Hect JL, Harford EE, Holt LL, Ghuman AS, Abel TJ. A hierarchy of processing complexity and timescales for natural sounds in human auditory cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.24.595822. [PMID: 38826304 PMCID: PMC11142240 DOI: 10.1101/2024.05.24.595822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
Efficient behavior is supported by humans' ability to rapidly recognize acoustically distinct sounds as members of a common category. Within auditory cortex, there are critical unanswered questions regarding the organization and dynamics of sound categorization. Here, we performed intracerebral recordings in the context of epilepsy surgery as 20 patient-participants listened to natural sounds. We built encoding models to predict neural responses using features of these sounds extracted from different layers within a sound-categorization deep neural network (DNN). This approach yielded highly accurate models of neural responses throughout auditory cortex. The complexity of a cortical site's representation (measured by the depth of the DNN layer that produced the best model) was closely related to its anatomical location, with shallow, middle, and deep layers of the DNN associated with core (primary auditory cortex), lateral belt, and parabelt regions, respectively. Smoothly varying gradients of representational complexity also existed within these regions, with complexity increasing along a posteromedial-to-anterolateral direction in core and lateral belt, and along posterior-to-anterior and dorsal-to-ventral dimensions in parabelt. When we estimated the time window over which each recording site integrates information, we found shorter integration windows in core relative to lateral belt and parabelt. Lastly, we found a relationship between the length of the integration window and the complexity of information processing within core (but not lateral belt or parabelt). These findings suggest hierarchies of timescales and processing complexity, and their interrelationship, represent a functional organizational principle of the auditory stream that underlies our perception of complex, abstract auditory information.
Collapse
Affiliation(s)
- Kyle M. Rupp
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Jasmine L. Hect
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Emily E. Harford
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Lori L. Holt
- Department of Psychology, The University of Texas at Austin, Austin, Texas, United States of America
| | - Avniel Singh Ghuman
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Taylor J. Abel
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| |
Collapse
|
5
|
Harford EE, Holt LL, Abel TJ. Unveiling the development of human voice perception: Neurobiological mechanisms and pathophysiology. CURRENT RESEARCH IN NEUROBIOLOGY 2024; 6:100127. [PMID: 38511174 PMCID: PMC10950757 DOI: 10.1016/j.crneur.2024.100127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 02/22/2024] [Accepted: 02/26/2024] [Indexed: 03/22/2024] Open
Abstract
The human voice is a critical stimulus for the auditory system that promotes social connection, informs the listener about identity and emotion, and acts as the carrier for spoken language. Research on voice processing in adults has informed our understanding of the unique status of the human voice in the mature auditory cortex and provided potential explanations for mechanisms that underly voice selectivity and identity processing. There is evidence that voice perception undergoes developmental change starting in infancy and extending through early adolescence. While even young infants recognize the voice of their mother, there is an apparent protracted course of development to reach adult-like selectivity for human voice over other sound categories and recognition of other talkers by voice. Gaps in the literature do not allow for an exact mapping of this trajectory or an adequate description of how voice processing and its neural underpinnings abilities evolve. This review provides a comprehensive account of developmental voice processing research published to date and discusses how this evidence fits with and contributes to current theoretical models proposed in the adult literature. We discuss how factors such as cognitive development, neural plasticity, perceptual narrowing, and language acquisition may contribute to the development of voice processing and its investigation in children. We also review evidence of voice processing abilities in premature birth, autism spectrum disorder, and phonagnosia to examine where and how deviations from the typical trajectory of development may manifest.
Collapse
Affiliation(s)
- Emily E. Harford
- Department of Neurological Surgery, University of Pittsburgh, USA
| | - Lori L. Holt
- Department of Psychology, The University of Texas at Austin, USA
| | - Taylor J. Abel
- Department of Neurological Surgery, University of Pittsburgh, USA
- Department of Bioengineering, University of Pittsburgh, USA
| |
Collapse
|
6
|
Oganian Y, Bhaya-Grossman I, Johnson K, Chang EF. Vowel and formant representation in the human auditory speech cortex. Neuron 2023; 111:2105-2118.e4. [PMID: 37105171 PMCID: PMC10330593 DOI: 10.1016/j.neuron.2023.04.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 02/08/2023] [Accepted: 04/04/2023] [Indexed: 04/29/2023]
Abstract
Vowels, a fundamental component of human speech across all languages, are cued acoustically by formants, resonance frequencies of the vocal tract shape during speaking. An outstanding question in neurolinguistics is how formants are processed neurally during speech perception. To address this, we collected high-density intracranial recordings from the human speech cortex on the superior temporal gyrus (STG) while participants listened to continuous speech. We found that two-dimensional receptive fields based on the first two formants provided the best characterization of vowel sound representation. Neural activity at single sites was highly selective for zones in this formant space. Furthermore, formant tuning is adjusted dynamically for speaker-specific spectral context. However, the entire population of formant-encoding sites was required to accurately decode single vowels. Overall, our results reveal that complex acoustic tuning in the two-dimensional formant space underlies local vowel representations in STG. As a population code, this gives rise to phonological vowel perception.
Collapse
Affiliation(s)
- Yulia Oganian
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Ilina Bhaya-Grossman
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA; University of California, Berkeley-University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA 94720, USA
| | - Keith Johnson
- Department of Linguistics, University of California, Berkeley, Berkeley, CA, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA.
| |
Collapse
|
7
|
Luthra S, Magnuson JS, Myers EB. Right Posterior Temporal Cortex Supports Integration of Phonetic and Talker Information. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2023; 4:145-177. [PMID: 37229142 PMCID: PMC10205075 DOI: 10.1162/nol_a_00091] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 11/08/2022] [Indexed: 05/27/2023]
Abstract
Though the right hemisphere has been implicated in talker processing, it is thought to play a minimal role in phonetic processing, at least relative to the left hemisphere. Recent evidence suggests that the right posterior temporal cortex may support learning of phonetic variation associated with a specific talker. In the current study, listeners heard a male talker and a female talker, one of whom produced an ambiguous fricative in /s/-biased lexical contexts (e.g., epi?ode) and one who produced it in /∫/-biased contexts (e.g., friend?ip). Listeners in a behavioral experiment (Experiment 1) showed evidence of lexically guided perceptual learning, categorizing ambiguous fricatives in line with their previous experience. Listeners in an fMRI experiment (Experiment 2) showed differential phonetic categorization as a function of talker, allowing for an investigation of the neural basis of talker-specific phonetic processing, though they did not exhibit perceptual learning (likely due to characteristics of our in-scanner headphones). Searchlight analyses revealed that the patterns of activation in the right superior temporal sulcus (STS) contained information about who was talking and what phoneme they produced. We take this as evidence that talker information and phonetic information are integrated in the right STS. Functional connectivity analyses suggested that the process of conditioning phonetic identity on talker information depends on the coordinated activity of a left-lateralized phonetic processing system and a right-lateralized talker processing system. Overall, these results clarify the mechanisms through which the right hemisphere supports talker-specific phonetic processing.
Collapse
Affiliation(s)
- Sahil Luthra
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, USA
| | - James S. Magnuson
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, USA
- Basque Center on Cognition Brain and Language (BCBL), Donostia-San Sebastián, Spain
- Ikerbasque, Basque Foundation for Science, Bilbao, Spain
| | - Emily B. Myers
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, USA
- Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, CT, USA
| |
Collapse
|
8
|
Rassili O, Michelas A, Dufour S. Does accentual variation in the pronunciation of French words influence their recognition? It depends on the ear of presentation. JASA EXPRESS LETTERS 2023; 3:035204. [PMID: 37003717 DOI: 10.1121/10.0017516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
This repetition priming study examined how word accentual variation in French is represented and processed during spoken word recognition. Mismatched primes in the accentual pattern were less effective than matched primes in facilitating target word recognition when the targets were presented in the left ear but not in the right ear. This indicates that in French, the accentual pattern of words influences their recognition when processing is constrained in the right hemisphere. This study pleads in favor of two memory systems, the one retaining words in an abstract format and the other retaining words in their various forms.
Collapse
Affiliation(s)
- Outhmane Rassili
- Aix-Marseille Université, Centre National de la Recherche Scientifique, Laboratoire Parole et Langage, Unité Mixte de Recherche 7309, 13100 Aix-en-Provence, France ; ;
| | - Amandine Michelas
- Aix-Marseille Université, Centre National de la Recherche Scientifique, Laboratoire Parole et Langage, Unité Mixte de Recherche 7309, 13100 Aix-en-Provence, France ; ;
| | - Sophie Dufour
- Aix-Marseille Université, Centre National de la Recherche Scientifique, Laboratoire Parole et Langage, Unité Mixte de Recherche 7309, 13100 Aix-en-Provence, France ; ;
| |
Collapse
|
9
|
Distinct Neural Resource Involvements but Similar Hemispheric Lateralization Patterns in Pre-Attentive Processing of Speaker's Identity and Linguistic Information. Brain Sci 2023; 13:brainsci13020192. [PMID: 36831735 PMCID: PMC9954658 DOI: 10.3390/brainsci13020192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 12/29/2022] [Accepted: 01/13/2023] [Indexed: 01/26/2023] Open
Abstract
The speaker's identity (who the speaker is) and linguistic information (what the speaker is saying) are essential to daily communication. However, it is unclear whether and how listeners process the two types of information differently in speech perception. The present study adopted a passive oddball paradigm to compare the identity and linguistic information processing concerning neural resource involvements and hemispheric lateralization patterns. We used two female native Mandarin speakers' real and pseudo-Mandarin words to differentiate the identity from linguistic (phonological and lexical) information. The results showed that, in real words, the phonological-lexical variation elicited larger MMN amplitudes than the identity variation. In contrast, there were no significant MMN amplitude differences between the identity and phonological variation in pseudo words. Regardless of real or pseudo words, the identity and linguistic variation did not elicit MMN amplitudes differences between the left and right hemispheres. Taken together, findings from the present study indicated that the identity information recruited similar neural resources to the phonological information but different neural resources from the lexical information. However, the identity and linguistic information processing did not show a particular hemispheric lateralization pattern at an early pre-attentive speech perception stage. The findings revealed similarities and differences between linguistic and non-linguistic information processing, contributing to a better understanding of speech perception and spoken word recognition.
Collapse
|
10
|
Trapeau R, Thoret E, Belin P. The Temporal Voice Areas are not "just" Speech Areas. Front Neurosci 2023; 16:1075288. [PMID: 36685244 PMCID: PMC9846853 DOI: 10.3389/fnins.2022.1075288] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 12/06/2022] [Indexed: 01/05/2023] Open
Abstract
The Temporal Voice Areas (TVAs) respond more strongly to speech sounds than to non-speech vocal sounds, but does this make them Temporal "Speech" Areas? We provide a perspective on this issue by combining univariate, multivariate, and representational similarity analyses of fMRI activations to a balanced set of speech and non-speech vocal sounds. We find that while speech sounds activate the TVAs more than non-speech vocal sounds, which is likely related to their larger temporal modulations in syllabic rate, they do not appear to activate additional areas nor are they segregated from the non-speech vocal sounds when their higher activation is controlled. It seems safe, then, to continue calling these regions the Temporal Voice Areas.
Collapse
Affiliation(s)
- Régis Trapeau
- La Timone Neuroscience Institute, CNRS and Aix-Marseille University, UMR 7289, Marseille, France
| | - Etienne Thoret
- Aix-Marseille University, CNRS, UMR7061 PRISM, UMR7020 LIS, Marseille, France,Institute of Language, Communication and the Brain (ILCB), Marseille, France
| | - Pascal Belin
- La Timone Neuroscience Institute, CNRS and Aix-Marseille University, UMR 7289, Marseille, France,Department of Psychology, Montreal University, Montreal, QC, Canada,*Correspondence: Pascal Belin ✉
| |
Collapse
|
11
|
Sun Y, Ming L, Sun J, Guo F, Li Q, Hu X. Brain mechanism of unfamiliar and familiar voice processing: an activation likelihood estimation meta-analysis. PeerJ 2023; 11:e14976. [PMID: 36935917 PMCID: PMC10019337 DOI: 10.7717/peerj.14976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 02/08/2023] [Indexed: 03/14/2023] Open
Abstract
Interpersonal communication through vocal information is very important for human society. During verbal interactions, our vocal cord vibrations convey important information regarding voice identity, which allows us to decide how to respond to speakers (e.g., neither greeting a stranger too warmly or speaking too coldly to a friend). Numerous neural studies have shown that identifying familiar and unfamiliar voices may rely on different neural bases. However, the mechanism underlying voice identification of individuals of varying familiarity has not been determined due to vague definitions, confusion of terms, and differences in task design. To address this issue, the present study first categorized three kinds of voice identity processing (perception, recognition and identification) from speakers with different degrees of familiarity. We defined voice identity perception as passively listening to a voice or determining if the voice was human, voice identity recognition as determining if the sound heard was acoustically familiar, and voice identity identification as ascertaining whether a voice is associated with a name or face. Of these, voice identity perception involves processing unfamiliar voices, and voice identity recognition and identification involves processing familiar voices. According to these three definitions, we performed activation likelihood estimation (ALE) on 32 studies and revealed different brain mechanisms underlying processing of unfamiliar and familiar voice identities. The results were as follows: (1) familiar voice recognition/identification was supported by a network involving most regions in the temporal lobe, some regions in the frontal lobe, subcortical structures and regions around the marginal lobes; (2) the bilateral superior temporal gyrus was recruited for voice identity perception of an unfamiliar voice; (3) voice identity recognition/identification of familiar voices was more likely to activate the right frontal lobe than voice identity perception of unfamiliar voices, while voice identity perception of an unfamiliar voice was more likely to activate the bilateral temporal lobe and left frontal lobe; and (4) the bilateral superior temporal gyrus served as a shared neural basis of unfamiliar voice identity perception and familiar voice identity recognition/identification. In general, the results of the current study address gaps in the literature, provide clear definitions of concepts, and indicate brain mechanisms for subsequent investigations.
Collapse
|
12
|
Pinheiro AP, Sarzedas J, Roberto MS, Kotz SA. Attention and emotion shape self-voice prioritization in speech processing. Cortex 2023; 158:83-95. [PMID: 36473276 DOI: 10.1016/j.cortex.2022.10.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Revised: 09/27/2022] [Accepted: 10/06/2022] [Indexed: 01/18/2023]
Abstract
Both self-voice and emotional speech are salient signals that are prioritized in perception. Surprisingly, self-voice perception has been investigated to a lesser extent than the self-face. Therefore, it remains to be clarified whether self-voice prioritization is boosted by emotion, and whether self-relevance and emotion interact differently when attention is focused on who is speaking vs. what is being said. Thirty participants listened to 210 prerecorded words spoken in one's own or an unfamiliar voice and differing in emotional valence in two tasks, manipulating the attention focus on either speaker identity or speech emotion. Event-related potentials (ERP) of the electroencephalogram (EEG) informed on the temporal dynamics of self-relevance, emotion, and attention effects. Words spoken in one's own voice elicited a larger N1 and Late Positive Potential (LPP), but smaller N400. Identity and emotion interactively modulated the P2 (self-positivity bias) and LPP (self-negativity bias). Attention to speaker identity modulated more strongly ERP responses within 600 ms post-word onset (N1, P2, N400), whereas attention to speech emotion altered the late component (LPP). However, attention did not modulate the interaction of self-relevance and emotion. These findings suggest that the self-voice is prioritized for neural processing at early sensory stages, and that both emotion and attention shape self-voice prioritization in speech processing. They also confirm involuntary processing of salient signals (self-relevance and emotion) even in situations in which attention is deliberately directed away from those cues. These findings have important implications for a better understanding of symptoms thought to arise from aberrant self-voice monitoring such as auditory verbal hallucinations.
Collapse
Affiliation(s)
- Ana P Pinheiro
- CICPSI, Faculdade de Psicologia, Universidade de Lisboa, Lisboa, Portugal; Basic and Applied NeuroDynamics Lab, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, the Netherlands.
| | - João Sarzedas
- CICPSI, Faculdade de Psicologia, Universidade de Lisboa, Lisboa, Portugal
| | - Magda S Roberto
- CICPSI, Faculdade de Psicologia, Universidade de Lisboa, Lisboa, Portugal
| | - Sonja A Kotz
- Basic and Applied NeuroDynamics Lab, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, the Netherlands
| |
Collapse
|
13
|
Bestelmeyer PEG, Mühl C. Neural dissociation of the acoustic and cognitive representation of voice identity. Neuroimage 2022; 263:119647. [PMID: 36162634 DOI: 10.1016/j.neuroimage.2022.119647] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 08/18/2022] [Accepted: 09/22/2022] [Indexed: 10/31/2022] Open
Abstract
Recognising a speaker's identity by the sound of their voice is important for successful interaction. The skill depends on our ability to discriminate minute variations in the acoustics of the vocal signal. Performance on voice identity assessments varies widely across the population. The neural underpinnings of this ability and its individual differences, however, remain poorly understood. Here we provide critical tests of a theoretical framework for the neural processing stages of voice identity and address how individual differences in identity discrimination mediate activation in this neural network. We scanned 40 individuals on an fMRI adaptation task involving voices drawn from morphed continua between two personally familiar identities. Analyses dissociated neuronal effects induced by repetition of acoustically similar morphs from those induced by a switch in perceived identity. Activation in temporal voice-sensitive areas decreased with acoustic similarity between consecutive stimuli. This repetition suppression effect was mediated by the performance on an independent voice assessment and this result highlights an important functional role of adaptive coding in voice expertise. Bilateral anterior insulae and medial frontal gyri responded to a switch in perceived voice identity compared to an acoustically equidistant switch within identity. Our results support a multistep model of voice identity perception.
Collapse
Affiliation(s)
| | - Constanze Mühl
- Institute of Cognitive Neuroscience, Bangor University, UK
| |
Collapse
|
14
|
Drown L, Philip B, Francis AL, Theodore RM. Revisiting the left ear advantage for phonetic cues to talker identification. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:3107. [PMID: 36456295 PMCID: PMC9715276 DOI: 10.1121/10.0015093] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 09/13/2022] [Accepted: 10/18/2022] [Indexed: 06/17/2023]
Abstract
Previous research suggests that learning to use a phonetic property [e.g., voice-onset-time, (VOT)] for talker identity supports a left ear processing advantage. Specifically, listeners trained to identify two "talkers" who only differed in characteristic VOTs showed faster talker identification for stimuli presented to the left ear compared to that presented to the right ear, which is interpreted as evidence of hemispheric lateralization consistent with task demands. Experiment 1 (n = 97) aimed to replicate this finding and identify predictors of performance; experiment 2 (n = 79) aimed to replicate this finding under conditions that better facilitate observation of laterality effects. Listeners completed a talker identification task during pretest, training, and posttest phases. Inhibition, category identification, and auditory acuity were also assessed in experiment 1. Listeners learned to use VOT for talker identity, which was positively associated with auditory acuity. Talker identification was not influenced by ear of presentation, and Bayes factors indicated strong support for the null. These results suggest that talker-specific phonetic variation is not sufficient to induce a left ear advantage for talker identification; together with the extant literature, this instead suggests that hemispheric lateralization for talker-specific phonetic variation requires phonetic variation to be conditioned on talker differences in source characteristics.
Collapse
Affiliation(s)
- Lee Drown
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, Connecticut 06269-1085, USA
| | - Betsy Philip
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, Connecticut 06269-1085, USA
| | - Alexander L Francis
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907-2122, USA
| | - Rachel M Theodore
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, Connecticut 06269-1085, USA
| |
Collapse
|
15
|
Rupp K, Hect JL, Remick M, Ghuman A, Chandrasekaran B, Holt LL, Abel TJ. Neural responses in human superior temporal cortex support coding of voice representations. PLoS Biol 2022; 20:e3001675. [PMID: 35900975 PMCID: PMC9333263 DOI: 10.1371/journal.pbio.3001675] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Accepted: 05/13/2022] [Indexed: 11/19/2022] Open
Abstract
The ability to recognize abstract features of voice during auditory perception is an intricate feat of human audition. For the listener, this occurs in near-automatic fashion to seamlessly extract complex cues from a highly variable auditory signal. Voice perception depends on specialized regions of auditory cortex, including superior temporal gyrus (STG) and superior temporal sulcus (STS). However, the nature of voice encoding at the cortical level remains poorly understood. We leverage intracerebral recordings across human auditory cortex during presentation of voice and nonvoice acoustic stimuli to examine voice encoding at the cortical level in 8 patient-participants undergoing epilepsy surgery evaluation. We show that voice selectivity increases along the auditory hierarchy from supratemporal plane (STP) to the STG and STS. Results show accurate decoding of vocalizations from human auditory cortical activity even in the complete absence of linguistic content. These findings show an early, less-selective temporal window of neural activity in the STG and STS followed by a sustained, strongly voice-selective window. Encoding models demonstrate divergence in the encoding of acoustic features along the auditory hierarchy, wherein STG/STS responses are best explained by voice category and acoustics, as opposed to acoustic features of voice stimuli alone. This is in contrast to neural activity recorded from STP, in which responses were accounted for by acoustic features. These findings support a model of voice perception that engages categorical encoding mechanisms within STG and STS to facilitate feature extraction. Voice perception occurs via specialized networks in higher order auditory cortex, but how voice features are encoded remains a central unanswered question. Using human intracerebral recordings of auditory cortex, this study provides evidence for categorical encoding of voice.
Collapse
Affiliation(s)
- Kyle Rupp
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Jasmine L. Hect
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Madison Remick
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Avniel Ghuman
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Lori L. Holt
- Department of Psychology, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Taylor J. Abel
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- * E-mail:
| |
Collapse
|
16
|
Johnson JF, Belyk M, Schwartze M, Pinheiro AP, Kotz SA. Hypersensitivity to passive voice hearing in hallucination proneness. Front Hum Neurosci 2022; 16:859731. [PMID: 35966990 PMCID: PMC9366353 DOI: 10.3389/fnhum.2022.859731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 06/29/2022] [Indexed: 11/21/2022] Open
Abstract
Voices are a complex and rich acoustic signal processed in an extensive cortical brain network. Specialized regions within this network support voice perception and production and may be differentially affected in pathological voice processing. For example, the experience of hallucinating voices has been linked to hyperactivity in temporal and extra-temporal voice areas, possibly extending into regions associated with vocalization. Predominant self-monitoring hypotheses ascribe a primary role of voice production regions to auditory verbal hallucinations (AVH). Alternative postulations view a generalized perceptual salience bias as causal to AVH. These theories are not mutually exclusive as both ascribe the emergence and phenomenology of AVH to unbalanced top-down and bottom-up signal processing. The focus of the current study was to investigate the neurocognitive mechanisms underlying predisposition brain states for emergent hallucinations, detached from the effects of inner speech. Using the temporal voice area (TVA) localizer task, we explored putative hypersalient responses to passively presented sounds in relation to hallucination proneness (HP). Furthermore, to avoid confounds commonly found in in clinical samples, we employed the Launay-Slade Hallucination Scale (LSHS) for the quantification of HP levels in healthy people across an experiential continuum spanning the general population. We report increased activation in the right posterior superior temporal gyrus (pSTG) during the perception of voice features that positively correlates with increased HP scores. In line with prior results, we propose that this right-lateralized pSTG activation might indicate early hypersensitivity to acoustic features coding speaker identity that extends beyond own voice production to perception in healthy participants prone to experience AVH.
Collapse
Affiliation(s)
- Joseph F. Johnson
- Department of Neuropsychology and Psychopharmacology, University of Maastricht, Maastricht, Netherlands
| | - Michel Belyk
- Department of Psychology, Edge Hill University, Ormskirk, United Kingdom
| | - Michael Schwartze
- Department of Neuropsychology and Psychopharmacology, University of Maastricht, Maastricht, Netherlands
| | - Ana P. Pinheiro
- Faculdade de Psicologia, Universidade de Lisboa, Lisbon, Portugal
| | - Sonja A. Kotz
- Department of Neuropsychology and Psychopharmacology, University of Maastricht, Maastricht, Netherlands
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
17
|
Schelinski S, Tabas A, von Kriegstein K. Altered processing of communication signals in the subcortical auditory sensory pathway in autism. Hum Brain Mapp 2022; 43:1955-1972. [PMID: 35037743 PMCID: PMC8933247 DOI: 10.1002/hbm.25766] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 11/24/2021] [Accepted: 12/19/2021] [Indexed: 12/17/2022] Open
Abstract
Autism spectrum disorder (ASD) is characterised by social communication difficulties. These difficulties have been mainly explained by cognitive, motivational, and emotional alterations in ASD. The communication difficulties could, however, also be associated with altered sensory processing of communication signals. Here, we assessed the functional integrity of auditory sensory pathway nuclei in ASD in three independent functional magnetic resonance imaging experiments. We focused on two aspects of auditory communication that are impaired in ASD: voice identity perception, and recognising speech-in-noise. We found reduced processing in adults with ASD as compared to typically developed control groups (pairwise matched on sex, age, and full-scale IQ) in the central midbrain structure of the auditory pathway (inferior colliculus [IC]). The right IC responded less in the ASD as compared to the control group for voice identity, in contrast to speech recognition. The right IC also responded less in the ASD as compared to the control group when passively listening to vocal in contrast to non-vocal sounds. Within the control group, the left and right IC responded more when recognising speech-in-noise as compared to when recognising speech without additional noise. In the ASD group, this was only the case in the left, but not the right IC. The results show that communication signal processing in ASD is associated with reduced subcortical sensory functioning in the midbrain. The results highlight the importance of considering sensory processing alterations in explaining communication difficulties, which are at the core of ASD.
Collapse
Affiliation(s)
- Stefanie Schelinski
- Faculty of Psychology, Chair of Cognitive and Clinical NeuroscienceTechnische Universität DresdenDresdenGermany
- Max Planck Institute for Human Cognitive and Brain SciencesLeipzigGermany
| | - Alejandro Tabas
- Faculty of Psychology, Chair of Cognitive and Clinical NeuroscienceTechnische Universität DresdenDresdenGermany
- Max Planck Institute for Human Cognitive and Brain SciencesLeipzigGermany
| | - Katharina von Kriegstein
- Faculty of Psychology, Chair of Cognitive and Clinical NeuroscienceTechnische Universität DresdenDresdenGermany
- Max Planck Institute for Human Cognitive and Brain SciencesLeipzigGermany
| |
Collapse
|
18
|
Sliwa J, Mallet M, Christiaens M, Takahashi DY. Neural basis of multi-sensory communication in primates. ETHOL ECOL EVOL 2022. [DOI: 10.1080/03949370.2021.2024266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Julia Sliwa
- Paris Brain Institute–Institut du Cerveau, Inserm, CNRS, APHP, Hôpital Pitié-Salpêtrière, Sorbonne Université, Paris, France
| | - Marion Mallet
- Paris Brain Institute–Institut du Cerveau, Inserm, CNRS, APHP, Hôpital Pitié-Salpêtrière, Sorbonne Université, Paris, France
| | - Maëlle Christiaens
- Paris Brain Institute–Institut du Cerveau, Inserm, CNRS, APHP, Hôpital Pitié-Salpêtrière, Sorbonne Université, Paris, France
| | | |
Collapse
|
19
|
Hierarchical cortical networks of "voice patches" for processing voices in human brain. Proc Natl Acad Sci U S A 2021; 118:2113887118. [PMID: 34930846 DOI: 10.1073/pnas.2113887118] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/11/2021] [Indexed: 12/26/2022] Open
Abstract
Humans have an extraordinary ability to recognize and differentiate voices. It is yet unclear whether voices are uniquely processed in the human brain. To explore the underlying neural mechanisms of voice processing, we recorded electrocorticographic signals from intracranial electrodes in epilepsy patients while they listened to six different categories of voice and nonvoice sounds. Subregions in the temporal lobe exhibited preferences for distinct voice stimuli, which were defined as "voice patches." Latency analyses suggested a dual hierarchical organization of the voice patches. We also found that voice patches were functionally connected under both task-engaged and resting states. Furthermore, the left motor areas were coactivated and correlated with the temporal voice patches during the sound-listening task. Taken together, this work reveals hierarchical cortical networks in the human brain for processing human voices.
Collapse
|
20
|
Renvall H, Seol J, Tuominen R, Sorger B, Riecke L, Salmelin R. Selective auditory attention within naturalistic scenes modulates reactivity to speech sounds. Eur J Neurosci 2021; 54:7626-7641. [PMID: 34697833 PMCID: PMC9298413 DOI: 10.1111/ejn.15504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Accepted: 10/10/2021] [Indexed: 11/27/2022]
Abstract
Rapid recognition and categorization of sounds are essential for humans and animals alike, both for understanding and reacting to our surroundings and for daily communication and social interaction. For humans, perception of speech sounds is of crucial importance. In real life, this task is complicated by the presence of a multitude of meaningful non‐speech sounds. The present behavioural, magnetoencephalography (MEG) and functional magnetic resonance imaging (fMRI) study was set out to address how attention to speech versus attention to natural non‐speech sounds within complex auditory scenes influences cortical processing. The stimuli were superimpositions of spoken words and environmental sounds, with parametric variation of the speech‐to‐environmental sound intensity ratio. The participants' task was to detect a repetition in either the speech or the environmental sound. We found that specifically when participants attended to speech within the superimposed stimuli, higher speech‐to‐environmental sound ratios resulted in shorter sustained MEG responses and stronger BOLD fMRI signals especially in the left supratemporal auditory cortex and in improved behavioural performance. No such effects of speech‐to‐environmental sound ratio were observed when participants attended to the environmental sound part within the exact same stimuli. These findings suggest stronger saliency of speech compared with other meaningful sounds during processing of natural auditory scenes, likely linked to speech‐specific top‐down and bottom‐up mechanisms activated during speech perception that are needed for tracking speech in real‐life‐like auditory environments.
Collapse
Affiliation(s)
- Hanna Renvall
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland.,Aalto NeuroImaging, Aalto University, Espoo, Finland.,BioMag Laboratory, HUS Diagnostic Center, Helsinki University Hospital, University of Helsinki and Aalto University School of Science, Helsinki, Finland
| | - Jaeho Seol
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland.,Aalto NeuroImaging, Aalto University, Espoo, Finland
| | - Riku Tuominen
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland.,Aalto NeuroImaging, Aalto University, Espoo, Finland
| | - Bettina Sorger
- Department of Cognitive Neuroscience, Maastricht University, Maastricht, The Netherlands
| | - Lars Riecke
- Department of Cognitive Neuroscience, Maastricht University, Maastricht, The Netherlands
| | - Riitta Salmelin
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland.,Aalto NeuroImaging, Aalto University, Espoo, Finland
| |
Collapse
|
21
|
Ito T, Ohashi H, Gracco VL. Somatosensory contribution to audio-visual speech processing. Cortex 2021; 143:195-204. [PMID: 34450567 DOI: 10.1016/j.cortex.2021.07.013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 07/20/2021] [Accepted: 07/28/2021] [Indexed: 10/20/2022]
Abstract
Recent studies have demonstrated that the auditory speech perception of a listener can be modulated by somatosensory input applied to the facial skin suggesting that perception is an embodied process. However, speech perception is a multisensory process involving both the auditory and visual modalities. It is unknown whether and to what extent somatosensory stimulation to the facial skin modulates audio-visual speech perception. If speech perception is an embodied process, then somatosensory stimulation applied to the perceiver should influence audio-visual speech processing. Using the McGurk effect (the perceptual illusion that occurs when a sound is paired with the visual representation of a different sound, resulting in the perception of a third sound) we tested the prediction using a simple behavioral paradigm and at the neural level using event-related potentials (ERPs) and their cortical sources. We recorded ERPs from 64 scalp sites in response to congruent and incongruent audio-visual speech randomly presented with and without somatosensory stimulation associated with facial skin deformation. Subjects judged whether the production was /ba/ or not under all stimulus conditions. In the congruent audio-visual condition subjects identifying the sound as /ba/, but not in the incongruent condition consistent with the McGurk effect. Concurrent somatosensory stimulation improved the ability of participants to more correctly identify the production as /ba/ relative to the non-somatosensory condition in both congruent and incongruent conditions. ERP in response to the somatosensory stimulation for the incongruent condition reliably diverged 220 msec after stimulation onset. Cortical sources were estimated around the left anterior temporal gyrus, the right middle temporal gyrus, the right posterior superior temporal lobe and the right occipital region. The results demonstrate a clear multisensory convergence of somatosensory and audio-visual processing in both behavioral and neural processing consistent with the perspective that speech perception is a self-referenced, sensorimotor process.
Collapse
Affiliation(s)
- Takayuki Ito
- University Grenoble-Alpes, CNRS, Grenoble-INP, GIPSA-Lab, Saint Martin D'heres Cedex, France; Haskins Laboratories, New Haven, CT, USA.
| | | | - Vincent L Gracco
- Haskins Laboratories, New Haven, CT, USA; McGill University, Montréal, QC, Canada
| |
Collapse
|
22
|
Papagno C, Pisoni A, Gainotti G. False alarms during recognition of famous people from faces and voices in patients with unilateral temporal lobe resection and normal participants tested after anodal tDCS over the left or right ATL. Neuropsychologia 2021; 159:107926. [PMID: 34216595 DOI: 10.1016/j.neuropsychologia.2021.107926] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Revised: 06/04/2021] [Accepted: 06/23/2021] [Indexed: 10/21/2022]
Abstract
Data gathered in the field of the experimental social psychology have shown that it is more difficult to recognize a person through his/her voice than through his/her face and that false alarms (FA) are produced more in voice than in face recognition. Furthermore, some neuropsychological investigations have suggested that in patients with damage to the right anterior temporal lobe (ATL) the number of FA could be higher for voice than for face recognition. In the present study we assessed FA during recognition of famous people from faces and voices in patients with unilateral ATL tumours and in normal participants tested after anodal transcranial direct current stimulation (tCDS), over the left or right ATL. The number of FA was significantly higher in patients with right than in those with left temporal tumours on both face and voice familiarity. Furthermore, lesion side did not differentially affect patient's sensitivity or response criterion when recognizing famous faces, but influenced both these measures on a voice recognition task. In fact, in this condition patients with right temporal tumours showed a lower sensitivity index and a lower response criterion than those with left-sided lesions. In normal subjects, the greater right sided involvement in voice than in face processing was confirmed by the observation that right ATL anodal stimulation significantly increased voice but only marginally influenced face sensitivity. This asymmetry between face and voice processing in the right hemisphere could be due to the greater complexity of voice processing and to the difficulty of forming stable and well-structured representations, allowing to evaluate if a presented voice matches or not with an already known voice.
Collapse
Affiliation(s)
- C Papagno
- CIMeC, Center for Mind/Brain Sciences, University of Trento, Rovereto, Italy; Department of Psychology, University of Milano-Bicocca, Milano, Italy.
| | - A Pisoni
- Department of Psychology, University of Milano-Bicocca, Milano, Italy
| | - G Gainotti
- Catholic University, Policlinico Gemelli, Roma, Italy
| |
Collapse
|
23
|
Fast Periodic Auditory Stimulation Reveals a Robust Categorical Response to Voices in the Human Brain. eNeuro 2021; 8:ENEURO.0471-20.2021. [PMID: 34016602 PMCID: PMC8225406 DOI: 10.1523/eneuro.0471-20.2021] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 03/03/2021] [Accepted: 04/04/2021] [Indexed: 11/21/2022] Open
Abstract
Voices are arguably among the most relevant sounds in humans' everyday life, and several studies have suggested the existence of voice-selective regions in the human brain. Despite two decades of research, defining the human brain regions supporting voice recognition remains challenging. Moreover, whether neural selectivity to voices is merely driven by acoustic properties specific to human voices (e.g., spectrogram, harmonicity), or whether it also reflects a higher-level categorization response is still under debate. Here, we objectively measured rapid automatic categorization responses to human voices with fast periodic auditory stimulation (FPAS) combined with electroencephalography (EEG). Participants were tested with stimulation sequences containing heterogeneous non-vocal sounds from different categories presented at 4 Hz (i.e., four stimuli/s), with vocal sounds appearing every three stimuli (1.333 Hz). A few minutes of stimulation are sufficient to elicit robust 1.333 Hz voice-selective focal brain responses over superior temporal regions of individual participants. This response is virtually absent for sequences using frequency-scrambled sounds, but is clearly observed when voices are presented among sounds from musical instruments matched for pitch and harmonicity-to-noise ratio (HNR). Overall, our FPAS paradigm demonstrates that the human brain seamlessly categorizes human voices when compared with other sounds including musical instruments' sounds matched for low level acoustic features and that voice-selective responses are at least partially independent from low-level acoustic features, making it a powerful and versatile tool to understand human auditory categorization in general.
Collapse
|
24
|
Maguinness C, von Kriegstein K. Visual mechanisms for voice-identity recognition flexibly adjust to auditory noise level. Hum Brain Mapp 2021; 42:3963-3982. [PMID: 34043249 PMCID: PMC8288083 DOI: 10.1002/hbm.25532] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 04/26/2021] [Accepted: 05/02/2021] [Indexed: 11/24/2022] Open
Abstract
Recognising the identity of voices is a key ingredient of communication. Visual mechanisms support this ability: recognition is better for voices previously learned with their corresponding face (compared to a control condition). This so‐called ‘face‐benefit’ is supported by the fusiform face area (FFA), a region sensitive to facial form and identity. Behavioural findings indicate that the face‐benefit increases in noisy listening conditions. The neural mechanisms for this increase are unknown. Here, using functional magnetic resonance imaging, we examined responses in face‐sensitive regions while participants recognised the identity of auditory‐only speakers (previously learned by face) in high (SNR −4 dB) and low (SNR +4 dB) levels of auditory noise. We observed a face‐benefit in both noise levels, for most participants (16 of 21). In high‐noise, the recognition of face‐learned speakers engaged the right posterior superior temporal sulcus motion‐sensitive face area (pSTS‐mFA), a region implicated in the processing of dynamic facial cues. The face‐benefit in high‐noise also correlated positively with increased functional connectivity between this region and voice‐sensitive regions in the temporal lobe in the group of 16 participants with a behavioural face‐benefit. In low‐noise, the face‐benefit was robustly associated with increased responses in the FFA and to a lesser extent the right pSTS‐mFA. The findings highlight the remarkably adaptive nature of the visual network supporting voice‐identity recognition in auditory‐only listening conditions.
Collapse
Affiliation(s)
- Corrina Maguinness
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany.,Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Katharina von Kriegstein
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany.,Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
25
|
Zhang L, Li Y, Zhou H, Zhang Y, Shu H. Language-familiarity effect on voice recognition by blind listeners. JASA EXPRESS LETTERS 2021; 1:055201. [PMID: 36154110 DOI: 10.1121/10.0004848] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
The current study compared the language-familiarity effect on voice recognition by blind listeners and sighted individuals. Both groups performed better on the recognition of native voices than nonnative voices, but the language-familiarity effect is smaller in the blind than in the sighted group, with blind individuals performing better than their sighted counterparts only on the recognition of nonnative voices. Furthermore, recognition of native and nonnative voices was significantly correlated only in the blind group. These results indicate that language familiarity affects voice recognition by blind listeners, who differ to some extent from their sighted counterparts in the use of linguistic and nonlinguistic features during voice recognition.
Collapse
Affiliation(s)
- Linjun Zhang
- Beijing Advanced Innovation Center for Language Resources and College of Advanced Chinese Training, Beijing Language and Culture University, Beijing 100083, China
| | - Yu Li
- Division of Science and Technology, BNU-HKBU United International College, Zhuhai 519085, Guangdong, China
| | - Hong Zhou
- International Cultural Exchange School, Shanghai University of Finance and Economics, Shanghai 200433, China
| | - Yang Zhang
- Department of Speech-Language-Hearing Sciences and Center for Neurobehavioral Development, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Hua Shu
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing 100875, China , , , ,
| |
Collapse
|
26
|
Holmes E, Johnsrude IS. Speech-evoked brain activity is more robust to competing speech when it is spoken by someone familiar. Neuroimage 2021; 237:118107. [PMID: 33933598 DOI: 10.1016/j.neuroimage.2021.118107] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Revised: 04/19/2021] [Accepted: 04/25/2021] [Indexed: 10/21/2022] Open
Abstract
When speech is masked by competing sound, people are better at understanding what is said if the talker is familiar compared to unfamiliar. The benefit is robust, but how does processing of familiar voices facilitate intelligibility? We combined high-resolution fMRI with representational similarity analysis to quantify the difference in distributed activity between clear and masked speech. We demonstrate that brain representations of spoken sentences are less affected by a competing sentence when they are spoken by a friend or partner than by someone unfamiliar-effectively, showing a cortical signal-to-noise ratio (SNR) enhancement for familiar voices. This effect correlated with the familiar-voice intelligibility benefit. We functionally parcellated auditory cortex, and found that the most prominent familiar-voice advantage was manifest along the posterior superior and middle temporal gyri. Overall, our results demonstrate that experience-driven improvements in intelligibility are associated with enhanced multivariate pattern activity in posterior temporal cortex.
Collapse
Affiliation(s)
- Emma Holmes
- The Brain and Mind Institute, University of Western Ontario, London, Ontario, N6A 3K7, Canada.
| | - Ingrid S Johnsrude
- The Brain and Mind Institute, University of Western Ontario, London, Ontario, N6A 3K7, Canada; School of Communication Sciences and Disorders, University of Western Ontario, London, Ontario, London, N6G 1H1, Canada
| |
Collapse
|
27
|
Escitalopram enhances synchrony of brain responses during emotional narratives in patients with major depressive disorder. Neuroimage 2021; 237:118110. [PMID: 33933596 DOI: 10.1016/j.neuroimage.2021.118110] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 04/19/2021] [Accepted: 04/21/2021] [Indexed: 11/20/2022] Open
Abstract
One-week treatment with escitalopram decreases amygdala responses to fearful facial expressions in depressed patients, but it remains unknown whether it also modulates processing of complex and freely processed emotional stimuli resembling daily life emotional situations. Inter-subject correlation (ISC) offers a means to track brain activity during complex, dynamic stimuli in a model-free manner. Twenty-nine treatment-seeking patients with major depressive disorder were randomized in a double-blind study design to receive either escitalopram or placebo for one week, after which functional magnetic resonance imaging (fMRI) was performed. During fMRI the participants listened to spoken emotional narratives. Level of ISC between the escitalopram and the placebo group was compared across all the narratives and separately for the episodes with positive and negative valence. Across all the narratives, the escitalopram group had higher ISC in the default mode network of the brain as well as in the fronto-temporal narrative processing regions, whereas lower ISC was seen in the middle temporal cortex, hippocampus and occipital cortex. Escitalopram increased ISC during positive parts of the narratives in the precuneus, medial prefrontal cortex, anterior cingulate and fronto-insular cortex, whereas there was no significant synchronization in brain responses to positive vs negative events in the placebo group. Increased ISC may imply improved emotional synchronization with others, particularly during observation of positive events. Further studies are needed to test whether this contributes to the later therapeutic effect of escitalopram.
Collapse
|
28
|
Roswandowitz C, Swanborough H, Frühholz S. Categorizing human vocal signals depends on an integrated auditory-frontal cortical network. Hum Brain Mapp 2021; 42:1503-1517. [PMID: 33615612 PMCID: PMC7927295 DOI: 10.1002/hbm.25309] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Revised: 11/20/2020] [Accepted: 11/25/2020] [Indexed: 11/30/2022] Open
Abstract
Voice signals are relevant for auditory communication and suggested to be processed in dedicated auditory cortex (AC) regions. While recent reports highlighted an additional role of the inferior frontal cortex (IFC), a detailed description of the integrated functioning of the AC-IFC network and its task relevance for voice processing is missing. Using neuroimaging, we tested sound categorization while human participants either focused on the higher-order vocal-sound dimension (voice task) or feature-based intensity dimension (loudness task) while listening to the same sound material. We found differential involvements of the AC and IFC depending on the task performed and whether the voice dimension was of task relevance or not. First, when comparing neural vocal-sound processing of our task-based with previously reported passive listening designs we observed highly similar cortical activations in the AC and IFC. Second, during task-based vocal-sound processing we observed voice-sensitive responses in the AC and IFC whereas intensity processing was restricted to distinct AC regions. Third, the IFC flexibly adapted to the vocal-sounds' task relevance, being only active when the voice dimension was task relevant. Forth and finally, connectivity modeling revealed that vocal signals independent of their task relevance provided significant input to bilateral AC. However, only when attention was on the voice dimension, we found significant modulations of auditory-frontal connections. Our findings suggest an integrated auditory-frontal network to be essential for behaviorally relevant vocal-sounds processing. The IFC seems to be an important hub of the extended voice network when representing higher-order vocal objects and guiding goal-directed behavior.
Collapse
Affiliation(s)
- Claudia Roswandowitz
- Department of PsychologyUniversity of ZurichZurichSwitzerland
- Neuroscience Center ZurichUniversity of Zurich and ETH ZurichZurichSwitzerland
| | - Huw Swanborough
- Department of PsychologyUniversity of ZurichZurichSwitzerland
- Neuroscience Center ZurichUniversity of Zurich and ETH ZurichZurichSwitzerland
| | - Sascha Frühholz
- Department of PsychologyUniversity of ZurichZurichSwitzerland
- Neuroscience Center ZurichUniversity of Zurich and ETH ZurichZurichSwitzerland
- Center for Integrative Human Physiology (ZIHP)University of ZurichZurichSwitzerland
| |
Collapse
|
29
|
Johnson JF, Belyk M, Schwartze M, Pinheiro AP, Kotz SA. Expectancy changes the self-monitoring of voice identity. Eur J Neurosci 2021; 53:2681-2695. [PMID: 33638190 PMCID: PMC8252045 DOI: 10.1111/ejn.15162] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Revised: 01/18/2021] [Accepted: 02/20/2021] [Indexed: 12/02/2022]
Abstract
Self‐voice attribution can become difficult when voice characteristics are ambiguous, but functional magnetic resonance imaging (fMRI) investigations of such ambiguity are sparse. We utilized voice‐morphing (self‐other) to manipulate (un‐)certainty in self‐voice attribution in a button‐press paradigm. This allowed investigating how levels of self‐voice certainty alter brain activation in brain regions monitoring voice identity and unexpected changes in voice playback quality. FMRI results confirmed a self‐voice suppression effect in the right anterior superior temporal gyrus (aSTG) when self‐voice attribution was unambiguous. Although the right inferior frontal gyrus (IFG) was more active during a self‐generated compared to a passively heard voice, the putative role of this region in detecting unexpected self‐voice changes during the action was demonstrated only when hearing the voice of another speaker and not when attribution was uncertain. Further research on the link between right aSTG and IFG is required and may establish a threshold monitoring voice identity in action. The current results have implications for a better understanding of the altered experience of self‐voice feedback in auditory verbal hallucinations.
Collapse
Affiliation(s)
- Joseph F Johnson
- Department of Neuropsychology and Psychopharmacology, Maastricht University, Maastricht, the Netherlands
| | - Michel Belyk
- Division of Psychology and Language Sciences, University College London, London, UK
| | - Michael Schwartze
- Department of Neuropsychology and Psychopharmacology, Maastricht University, Maastricht, the Netherlands
| | - Ana P Pinheiro
- Faculdade de Psicologia, Universidade de Lisboa, Lisbon, Portugal
| | - Sonja A Kotz
- Department of Neuropsychology and Psychopharmacology, Maastricht University, Maastricht, the Netherlands.,Department of Neuropsychology, Max Planck Institute for Human and Cognitive Sciences, Leipzig, Germany
| |
Collapse
|
30
|
Michelas A, Dufour S. When native contrasts are perceived as non-native: the role of the ear of presentation in the discrimination of accentual contrasts. JOURNAL OF COGNITIVE PSYCHOLOGY 2021. [DOI: 10.1080/20445911.2021.1889569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Amandine Michelas
- Laboratoire Parole et Langage, Aix-Marseille Université, CNRS, Aix-en-Provence, France
| | - Sophie Dufour
- Laboratoire Parole et Langage, Aix-Marseille Université, CNRS, Aix-en-Provence, France
| |
Collapse
|
31
|
FMRI-based identity classification accuracy in left temporal and frontal regions predicts speaker recognition performance. Sci Rep 2021; 11:489. [PMID: 33436825 PMCID: PMC7803954 DOI: 10.1038/s41598-020-79922-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Accepted: 12/14/2020] [Indexed: 01/29/2023] Open
Abstract
Speaker recognition is characterized by considerable inter-individual variability with poorly understood neural bases. This study was aimed at (1) clarifying the cerebral correlates of speaker recognition in humans, in particular the involvement of prefrontal areas, using multi voxel pattern analysis (MVPA) applied to fMRI data from a relatively large group of participants, and (2) at investigating the relationship across participants between fMRI-based classification and the group's variable behavioural performance at the speaker recognition task. A cohort of subjects (N = 40, 28 females) selected to present a wide distribution of voice recognition abilities underwent an fMRI speaker identification task during which they were asked to recognize three previously learned speakers with finger button presses. The results showed that speaker identity could be significantly decoded based on fMRI patterns in voice-sensitive regions including bilateral temporal voice areas (TVAs) along the superior temporal sulcus/gyrus but also in bilateral parietal and left inferior frontal regions. Furthermore, fMRI-based classification accuracy showed a significant correlation with individual behavioural performance in left anterior STG/STS and left inferior frontal gyrus. These results highlight the role of both temporal and extra-temporal regions in performing a speaker identity recognition task with motor responses.
Collapse
|
32
|
Luthra S. The Role of the Right Hemisphere in Processing Phonetic Variability Between Talkers. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2021; 2:138-151. [PMID: 37213418 PMCID: PMC10174361 DOI: 10.1162/nol_a_00028] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 11/13/2020] [Indexed: 05/23/2023]
Abstract
Neurobiological models of speech perception posit that both left and right posterior temporal brain regions are involved in the early auditory analysis of speech sounds. However, frank deficits in speech perception are not readily observed in individuals with right hemisphere damage. Instead, damage to the right hemisphere is often associated with impairments in vocal identity processing. Herein lies an apparent paradox: The mapping between acoustics and speech sound categories can vary substantially across talkers, so why might right hemisphere damage selectively impair vocal identity processing without obvious effects on speech perception? In this review, I attempt to clarify the role of the right hemisphere in speech perception through a careful consideration of its role in processing vocal identity. I review evidence showing that right posterior superior temporal, right anterior superior temporal, and right inferior / middle frontal regions all play distinct roles in vocal identity processing. In considering the implications of these findings for neurobiological accounts of speech perception, I argue that the recruitment of right posterior superior temporal cortex during speech perception may specifically reflect the process of conditioning phonetic identity on talker information. I suggest that the relative lack of involvement of other right hemisphere regions in speech perception may be because speech perception does not necessarily place a high burden on talker processing systems, and I argue that the extant literature hints at potential subclinical impairments in the speech perception abilities of individuals with right hemisphere damage.
Collapse
|
33
|
Banellis L, Sokoliuk R, Wild CJ, Bowman H, Cruse D. Event-related potentials reflect prediction errors and pop-out during comprehension of degraded speech. Neurosci Conscious 2020; 2020:niaa022. [PMID: 33133640 PMCID: PMC7585676 DOI: 10.1093/nc/niaa022] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Revised: 07/08/2020] [Accepted: 08/06/2020] [Indexed: 11/20/2022] Open
Abstract
Comprehension of degraded speech requires higher-order expectations informed by prior knowledge. Accurate top-down expectations of incoming degraded speech cause a subjective semantic 'pop-out' or conscious breakthrough experience. Indeed, the same stimulus can be perceived as meaningless when no expectations are made in advance. We investigated the event-related potential (ERP) correlates of these top-down expectations, their error signals and the subjective pop-out experience in healthy participants. We manipulated expectations in a word-pair priming degraded (noise-vocoded) speech task and investigated the role of top-down expectation with a between-groups attention manipulation. Consistent with the role of expectations in comprehension, repetition priming significantly enhanced perceptual intelligibility of the noise-vocoded degraded targets for attentive participants. An early ERP was larger for mismatched (i.e. unexpected) targets than matched targets, indicative of an initial error signal not reliant on top-down expectations. Subsequently, a P3a-like ERP was larger to matched targets than mismatched targets only for attending participants-i.e. a pop-out effect-while a later ERP was larger for mismatched targets and did not significantly interact with attention. Rather than relying on complex post hoc interactions between prediction error and precision to explain this apredictive pattern, we consider our data to be consistent with prediction error minimization accounts for early stages of processing followed by Global Neuronal Workspace-like breakthrough and processing in service of task goals.
Collapse
Affiliation(s)
- Leah Banellis
- School of Psychology and Centre for Human Brain Health, University of Birmingham, Edgbaston B15 2TT, UK
| | - Rodika Sokoliuk
- School of Psychology and Centre for Human Brain Health, University of Birmingham, Edgbaston B15 2TT, UK
| | - Conor J Wild
- Brain and Mind Institute, University of Western Ontario, London, ON N6A 3K7, Canada
| | - Howard Bowman
- School of Psychology and Centre for Human Brain Health, University of Birmingham, Edgbaston B15 2TT, UK
- School of Computing, University of Kent, Canterbury, Kent CT2 7NF, UK
| | - Damian Cruse
- School of Psychology and Centre for Human Brain Health, University of Birmingham, Edgbaston B15 2TT, UK
| |
Collapse
|
34
|
Koyama MS, Molfese PJ, Milham MP, Mencl WE, Pugh KR. Thalamus is a common locus of reading, arithmetic, and IQ: Analysis of local intrinsic functional properties. BRAIN AND LANGUAGE 2020; 209:104835. [PMID: 32738503 PMCID: PMC8087146 DOI: 10.1016/j.bandl.2020.104835] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Revised: 06/24/2020] [Accepted: 06/28/2020] [Indexed: 05/04/2023]
Abstract
Neuroimaging studies of basic achievement skills - reading and arithmetic - often control for the effect of IQ to identify unique neural correlates of each skill. This may underestimate possible effects of common factors between achievement and IQ measures on neuroimaging results. Here, we simultaneously examined achievement (reading and arithmetic) and IQ measures in young adults, aiming to identify MRI correlates of their common factors. Resting-state fMRI (rs-fMRI) data were analyzed using two metrics assessing local intrinsic functional properties; regional homogeneity (ReHo) and fractional amplitude low frequency fluctuation (fALFF), measuring local intrinsic functional connectivity and intrinsic functional activity, respectively. ReHo highlighted the thalamus/pulvinar (a subcortical region implied for selective attention) as a common locus for both achievement skills and IQ. More specifically, the higher the ReHo values, the lower the achievement and IQ scores. For fALFF, the left superior parietal lobule, part of the dorsal attention network, was positively associated with reading and IQ. Collectively, our results highlight attention-related regions, particularly the thalamus/pulvinar as a key region related to individual differences in performance on all the three measures. ReHo in the thalamus/pulvinar may serve as a tool to examine brain mechanisms underlying a comorbidity of reading and arithmetic difficulties, which could co-occur with weakness in general intellectual abilities.
Collapse
Affiliation(s)
- Maki S Koyama
- Haskins Laboratories, New Haven, CT, USA; Center for the Developing Brain, Child Mind Institute, New York, NY, USA.
| | - Peter J Molfese
- Haskins Laboratories, New Haven, CT, USA; Section on Functional Imaging Methods, Laboratory of Brain and Cognition, Department of Health and Human Services, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, USA.
| | - Michael P Milham
- Center for the Developing Brain, Child Mind Institute, New York, NY, USA; Center for Biomedical Imagingand Neuromodulation, Nathan Kline Institute, Orangeburg, NY, USA.
| | | | - Kenneth R Pugh
- Haskins Laboratories, New Haven, CT, USA; Yale University School of Medicine, Department of Diagnostic Radiology, New Haven, CT, USA; University of Connecticut, Department of Psychology, Storrs, CT, USA.
| |
Collapse
|
35
|
The Role of the Left and Right Anterior Temporal Poles in People Naming and Recognition. Neuroscience 2020; 440:175-185. [DOI: 10.1016/j.neuroscience.2020.05.040] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2019] [Revised: 05/21/2020] [Accepted: 05/23/2020] [Indexed: 01/27/2023]
|
36
|
Borowiak K, von Kriegstein K. Intranasal oxytocin modulates brain responses to voice-identity recognition in typically developing individuals, but not in ASD. Transl Psychiatry 2020; 10:221. [PMID: 32636360 PMCID: PMC7341857 DOI: 10.1038/s41398-020-00903-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Revised: 06/05/2020] [Accepted: 06/08/2020] [Indexed: 11/09/2022] Open
Abstract
Faces and voices are prominent cues for person-identity recognition. Face recognition behavior and associated brain responses can be enhanced by intranasal administration of oxytocin. It is unknown whether oxytocin can also augment voice-identity recognition mechanisms. To find it out is particularly relevant for individuals who have difficulties recognizing voice identity such as individuals diagnosed with autism spectrum disorder (ASD). We conducted a combined behavioral and functional magnetic resonance imaging (fMRI) study to investigate voice-identity recognition following intranasal administration of oxytocin or placebo in a group of adults diagnosed with ASD (full-scale intelligence quotient > 85) and pairwise-matched typically developing (TD) controls. A single dose of 24 IU oxytocin was administered in a randomized, double-blind, placebo-controlled and cross-over design. In the control group, but not in the ASD group, administration of oxytocin compared to placebo increased responses to recognition of voice identity in contrast to speech in the right posterior superior temporal sulcus/gyrus (pSTS/G) - a region implicated in the perceptual analysis of voice-identity information. In the ASD group, the right pSTS/G responses were positively correlated with voice-identity recognition accuracy in the oxytocin condition, but not in the placebo condition. Oxytocin did not improve voice-identity recognition performance at the group level. The ASD compared to the control group had lower right pSTS/G responses to voice-identity recognition. Since ASD is known to have atypical pSTS/G, the results indicate that the potential of intranasal oxytocin to enhance mechanisms for voice-identity recognition might be variable and dependent on the functional integrity of this brain region.
Collapse
Affiliation(s)
- Kamila Borowiak
- Technische Universität Dresden, Bamberger Straße 7, 01187, Dresden, Germany.
- Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1a, 04103, Leipzig, Germany.
- Berlin School of Mind and Brain, Humboldt University of Berlin, Luisenstraße 56, 10117, Berlin, Germany.
| | - Katharina von Kriegstein
- Technische Universität Dresden, Bamberger Straße 7, 01187, Dresden, Germany
- Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1a, 04103, Leipzig, Germany
| |
Collapse
|
37
|
Adam-Darque A, Pittet MP, Grouiller F, Rihs TA, Leuchter RHV, Lazeyras F, Michel CM, Hüppi PS. Neural Correlates of Voice Perception in Newborns and the Influence of Preterm Birth. Cereb Cortex 2020; 30:5717-5730. [PMID: 32518940 DOI: 10.1093/cercor/bhaa144] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Revised: 05/01/2020] [Accepted: 05/01/2020] [Indexed: 12/30/2022] Open
Abstract
Maternal voice is a highly relevant stimulus for newborns. Adult voice processing occurs in specific brain regions. Voice-specific brain areas in newborns and the relevance of an early vocal exposure on these networks have not been defined. This study investigates voice perception in newborns and the impact of prematurity on the cerebral processes. Functional magnetic resonance imaging (fMRI) and high-density electroencephalography (EEG) were used to explore the brain responses to maternal and stranger female voices in full-term newborns and preterm infants at term-equivalent age (TEA). fMRI results and the EEG oddball paradigm showed enhanced processing for voices in preterms at TEA than in full-term infants. Preterm infants showed additional cortical regions involved in voice processing in fMRI and a late mismatch response for maternal voice, considered as a first trace of a recognition process based on memory representation. Full-term newborns showed increased cerebral activity to the stranger voice. Results from fMRI, oddball, and standard auditory EEG paradigms highlighted important change detection responses to novelty after birth. These findings suggest that the main components of the adult voice-processing networks emerge early in development. Moreover, an early postnatal exposure to voices in premature infants might enhance their capacity to process voices.
Collapse
Affiliation(s)
- Alexandra Adam-Darque
- Division of Development and Growth, Department of Pediatrics, Geneva University Hospitals, 1205 Geneva, Switzerland.,Laboratory of Cognitive Neurorehabilitation, Division of Neurorehabilitation, Department of Clinical Neuroscience, Geneva University Hospitals, 1205 Geneva, Switzerland
| | - Marie P Pittet
- Division of Development and Growth, Department of Pediatrics, Geneva University Hospitals, 1205 Geneva, Switzerland
| | - Frédéric Grouiller
- Department of Radiology and Medical Informatics, University of Geneva, 1205 Geneva, Switzerland.,Swiss Centre for Affective Sciences, University of Geneva, 1205 Geneva, Switzerland
| | - Tonia A Rihs
- Functional Brain Mapping Laboratory, Department of Neurosciences, University of Geneva, 1205 Geneva, Switzerland
| | - Russia Ha-Vinh Leuchter
- Division of Development and Growth, Department of Pediatrics, Geneva University Hospitals, 1205 Geneva, Switzerland
| | - François Lazeyras
- Department of Radiology and Medical Informatics, University of Geneva, 1205 Geneva, Switzerland
| | - Christoph M Michel
- Functional Brain Mapping Laboratory, Department of Neurosciences, University of Geneva, 1205 Geneva, Switzerland
| | - Petra S Hüppi
- Division of Development and Growth, Department of Pediatrics, Geneva University Hospitals, 1205 Geneva, Switzerland
| |
Collapse
|
38
|
Kroczek LOH, Gunter TC. Distinct Neural Networks Relate to Common and Speaker-Specific Language Priors. Cereb Cortex Commun 2020; 1:tgaa021. [PMID: 34296098 PMCID: PMC8153046 DOI: 10.1093/texcom/tgaa021] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 04/23/2020] [Accepted: 05/27/2020] [Indexed: 11/13/2022] Open
Abstract
Effective natural communication requires listeners to incorporate not only very general linguistic principles which evolved during a lifetime but also other information like the specific individual language use of a particular interlocutor. Traditionally, research has focused on the general linguistic rules, and brain science has shown a left hemispheric fronto-temporal brain network related to this processing. The present fMRI research explores speaker-specific individual language use because it is unknown whether this processing is supported by similar or distinct neural structures. Twenty-eight participants listened to sentences of persons who used more easy or difficult language. This was done by manipulating the proportion of easy SOV vs. complex OSV sentences for each speaker. Furthermore, ambiguous probe sentences were included to test top-down influences of speaker information in the absence of syntactic structure information. We observed distinct neural processing for syntactic complexity and speaker-specific language use. Syntactic complexity correlated with left frontal and posterior temporal regions. Speaker-specific processing correlated with bilateral (right-dominant) fronto-parietal brain regions. Finally, the top-down influence of speaker information was found in frontal and striatal brain regions, suggesting a mechanism for controlled syntactic processing. These findings show distinct neural networks related to general language principles as well as speaker-specific individual language use.
Collapse
Affiliation(s)
- Leon O H Kroczek
- Department of Clinical Psychology and Psychotherapy, University of Regensburg, Regensburg 93053, Germany
| | - Thomas C Gunter
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig 04103, Germany
| |
Collapse
|
39
|
Chien PJ, Friederici AD, Hartwigsen G, Sammler D. Neural correlates of intonation and lexical tone in tonal and non-tonal language speakers. Hum Brain Mapp 2020; 41:1842-1858. [PMID: 31957928 PMCID: PMC7268089 DOI: 10.1002/hbm.24916] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Revised: 12/10/2019] [Accepted: 12/18/2019] [Indexed: 12/31/2022] Open
Abstract
Intonation, the modulation of pitch in speech, is a crucial aspect of language that is processed in right‐hemispheric regions, beyond the classical left‐hemispheric language system. Whether or not this notion generalises across languages remains, however, unclear. Particularly, tonal languages are an interesting test case because of the dual linguistic function of pitch that conveys lexical meaning in form of tone, in addition to intonation. To date, only few studies have explored how intonation is processed in tonal languages, how this compares to tone and between tonal and non‐tonal language speakers. The present fMRI study addressed these questions by testing Mandarin and German speakers with Mandarin material. Both groups categorised mono‐syllabic Mandarin words in terms of intonation, tone, and voice gender. Systematic comparisons of brain activity of the two groups between the three tasks showed large cross‐linguistic commonalities in the neural processing of intonation in left fronto‐parietal, right frontal, and bilateral cingulo‐opercular regions. These areas are associated with general phonological, specific prosodic, and controlled categorical decision‐making processes, respectively. Tone processing overlapped with intonation processing in left fronto‐parietal areas, in both groups, but evoked additional activity in bilateral temporo‐parietal semantic regions and subcortical areas in Mandarin speakers only. Together, these findings confirm cross‐linguistic commonalities in the neural implementation of intonation processing but dissociations for semantic processing of tone only in tonal language speakers.
Collapse
Affiliation(s)
- Pei-Ju Chien
- International Max Planck Research School NeuroCom, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.,Otto Hahn Group "Neural Bases of Intonation in Speech and Music", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.,Lise Meitner Research Group "Cognition and Plasticity", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.,Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Angela D Friederici
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Gesa Hartwigsen
- Lise Meitner Research Group "Cognition and Plasticity", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Daniela Sammler
- Otto Hahn Group "Neural Bases of Intonation in Speech and Music", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
40
|
Burgering MA, van Laarhoven T, Baart M, Vroomen J. Fluidity in the perception of auditory speech: Cross-modal recalibration of voice gender and vowel identity by a talking face. Q J Exp Psychol (Hove) 2020; 73:957-967. [PMID: 31931664 DOI: 10.1177/1747021819900884] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Humans quickly adapt to variations in the speech signal. Adaptation may surface as recalibration, a learning effect driven by error-minimisation between a visual face and an ambiguous auditory speech signal, or as selective adaptation, a contrastive aftereffect driven by the acoustic clarity of the sound. Here, we examined whether these aftereffects occur for vowel identity and voice gender. Participants were exposed to male, female, or androgynous tokens of speakers pronouncing /e/, /ø/, (embedded in words with a consonant-vowel-consonant structure), or an ambiguous vowel halfway between /e/ and /ø/ dubbed onto the video of a male or female speaker pronouncing /e/ or /ø/. For both voice gender and vowel identity, we found assimilative aftereffects after exposure to auditory ambiguous adapter sounds, and contrastive aftereffects after exposure to auditory clear adapter sounds. This demonstrates that similar principles for adaptation in these dimensions are at play.
Collapse
Affiliation(s)
- Merel A Burgering
- Department of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands
| | - Thijs van Laarhoven
- Department of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands
| | - Martijn Baart
- Department of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands.,BCBL-Basque Center on Cognition, Brain and Language, Donostia-San Sebastián, Spain
| | - Jean Vroomen
- Department of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands
| |
Collapse
|
41
|
Stevenage SV, Symons AE, Fletcher A, Coen C. Sorting through the impact of familiarity when processing vocal identity: Results from a voice sorting task. Q J Exp Psychol (Hove) 2019; 73:519-536. [PMID: 31658884 PMCID: PMC7074657 DOI: 10.1177/1747021819888064] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The present article reports on one experiment designed to examine the importance of familiarity when processing vocal identity. A voice sorting task was used with participants who were either personally familiar or unfamiliar with three speakers. The results suggested that familiarity supported both an ability to tell different instances of the same voice together, and to tell similar instances of different voices apart. In addition, the results suggested differences between the three speakers in terms of the extent to which they were confusable, underlining the importance of vocal characteristics and stimulus selection within behavioural tasks. The results are discussed with reference to existing debates regarding the nature of stored representations as familiarity develops, and the difficulty when processing voices over faces more generally.
Collapse
Affiliation(s)
| | - Ashley E Symons
- School of Psychology, University of Southampton, Southampton, UK
| | - Abi Fletcher
- School of Psychology, University of Southampton, Southampton, UK
| | - Chantelle Coen
- School of Psychology, University of Southampton, Southampton, UK
| |
Collapse
|
42
|
Bodin C, Belin P. Exploring the cerebral substrate of voice perception in primate brains. Philos Trans R Soc Lond B Biol Sci 2019; 375:20180386. [PMID: 31735143 PMCID: PMC6895549 DOI: 10.1098/rstb.2018.0386] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
One can consider human language to be the Swiss army knife of the vast domain of animal communication. There is now growing evidence suggesting that this technology may have emerged from already operational material instead of being a sudden innovation. Sharing ideas and thoughts with conspecifics via language constitutes an amazing ability, but what value would it hold if our conspecifics were not first detected and recognized? Conspecific voice (CV) perception is fundamental to communication and widely shared across the animal kingdom. Two questions that arise then are: is this apparently shared ability reflected in common cerebral substrate? And, how has this substrate evolved? The paper addresses these questions by examining studies on the cerebral basis of CV perception in humans' closest relatives, non-human primates. Neuroimaging studies, in particular, suggest the existence of a ‘voice patch system’, a network of interconnected cortical areas that can provide a common template for the cerebral processing of CV in primates. This article is part of the theme issue ‘What can animal communication teach us about human language?’
Collapse
Affiliation(s)
- Clémentine Bodin
- Institut de Neurosciences de la Timone, UMR 7289 Centre National de la Recherche Scientifique and Aix-Marseille Université, Marseille, France
| | - Pascal Belin
- Institut de Neurosciences de la Timone, UMR 7289 Centre National de la Recherche Scientifique and Aix-Marseille Université, Marseille, France.,Département de Psychologie, Université de Montréal, Montréal, Canada
| |
Collapse
|
43
|
Jagiello R, Pomper U, Yoneya M, Zhao S, Chait M. Rapid Brain Responses to Familiar vs. Unfamiliar Music - an EEG and Pupillometry study. Sci Rep 2019; 9:15570. [PMID: 31666553 PMCID: PMC6821741 DOI: 10.1038/s41598-019-51759-9] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Accepted: 10/07/2019] [Indexed: 12/17/2022] Open
Abstract
Human listeners exhibit marked sensitivity to familiar music, perhaps most readily revealed by popular “name that tune” games, in which listeners often succeed in recognizing a familiar song based on extremely brief presentation. In this work, we used electroencephalography (EEG) and pupillometry to reveal the temporal signatures of the brain processes that allow differentiation between a familiar, well liked, and unfamiliar piece of music. In contrast to previous work, which has quantified gradual changes in pupil diameter (the so-called “pupil dilation response”), here we focus on the occurrence of pupil dilation events. This approach is substantially more sensitive in the temporal domain and allowed us to tap early activity with the putative salience network. Participants (N = 10) passively listened to snippets (750 ms) of a familiar, personally relevant and, an acoustically matched, unfamiliar song, presented in random order. A group of control participants (N = 12), who were unfamiliar with all of the songs, was also tested. We reveal a rapid differentiation between snippets from familiar and unfamiliar songs: Pupil responses showed greater dilation rate to familiar music from 100–300 ms post-stimulus-onset, consistent with a faster activation of the autonomic salience network. Brain responses measured with EEG showed a later differentiation between familiar and unfamiliar music from 350 ms post onset. Remarkably, the cluster pattern identified in the EEG response is very similar to that commonly found in the classic old/new memory retrieval paradigms, suggesting that the recognition of brief, randomly presented, music snippets, draws on similar processes.
Collapse
Affiliation(s)
- Robert Jagiello
- Ear Institute, University College London, London, UK.,Institute of Cognitive and Evolutionary Anthropology, University of Oxford, Oxford, UK
| | - Ulrich Pomper
- Ear Institute, University College London, London, UK. .,Faculty of Psychology, University of Vienna, Vienna, Austria.
| | - Makoto Yoneya
- NTT Communication Science Laboratories, NTT Corporation, Atsugi, 243-0198, Japan
| | - Sijia Zhao
- Ear Institute, University College London, London, UK
| | - Maria Chait
- Ear Institute, University College London, London, UK.
| |
Collapse
|
44
|
Choi JY, Perrachione TK. Noninvasive neurostimulation of left temporal lobe disrupts rapid talker adaptation in speech processing. BRAIN AND LANGUAGE 2019; 196:104655. [PMID: 31310963 PMCID: PMC6688950 DOI: 10.1016/j.bandl.2019.104655] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Revised: 06/28/2019] [Accepted: 07/02/2019] [Indexed: 06/10/2023]
Abstract
Talker adaptation improves speech processing efficiency by reducing possible mappings between talkers' speech acoustics and listeners' phonemic representations. We investigated the functional neuroanatomy of talker adaptation by applying noninvasive neurostimulation (high-definition transcranial direct current stimulation; HD-tDCS) to left superior temporal lobe while participants performed an auditory word identification task. We factorially manipulated talker variability (single vs. mixed talkers) and speech context (isolated words vs. connected speech), measuring listeners' speech processing efficiency under anodal, cathodal, or sham stimulation. Speech processing was faster for single talkers than mixed talkers, and connected speech reduced the additional processing costs associated with mixed-talker speech. However, the beneficial effect of connected speech in the mixed-talker condition was significantly attenuated under both anodal and cathodal stimulation versus sham. Stimulation of left superior temporal lobe disrupts the brain's ability to use local phonetic context to rapidly adapt to a talker, revealing this region's causal role in talker adaptation.
Collapse
Affiliation(s)
- Ja Young Choi
- Department of Speech, Language, and Hearing Sciences, Boston University, Boston, MA, United States; Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, United States
| | - Tyler K Perrachione
- Department of Speech, Language, and Hearing Sciences, Boston University, Boston, MA, United States.
| |
Collapse
|
45
|
Latinus M, Mofid Y, Kovarski K, Charpentier J, Batty M, Bonnet-Brilhault F. Atypical Sound Perception in ASD Explained by Inter-Trial (In)consistency in EEG. Front Psychol 2019; 10:1177. [PMID: 31214068 PMCID: PMC6558157 DOI: 10.3389/fpsyg.2019.01177] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Accepted: 05/06/2019] [Indexed: 12/27/2022] Open
Abstract
A relative indifference to the human voice is a characteristic of Autism Spectrum Disorder (ASD). Yet, studies of voice perception in ASD provided contradictory results: one study described an absence of preferential response to voices in ASD while another reported a larger activation to vocal sounds than environmental sounds, as seen in typically developed (TD) adults. In children with ASD, an absence of preferential response to vocal sounds was attributed to an atypical response to environmental sounds. To have a better understanding of these contradictions, we re-analyzed the data from sixteen children with ASD and sixteen age-matched TD children to evaluate both inter- and intra-subject variability. Intra-subject variability was estimated with a single-trial analysis of electroencephalographic data, through a measure of inter-trial consistency, which is the proportion of trials showing a positive activity in response to vocal and non-vocal sounds. Results demonstrate a larger inter-subject variability in response to non-vocal sounds, driven by a subset of children with ASD (7/16) who do not show the expected negative Tb peak in response to non-vocal sounds around 200 ms after the start of the stimulation due to a reduced inter-trial consistency. A logistic regression model with age and clinical parameters allowed demonstrating that not a single parameter discriminated the subgroups of ASD participants. Yet, the electrophysiologically-based groups differed on a linear combination of parameters. Children with ASD showing a reduced inter-trial consistency were younger and characterized by lower verbal developmental quotient and less attempt to communicate by voice. This data suggests that a lack of specialization for processing social signal may stem from an atypical processing of environmental sounds, linked to the development of general communication abilities. Discrepancy reported in the literature may arise from that heterogeneity and it may be inadequate to divide children with ASD based only on intellectual quotient or language abilities. This analysis could be a useful tool in providing complementary information for the functional diagnostic of ASD and evaluating verbal communication impairment.
Collapse
Affiliation(s)
| | - Yassine Mofid
- UMR 1253, iBrain, Université de Tours, INSERM, Tours, France
| | - Klara Kovarski
- Fondation Ophtalmologique Rothschild, Unité Vision et Cognition, Paris, France
- CNRS (Integrative Neuroscience and Cognition Center, UMR 8002), Paris, France
- Université Paris Descartes, Sorbonne Paris Cité, Paris, France
| | | | - Magali Batty
- CERPPS, Université de Toulouse, Toulouse, France
| | - Frédérique Bonnet-Brilhault
- UMR 1253, iBrain, Université de Tours, INSERM, Tours, France
- CHRU de Tours, Centre Universitaire de Pédopsychiatrie, Tours, France
| |
Collapse
|
46
|
Ghaffarvand Mokari P, Werner S. On the Role of Cognitive Abilities in Second Language Vowel Learning. LANGUAGE AND SPEECH 2019; 62:260-280. [PMID: 29589808 DOI: 10.1177/0023830918764517] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This study investigated the role of different cognitive abilities-inhibitory control, attention control, phonological short-term memory (PSTM), and acoustic short-term memory (AM)-in second language (L2) vowel learning. The participants were 40 Azerbaijani learners of Standard Southern British English. Their perception of L2 vowels was tested through a perceptual discrimination task before and after five sessions of high-variability phonetic training. Inhibitory control was significantly correlated with gains from training in the discrimination of L2 vowel pairs. However, there were no significant correlations between attention control, AM, PSTM, and gains from training. These findings suggest the potential role of inhibitory control in L2 phonological learning. We suggest that inhibitory control facilitates the processing of L2 sounds by allowing learners to ignore the interfering information from L1 during training, leading to better L2 segmental learning.
Collapse
Affiliation(s)
- Payam Ghaffarvand Mokari
- Department of General Linguistics and Language Technology, School of Humanities, University of Eastern Finland, Finland
| | - Stefan Werner
- Department of General Linguistics and Language Technology, School of Humanities, University of Eastern Finland, Finland
| |
Collapse
|
47
|
Payne H, Gutierrez-Sigut E, Woll B, MacSweeney M. Cerebral lateralisation during signed and spoken language production in children born deaf. Dev Cogn Neurosci 2019; 36:100619. [PMID: 30711882 PMCID: PMC6891228 DOI: 10.1016/j.dcn.2019.100619] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2018] [Revised: 01/21/2019] [Accepted: 01/21/2019] [Indexed: 01/26/2023] Open
Abstract
The effect of sensory experience on hemispheric specialisation for language production is not well understood. Children born deaf, including those who have cochlear implants, have drastically different perceptual experiences of language than their hearing peers. Using functional transcranial Doppler sonography (fTCD), we measured lateralisation during language production in a heterogeneous group of 19 deaf children and in 19 hearing children, matched on language ability. In children born deaf, we observed significant left lateralisation during language production (British Sign Language, spoken English, or a combination of languages). There was no difference in the strength of lateralisation between deaf and hearing groups. Comparable proportions of children were categorised as left-, right-, or not significantly-lateralised in each group. Moreover, an exploratory subgroup analysis showed no significant difference in lateralisation between deaf children with cochlear implants and those without. These data suggest that the processes underpinning language production remain robustly left lateralised regardless of sensory language experience.
Collapse
Affiliation(s)
- Heather Payne
- Deafness, Cognition & Language Research Centre, University College London, WC1H 0PD, UK; Institute of Cognitive Neuroscience, University College London, WC1N 3AZ, UK.
| | - Eva Gutierrez-Sigut
- Deafness, Cognition & Language Research Centre, University College London, WC1H 0PD, UK; Institute of Cognitive Neuroscience, University College London, WC1N 3AZ, UK; Departamento de Metodología de las Ciencias del Comportamiento, Universitat de València, Av. Blasco Ibáñez, 2146010, Spain.
| | - Bencie Woll
- Deafness, Cognition & Language Research Centre, University College London, WC1H 0PD, UK.
| | - Mairéad MacSweeney
- Deafness, Cognition & Language Research Centre, University College London, WC1H 0PD, UK; Institute of Cognitive Neuroscience, University College London, WC1N 3AZ, UK.
| |
Collapse
|
48
|
Ogg M, Moraczewski D, Kuchinsky SE, Slevc LR. Separable neural representations of sound sources: Speaker identity and musical timbre. Neuroimage 2019; 191:116-126. [PMID: 30731247 DOI: 10.1016/j.neuroimage.2019.01.075] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2018] [Revised: 12/14/2018] [Accepted: 01/30/2019] [Indexed: 11/28/2022] Open
Abstract
Human listeners can quickly and easily recognize different sound sources (objects and events) in their environment. Understanding how this impressive ability is accomplished can improve signal processing and machine intelligence applications along with assistive listening technologies. However, it is not clear how the brain represents the many sounds that humans can recognize (such as speech and music) at the level of individual sources, categories and acoustic features. To examine the cortical organization of these representations, we used patterns of fMRI responses to decode 1) four individual speakers and instruments from one another (separately, within each category), 2) the superordinate category labels associated with each stimulus (speech or instrument), and 3) a set of simple synthesized sounds that could be differentiated entirely on their acoustic features. Data were collected using an interleaved silent steady state sequence to increase the temporal signal-to-noise ratio, and mitigate issues with auditory stimulus presentation in fMRI. Largely separable clusters of voxels in the temporal lobes supported the decoding of individual speakers and instruments from other stimuli in the same category. Decoding the superordinate category of each sound was more accurate and involved a larger portion of the temporal lobes. However, these clusters all overlapped with areas that could decode simple, acoustically separable stimuli. Thus, individual sound sources from different sound categories are represented in separate regions of the temporal lobes that are situated within regions implicated in more general acoustic processes. These results bridge an important gap in our understanding of cortical representations of sounds and their acoustics.
Collapse
Affiliation(s)
- Mattson Ogg
- Program in Neuroscience and Cognitive Science, University of Maryland, College Park, MD, 20742, USA; Department of Psychology, University of Maryland, College Park, MD, 20742, USA.
| | - Dustin Moraczewski
- Program in Neuroscience and Cognitive Science, University of Maryland, College Park, MD, 20742, USA; Department of Psychology, University of Maryland, College Park, MD, 20742, USA
| | - Stefanie E Kuchinsky
- Program in Neuroscience and Cognitive Science, University of Maryland, College Park, MD, 20742, USA; Center for Advanced Study of Language, University of Maryland, College Park, MD, 20742, USA; Maryland Neuroimaging Center, University of Maryland, College Park, MD, 20742, USA
| | - L Robert Slevc
- Program in Neuroscience and Cognitive Science, University of Maryland, College Park, MD, 20742, USA; Department of Psychology, University of Maryland, College Park, MD, 20742, USA
| |
Collapse
|
49
|
Walenski M, Europa E, Caplan D, Thompson CK. Neural networks for sentence comprehension and production: An ALE-based meta-analysis of neuroimaging studies. Hum Brain Mapp 2019; 40:2275-2304. [PMID: 30689268 DOI: 10.1002/hbm.24523] [Citation(s) in RCA: 90] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Revised: 12/14/2018] [Accepted: 12/26/2018] [Indexed: 12/24/2022] Open
Abstract
Comprehending and producing sentences is a complex endeavor requiring the coordinated activity of multiple brain regions. We examined three issues related to the brain networks underlying sentence comprehension and production in healthy individuals: First, which regions are recruited for sentence comprehension and sentence production? Second, are there differences for auditory sentence comprehension vs. visual sentence comprehension? Third, which regions are specifically recruited for the comprehension of syntactically complex sentences? Results from activation likelihood estimation (ALE) analyses (from 45 studies) implicated a sentence comprehension network occupying bilateral frontal and temporal lobe regions. Regions implicated in production (from 15 studies) overlapped with the set of regions associated with sentence comprehension in the left hemisphere, but did not include inferior frontal cortex, and did not extend to the right hemisphere. Modality differences between auditory and visual sentence comprehension were found principally in the temporal lobes. Results from the analysis of complex syntax (from 37 studies) showed engagement of left inferior frontal and posterior temporal regions, as well as the right insula. The involvement of the right hemisphere in the comprehension of these structures has potentially important implications for language treatment and recovery in individuals with agrammatic aphasia following left hemisphere brain damage.
Collapse
Affiliation(s)
- Matthew Walenski
- Center for the Neurobiology of Language Recovery, Northwestern University, Evanston, Illinois.,Department of Communication Sciences and Disorders, School of Communication, Northwestern University, Evanston, Illinois
| | - Eduardo Europa
- Department of Neurology, University of California, San Francisco
| | - David Caplan
- Department of Neurology, Harvard Medical School, Massachusetts General Hospital, Boston, Massachusetts
| | - Cynthia K Thompson
- Center for the Neurobiology of Language Recovery, Northwestern University, Evanston, Illinois.,Department of Communication Sciences and Disorders, School of Communication, Northwestern University, Evanston, Illinois.,Department of Neurology, Feinberg School of Medicine, Northwestern University, Evanston, Illinois
| |
Collapse
|
50
|
Sliwa J, Takahashi D, Shepherd S. Mécanismes neuronaux pour la communication chez les primates. REVUE DE PRIMATOLOGIE 2018. [DOI: 10.4000/primatologie.2950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|