1
|
Bai Y, Tang Q, Zhao R, Liu H, Zhang S, Guo M, Guo M, Wang J, Wang C, Xing M, Ni G, Ming D. TMNRED, A Chinese Language EEG Dataset for Fuzzy Semantic Target Identification in Natural Reading Environments. Sci Data 2025; 12:701. [PMID: 40280929 PMCID: PMC12032204 DOI: 10.1038/s41597-025-05036-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Accepted: 04/22/2025] [Indexed: 04/29/2025] Open
Abstract
Semantic understanding is central to advanced cognitive functions, and the mechanisms by which the brain processes language information are still being explored. Existing EEG datasets often lack natural reading data specific to Chinese, limiting research on Chinese semantic decoding and natural language processing. This study aims to construct a Chinese natural reading EEG dataset, TMNRED, for semantic target identification in natural reading environments. TMNRED was collected from 30 participants reading sentences sourced from public internet resources and media reports. Each participant underwent 400-450 trials in a single day, resulting in a dataset with over 10 hours of continuous EEG data and more than 4000 trials. This dataset provides valuable physiological data for studying Chinese semantics and developing more accurate Chinese natural language processing models.
Collapse
Affiliation(s)
- Yanru Bai
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China.
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin, 300072, China.
- Haihe Laboratory of Brain-Computer Interaction and Human-Machine Integration, Tianjin, 300392, China.
| | - Qi Tang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
- Haihe Laboratory of Brain-Computer Interaction and Human-Machine Integration, Tianjin, 300392, China
| | - Ran Zhao
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
- Haihe Laboratory of Brain-Computer Interaction and Human-Machine Integration, Tianjin, 300392, China
| | - Hongxing Liu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
| | - Shuming Zhang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
| | - Mingkun Guo
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
| | - Minghan Guo
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
| | - Junjie Wang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
| | - Changjian Wang
- National University of Defense Technology, Changsha, Hunan, 410000, China
| | - Mu Xing
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
| | - Guangjian Ni
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China.
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin, 300072, China.
- Haihe Laboratory of Brain-Computer Interaction and Human-Machine Integration, Tianjin, 300392, China.
| | - Dong Ming
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin, 300072, China
- Haihe Laboratory of Brain-Computer Interaction and Human-Machine Integration, Tianjin, 300392, China
| |
Collapse
|
2
|
He T, Wei M, Wang R, Wang R, Du S, Cai S, Tao W, Li H. VocalMind: A Stereotactic EEG Dataset for Vocalized, Mimed, and Imagined Speech in Tonal Language. Sci Data 2025; 12:657. [PMID: 40253415 PMCID: PMC12009324 DOI: 10.1038/s41597-025-04741-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Accepted: 02/28/2025] [Indexed: 04/21/2025] Open
Abstract
Speech BCIs based on implanted electrodes hold significant promise for enhancing spoken communication through high temporal resolution and invasive neural sensing. Despite the potential, acquiring such data is challenging due to its invasive nature, and publicly available datasets, particularly for tonal languages, are limited. In this study, we introduce VocalMind, a stereotactic electroencephalography (sEEG) dataset focused on Mandarin Chinese, a tonal language. This dataset includes sEEG-speech parallel recordings from three distinct speech modes, namely vocalized speech, mimed speech, and imagined speech, at both word and sentence levels, totaling over one hour of intracranial neural recordings related to speech production. This paper also presents a baseline model as the reference model for future studies, at the same time, ensuring the integrity of the dataset. The diversity of tasks and the substantial data volume provide a valuable resource for developing advanced algorithms for speech decoding, thereby advancing BCI research for spoken communication.
Collapse
Affiliation(s)
- Tianyu He
- School of Data Science, Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen, Guangdong, 518172, P. R. China
| | - Mingyi Wei
- Department of Neurosurgery, South China Hospital, Medical School, Shenzhen University, Shenzhen, 518116, P. R. China
| | - Ruicong Wang
- School of Data Science, Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen, Guangdong, 518172, P. R. China
| | - Renzhi Wang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Guangdong, 518172, P. R. China
| | - Shiwei Du
- Department of Neurosurgery, South China Hospital, Medical School, Shenzhen University, Shenzhen, 518116, P. R. China
| | - Siqi Cai
- School of Data Science, Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen, Guangdong, 518172, P. R. China.
| | - Wei Tao
- Department of Neurosurgery, South China Hospital, Medical School, Shenzhen University, Shenzhen, 518116, P. R. China.
| | - Haizhou Li
- School of Data Science, Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen, Guangdong, 518172, P. R. China.
| |
Collapse
|
3
|
Gnanateja GN, Rupp K, Llanos F, Hect J, German JS, Teichert T, Abel TJ, Chandrasekaran B. Cortical processing of discrete prosodic patterns in continuous speech. Nat Commun 2025; 16:1947. [PMID: 40032850 PMCID: PMC11876672 DOI: 10.1038/s41467-025-56779-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 01/29/2025] [Indexed: 03/05/2025] Open
Abstract
Prosody has a vital function in speech, structuring a speaker's intended message for the listener. The superior temporal gyrus (STG) is considered a critical hub for prosody, but the role of earlier auditory regions like Heschl's gyrus (HG), associated with pitch processing, remains unclear. Using intracerebral recordings in humans and non-human primate models, we investigated prosody processing in narrative speech, focusing on pitch accents-abstract phonological units that signal word prominence and communicative intent. In humans, HG encoded pitch accents as abstract representations beyond spectrotemporal features, distinct from segmental speech processing, and outperforms STG in disambiguating pitch accents. Multivariate models confirm HG's unique representation of pitch accent categories. In the non-human primate, pitch accents were not abstractly encoded, despite robust spectrotemporal processing, highlighting the role of experience in shaping abstract representations. These findings emphasize a key role for the HG in early prosodic abstraction and advance our understanding of human speech processing.
Collapse
Affiliation(s)
- G Nike Gnanateja
- Speech Processing and Auditory Neuroscience Lab, Department of Communication Sciences and Disorder, University of Wisconsin-Madison, Madison, WI, USA
| | - Kyle Rupp
- Pediatric Brain Electrophysiology Laboratory, Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA
| | - Fernando Llanos
- UT Austin Neurolinguistics Lab, Department of Linguistics, The University of Texas at Austin, Austin, TX, USA
| | - Jasmine Hect
- Pediatric Brain Electrophysiology Laboratory, Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA
| | - James S German
- Aix-Marseille University, CNRS, LPL, Aix-en-Provence, France
| | - Tobias Teichert
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Taylor J Abel
- Pediatric Brain Electrophysiology Laboratory, Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA.
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, PA, USA.
| | - Bharath Chandrasekaran
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, PA, USA.
- Roxelyn and Richard Pepper Department of Communication Sciences & Disorders, Northwestern University, Evanston, IL, USA.
- Knowles Hearing Center, Evanston, IL, 60208, USA.
| |
Collapse
|
4
|
Keshishian M, Mischler G, Thomas S, Kingsbury B, Bickel S, Mehta AD, Mesgarani N. Parallel hierarchical encoding of linguistic representations in the human auditory cortex and recurrent automatic speech recognition systems. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.30.635775. [PMID: 39975377 PMCID: PMC11838305 DOI: 10.1101/2025.01.30.635775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
The human brain's ability to transform acoustic speech signals into rich linguistic representations has inspired advancements in automatic speech recognition (ASR) systems. While ASR systems now achieve human-level performance under controlled conditions, prior research on their parallels with the brain has been limited by the use of biologically implausible models, narrow feature sets, and comparisons that primarily emphasize predictability of brain activity without fully exploring shared underlying representations. Additionally, studies comparing the brain to text-based language models overlook the acoustic stages of speech processing, an essential part in transforming sound to meaning. Leveraging high-resolution intracranial recordings and a recurrent ASR model, this study bridges these gaps by uncovering a striking correspondence in the hierarchical encoding of linguistic features, from low-level acoustic signals to high-level semantic processing. Specifically, we demonstrate that neural activity in distinct regions of the auditory cortex aligns with representations in corresponding layers of the ASR model and, crucially, that both systems encode similar features at each stage of processing-from acoustic to phonetic, lexical, and semantic information. These findings suggest that both systems, despite their distinct architectures, converge on similar strategies for language processing, providing insight in the optimal computational principles underlying linguistic representation and the shared constraints shaping human and artificial speech processing.
Collapse
Affiliation(s)
- Menoua Keshishian
- Department of Electrical Engineering, Columbia University, New York, NY, USA
- Zuckerman Institute, Columbia University, New York, NY, USA
| | - Gavin Mischler
- Department of Electrical Engineering, Columbia University, New York, NY, USA
- Zuckerman Institute, Columbia University, New York, NY, USA
| | | | | | - Stephan Bickel
- The Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
- Department of Neurosurgery, Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, USA
| | - Ashesh D. Mehta
- The Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
- Department of Neurosurgery, Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, USA
| | - Nima Mesgarani
- Department of Electrical Engineering, Columbia University, New York, NY, USA
- Zuckerman Institute, Columbia University, New York, NY, USA
| |
Collapse
|
5
|
Wang L, Pfordresher PQ, Jiang C, Liu F. Atypical vocal imitation of speech and song in autism spectrum disorder: Evidence from Mandarin speakers. AUTISM : THE INTERNATIONAL JOURNAL OF RESEARCH AND PRACTICE 2025; 29:408-423. [PMID: 39239838 PMCID: PMC11816480 DOI: 10.1177/13623613241275395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2024]
Abstract
LAY ABSTRACT Atypical vocal imitation has been identified in English-speaking autistic individuals, whereas the characteristics of vocal imitation in tone-language-speaking autistic individuals remain unexplored. By comparing speech and song imitation, the present study reveals a unique pattern of atypical vocal imitation across speech and music domains among Mandarin-speaking autistic individuals. The findings suggest that tone language experience does not compensate for difficulties in vocal imitation in autistic individuals and extends our understanding of vocal imitation in autism across different languages.
Collapse
Affiliation(s)
- Li Wang
- The Chinese University of Hong Kong, China
- University of Reading, UK
| | | | | | | |
Collapse
|
6
|
Vaziri PA, McDougle SD, Clark DA. Humans can use positive and negative spectrotemporal correlations to detect rising and falling pitch. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.03.606481. [PMID: 39131316 PMCID: PMC11312537 DOI: 10.1101/2024.08.03.606481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
To discern speech or appreciate music, the human auditory system detects how pitch increases or decreases over time. However, the algorithms used to detect changes in pitch, or pitch motion, are incompletely understood. Here, using psychophysics, computational modeling, functional neuroimaging, and analysis of recorded speech, we ask if humans can detect pitch motion using computations analogous to those used by the visual system. We adapted stimuli from studies of vision to create novel auditory correlated noise stimuli that elicited robust pitch motion percepts. Crucially, these stimuli are inharmonic and possess no persistent features across frequency or time, but do possess positive or negative local spectrotemporal correlations in intensity. In psychophysical experiments, we found clear evidence that humans can judge pitch direction based only on positive or negative spectrotemporal intensity correlations. The key behavioral result-robust sensitivity to the negative spectrotemporal correlations-is a direct analogue of illusory "reverse-phi" motion in vision, and thus constitutes a new auditory illusion. Our behavioral results and computational modeling led us to hypothesize that human auditory processing may employ pitch direction opponency. fMRI measurements in auditory cortex supported this hypothesis. To link our psychophysical findings to real-world pitch perception, we analyzed recordings of English and Mandarin speech and found that pitch direction was robustly signaled by both positive and negative spectrotemporal correlations, suggesting that sensitivity to both types of correlations confers ecological benefits. Overall, this work reveals how motion detection algorithms sensitive to local correlations are deployed by the central nervous system across disparate modalities (vision and audition) and dimensions (space and frequency).
Collapse
|
7
|
Liu X, Wu X, Feng Y, Yang J, Gu N, Mei L. Neural representations of phonological information in bilingual language production. Cereb Cortex 2024; 34:bhae451. [PMID: 39545691 DOI: 10.1093/cercor/bhae451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Revised: 10/29/2024] [Accepted: 11/01/2024] [Indexed: 11/17/2024] Open
Abstract
Previous research has explored the neural mechanisms of bilinguals' language production, but most studies focused on neural mechanisms of cognitive control during language production. Therefore, it is unclear which brain regions represent lexical information (especially phonological information) during production and how they are affected by language context. To address those questions, we used representational similarity analysis to explore neural representations of phonological information in native (L1) and second languages (L2) in the single- and mixed-language contexts, respectively. Results showed that Chinese-English bilinguals behaviorally performed worse and exhibited more activations in brain regions associated with language processing and cognitive control in the mixed-language context relative to the single-language context. Further representational similarity analysis revealed that phonological representations of L1 were detected in the left pars opercularis, middle frontal gyrus, and anterior supramarginal gyrus, while phonological representations of L2 were detected in the bilateral occipitotemporal cortex regardless of the target language. More interestingly, robust phonological representations of L1 were observed in brain areas related to phonological processing during L2 production regardless of language context. These results provide direct neuroimaging evidence for the nonselective processing hypothesis and highlight the superiority of phonological representations in the dominant language during bilingual language production.
Collapse
Affiliation(s)
- Xiaoyu Liu
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
- School of Psychology, Zhejiang Normal University, 688 Yingbin Road, 321000 Jinhua, China
| | - Xiaoyan Wu
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
- Center for Studies of Psychological Application, South China Normal University, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
- Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
- School of Psychology, South China Normal University, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
| | - Yuan Feng
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
- Center for Studies of Psychological Application, South China Normal University, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
- Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
- School of Psychology, South China Normal University, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
| | - Jingyu Yang
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
- Center for Studies of Psychological Application, South China Normal University, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
- Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
- School of Psychology, South China Normal University, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
| | - Nannan Gu
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
- Center for Studies of Psychological Application, South China Normal University, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
- Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
- School of Psychology, South China Normal University, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
| | - Leilei Mei
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
- Center for Studies of Psychological Application, South China Normal University, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
- Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
- School of Psychology, South China Normal University, 55 West of Zhongshan Avenue, 510631 Guangzhou, China
| |
Collapse
|
8
|
Linazi G, Li S, Qu M, Xi Y. Dynamic degree centrality in stroke-induced Broca's aphasia varies based on first language: A functional MRI study. J Neuroimaging 2024; 34:732-741. [PMID: 39175169 DOI: 10.1111/jon.13231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2024] [Revised: 08/02/2024] [Accepted: 08/10/2024] [Indexed: 08/24/2024] Open
Abstract
BACKGROUND AND PURPOSE This study sought to explore dynamic degree centrality (DC) variability in particular regions of the brain in patients with poststroke Broca aphasia (BA) using a resting-state functional magnetic resonance imaging (rs-fMRI) approach, comparing differences between Uyghur and Chinese BA patients. METHODS This study investigated two factors, language and BA status, and divided patients into four groups: Uyghur aphasia patients (UA), Uyghur normal control subjects (UN), Chinese aphasia patients (CA), and Chinese normal subjects (CN) who underwent rs-fMRI analysis. Two-way analysis of variance (ANOVA) was used to calculate the comprehensive differences in dynamic DC among these four groups. Correlations between DC and language behavior were assessed with partial correlation analyses. RESULTS Two-way ANOVA revealed comparable results for the results of pairwise comparisons of dynamic DC variability among the four groups in the right middle frontal gyrus/orbital part (ORBmid.R), right superior frontal gyrus/dorsolateral, and right precuneus (PCUN.R), with results as follows: UA < UN, CA > CN, UA < CA, and UN > CN (p < .05, with the exception of the p-values for UA and UN in superior frontal gyrus/dorsolateral). In contrast, the opposite results were observed for the right calcarine fissure and surrounding cortex (CAL.R, p < .05). CONCLUSION The observed enhancement of dynamic DC variability in ORBmid.R and PCUN.R among Chinese BA patients and in CAL.R in Uyghur BA patients may be attributable to language network restructuring. Overall, these results suggest that BA patients who use different language families may exhibit differences in the network mechanisms that characterize observed impairments of language function.
Collapse
Affiliation(s)
- Gu Linazi
- Department of Rehabilitation Medicine, First Affiliated Hospital of Xinjiang Medical University, Wulumuqi, China
| | - Sijing Li
- Department of Rehabilitation Medicine, First Affiliated Hospital of Xinjiang Medical University, Wulumuqi, China
| | - Mei Qu
- Department of Rehabilitation Medicine, Shanghai Pudong New Area Guangming Hospital of Traditional Chinese Medicine, Shanghai, China
| | - Yanling Xi
- Department of Rehabilitation Medicine, Shanghai Pudong New Area Guangming Hospital of Traditional Chinese Medicine, Shanghai, China
| |
Collapse
|
9
|
Silva AB, Liu JR, Metzger SL, Bhaya-Grossman I, Dougherty ME, Seaton MP, Littlejohn KT, Tu-Chan A, Ganguly K, Moses DA, Chang EF. A bilingual speech neuroprosthesis driven by cortical articulatory representations shared between languages. Nat Biomed Eng 2024; 8:977-991. [PMID: 38769157 PMCID: PMC11554235 DOI: 10.1038/s41551-024-01207-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 04/01/2024] [Indexed: 05/22/2024]
Abstract
Advancements in decoding speech from brain activity have focused on decoding a single language. Hence, the extent to which bilingual speech production relies on unique or shared cortical activity across languages has remained unclear. Here, we leveraged electrocorticography, along with deep-learning and statistical natural-language models of English and Spanish, to record and decode activity from speech-motor cortex of a Spanish-English bilingual with vocal-tract and limb paralysis into sentences in either language. This was achieved without requiring the participant to manually specify the target language. Decoding models relied on shared vocal-tract articulatory representations across languages, which allowed us to build a syllable classifier that generalized across a shared set of English and Spanish syllables. Transfer learning expedited training of the bilingual decoder by enabling neural data recorded in one language to improve decoding in the other language. Overall, our findings suggest shared cortical articulatory representations that persist after paralysis and enable the decoding of multiple languages without the need to train separate language-specific decoders.
Collapse
Affiliation(s)
- Alexander B Silva
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- University of California, Berkeley - University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA
| | - Jessie R Liu
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- University of California, Berkeley - University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA
| | - Sean L Metzger
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- University of California, Berkeley - University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA
| | - Ilina Bhaya-Grossman
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- University of California, Berkeley - University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA
| | - Maximilian E Dougherty
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - Margaret P Seaton
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - Kaylo T Littlejohn
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
| | - Adelyn Tu-Chan
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Karunesh Ganguly
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - David A Moses
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA.
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA.
- University of California, Berkeley - University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA.
| |
Collapse
|
10
|
Kurumada C, Rivera R, Allen P, Bennetto L. Perception and adaptation of receptive prosody in autistic adolescents. Sci Rep 2024; 14:16409. [PMID: 39013983 PMCID: PMC11252140 DOI: 10.1038/s41598-024-66569-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 07/01/2024] [Indexed: 07/18/2024] Open
Abstract
A fundamental aspect of language processing is inferring others' minds from subtle variations in speech. The same word or sentence can often convey different meanings depending on its tempo, timing, and intonation-features often referred to as prosody. Although autistic children and adults are known to experience difficulty in making such inferences, the science remains unclear as to why. We hypothesize that detail-oriented perception in autism may interfere with the inference process if it lacks the adaptivity required to cope with the variability ubiquitous in human speech. Using a novel prosodic continuum that shifts the sentence meaning gradiently from a statement (e.g., "It's raining") to a question (e.g., "It's raining?"), we have investigated the perception and adaptation of receptive prosody in autistic adolescents and two groups of non-autistic controls. Autistic adolescents showed attenuated adaptivity in categorizing prosody, whereas they were equivalent to controls in terms of discrimination accuracy. Combined with recent findings in segmental (e.g., phoneme) recognition, the current results provide the basis for an emerging research framework for attenuated flexibility and reduced influence of contextual feedback as a possible source of deficits that hinder linguistic and social communication in autism.
Collapse
Affiliation(s)
- Chigusa Kurumada
- Brain and Cognitive Sciences, University of Rochester, Rochester, 14627, USA.
| | - Rachel Rivera
- Psychology, University of Rochester, Rochester, 14627, USA
| | - Paul Allen
- Psychology, University of Rochester, Rochester, 14627, USA
- Otolaryngology, University of Rochester Medical Center, Rochester, 14642, USA
| | - Loisa Bennetto
- Psychology, University of Rochester, Rochester, 14627, USA
| |
Collapse
|
11
|
Roark CL, Paulon G, Rebaudo G, McHaney JR, Sarkar A, Chandrasekaran B. Individual differences in working memory impact the trajectory of non-native speech category learning. PLoS One 2024; 19:e0297917. [PMID: 38857268 PMCID: PMC11164376 DOI: 10.1371/journal.pone.0297917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 01/15/2024] [Indexed: 06/12/2024] Open
Abstract
What is the role of working memory over the course of non-native speech category learning? Prior work has predominantly focused on how working memory might influence learning assessed at a single timepoint. Here, we substantially extend this prior work by examining the role of working memory on speech learning performance over time (i.e., over several months) and leverage a multifaceted approach that provides key insights into how working memory influences learning accuracy, maintenance of knowledge over time, generalization ability, and decision processes. We found that the role of working memory in non-native speech learning depends on the timepoint of learning and whether individuals learned the categories at all. Among learners, across all stages of learning, working memory was associated with higher accuracy as well as faster and slightly more cautious decision making. Further, while learners and non-learners did not have substantially different working memory performance, learners had faster evidence accumulation and more cautious decision thresholds throughout all sessions. Working memory may enhance learning by facilitating rapid category acquisition in initial stages and enabling faster and slightly more careful decision-making strategies that may reduce the overall effort needed to learn. Our results have important implications for developing interventions to improve learning in naturalistic language contexts.
Collapse
Affiliation(s)
- Casey L. Roark
- Communication Science & Disorders, University of Pittsburgh, Pittsburgh, PA, United States of America
- Center for the Neural Basis of Cognition, Pittsburgh, PA, United States of America
| | - Giorgio Paulon
- Statistics and Data Sciences, University of Texas at Austin, Austin, TX, United States of America
| | - Giovanni Rebaudo
- Statistics and Data Sciences, University of Texas at Austin, Austin, TX, United States of America
| | - Jacie R. McHaney
- Communication Science & Disorders, University of Pittsburgh, Pittsburgh, PA, United States of America
| | - Abhra Sarkar
- Statistics and Data Sciences, University of Texas at Austin, Austin, TX, United States of America
| | - Bharath Chandrasekaran
- Communication Science & Disorders, University of Pittsburgh, Pittsburgh, PA, United States of America
- Center for the Neural Basis of Cognition, Pittsburgh, PA, United States of America
| |
Collapse
|
12
|
Di Y, Mefford J, Rahmani E, Wang J, Ravi V, Gorla A, Alwan A, Zhu T, Flint J. Genetic association analysis of human median voice pitch identifies a common locus for tonal and non-tonal languages. Commun Biol 2024; 7:540. [PMID: 38714798 PMCID: PMC11076565 DOI: 10.1038/s42003-024-06198-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Accepted: 04/16/2024] [Indexed: 05/10/2024] Open
Abstract
The genetic influence on human vocal pitch in tonal and non-tonal languages remains largely unknown. In tonal languages, such as Mandarin Chinese, pitch changes differentiate word meanings, whereas in non-tonal languages, such as Icelandic, pitch is used to convey intonation. We addressed this question by searching for genetic associations with interindividual variation in median pitch in a Chinese major depression case-control cohort and compared our results with a genome-wide association study from Iceland. The same genetic variant, rs11046212-T in an intron of the ABCC9 gene, was one of the most strongly associated loci with median pitch in both samples. Our meta-analysis revealed four genome-wide significant hits, including two novel associations. The discovery of genetic variants influencing vocal pitch across both tonal and non-tonal languages suggests the possibility of a common genetic contribution to the human vocal system shared in two distinct populations with languages that differ in tonality (Icelandic and Mandarin).
Collapse
Affiliation(s)
- Yazheng Di
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing, 100101, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Joel Mefford
- Department of Neurology, University of California Los Angeles, Los Angeles, CA, USA
| | - Elior Rahmani
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Jinhan Wang
- Department of Electrical and Computer Engineering, University of California Los Angeles, Los Angeles, CA, USA
| | - Vijay Ravi
- Department of Electrical and Computer Engineering, University of California Los Angeles, Los Angeles, CA, USA
| | - Aditya Gorla
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA
| | - Abeer Alwan
- Department of Electrical and Computer Engineering, University of California Los Angeles, Los Angeles, CA, USA
| | - Tingshao Zhu
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing, 100101, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Jonathan Flint
- Department of Psychiatry and Biobehavioral Sciences, Brain Research Institute, University of California Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
13
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. SCIENCE ADVANCES 2024; 10:eadk0010. [PMID: 38363839 PMCID: PMC10871532 DOI: 10.1126/sciadv.adk0010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 01/17/2024] [Indexed: 02/18/2024]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception varies along several pitch-based dimensions: (i) the absolute pitch of notes, (ii) the difference in pitch between successive notes, and (iii) the statistical expectation of each note given prior context. How the brain represents these dimensions and whether their encoding is specialized for music remains unknown. We recorded high-density neurophysiological activity directly from the human auditory cortex while participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial map for representing distinct melodic dimensions. The same participants listened to spoken English, and we compared responses to music and speech. Cortical sites selective for music encoded expectation, while sites that encoded pitch and pitch-change in music used the same neural code to represent equivalent properties of speech. Findings reveal how the perception of melody recruits both music-specific and general-purpose sound representations.
Collapse
Affiliation(s)
- Narayan Sankaran
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Matthew K. Leonard
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Frederic Theunissen
- Department of Psychology, University of California, Berkeley, 2121 Berkeley Way, Berkeley, CA 94720, USA
| | - Edward F. Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| |
Collapse
|
14
|
Silva Pereira S, Özer EE, Sebastian-Galles N. Complexity of STG signals and linguistic rhythm: a methodological study for EEG data. Cereb Cortex 2024; 34:bhad549. [PMID: 38236741 DOI: 10.1093/cercor/bhad549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 12/29/2023] [Accepted: 12/30/2023] [Indexed: 02/06/2024] Open
Abstract
The superior temporal and the Heschl's gyri of the human brain play a fundamental role in speech processing. Neurons synchronize their activity to the amplitude envelope of the speech signal to extract acoustic and linguistic features, a process known as neural tracking/entrainment. Electroencephalography has been extensively used in language-related research due to its high temporal resolution and reduced cost, but it does not allow for a precise source localization. Motivated by the lack of a unified methodology for the interpretation of source reconstructed signals, we propose a method based on modularity and signal complexity. The procedure was tested on data from an experiment in which we investigated the impact of native language on tracking to linguistic rhythms in two groups: English natives and Spanish natives. In the experiment, we found no effect of native language but an effect of language rhythm. Here, we compare source projected signals in the auditory areas of both hemispheres for the different conditions using nonparametric permutation tests, modularity, and a dynamical complexity measure. We found increasing values of complexity for decreased regularity in the stimuli, giving us the possibility to conclude that languages with less complex rhythms are easier to track by the auditory cortex.
Collapse
Affiliation(s)
- Silvana Silva Pereira
- Center for Brain and Cognition, Department of Information and Communications Technologies, Universitat Pompeu Fabra, 08005 Barcelona, Spain
| | - Ege Ekin Özer
- Center for Brain and Cognition, Department of Information and Communications Technologies, Universitat Pompeu Fabra, 08005 Barcelona, Spain
| | - Nuria Sebastian-Galles
- Center for Brain and Cognition, Department of Information and Communications Technologies, Universitat Pompeu Fabra, 08005 Barcelona, Spain
| |
Collapse
|
15
|
Li Y, Anumanchipalli GK, Mohamed A, Chen P, Carney LH, Lu J, Wu J, Chang EF. Dissecting neural computations in the human auditory pathway using deep neural networks for speech. Nat Neurosci 2023; 26:2213-2225. [PMID: 37904043 PMCID: PMC10689246 DOI: 10.1038/s41593-023-01468-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 09/13/2023] [Indexed: 11/01/2023]
Abstract
The human auditory system extracts rich linguistic abstractions from speech signals. Traditional approaches to understanding this complex process have used linear feature-encoding models, with limited success. Artificial neural networks excel in speech recognition tasks and offer promising computational models of speech processing. We used speech representations in state-of-the-art deep neural network (DNN) models to investigate neural coding from the auditory nerve to the speech cortex. Representations in hierarchical layers of the DNN correlated well with the neural activity throughout the ascending auditory system. Unsupervised speech models performed at least as well as other purely supervised or fine-tuned models. Deeper DNN layers were better correlated with the neural activity in the higher-order auditory cortex, with computations aligned with phonemic and syllabic structures in speech. Accordingly, DNN models trained on either English or Mandarin predicted cortical responses in native speakers of each language. These results reveal convergence between DNN model representations and the biological auditory pathway, offering new approaches for modeling neural coding in the auditory cortex.
Collapse
Affiliation(s)
- Yuanning Li
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- School of Biomedical Engineering & State Key Laboratory of Advanced Medical Materials and Devices, ShanghaiTech University, Shanghai, China
| | - Gopala K Anumanchipalli
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
- Department of Electrical Engineering and Computer Science, University of California, Berkeley, Berkeley, CA, USA
| | | | - Peili Chen
- School of Biomedical Engineering & State Key Laboratory of Advanced Medical Materialsand Devices, ShanghaiTech University, Shanghai, China
| | - Laurel H Carney
- Department of Biomedical Engineering, University of Rochester, Rochester, NY, USA
| | - Junfeng Lu
- Neurologic Surgery Department, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China
- Brain Function Laboratory, Neurosurgical Institute, Fudan University, Shanghai, China
| | - Jinsong Wu
- Neurologic Surgery Department, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China
- Brain Function Laboratory, Neurosurgical Institute, Fudan University, Shanghai, China
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA.
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
16
|
Ni G, Xu Z, Bai Y, Zheng Q, Zhao R, Wu Y, Ming D. EEG-based assessment of temporal fine structure and envelope effect in mandarin syllable and tone perception. Cereb Cortex 2023; 33:11287-11299. [PMID: 37804238 DOI: 10.1093/cercor/bhad366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 09/13/2023] [Accepted: 09/15/2023] [Indexed: 10/09/2023] Open
Abstract
In recent years, speech perception research has benefited from low-frequency rhythm entrainment tracking of the speech envelope. However, speech perception is still controversial regarding the role of speech envelope and temporal fine structure, especially in Mandarin. This study aimed to discuss the dependence of Mandarin syllables and tones perception on the speech envelope and the temporal fine structure. We recorded the electroencephalogram (EEG) of the subjects under three acoustic conditions using the sound chimerism analysis, including (i) the original speech, (ii) the speech envelope and the sinusoidal modulation, and (iii) the fine structure of time and the modulation of the non-speech (white noise) sound envelope. We found that syllable perception mainly depended on the speech envelope, while tone perception depended on the temporal fine structure. The delta bands were prominent, and the parietal and prefrontal lobes were the main activated brain areas, regardless of whether syllable or tone perception was involved. Finally, we decoded the spatiotemporal features of Mandarin perception from the microstate sequence. The spatiotemporal feature sequence of the EEG caused by speech material was found to be specific, suggesting a new perspective for the subsequent auditory brain-computer interface. These results provided a new scheme for the coding strategy of new hearing aids for native Mandarin speakers. HIGHLIGHTS
Collapse
Affiliation(s)
- Guangjian Ni
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China
- Haihe Laboratory of Brain-Computer Interaction and Human-Machine Integration, Tianjin 300392 China
| | - Zihao Xu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China
| | - Yanru Bai
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China
| | - Qi Zheng
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
| | - Ran Zhao
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China
| | - Yubo Wu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
| | - Dong Ming
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China
- Haihe Laboratory of Brain-Computer Interaction and Human-Machine Integration, Tianjin 300392 China
| |
Collapse
|
17
|
Lu J, Li Y, Zhao Z, Liu Y, Zhu Y, Mao Y, Wu J, Chang EF. Neural control of lexical tone production in human laryngeal motor cortex. Nat Commun 2023; 14:6917. [PMID: 37903780 PMCID: PMC10616086 DOI: 10.1038/s41467-023-42175-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 09/28/2023] [Indexed: 11/01/2023] Open
Abstract
In tonal languages, which are spoken by nearly one-third of the world's population, speakers precisely control the tension of vocal folds in the larynx to modulate pitch in order to distinguish words with completely different meanings. The specific pitch trajectories for a given tonal language are called lexical tones. Here, we used high-density direct cortical recordings to determine the neural basis of lexical tone production in native Mandarin-speaking participants. We found that instead of a tone category-selective coding, local populations in the bilateral laryngeal motor cortex (LMC) encode articulatory kinematic information to generate the pitch dynamics of lexical tones. Using a computational model of tone production, we discovered two distinct patterns of population activity in LMC commanding pitch rising and lowering. Finally, we showed that direct electrocortical stimulation of different local populations in LMC evoked pitch rising and lowering during tone production, respectively. Together, these results reveal the neural basis of vocal pitch control of lexical tones in tonal languages.
Collapse
Affiliation(s)
- Junfeng Lu
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai, 200040, China
- National Center for Neurological Disorders, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China
| | - Yuanning Li
- School of Biomedical Engineering, ShanghaiTech University, Shanghai, 201210, China
- Department of Neurological Surgery, University of California, San Francisco, CA, 94143, USA
- Weill Institute for Neurosciences, University of California, San Francisco, CA, 94158, USA
- State Key Laboratory of Advanced Medical Materials and Devices, ShanghaiTech University, Shanghai, 201210, China
| | - Zehao Zhao
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai, 200040, China
- National Center for Neurological Disorders, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China
| | - Yan Liu
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai, 200040, China
- National Center for Neurological Disorders, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China
| | - Yanming Zhu
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China
- Speech and Hearing Bioscience & Technology Program, Division of Medical Sciences, Harvard University, Boston, MA, 02215, USA
| | - Ying Mao
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China.
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai, 200040, China.
- National Center for Neurological Disorders, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China.
| | - Jinsong Wu
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China.
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai, 200040, China.
- National Center for Neurological Disorders, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China.
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, CA, 94143, USA.
- Weill Institute for Neurosciences, University of California, San Francisco, CA, 94158, USA.
| |
Collapse
|
18
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.17.562771. [PMID: 37905047 PMCID: PMC10614915 DOI: 10.1101/2023.10.17.562771] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception of melody varies along several pitch-based dimensions: (1) the absolute pitch of notes, (2) the difference in pitch between successive notes, and (3) the higher-order statistical expectation of each note conditioned on its prior context. While humans readily perceive melody, how these dimensions are collectively represented in the brain and whether their encoding is specialized for music remains unknown. Here, we recorded high-density neurophysiological activity directly from the surface of human auditory cortex while Western participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial code for representing distinct dimensions of melody. The same participants listened to spoken English, and we compared evoked responses to music and speech. Cortical sites selective for music were systematically driven by the encoding of expectation. In contrast, sites that encoded pitch and pitch-change used the same neural code to represent equivalent properties of speech. These findings reveal the multidimensional nature of melody encoding, consisting of both music-specific and domain-general sound representations in auditory cortex. Teaser The human brain contains both general-purpose and music-specific neural populations for processing distinct attributes of melody.
Collapse
|
19
|
Schroeder ML, Sherafati A, Ulbrich RL, Wheelock MD, Svoboda AM, Klein ED, George TG, Tripathy K, Culver JP, Eggebrecht AT. Mapping cortical activations underlying covert and overt language production using high-density diffuse optical tomography. Neuroimage 2023; 276:120190. [PMID: 37245559 PMCID: PMC10760405 DOI: 10.1016/j.neuroimage.2023.120190] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 05/05/2023] [Accepted: 05/23/2023] [Indexed: 05/30/2023] Open
Abstract
Gold standard neuroimaging modalities such as functional magnetic resonance imaging (fMRI), positron emission tomography (PET), and more recently electrocorticography (ECoG) have provided profound insights regarding the neural mechanisms underlying the processing of language, but they are limited in applications involving naturalistic language production especially in developing brains, during face-to-face dialogues, or as a brain-computer interface. High-density diffuse optical tomography (HD-DOT) provides high-fidelity mapping of human brain function with comparable spatial resolution to that of fMRI but in a silent and open scanning environment similar to real-life social scenarios. Therefore, HD-DOT has potential to be used in naturalistic settings where other neuroimaging modalities are limited. While HD-DOT has been previously validated against fMRI for mapping the neural correlates underlying language comprehension and covert (i.e., "silent") language production, HD-DOT has not yet been established for mapping the cortical responses to overt (i.e., "out loud") language production. In this study, we assessed the brain regions supporting a simple hierarchy of language tasks: silent reading of single words, covert production of verbs, and overt production of verbs in normal hearing right-handed native English speakers (n = 33). First, we found that HD-DOT brain mapping is resilient to movement associated with overt speaking. Second, we observed that HD-DOT is sensitive to key activations and deactivations in brain function underlying the perception and naturalistic production of language. Specifically, statistically significant results were observed that show recruitment of regions in occipital, temporal, motor, and prefrontal cortices across all three tasks after performing stringent cluster-extent based thresholding. Our findings lay the foundation for future HD-DOT studies of imaging naturalistic language comprehension and production during real-life social interactions and for broader applications such as presurgical language assessment and brain-machine interfaces.
Collapse
Affiliation(s)
- Mariel L Schroeder
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, MO, USA; Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN, USA
| | - Arefeh Sherafati
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, MO, USA; Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Rachel L Ulbrich
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, MO, USA; University of Missouri School of Medicine, Columbia, MO, USA
| | - Muriah D Wheelock
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, MO, USA
| | - Alexandra M Svoboda
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, MO, USA; University of Cincinnati Medical Center, Cincinnati, Oh, USA
| | - Emma D Klein
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, MO, USA; Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Tessa G George
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, MO, USA
| | - Kalyan Tripathy
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, MO, USA; Washington University School of Medicine, St Louis, MO, USA
| | - Joseph P Culver
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, MO, USA; Division of Biology & Biomedical Sciences, Washington University School of Medicine, St Louis, MO, USA; Department of Physics, Washington University in St. Louis, St Louis, MO, USA; Department of Biomedical Engineering, Washington University in St. Louis, St Louis, MO, USA
| | - Adam T Eggebrecht
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, MO, USA; Division of Biology & Biomedical Sciences, Washington University School of Medicine, St Louis, MO, USA; Department of Biomedical Engineering, Washington University in St. Louis, St Louis, MO, USA.
| |
Collapse
|
20
|
Tao DD, Shi B, Galvin JJ, Liu JS, Fu QJ. Frequency detection, frequency discrimination, and spectro-temporal pattern perception in older and younger typically hearing adults. Heliyon 2023; 9:e18922. [PMID: 37583764 PMCID: PMC10424075 DOI: 10.1016/j.heliyon.2023.e18922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 07/14/2023] [Accepted: 08/02/2023] [Indexed: 08/17/2023] Open
Abstract
Elderly adults often experience difficulties in speech understanding, possibly due to age-related deficits in frequency perception. It is unclear whether age-related deficits in frequency perception differ between the apical or basal regions of the cochlea. It is also unclear how aging might differently affect frequency discrimination or detection of a change in frequency within a stimulus. In the present study, pure-tone frequency thresholds were measured in 19 older (61-74 years) and 20 younger (22-28 years) typically hearing adults. Participants were asked to discriminate between reference and probe frequencies or to detect changes in frequency within a probe stimulus. Broadband spectro-temporal pattern perception was also measured using the spectro-temporal modulated ripple test (SMRT). Frequency thresholds were significantly poorer in the basal than in the apical region of the cochlea; the deficit in the basal region was 2 times larger for the older than for the younger group. Frequency thresholds were significantly poorer in the older group, especially in the basal region where frequency detection thresholds were 3.9 times poorer for the older than for the younger group. SMRT thresholds were 1.5 times better for the younger than for the older group. Significant age effects were observed for SMRT thresholds and for frequency thresholds only in the basal region. SMRT thresholds were significantly correlated with frequency thresholds only in the older group. The poorer frequency and spectro-temporal pattern perception may contribute to age-related deficits in speech perception, even when audiometric thresholds are nearly normal.
Collapse
Affiliation(s)
- Duo-Duo Tao
- Department of Ear, Nose, and Throat, The First Affiliated Hospital of Soochow University, Suzhou, 215006, China
| | - Bin Shi
- Department of Ear, Nose, and Throat, The First Affiliated Hospital of Soochow University, Suzhou, 215006, China
| | - John J. Galvin
- House Institute Foundation, Los Angeles, CA, 90057, USA
- University Hospital Center of Tours, Tours, 37000, France
| | - Ji-Sheng Liu
- Department of Ear, Nose, and Throat, The First Affiliated Hospital of Soochow University, Suzhou, 215006, China
| | - Qian-Jie Fu
- Department of Head and Neck Surgery, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA
| |
Collapse
|
21
|
Zheng Y, Li Q, Gong B, Xia Y, Lu X, Liu Y, Wu H, She S, Wu C. Negative-emotion-induced reduction in speech-in-noise recognition is associated with source-monitoring deficits and psychiatric symptoms in mandarin-speaking patients with schizophrenia. Compr Psychiatry 2023; 124:152395. [PMID: 37216805 DOI: 10.1016/j.comppsych.2023.152395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Revised: 05/03/2023] [Accepted: 05/15/2023] [Indexed: 05/24/2023] Open
Abstract
BACKGROUND Patients with schizophrenia (SCH) have deficits in source monitoring (SM), speech-in-noise recognition (SR), and auditory prosody recognition. This study aimed to test the covariation between SM and SR alteration induced by negative prosodies and their association with psychiatric symptoms in SCH. METHODS Fifty-four SCH patients and 59 healthy controls (HCs) underwent a speech SM task, an SR task, and the assessment of positive and negative syndrome scale (PANSS). We used the multivariate analyses of partial least squares (PLS) regression to explore the associations among SM (external/internal/new attribution error [AE] and response bias [RB]), SR alteration/release induced by four negative-emotion (sad, angry, fear, and disgust) prosodies of target speech, and psychiatric symptoms. RESULTS In SCH, but not HCs, a profile (linear combination) of SM (especially the external-source RB) was positively associated with a profile of SR reductions (induced especially by the angry prosody). Moreover, two SR reduction profiles (especially in the anger and sadness conditions) were related to two profiles of psychiatric symptoms (negative symptoms, lack of insight, and emotional disturbances). The two PLS components explained 50.4% of the total variances of the release-symptom association. CONCLUSION Compared to HCs, SCH is more likely to perceive the external-source speech as internal/new source speech. The SM-related SR reduction induced by the angry prosody was mainly associated with negative symptoms. These findings help understand the psychopathology of SCH and may provide a potential direction to improve negative symptoms via minimizing emotional SR reduction in schizophrenia.
Collapse
Affiliation(s)
- Yingjun Zheng
- The Affiliated Brain Hospital of Guangzhou Medical University, Guangzhou 510145, Guangdong, China
| | - Qiuhong Li
- Peking University School of Nursing, Beijing 100191, China
| | - Bingyan Gong
- Peking University School of Nursing, Beijing 100191, China
| | - Yu Xia
- The Affiliated Brain Hospital of Guangzhou Medical University, Guangzhou 510145, Guangdong, China
| | - Xiaohua Lu
- The Affiliated Brain Hospital of Guangzhou Medical University, Guangzhou 510145, Guangdong, China
| | - Yi Liu
- The Affiliated Brain Hospital of Guangzhou Medical University, Guangzhou 510145, Guangdong, China
| | - Huawang Wu
- The Affiliated Brain Hospital of Guangzhou Medical University, Guangzhou 510145, Guangdong, China
| | - Shenglin She
- The Affiliated Brain Hospital of Guangzhou Medical University, Guangzhou 510145, Guangdong, China.
| | - Chao Wu
- Peking University School of Nursing, Beijing 100191, China.
| |
Collapse
|
22
|
Liu Y, Zhao Z, Xu M, Yu H, Zhu Y, Zhang J, Bu L, Zhang X, Lu J, Li Y, Ming D, Wu J. Decoding and synthesizing tonal language speech from brain activity. SCIENCE ADVANCES 2023; 9:eadh0478. [PMID: 37294753 PMCID: PMC10256166 DOI: 10.1126/sciadv.adh0478] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 05/03/2023] [Indexed: 06/11/2023]
Abstract
Recent studies have shown that the feasibility of speech brain-computer interfaces (BCIs) as a clinically valid treatment in helping nontonal language patients with communication disorders restore their speech ability. However, tonal language speech BCI is challenging because additional precise control of laryngeal movements to produce lexical tones is required. Thus, the model should emphasize the features from the tonal-related cortex. Here, we designed a modularized multistream neural network that directly synthesizes tonal language speech from intracranial recordings. The network decoded lexical tones and base syllables independently via parallel streams of neural network modules inspired by neuroscience findings. The speech was synthesized by combining tonal syllable labels with nondiscriminant speech neural activity. Compared to commonly used baseline models, our proposed models achieved higher performance with modest training data and computational costs. These findings raise a potential strategy for approaching tonal language speech restoration.
Collapse
Affiliation(s)
- Yan Liu
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai 200040, China
- National Center for Neurological Disorders, Shanghai 200052, China
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai 200040, China
- Neurosurgical Institute of Fudan University, Shanghai 200052, China
| | - Zehao Zhao
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai 200040, China
- National Center for Neurological Disorders, Shanghai 200052, China
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai 200040, China
- Neurosurgical Institute of Fudan University, Shanghai 200052, China
| | - Minpeng Xu
- Department of Biomedical Engineering, College of Precision Instruments and Optoelectronics Engineering, Tianjin University, Tianjin 300041, China
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300041, China
| | - Haiqing Yu
- Department of Biomedical Engineering, College of Precision Instruments and Optoelectronics Engineering, Tianjin University, Tianjin 300041, China
| | - Yanming Zhu
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai 200040, China
- National Center for Neurological Disorders, Shanghai 200052, China
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai 200040, China
- Neurosurgical Institute of Fudan University, Shanghai 200052, China
| | - Jie Zhang
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai 200040, China
- National Center for Neurological Disorders, Shanghai 200052, China
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai 200040, China
- Neurosurgical Institute of Fudan University, Shanghai 200052, China
| | - Linghao Bu
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai 200040, China
- National Center for Neurological Disorders, Shanghai 200052, China
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai 200040, China
- Neurosurgical Institute of Fudan University, Shanghai 200052, China
- Department of Neurosurgery, First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou 310000, China
| | - Xiaoluo Zhang
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai 200040, China
- National Center for Neurological Disorders, Shanghai 200052, China
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai 200040, China
- Neurosurgical Institute of Fudan University, Shanghai 200052, China
| | - Junfeng Lu
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai 200040, China
- National Center for Neurological Disorders, Shanghai 200052, China
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai 200040, China
- Neurosurgical Institute of Fudan University, Shanghai 200052, China
- MOE Frontiers Center for Brain Science, Huashan Hospital, Fudan University, Shanghai 200040, China
| | - Yuanning Li
- School of Biomedical Engineering, ShanghaiTech University, Shanghai 201210, China
| | - Dong Ming
- Department of Biomedical Engineering, College of Precision Instruments and Optoelectronics Engineering, Tianjin University, Tianjin 300041, China
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300041, China
| | - Jinsong Wu
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai 200040, China
- National Center for Neurological Disorders, Shanghai 200052, China
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai 200040, China
- Neurosurgical Institute of Fudan University, Shanghai 200052, China
| |
Collapse
|
23
|
Hullett PW, Kandahari N, Shih TT, Kleen JK, Knowlton RC, Rao VR, Chang EF. Intact speech perception after resection of dominant hemisphere primary auditory cortex for the treatment of medically refractory epilepsy: illustrative case. JOURNAL OF NEUROSURGERY. CASE LESSONS 2022; 4:CASE22417. [PMID: 36443954 PMCID: PMC9705521 DOI: 10.3171/case22417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 10/27/2022] [Indexed: 11/29/2022]
Abstract
BACKGROUND In classic speech network models, the primary auditory cortex is the source of auditory input to Wernicke's area in the posterior superior temporal gyrus (pSTG). Because resection of the primary auditory cortex in the dominant hemisphere removes inputs to the pSTG, there is a risk of speech impairment. However, recent research has shown the existence of other, nonprimary auditory cortex inputs to the pSTG, potentially reducing the risk of primary auditory cortex resection in the dominant hemisphere. OBSERVATIONS Here, the authors present a clinical case of a woman with severe medically refractory epilepsy with a lesional epileptic focus in the left (dominant) Heschl's gyrus. Analysis of neural responses to speech stimuli was consistent with primary auditory cortex localization to Heschl's gyrus. Although the primary auditory cortex was within the proposed resection margins, she underwent lesionectomy with total resection of Heschl's gyrus. Postoperatively, she had no speech deficits and her seizures were fully controlled. LESSONS While resection of the dominant hemisphere Heschl's gyrus/primary auditory cortex warrants caution, this case illustrates the ability to resect the primary auditory cortex without speech impairment and supports recent models of multiple parallel inputs to the pSTG.
Collapse
Affiliation(s)
- Patrick W. Hullett
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Nazineen Kandahari
- Department of Neurosurgery, University of California San Francisco, San Francisco, California; and ,Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Tina T. Shih
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Jonathan K. Kleen
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Robert C. Knowlton
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Vikram R. Rao
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Edward F. Chang
- Department of Neurosurgery, University of California San Francisco, San Francisco, California; and
| |
Collapse
|
24
|
Xu Z, Bai Y, Zhao R, Zheng Q, Ni G, Ming D. Auditory attention decoding from EEG-based Mandarin speech envelope reconstruction. Hear Res 2022; 422:108552. [PMID: 35714555 DOI: 10.1016/j.heares.2022.108552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 06/01/2022] [Accepted: 06/08/2022] [Indexed: 11/23/2022]
Abstract
In the cocktail party circumstance, the human auditory system extracts the information from a specific speaker of interest and ignores others. Many studies have focused on auditory attention decoding (AAD), but the stimulation materials were mainly non-tonal languages. We used a tonal language (Mandarin) as the speech stimulus and constructed a Long Short-Term Memory (LSTM) architecture for speech envelope reconstruction based on electroencephalogram (EEG) data. The correlation coefficient between the reconstructed and candidate envelopes was calculated to determine the subject's auditory attention. The proposed LSTM architecture outperformed the linear models. The average decoding accuracy in cross-subject and inter-subject cases varies from 63.02 to 74.29%, with the highest accuracy rate of 89.1% in a decision window of 0.15 s. In addition, the beta-band rhythm was found to play an essential role in identifying the attention and the non-attention state. These results provide a new AAD architecture to help develop neuro-steered hearing devices, especially for tonal languages.
Collapse
Affiliation(s)
- Zihao Xu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072, China; Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072, China
| | - Yanru Bai
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072, China; Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072, China
| | - Ran Zhao
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072, China; Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072, China
| | - Qi Zheng
- Department of Biomedical Engineering, College of Precision Instruments and Optoelectronics Engineering, Tianjin University, Tianjin 300072, China
| | - Guangjian Ni
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072, China; Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072, China; Department of Biomedical Engineering, College of Precision Instruments and Optoelectronics Engineering, Tianjin University, Tianjin 300072, China.
| | - Dong Ming
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072, China; Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072, China; Department of Biomedical Engineering, College of Precision Instruments and Optoelectronics Engineering, Tianjin University, Tianjin 300072, China.
| |
Collapse
|
25
|
Brodbeck C, Simon JZ. Cortical tracking of voice pitch in the presence of multiple speakers depends on selective attention. Front Neurosci 2022; 16:828546. [PMID: 36003957 PMCID: PMC9393379 DOI: 10.3389/fnins.2022.828546] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 07/08/2022] [Indexed: 11/13/2022] Open
Abstract
Voice pitch carries linguistic and non-linguistic information. Previous studies have described cortical tracking of voice pitch in clean speech, with responses reflecting both pitch strength and pitch value. However, pitch is also a powerful cue for auditory stream segregation, especially when competing streams have pitch differing in fundamental frequency, as is the case when multiple speakers talk simultaneously. We therefore investigated how cortical speech pitch tracking is affected in the presence of a second, task-irrelevant speaker. We analyzed human magnetoencephalography (MEG) responses to continuous narrative speech, presented either as a single talker in a quiet background or as a two-talker mixture of a male and a female speaker. In clean speech, voice pitch was associated with a right-dominant response, peaking at a latency of around 100 ms, consistent with previous electroencephalography and electrocorticography results. The response tracked both the presence of pitch and the relative value of the speaker's fundamental frequency. In the two-talker mixture, the pitch of the attended speaker was tracked bilaterally, regardless of whether or not there was simultaneously present pitch in the speech of the irrelevant speaker. Pitch tracking for the irrelevant speaker was reduced: only the right hemisphere still significantly tracked pitch of the unattended speaker, and only during intervals in which no pitch was present in the attended talker's speech. Taken together, these results suggest that pitch-based segregation of multiple speakers, at least as measured by macroscopic cortical tracking, is not entirely automatic but strongly dependent on selective attention.
Collapse
Affiliation(s)
- Christian Brodbeck
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, United States
- Institute for Systems Research, University of Maryland, College Park, College Park, MD, United States
| | - Jonathan Z. Simon
- Institute for Systems Research, University of Maryland, College Park, College Park, MD, United States
- Department of Electrical and Computer Engineering, University of Maryland, College Park, College Park, MD, United States
- Department of Biology, University of Maryland, College Park, College Park, MD, United States
| |
Collapse
|
26
|
Malik-Moraleda S, Ayyash D, Gallée J, Affourtit J, Hoffmann M, Mineroff Z, Jouravlev O, Fedorenko E. An investigation across 45 languages and 12 language families reveals a universal language network. Nat Neurosci 2022; 25:1014-1019. [PMID: 35856094 PMCID: PMC10414179 DOI: 10.1038/s41593-022-01114-5] [Citation(s) in RCA: 103] [Impact Index Per Article: 34.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Accepted: 06/06/2022] [Indexed: 11/08/2022]
Abstract
To understand the architecture of human language, it is critical to examine diverse languages; however, most cognitive neuroscience research has focused on only a handful of primarily Indo-European languages. Here we report an investigation of the fronto-temporo-parietal language network across 45 languages and establish the robustness to cross-linguistic variation of its topography and key functional properties, including left-lateralization, strong functional integration among its brain regions and functional selectivity for language processing.
Collapse
Affiliation(s)
- Saima Malik-Moraleda
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Boston, MA, USA.
| | - Dima Ayyash
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Jeanne Gallée
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Boston, MA, USA
| | - Josef Affourtit
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Malte Hoffmann
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, USA
- Department of Radiology, Harvard Medical School, Boston, MA, USA
| | - Zachary Mineroff
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Eberly Center, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Olessia Jouravlev
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Cognitive Science, Carleton University, Ottawa, ON, Canada
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Boston, MA, USA.
| |
Collapse
|
27
|
Lee KW, Lee DH, Kim SJ, Lee SW. Decoding Neural Correlation of Language-Specific Imagined Speech using EEG Signals. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:1977-1980. [PMID: 36086641 DOI: 10.1109/embc48229.2022.9871721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Speech impairments due to cerebral lesions and degenerative disorders can be devastating. For humans with severe speech deficits, imagined speech in the brain-computer interface has been a promising hope for reconstructing the neural signals of speech production. However, studies in the EEG-based imagined speech domain still have some limitations due to high variability in spatial and temporal information and low signal-to-noise ratio. In this paper, we investigated the neural signals for two groups of native speakers with two tasks with different languages, English and Chinese. Our assumption was that English, a non-tonal and phonogram-based language, would have spectral differences in neural computation compared to Chinese, a tonal and ideogram-based language. The results showed the significant difference in the relative power spectral density between English and Chinese in specific frequency band groups. Also, the spatial evaluation of Chinese native speakers in the theta band was distinctive during the imagination task. Hence, this paper would suggest the key spectral and spatial information of word imagination with specialized language while decoding the neural signals of speech. Clinical Relevance- Imagined speech-related studies lead to the development of assistive communication technology especially for patients with speech disorders such as aphasia due to brain damage. This study suggests significant spectral features by analyzing cross-language differences of EEG-based imagined speech using two widely used languages.
Collapse
|
28
|
Sequence-to-Sequence Voice Reconstruction for Silent Speech in a Tonal Language. Brain Sci 2022; 12:brainsci12070818. [PMID: 35884626 PMCID: PMC9312762 DOI: 10.3390/brainsci12070818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 06/20/2022] [Accepted: 06/21/2022] [Indexed: 11/23/2022] Open
Abstract
Silent speech decoding (SSD), based on articulatory neuromuscular activities, has become a prevalent task of brain–computer interfaces (BCIs) in recent years. Many works have been devoted to decoding surface electromyography (sEMG) from articulatory neuromuscular activities. However, restoring silent speech in tonal languages such as Mandarin Chinese is still difficult. This paper proposes an optimized sequence-to-sequence (Seq2Seq) approach to synthesize voice from the sEMG-based silent speech. We extract duration information to regulate the sEMG-based silent speech using the audio length. Then, we provide a deep-learning model with an encoder–decoder structure and a state-of-the-art vocoder to generate the audio waveform. Experiments based on six Mandarin Chinese speakers demonstrate that the proposed model can successfully decode silent speech in Mandarin Chinese and achieve a character error rate (CER) of 6.41% on average with human evaluation.
Collapse
|
29
|
Chen Z, Ye N, Teng C, Li X. Alternations and Applications of the Structural and Functional Connectome in Gliomas: A Mini-Review. Front Neurosci 2022; 16:856808. [PMID: 35478847 PMCID: PMC9035851 DOI: 10.3389/fnins.2022.856808] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Accepted: 02/28/2022] [Indexed: 12/12/2022] Open
Abstract
In the central nervous system, gliomas are the most common, but complex primary tumors. Genome-based molecular and clinical studies have revealed different classifications and subtypes of gliomas. Neuroradiological approaches have non-invasively provided a macroscopic view for surgical resection and therapeutic effects. The connectome is a structural map of a physical object, the brain, which raises issues of spatial scale and definition, and it is calculated through diffusion magnetic resonance imaging (MRI) and functional MRI. In this study, we reviewed the basic principles and attributes of the structural and functional connectome, followed by the alternations of connectomes and their influences on glioma. To extend the applications of connectome, we demonstrated that a series of multi-center projects still need to be conducted to systemically investigate the connectome and the structural-functional coupling of glioma. Additionally, the brain-computer interface based on accurate connectome could provide more precise structural and functional data, which are significant for surgery and postoperative recovery. Besides, integrating the data from different sources, including connectome and other omics information, and their processing with artificial intelligence, together with validated biological and clinical findings will be significant for the development of a personalized surgical strategy.
Collapse
Affiliation(s)
- Ziyan Chen
- Department of Neurosurgery, Xiangya Hospital, Central South University, Hunan, China
- Hunan International Scientific and Technological Cooperation Base of Brain Tumor Research, Xiangya Hospital, Central South University, Changsha, China
| | - Ningrong Ye
- Department of Neurosurgery, Xiangya Hospital, Central South University, Hunan, China
- Hunan International Scientific and Technological Cooperation Base of Brain Tumor Research, Xiangya Hospital, Central South University, Changsha, China
| | - Chubei Teng
- Department of Neurosurgery, Xiangya Hospital, Central South University, Hunan, China
- Hunan International Scientific and Technological Cooperation Base of Brain Tumor Research, Xiangya Hospital, Central South University, Changsha, China
- Department of Neurosurgery, The First Affiliated Hospital, University of South China, Hengyang, China
| | - Xuejun Li
- Department of Neurosurgery, Xiangya Hospital, Central South University, Hunan, China
- Hunan International Scientific and Technological Cooperation Base of Brain Tumor Research, Xiangya Hospital, Central South University, Changsha, China
| |
Collapse
|
30
|
Decoding Selective Auditory Attention with EEG using A Transformer Model. Methods 2022; 204:410-417. [DOI: 10.1016/j.ymeth.2022.04.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 03/20/2022] [Accepted: 04/14/2022] [Indexed: 11/23/2022] Open
|
31
|
Lin BF, Yeh SC, Kao YCJ, Lu CF, Tsai PY. Functional Remodeling Associated With Language Recovery After Repetitive Transcranial Magnetic Stimulation in Chronic Aphasic Stroke. Front Neurol 2022; 13:809843. [PMID: 35330805 PMCID: PMC8940300 DOI: 10.3389/fneur.2022.809843] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 01/24/2022] [Indexed: 11/22/2022] Open
Abstract
Background Repetitive transcranial magnetic stimulation (rTMS) has shown promising efficacy in improving the language functions in poststroke aphasia. However, randomized controlled trials were lacking to investigate the rTMS-related neuroimaging changes underlying the therapeutic effects on language improvement in chronic aphasia. Objective In this study, we aimed to evaluate the effects of low-frequency rTMS (LF-rTMS) on chronic poststroke aphasia. We hypothesized that the deactivation of the right pars triangularis could restore the balance of interhemispheric inhibition and, hence, facilitated the functional remodeling of language networks in both the hemispheres. Furthermore, the rTMS-induced functional reorganization should underpin the language recovery after rTMS. Methods A total of 33 patients (22 males; age: 58.70 ± 13.77 years) with chronic stroke in the left hemisphere and nonfluent aphasia were recruited in this randomized double-blinded study. The ratio of randomization between the rTMS and sham groups is 17:16. All the patients received real 1-Hz rTMS or sham stimulation (placebo coil delivered < 5% of magnetic output with similar audible click-on discharge) at the right posterior pars triangularis for 10 consecutive weekdays (stroke onset to the first stimulation: 10.97 ± 10.35 months). Functional connectivity of language networks measured by resting-state fMRI was calculated and correlated to the scores of the Concise Chinese Aphasia Test by using the stepwise regression analysis. Results After LF-rTMS intervention, significant improvement in language functions in terms of comprehension and expression abilities was observed compared with the sham group. The rTMS group showed a significant decrease of coupling strength between right pars triangularis and pars opercularis with a strengthened connection between right pars orbitalis and angular gyrus. Furthermore, the LF-rTMS significantly enhanced the coupling strength associated with left Wernicke area. Results of regression analysis showed that the identified functional remodeling involving both the hemispheres could support and predict the language recovery after LF-rTMS treatment. Conclusion We reported the therapeutic effects of LF-rTMS and corresponding functional remodeling in chronic poststroke aphasia. Our results provided neuroimage evidence reflecting the rebalance of interhemispheric inhibition induced by LF-rTMS, which could facilitate future research in the refinement of rTMS protocol to optimize the neuromodulation efficacy and benefit the clinical management of patients with stroke.
Collapse
Affiliation(s)
- Bing-Fong Lin
- Department of Biomedical Imaging and Radiological Sciences, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Shih-Ching Yeh
- Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan
| | - Yu-Chieh Jill Kao
- Department of Biomedical Imaging and Radiological Sciences, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Chia-Feng Lu
- Department of Biomedical Imaging and Radiological Sciences, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Institute of Biophotonics, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Po-Yi Tsai
- Department of Physical Medicine and Rehabilitation, Taipei Veterans General Hospital, Taipei, Taiwan.,School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| |
Collapse
|
32
|
Chauvette L, Fournier P, Sharp A. The frequency-following response to assess the neural representation of spectral speech cues in older adults. Hear Res 2022; 418:108486. [DOI: 10.1016/j.heares.2022.108486] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 03/12/2022] [Accepted: 03/15/2022] [Indexed: 11/04/2022]
|
33
|
Giampiccolo D, Duffau H. Controversy over the temporal cortical terminations of the left arcuate fasciculus: a reappraisal. Brain 2022; 145:1242-1256. [PMID: 35142842 DOI: 10.1093/brain/awac057] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 12/19/2021] [Accepted: 01/20/2022] [Indexed: 11/12/2022] Open
Abstract
The arcuate fasciculus has been considered a major dorsal fronto-temporal white matter pathway linking frontal language production regions with auditory perception in the superior temporal gyrus, the so-called Wernicke's area. In line with this tradition, both historical and contemporary models of language function have assigned primacy to superior temporal projections of the arcuate fasciculus. However, classical anatomical descriptions and emerging behavioural data are at odds with this assumption. On one hand, fronto-temporal projections to Wernicke's area may not be unique to the arcuate fasciculus. On the other hand, dorsal stream language deficits have been reported also for damage to middle, inferior and basal temporal gyri which may be linked to arcuate disconnection. These findings point to a reappraisal of arcuate projections in the temporal lobe. Here, we review anatomical and functional evidence regarding the temporal cortical terminations of the left arcuate fasciculus by incorporating dissection and tractography findings with stimulation data using cortico-cortical evoked potentials and direct electrical stimulation mapping in awake patients. Firstly, we discuss the fibers of the arcuate fasciculus projecting to the superior temporal gyrus and the functional rostro-caudal gradient in this region where both phonological encoding and auditory-motor transformation may be performed. Caudal regions within the temporoparietal junction may be involved in articulation and associated with temporoparietal projections of the third branch of the superior longitudinal fasciculus, while more rostral regions may support encoding of acoustic phonetic features, supported by arcuate fibres. We then move to examine clinical data showing that multimodal phonological encoding is facilitated by projections of the arcuate fasciculus to superior, but also middle, inferior and basal temporal regions. Hence, we discuss how projections of the arcuate fasciculus may contribute to acoustic (middle-posterior superior and middle temporal gyri), visual (posterior inferior temporal/fusiform gyri comprising the visual word form area) and lexical (anterior-middle inferior temporal/fusiform gyri in the basal temporal language area) information in the temporal lobe to be processed, encoded and translated into a dorsal phonological route to the frontal lobe. Finally, we point out surgical implications for this model in terms of the prediction and avoidance of neurological deficit.
Collapse
Affiliation(s)
- Davide Giampiccolo
- Section of Neurosurgery, Department of Neurosciences, Biomedicine and Movement Sciences, University Hospital, Verona, Italy.,Institute of Neuroscience, Cleveland Clinic London, Grosvenor Place, London, UK.,Department of Clinical and Experimental Epilepsy, UCL Queen Square Institute of Neurology, University College London, London, UK.,Victor Horsley Department of Neurosurgery, National Hospital for Neurology and Neurosurgery, Queen Square, London, UK
| | - Hugues Duffau
- Department of Neurosurgery, Gui de Chauliac Hospital, Montpellier University Medical Center, Montpellier, France.,Team "Neuroplasticity, Stem Cells and Low-grade Gliomas," INSERM U1191, Institute of Genomics of Montpellier, University of Montpellier, Montpellier, France
| |
Collapse
|
34
|
Bachmann FL, MacDonald EN, Hjortkjær J. Neural Measures of Pitch Processing in EEG Responses to Running Speech. Front Neurosci 2022; 15:738408. [PMID: 35002597 PMCID: PMC8729880 DOI: 10.3389/fnins.2021.738408] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 11/01/2021] [Indexed: 11/13/2022] Open
Abstract
Linearized encoding models are increasingly employed to model cortical responses to running speech. Recent extensions to subcortical responses suggest clinical perspectives, potentially complementing auditory brainstem responses (ABRs) or frequency-following responses (FFRs) that are current clinical standards. However, while it is well-known that the auditory brainstem responds both to transient amplitude variations and the stimulus periodicity that gives rise to pitch, these features co-vary in running speech. Here, we discuss challenges in disentangling the features that drive the subcortical response to running speech. Cortical and subcortical electroencephalographic (EEG) responses to running speech from 19 normal-hearing listeners (12 female) were analyzed. Using forward regression models, we confirm that responses to the rectified broadband speech signal yield temporal response functions consistent with wave V of the ABR, as shown in previous work. Peak latency and amplitude of the speech-evoked brainstem response were correlated with standard click-evoked ABRs recorded at the vertex electrode (Cz). Similar responses could be obtained using the fundamental frequency (F0) of the speech signal as model predictor. However, simulations indicated that dissociating responses to temporal fine structure at the F0 from broadband amplitude variations is not possible given the high co-variance of the features and the poor signal-to-noise ratio (SNR) of subcortical EEG responses. In cortex, both simulations and data replicated previous findings indicating that envelope tracking on frontal electrodes can be dissociated from responses to slow variations in F0 (relative pitch). Yet, no association between subcortical F0-tracking and cortical responses to relative pitch could be detected. These results indicate that while subcortical speech responses are comparable to click-evoked ABRs, dissociating pitch-related processing in the auditory brainstem may be challenging with natural speech stimuli.
Collapse
Affiliation(s)
- Florine L Bachmann
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
| | - Ewen N MacDonald
- Department of Systems Design Engineering, University of Waterloo, Waterloo, ON, Canada
| | - Jens Hjortkjær
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark.,Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital - Amager and Hvidovre, Copenhagen, Denmark
| |
Collapse
|
35
|
Wagner M, Ortiz-Mantilla S, Rusiniak M, Benasich AA, Shafer VL, Steinschneider M. Acoustic-level and language-specific processing of native and non-native phonological sequence onsets in the low gamma and theta-frequency bands. Sci Rep 2022; 12:314. [PMID: 35013345 PMCID: PMC8748887 DOI: 10.1038/s41598-021-03611-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Accepted: 11/08/2021] [Indexed: 11/15/2022] Open
Abstract
Acoustic structures associated with native-language phonological sequences are enhanced within auditory pathways for perception, although the underlying mechanisms are not well understood. To elucidate processes that facilitate perception, time-frequency (T-F) analyses of EEGs obtained from native speakers of English and Polish were conducted. Participants listened to same and different nonword pairs within counterbalanced attend and passive conditions. Nonwords contained the onsets /pt/, /pət/, /st/, and /sət/ that occur in both the Polish and English languages with the exception of /pt/, which never occurs in the English language in word onset. Measures of spectral power and inter-trial phase locking (ITPL) in the low gamma (LG) and theta-frequency bands were analyzed from two bilateral, auditory source-level channels, created through source localization modeling. Results revealed significantly larger spectral power in LG for the English listeners to the unfamiliar /pt/ onsets from the right hemisphere at early cortical stages, during the passive condition. Further, ITPL values revealed distinctive responses in high and low-theta to acoustic characteristics of the onsets, which were modulated by language exposure. These findings, language-specific processing in LG and acoustic-level and language-specific processing in theta, support the view that multi scale temporal processing in the LG and theta-frequency bands facilitates speech perception.
Collapse
Affiliation(s)
- Monica Wagner
- St. John's University, St. John's Hall, Room 344 e1, 8000 Utopia Parkway, Queens, NY, 11439, USA.
| | | | | | | | - Valerie L Shafer
- The Graduate Center of the City University of New York, New York, NY, 10016, USA
| | | |
Collapse
|
36
|
Abstract
Human speech perception results from neural computations that transform external acoustic speech signals into internal representations of words. The superior temporal gyrus (STG) contains the nonprimary auditory cortex and is a critical locus for phonological processing. Here, we describe how speech sound representation in the STG relies on fundamentally nonlinear and dynamical processes, such as categorization, normalization, contextual restoration, and the extraction of temporal structure. A spatial mosaic of local cortical sites on the STG exhibits complex auditory encoding for distinct acoustic-phonetic and prosodic features. We propose that as a population ensemble, these distributed patterns of neural activity give rise to abstract, higher-order phonemic and syllabic representations that support speech perception. This review presents a multi-scale, recurrent model of phonological processing in the STG, highlighting the critical interface between auditory and language systems.
Collapse
Affiliation(s)
- Ilina Bhaya-Grossman
- Department of Neurological Surgery, University of California, San Francisco, California 94143, USA;
- Joint Graduate Program in Bioengineering, University of California, Berkeley and San Francisco, California 94720, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, California 94143, USA;
| |
Collapse
|
37
|
Learning nonnative speech sounds changes local encoding in the adult human cortex. Proc Natl Acad Sci U S A 2021; 118:2101777118. [PMID: 34475209 DOI: 10.1073/pnas.2101777118] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 07/12/2021] [Indexed: 11/18/2022] Open
Abstract
Adults can learn to identify nonnative speech sounds with training, albeit with substantial variability in learning behavior. Increases in behavioral accuracy are associated with increased separability for sound representations in cortical speech areas. However, it remains unclear whether individual auditory neural populations all show the same types of changes with learning, or whether there are heterogeneous encoding patterns. Here, we used high-resolution direct neural recordings to examine local population response patterns, while native English listeners learned to recognize unfamiliar vocal pitch patterns in Mandarin Chinese tones. We found a distributed set of neural populations in bilateral superior temporal gyrus and ventrolateral frontal cortex, where the encoding of Mandarin tones changed throughout training as a function of trial-by-trial accuracy ("learning effect"), including both increases and decreases in the separability of tones. These populations were distinct from populations that showed changes as a function of exposure to the stimuli regardless of trial-by-trial accuracy. These learning effects were driven in part by more variable neural responses to repeated presentations of acoustically identical stimuli. Finally, learning effects could be predicted from speech-evoked activity even before training, suggesting that intrinsic properties of these populations make them amenable to behavior-related changes. Together, these results demonstrate that nonnative speech sound learning involves a wide array of changes in neural representations across a distributed set of brain regions.
Collapse
|