1
|
Berezutskaya J, Freudenburg ZV, Vansteensel MJ, Aarnoutse EJ, Ramsey NF, van Gerven MAJ. Direct speech reconstruction from sensorimotor brain activity with optimized deep learning models. J Neural Eng 2023; 20:056010. [PMID: 37467739 PMCID: PMC10510111 DOI: 10.1088/1741-2552/ace8be] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2022] [Revised: 07/12/2023] [Accepted: 07/19/2023] [Indexed: 07/21/2023]
Abstract
Objective.Development of brain-computer interface (BCI) technology is key for enabling communication in individuals who have lost the faculty of speech due to severe motor paralysis. A BCI control strategy that is gaining attention employs speech decoding from neural data. Recent studies have shown that a combination of direct neural recordings and advanced computational models can provide promising results. Understanding which decoding strategies deliver best and directly applicable results is crucial for advancing the field.Approach.In this paper, we optimized and validated a decoding approach based on speech reconstruction directly from high-density electrocorticography recordings from sensorimotor cortex during a speech production task.Main results.We show that (1) dedicated machine learning optimization of reconstruction models is key for achieving the best reconstruction performance; (2) individual word decoding in reconstructed speech achieves 92%-100% accuracy (chance level is 8%); (3) direct reconstruction from sensorimotor brain activity produces intelligible speech.Significance.These results underline the need for model optimization in achieving best speech decoding results and highlight the potential that reconstruction-based speech decoding from sensorimotor cortex can offer for development of next-generation BCI technology for communication.
Collapse
Affiliation(s)
- Julia Berezutskaya
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
- Donders Center for Brain, Cognition and Behaviour, Nijmegen 6525 GD, The Netherlands
| | - Zachary V Freudenburg
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Mariska J Vansteensel
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Erik J Aarnoutse
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Nick F Ramsey
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Marcel A J van Gerven
- Donders Center for Brain, Cognition and Behaviour, Nijmegen 6525 GD, The Netherlands
| |
Collapse
|
2
|
|
3
|
Hickok G, Venezia J, Teghipco A. Beyond Broca: neural architecture and evolution of a dual motor speech coordination system. Brain 2023; 146:1775-1790. [PMID: 36746488 PMCID: PMC10411947 DOI: 10.1093/brain/awac454] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 11/04/2022] [Accepted: 11/19/2022] [Indexed: 02/08/2023] Open
Abstract
Classical neural architecture models of speech production propose a single system centred on Broca's area coordinating all the vocal articulators from lips to larynx. Modern evidence has challenged both the idea that Broca's area is involved in motor speech coordination and that there is only one coordination network. Drawing on a wide range of evidence, here we propose a dual speech coordination model in which laryngeal control of pitch-related aspects of prosody and song are coordinated by a hierarchically organized dorsolateral system while supralaryngeal articulation at the phonetic/syllabic level is coordinated by a more ventral system posterior to Broca's area. We argue further that these two speech production subsystems have distinguishable evolutionary histories and discuss the implications for models of language evolution.
Collapse
Affiliation(s)
- Gregory Hickok
- Department of Cognitive Sciences, University of California, Irvine, CA 92697, USA
- Department of Language Science, University of California, Irvine, CA 92697, USA
| | - Jonathan Venezia
- Auditory Research Laboratory, VA Loma Linda Healthcare System, Loma Linda, CA 92357, USA
- Department of Otolaryngology—Head and Neck Surgery, Loma Linda University School of Medicine, Loma Linda, CA 92350, USA
| | - Alex Teghipco
- Department of Psychology, University of South Carolina, Columbia, SC 29208, USA
| |
Collapse
|
4
|
Silva AB, Liu JR, Zhao L, Levy DF, Scott TL, Chang EF. A Neurosurgical Functional Dissection of the Middle Precentral Gyrus during Speech Production. J Neurosci 2022; 42:8416-8426. [PMID: 36351829 PMCID: PMC9665919 DOI: 10.1523/jneurosci.1614-22.2022] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 08/30/2022] [Indexed: 11/17/2022] Open
Abstract
Classical models have traditionally focused on the left posterior inferior frontal gyrus (Broca's area) as a key region for motor planning of speech production. However, converging evidence suggests that it is not critical for either speech motor planning or execution. Alternative cortical areas supporting high-level speech motor planning have yet to be defined. In this review, we focus on the precentral gyrus, whose role in speech production is often thought to be limited to lower-level articulatory muscle control. In particular, we highlight neurosurgical investigations that have shed light on a cortical region anatomically located near the midpoint of the precentral gyrus, hence called the middle precentral gyrus (midPrCG). The midPrCG is functionally located between dorsal hand and ventral orofacial cortical representations and exhibits unique sensorimotor and multisensory functions relevant for speech processing. This includes motor control of the larynx, auditory processing, as well as a role in reading and writing. Furthermore, direct electrical stimulation of midPrCG can evoke complex movements, such as vocalization, and selective injury can cause deficits in verbal fluency, such as pure apraxia of speech. Based on these findings, we propose that midPrCG is essential to phonological-motoric aspects of speech production, especially syllabic-level speech sequencing, a role traditionally ascribed to Broca's area. The midPrCG is a cortical brain area that should be included in contemporary models of speech production with a unique role in speech motor planning and execution.
Collapse
Affiliation(s)
- Alexander B Silva
- Department of Neurological Surgery, University of California, San Francisco, California, 94158
- Weill Institute for Neurosciences, University of California, San Francisco, California, 94158
- Medical Scientist Training Program, University of California, San Francisco, California, 94158
- Graduate Program in Bioengineering, University of California, Berkeley, California 94720, & University of California, San Francisco, California, 94158
| | - Jessie R Liu
- Department of Neurological Surgery, University of California, San Francisco, California, 94158
- Weill Institute for Neurosciences, University of California, San Francisco, California, 94158
- Graduate Program in Bioengineering, University of California, Berkeley, California 94720, & University of California, San Francisco, California, 94158
| | - Lingyun Zhao
- Department of Neurological Surgery, University of California, San Francisco, California, 94158
- Weill Institute for Neurosciences, University of California, San Francisco, California, 94158
| | - Deborah F Levy
- Department of Neurological Surgery, University of California, San Francisco, California, 94158
- Weill Institute for Neurosciences, University of California, San Francisco, California, 94158
| | - Terri L Scott
- Department of Neurological Surgery, University of California, San Francisco, California, 94158
- Weill Institute for Neurosciences, University of California, San Francisco, California, 94158
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, California, 94158
- Weill Institute for Neurosciences, University of California, San Francisco, California, 94158
- Graduate Program in Bioengineering, University of California, Berkeley, California 94720, & University of California, San Francisco, California, 94158
| |
Collapse
|
5
|
Vansteensel MJ, Branco MP, Leinders S, Freudenburg ZF, Schippers A, Geukes SH, Gaytant MA, Gosselaar PH, Aarnoutse EJ, Ramsey NF. Methodological Recommendations for Studies on the Daily Life Implementation of Implantable Communication-Brain-Computer Interfaces for Individuals With Locked-in Syndrome. Neurorehabil Neural Repair 2022; 36:666-677. [PMID: 36124975 PMCID: PMC11986352 DOI: 10.1177/15459683221125788] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Implantable brain-computer interfaces (BCIs) promise to be a viable means to restore communication in individuals with locked-in syndrome (LIS). In 2016, we presented the world-first fully implantable BCI system that uses subdural electrocorticography electrodes to record brain signals and a subcutaneous amplifier to transmit the signals to the outside world, and that enabled an individual with LIS to communicate via a tablet computer by selecting icons in spelling software. For future clinical implementation of implantable communication-BCIs, however, much work is still needed, for example, to validate these systems in daily life settings with more participants, and to improve the speed of communication. We believe the design and execution of future studies on these and other topics may benefit from the experience we have gained. Therefore, based on relevant literature and our own experiences, we here provide an overview of procedures, as well as recommendations, for recruitment, screening, inclusion, imaging, hospital admission, implantation, training, and support of participants with LIS, for studies on daily life implementation of implantable communication-BCIs. With this article, we not only aim to inform the BCI community about important topics of concern, but also hope to contribute to improved methodological standardization of implantable BCI research.
Collapse
Affiliation(s)
- Mariska J Vansteensel
- UMC Utrecht Brain Center, Department of Neurology & Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Mariana P Branco
- UMC Utrecht Brain Center, Department of Neurology & Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Sacha Leinders
- UMC Utrecht Brain Center, Department of Neurology & Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Zac F Freudenburg
- UMC Utrecht Brain Center, Department of Neurology & Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Anouck Schippers
- UMC Utrecht Brain Center, Department of Neurology & Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Simon H Geukes
- UMC Utrecht Brain Center, Department of Neurology & Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Michael A Gaytant
- Department of Pulmonary Diseases/Home Mechanical Ventilation, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Peter H Gosselaar
- UMC Utrecht Brain Center, Department of Neurology & Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Erik J Aarnoutse
- UMC Utrecht Brain Center, Department of Neurology & Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Nick F Ramsey
- UMC Utrecht Brain Center, Department of Neurology & Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| |
Collapse
|
6
|
Kaestner E, Wu X, Friedman D, Dugan P, Devinsky O, Carlson C, Doyle W, Thesen T, Halgren E. The Precentral Gyrus Contributions to the Early Time-Course of Grapheme-to-Phoneme Conversion. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2022; 3:18-45. [PMID: 37215328 PMCID: PMC10158576 DOI: 10.1162/nol_a_00047] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Accepted: 06/16/2021] [Indexed: 05/24/2023]
Abstract
As part of silent reading models, visual orthographic information is transduced into an auditory phonological code in a process of grapheme-to-phoneme conversion (GPC). This process is often identified with lateral temporal-parietal regions associated with auditory phoneme encoding. However, the role of articulatory phonemic representations and the precentral gyrus in GPC is ambiguous. Though the precentral gyrus is implicated in many functional MRI studies of reading, it is not clear if the time course of activity in this region is consistent with the precentral gyrus being involved in GPC. We recorded cortical electrophysiology during a bimodal match/mismatch task from eight patients with perisylvian subdural electrodes to examine the time course of neural activity during a task that necessitated GPC. Patients made a match/mismatch decision between a 3-letter string and the following auditory bi-phoneme. We characterized the distribution and timing of evoked broadband high gamma (70-170 Hz) as well as phase-locking between electrodes. The precentral gyrus emerged with a high concentration of broadband high gamma responses to visual and auditory language as well as mismatch effects. The pars opercularis, supramarginal gyrus, and superior temporal gyrus were also involved. The precentral gyrus showed strong phase-locking with the caudal fusiform gyrus during letter-string presentation and with surrounding perisylvian cortex during the bimodal visual-auditory comparison period. These findings hint at a role for precentral cortex in transducing visual into auditory codes during silent reading.
Collapse
Affiliation(s)
- Erik Kaestner
- Center for Multimodal Imaging and Genetics, University of California, San Diego, USA
| | - Xiaojing Wu
- Department of Neurology, NYU Langone School of Medicine, New York, USA
| | - Daniel Friedman
- Department of Neurology, NYU Langone School of Medicine, New York, USA
| | - Patricia Dugan
- Department of Neurology, NYU Langone School of Medicine, New York, USA
| | - Orrin Devinsky
- Department of Neurology, NYU Langone School of Medicine, New York, USA
| | - Chad Carlson
- Department of Neurology, Medical College of Wisconsin, Milwaukee, USA
| | - Werner Doyle
- Department of Neurology, NYU Langone School of Medicine, New York, USA
- Department of Neurosurgery, NYU Langone School of Medicine, New York, USA
| | - Thomas Thesen
- Department of Neurology, NYU Langone School of Medicine, New York, USA
| | - Eric Halgren
- Department of Neurosciences, University of California at San Diego, La Jolla, USA
- Department of Radiology, University of California at San Diego, La Jolla, USA
| |
Collapse
|
7
|
Venezia JH, Richards VM, Hickok G. Speech-Driven Spectrotemporal Receptive Fields Beyond the Auditory Cortex. Hear Res 2021; 408:108307. [PMID: 34311190 PMCID: PMC8378265 DOI: 10.1016/j.heares.2021.108307] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 06/15/2021] [Accepted: 06/30/2021] [Indexed: 10/20/2022]
Abstract
We recently developed a method to estimate speech-driven spectrotemporal receptive fields (STRFs) using fMRI. The method uses spectrotemporal modulation filtering, a form of acoustic distortion that renders speech sometimes intelligible and sometimes unintelligible. Using this method, we found significant STRF responses only in classic auditory regions throughout the superior temporal lobes. However, our analysis was not optimized to detect small clusters of STRFs as might be expected in non-auditory regions. Here, we re-analyze our data using a more sensitive multivariate statistical test for cross-subject alignment of STRFs, and we identify STRF responses in non-auditory regions including the left dorsal premotor cortex (dPM), left inferior frontal gyrus (IFG), and bilateral calcarine sulcus (calcS). All three regions responded more to intelligible than unintelligible speech, but left dPM and calcS responded significantly to vocal pitch and demonstrated strong functional connectivity with early auditory regions. Left dPM's STRF generated the best predictions of activation on trials rated as unintelligible by listeners, a hallmark auditory profile. IFG, on the other hand, responded almost exclusively to intelligible speech and was functionally connected with classic speech-language regions in the superior temporal sulcus and middle temporal gyrus. IFG's STRF was also (weakly) able to predict activation on unintelligible trials, suggesting the presence of a partial 'acoustic trace' in the region. We conclude that left dPM is part of the human dorsal laryngeal motor cortex, a region previously shown to be capable of operating in an 'auditory mode' to encode vocal pitch. Further, given previous observations that IFG is involved in syntactic working memory and/or processing of linear order, we conclude that IFG is part of a higher-order speech circuit that exerts a top-down influence on processing of speech acoustics. Finally, because calcS is modulated by emotion, we speculate that changes in the quality of vocal pitch may have contributed to its response.
Collapse
Affiliation(s)
- Jonathan H Venezia
- VA Loma Linda Healthcare System, Loma Linda, CA, United States; Dept. of Otolaryngology, Loma Linda University School of Medicine, Loma Linda, CA, United States.
| | - Virginia M Richards
- Depts. of Cognitive Sciences and Language Science, University of California, Irvine, Irvine, CA, United States
| | - Gregory Hickok
- Depts. of Cognitive Sciences and Language Science, University of California, Irvine, Irvine, CA, United States
| |
Collapse
|
8
|
Berezutskaya J, Baratin C, Freudenburg ZV, Ramsey NF. High-density intracranial recordings reveal a distinct site in anterior dorsal precentral cortex that tracks perceived speech. Hum Brain Mapp 2020; 41:4587-4609. [PMID: 32744403 PMCID: PMC7555065 DOI: 10.1002/hbm.25144] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Revised: 06/23/2020] [Accepted: 07/06/2020] [Indexed: 01/15/2023] Open
Abstract
Various brain regions are implicated in speech processing, and the specific function of some of them is better understood than others. In particular, involvement of the dorsal precentral cortex (dPCC) in speech perception remains debated, and attribution of the function of this region is more or less restricted to motor processing. In this study, we investigated high-density intracranial responses to speech fragments of a feature film, aiming to determine whether dPCC is engaged in perception of continuous speech. Our findings show that dPCC exhibited preference to speech over other tested sounds. Moreover, the identified area was involved in tracking of speech auditory properties including speech spectral envelope, its rhythmic phrasal pattern and pitch contour. DPCC also showed the ability to filter out noise from the perceived speech. Comparing these results to data from motor experiments showed that the identified region had a distinct location in dPCC, anterior to the hand motor area and superior to the mouth articulator region. The present findings uncovered with high-density intracranial recordings help elucidate the functional specialization of PCC and demonstrate the unique role of its anterior dorsal region in continuous speech perception.
Collapse
Affiliation(s)
- Julia Berezutskaya
- Brain Center, Department of Neurology and NeurosurgeryUniversity Medical Center UtrechtUtrechtThe Netherlands
- Donders Institute for Brain, Cognition and BehaviourRadboud UniversityNijmegenThe Netherlands
| | - Clarissa Baratin
- Brain Center, Department of Neurology and NeurosurgeryUniversity Medical Center UtrechtUtrechtThe Netherlands
- Université Grenoble AlpesGrenoble Institut des NeurosciencesGrenobleFrance
| | - Zachary V. Freudenburg
- Brain Center, Department of Neurology and NeurosurgeryUniversity Medical Center UtrechtUtrechtThe Netherlands
| | - Nicolas F. Ramsey
- Brain Center, Department of Neurology and NeurosurgeryUniversity Medical Center UtrechtUtrechtThe Netherlands
| |
Collapse
|