1
|
Carney LH. Neural Fluctuation Contrast as a Code for Complex Sounds: The Role and Control of Peripheral Nonlinearities. Hear Res 2024; 443:108966. [PMID: 38310710 PMCID: PMC10923127 DOI: 10.1016/j.heares.2024.108966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/14/2024] [Accepted: 01/26/2024] [Indexed: 02/06/2024]
Abstract
The nonlinearities of the inner ear are often considered to be obstacles that the central nervous system has to overcome to decode neural responses to sounds. This review describes how peripheral nonlinearities, such as saturation of the inner-hair-cell response and of the IHC-auditory-nerve synapse, are instead beneficial to the neural encoding of complex sounds such as speech. These nonlinearities set up contrast in the depth of neural-fluctuations in auditory-nerve responses along the tonotopic axis, referred to here as neural fluctuation contrast (NFC). Physiological support for the NFC coding hypothesis is reviewed, and predictions of several psychophysical phenomena, including masked detection and speech intelligibility, are presented. Lastly, a framework based on the NFC code for understanding how the medial olivocochlear (MOC) efferent system contributes to the coding of complex sounds is presented. By modulating cochlear gain control in response to both sound energy and fluctuations in neural responses, the MOC system is hypothesized to function not as a simple feedback gain-control device, but rather as a mechanism for enhancing NFC along the tonotopic axis, enabling robust encoding of complex sounds across a wide range of sound levels and in the presence of background noise. Effects of sensorineural hearing loss on the NFC code and on the MOC feedback system are presented and discussed.
Collapse
Affiliation(s)
- Laurel H Carney
- Depts. of Biomedical Engineering, Neuroscience, and Electrical & Computer Engineering University of Rochester, Rochester, NY, USA.
| |
Collapse
|
2
|
Whalen DH. Direct neural coding of speech: Reconsideration of Whalen et al. (2006) (L). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:1704-1706. [PMID: 38426833 PMCID: PMC10908555 DOI: 10.1121/10.0025125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 02/12/2024] [Accepted: 02/12/2024] [Indexed: 03/02/2024]
Abstract
Previous brain imaging results indicated that speech perception proceeded independently of the auditory primitives that are the product of primary auditory cortex [Whalen, Benson, Richardson, Swainson, Clark, Lai, Mencl, Fulbright, Constable, and Liberman (2006). J. Acoust. Soc. Am. 119, 575-581]. Recent evidence using electrocorticography [Hamilton, Oganian, Hall, and Chang (2021). Cell 184, 4626-4639] indicates that there is a more direct connection from subcortical regions to cortical speech regions than previous studies had shown. Although the mechanism differs, the Hamilton, Oganian, Hall, and Chang result supports the original conclusion even more strongly: Speech perception does not rely on the analysis of primitives from auditory analysis. Rather, the speech signal is processed as speech from the beginning.
Collapse
|
3
|
Karunathilake IMD, Brodbeck C, Bhattasali S, Resnik P, Simon JZ. Neural Dynamics of the Processing of Speech Features: Evidence for a Progression of Features from Acoustic to Sentential Processing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.02.578603. [PMID: 38352332 PMCID: PMC10862830 DOI: 10.1101/2024.02.02.578603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/22/2024]
Abstract
When we listen to speech, our brain's neurophysiological responses "track" its acoustic features, but it is less well understood how these auditory responses are modulated by linguistic content. Here, we recorded magnetoencephalography (MEG) responses while subjects listened to four types of continuous-speech-like passages: speech-envelope modulated noise, English-like non-words, scrambled words, and narrative passage. Temporal response function (TRF) analysis provides strong neural evidence for the emergent features of speech processing in cortex, from acoustics to higher-level linguistics, as incremental steps in neural speech processing. Critically, we show a stepwise hierarchical progression of progressively higher order features over time, reflected in both bottom-up (early) and top-down (late) processing stages. Linguistically driven top-down mechanisms take the form of late N400-like responses, suggesting a central role of predictive coding mechanisms at multiple levels. As expected, the neural processing of lower-level acoustic feature responses is bilateral or right lateralized, with left lateralization emerging only for lexical-semantic features. Finally, our results identify potential neural markers of the computations underlying speech perception and comprehension.
Collapse
Affiliation(s)
| | - Christian Brodbeck
- Department of Computing and Software, McMaster University, Hamilton, ON, Canada
| | - Shohini Bhattasali
- Department of Language Studies, University of Toronto, Scarborough, Canada
| | - Philip Resnik
- Department of Linguistics and Institute for Advanced Computer Studies, University of Maryland, College Park, MD, USA
| | - Jonathan Z Simon
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, USA
- Department of Biology, University of Maryland, College Park, MD, USA
- Institute for Systems Research, University of Maryland, College Park, MD, USA
| |
Collapse
|
4
|
Zoefel B, Kösem A. Neural tracking of continuous acoustics: properties, speech-specificity and open questions. Eur J Neurosci 2024; 59:394-414. [PMID: 38151889 DOI: 10.1111/ejn.16221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 11/17/2023] [Accepted: 11/22/2023] [Indexed: 12/29/2023]
Abstract
Human speech is a particularly relevant acoustic stimulus for our species, due to its role of information transmission during communication. Speech is inherently a dynamic signal, and a recent line of research focused on neural activity following the temporal structure of speech. We review findings that characterise neural dynamics in the processing of continuous acoustics and that allow us to compare these dynamics with temporal aspects in human speech. We highlight properties and constraints that both neural and speech dynamics have, suggesting that auditory neural systems are optimised to process human speech. We then discuss the speech-specificity of neural dynamics and their potential mechanistic origins and summarise open questions in the field.
Collapse
Affiliation(s)
- Benedikt Zoefel
- Centre de Recherche Cerveau et Cognition (CerCo), CNRS UMR 5549, Toulouse, France
- Université de Toulouse III Paul Sabatier, Toulouse, France
| | - Anne Kösem
- Lyon Neuroscience Research Center (CRNL), INSERM U1028, Bron, France
| |
Collapse
|
5
|
Leonard MK, Gwilliams L, Sellers KK, Chung JE, Xu D, Mischler G, Mesgarani N, Welkenhuysen M, Dutta B, Chang EF. Large-scale single-neuron speech sound encoding across the depth of human cortex. Nature 2024; 626:593-602. [PMID: 38093008 PMCID: PMC10866713 DOI: 10.1038/s41586-023-06839-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 11/06/2023] [Indexed: 01/31/2024]
Abstract
Understanding the neural basis of speech perception requires that we study the human brain both at the scale of the fundamental computational unit of neurons and in their organization across the depth of cortex. Here we used high-density Neuropixels arrays1-3 to record from 685 neurons across cortical layers at nine sites in a high-level auditory region that is critical for speech, the superior temporal gyrus4,5, while participants listened to spoken sentences. Single neurons encoded a wide range of speech sound cues, including features of consonants and vowels, relative vocal pitch, onsets, amplitude envelope and sequence statistics. Neurons at each cross-laminar recording exhibited dominant tuning to a primary speech feature while also containing a substantial proportion of neurons that encoded other features contributing to heterogeneous selectivity. Spatially, neurons at similar cortical depths tended to encode similar speech features. Activity across all cortical layers was predictive of high-frequency field potentials (electrocorticography), providing a neuronal origin for macroelectrode recordings from the cortical surface. Together, these results establish single-neuron tuning across the cortical laminae as an important dimension of speech encoding in human superior temporal gyrus.
Collapse
Affiliation(s)
- Matthew K Leonard
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Laura Gwilliams
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Kristin K Sellers
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Jason E Chung
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Duo Xu
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Gavin Mischler
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
- Department of Electrical Engineering, Columbia University, New York, NY, USA
| | - Nima Mesgarani
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
- Department of Electrical Engineering, Columbia University, New York, NY, USA
| | | | | | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA.
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
6
|
Nourski KV, Steinschneider M, Rhone AE, Berger JI, Dappen ER, Kawasaki H, Howard III MA. Intracranial electrophysiology of spectrally degraded speech in the human cortex. Front Hum Neurosci 2024; 17:1334742. [PMID: 38318272 PMCID: PMC10839784 DOI: 10.3389/fnhum.2023.1334742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 12/28/2023] [Indexed: 02/07/2024] Open
Abstract
Introduction Cochlear implants (CIs) are the treatment of choice for severe to profound hearing loss. Variability in CI outcomes remains despite advances in technology and is attributed in part to differences in cortical processing. Studying these differences in CI users is technically challenging. Spectrally degraded stimuli presented to normal-hearing individuals approximate input to the central auditory system in CI users. This study used intracranial electroencephalography (iEEG) to investigate cortical processing of spectrally degraded speech. Methods Participants were adult neurosurgical epilepsy patients. Stimuli were utterances /aba/ and /ada/, spectrally degraded using a noise vocoder (1-4 bands) or presented without vocoding. The stimuli were presented in a two-alternative forced choice task. Cortical activity was recorded using depth and subdural iEEG electrodes. Electrode coverage included auditory core in posteromedial Heschl's gyrus (HGPM), superior temporal gyrus (STG), ventral and dorsal auditory-related areas, and prefrontal and sensorimotor cortex. Analysis focused on high gamma (70-150 Hz) power augmentation and alpha (8-14 Hz) suppression. Results Chance task performance occurred with 1-2 spectral bands and was near-ceiling for clear stimuli. Performance was variable with 3-4 bands, permitting identification of good and poor performers. There was no relationship between task performance and participants demographic, audiometric, neuropsychological, or clinical profiles. Several response patterns were identified based on magnitude and differences between stimulus conditions. HGPM responded strongly to all stimuli. A preference for clear speech emerged within non-core auditory cortex. Good performers typically had strong responses to all stimuli along the dorsal stream, including posterior STG, supramarginal, and precentral gyrus; a minority of sites in STG and supramarginal gyrus had a preference for vocoded stimuli. In poor performers, responses were typically restricted to clear speech. Alpha suppression was more pronounced in good performers. In contrast, poor performers exhibited a greater involvement of posterior middle temporal gyrus when listening to clear speech. Discussion Responses to noise-vocoded speech provide insights into potential factors underlying CI outcome variability. The results emphasize differences in the balance of neural processing along the dorsal and ventral stream between good and poor performers, identify specific cortical regions that may have diagnostic and prognostic utility, and suggest potential targets for neuromodulation-based CI rehabilitation strategies.
Collapse
Affiliation(s)
- Kirill V. Nourski
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, United States
- Iowa Neuroscience Institute, The University of Iowa, Iowa City, IA, United States
| | - Mitchell Steinschneider
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, United States
- Departments of Neurology and Neuroscience, Albert Einstein College of Medicine, Bronx, NY, United States
| | - Ariane E. Rhone
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, United States
| | - Joel I. Berger
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, United States
| | - Emily R. Dappen
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, United States
- Iowa Neuroscience Institute, The University of Iowa, Iowa City, IA, United States
| | - Hiroto Kawasaki
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, United States
| | - Matthew A. Howard III
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, United States
- Iowa Neuroscience Institute, The University of Iowa, Iowa City, IA, United States
- Pappajohn Biomedical Institute, The University of Iowa, Iowa City, IA, United States
| |
Collapse
|
7
|
Kocsis Z, Jenison RL, Taylor PN, Calmus RM, McMurray B, Rhone AE, Sarrett ME, Deifelt Streese C, Kikuchi Y, Gander PE, Berger JI, Kovach CK, Choi I, Greenlee JD, Kawasaki H, Cope TE, Griffiths TD, Howard MA, Petkov CI. Immediate neural impact and incomplete compensation after semantic hub disconnection. Nat Commun 2023; 14:6264. [PMID: 37805497 PMCID: PMC10560235 DOI: 10.1038/s41467-023-42088-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Accepted: 09/28/2023] [Indexed: 10/09/2023] Open
Abstract
The human brain extracts meaning using an extensive neural system for semantic knowledge. Whether broadly distributed systems depend on or can compensate after losing a highly interconnected hub is controversial. We report intracranial recordings from two patients during a speech prediction task, obtained minutes before and after neurosurgical treatment requiring disconnection of the left anterior temporal lobe (ATL), a candidate semantic knowledge hub. Informed by modern diaschisis and predictive coding frameworks, we tested hypotheses ranging from solely neural network disruption to complete compensation by the indirectly affected language-related and speech-processing sites. Immediately after ATL disconnection, we observed neurophysiological alterations in the recorded frontal and auditory sites, providing direct evidence for the importance of the ATL as a semantic hub. We also obtained evidence for rapid, albeit incomplete, attempts at neural network compensation, with neural impact largely in the forms stipulated by the predictive coding framework, in specificity, and the modern diaschisis framework, more generally. The overall results validate these frameworks and reveal an immediate impact and capability of the human brain to adjust after losing a brain hub.
Collapse
Affiliation(s)
- Zsuzsanna Kocsis
- Department of Neurosurgery, University of Iowa, Iowa City, IA, USA.
- Biosciences Institute, Newcastle University Medical School, Newcastle upon Tyne, UK.
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA.
| | - Rick L Jenison
- Departments of Neuroscience and Psychology, University of Wisconsin, Madison, WI, USA
| | - Peter N Taylor
- CNNP Lab, Interdisciplinary Computing and Complex BioSystems Group, School of Computing, Newcastle University, Newcastle upon Tyne, UK
- UCL Institute of Neurology, Queen Square, London, UK
| | - Ryan M Calmus
- Department of Neurosurgery, University of Iowa, Iowa City, IA, USA
- Biosciences Institute, Newcastle University Medical School, Newcastle upon Tyne, UK
| | - Bob McMurray
- Department of Psychological and Brain Science, University of Iowa, Iowa City, IA, USA
| | - Ariane E Rhone
- Department of Neurosurgery, University of Iowa, Iowa City, IA, USA
| | | | | | - Yukiko Kikuchi
- Biosciences Institute, Newcastle University Medical School, Newcastle upon Tyne, UK
| | - Phillip E Gander
- Department of Neurosurgery, University of Iowa, Iowa City, IA, USA
- Department of Radiology, University of Iowa, Iowa City, IA, USA
- Iowa Neuroscience Institute, University of Iowa, Iowa City, IA, USA
| | - Joel I Berger
- Department of Neurosurgery, University of Iowa, Iowa City, IA, USA
| | | | - Inyong Choi
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, IA, USA
| | | | - Hiroto Kawasaki
- Department of Neurosurgery, University of Iowa, Iowa City, IA, USA
| | - Thomas E Cope
- Department of Clinical Neurosciences, Cambridge University, Cambridge, UK
- MRC Cognition and Brain Sciences Unit, Cambridge University, Cambridge, UK
| | - Timothy D Griffiths
- Biosciences Institute, Newcastle University Medical School, Newcastle upon Tyne, UK
| | - Matthew A Howard
- Department of Neurosurgery, University of Iowa, Iowa City, IA, USA
| | - Christopher I Petkov
- Department of Neurosurgery, University of Iowa, Iowa City, IA, USA.
- Biosciences Institute, Newcastle University Medical School, Newcastle upon Tyne, UK.
| |
Collapse
|
8
|
Yamoah EN, Pavlinkova G, Fritzsch B. The Development of Speaking and Singing in Infants May Play a Role in Genomics and Dementia in Humans. Brain Sci 2023; 13:1190. [PMID: 37626546 PMCID: PMC10452560 DOI: 10.3390/brainsci13081190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2023] [Revised: 08/04/2023] [Accepted: 08/08/2023] [Indexed: 08/27/2023] Open
Abstract
The development of the central auditory system, including the auditory cortex and other areas involved in processing sound, is shaped by genetic and environmental factors, enabling infants to learn how to speak. Before explaining hearing in humans, a short overview of auditory dysfunction is provided. Environmental factors such as exposure to sound and language can impact the development and function of the auditory system sound processing, including discerning in speech perception, singing, and language processing. Infants can hear before birth, and sound exposure sculpts their developing auditory system structure and functions. Exposing infants to singing and speaking can support their auditory and language development. In aging humans, the hippocampus and auditory nuclear centers are affected by neurodegenerative diseases such as Alzheimer's, resulting in memory and auditory processing difficulties. As the disease progresses, overt auditory nuclear center damage occurs, leading to problems in processing auditory information. In conclusion, combined memory and auditory processing difficulties significantly impact people's ability to communicate and engage with their societal essence.
Collapse
Affiliation(s)
- Ebenezer N. Yamoah
- Department of Physiology and Cell Biology, School of Medicine, University of Nevada, Reno, NV 89557, USA;
| | | | - Bernd Fritzsch
- Department of Neurological Sciences, University of Nebraska Medical Center, Omaha, NE 68198, USA
| |
Collapse
|
9
|
Jia Z, Xu C, Li J, Gao J, Ding N, Luo B, Zou J. Phase Property of Envelope-Tracking EEG Response Is Preserved in Patients with Disorders of Consciousness. eNeuro 2023; 10:ENEURO.0130-23.2023. [PMID: 37500493 PMCID: PMC10420405 DOI: 10.1523/eneuro.0130-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 07/16/2023] [Accepted: 07/20/2023] [Indexed: 07/29/2023] Open
Abstract
When listening to speech, the low-frequency cortical response below 10 Hz can track the speech envelope. Previous studies have demonstrated that the phase lag between speech envelope and cortical response can reflect the mechanism by which the envelope-tracking response is generated. Here, we analyze whether the mechanism to generate the envelope-tracking response is modulated by the level of consciousness, by studying how the stimulus-response phase lag is modulated by the disorder of consciousness (DoC). It is observed that DoC patients in general show less reliable neural tracking of speech. Nevertheless, the stimulus-response phase lag changes linearly with frequency between 3.5 and 8 Hz, for DoC patients who show reliable cortical tracking to speech, regardless of the consciousness state. The mean phase lag is also consistent across these DoC patients. These results suggest that the envelope-tracking response to speech can be generated by an automatic process that is barely modulated by the consciousness state.
Collapse
Affiliation(s)
- Ziting Jia
- The Second Hospital, Cheeloo College of Medicine, Shandong University, Jinan 250033, China
| | - Chuan Xu
- Department of Neurology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou 310019, China
| | - Jingqi Li
- Department of Rehabilitation, Hangzhou Mingzhou Brain Rehabilitation Hospital, Hangzhou 311215, China
| | - Jian Gao
- Department of Rehabilitation, Hangzhou Mingzhou Brain Rehabilitation Hospital, Hangzhou 311215, China
| | - Nai Ding
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China
| | - Benyan Luo
- Department of Neurology, First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou 310003, China
| | - Jiajie Zou
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China
| |
Collapse
|
10
|
Banks MI, Krause BM, Berger DG, Campbell DI, Boes AD, Bruss JE, Kovach CK, Kawasaki H, Steinschneider M, Nourski KV. Functional geometry of auditory cortical resting state networks derived from intracranial electrophysiology. PLoS Biol 2023; 21:e3002239. [PMID: 37651504 PMCID: PMC10499207 DOI: 10.1371/journal.pbio.3002239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 09/13/2023] [Accepted: 07/07/2023] [Indexed: 09/02/2023] Open
Abstract
Understanding central auditory processing critically depends on defining underlying auditory cortical networks and their relationship to the rest of the brain. We addressed these questions using resting state functional connectivity derived from human intracranial electroencephalography. Mapping recording sites into a low-dimensional space where proximity represents functional similarity revealed a hierarchical organization. At a fine scale, a group of auditory cortical regions excluded several higher-order auditory areas and segregated maximally from the prefrontal cortex. On mesoscale, the proximity of limbic structures to the auditory cortex suggested a limbic stream that parallels the classically described ventral and dorsal auditory processing streams. Identities of global hubs in anterior temporal and cingulate cortex depended on frequency band, consistent with diverse roles in semantic and cognitive processing. On a macroscale, observed hemispheric asymmetries were not specific for speech and language networks. This approach can be applied to multivariate brain data with respect to development, behavior, and disorders.
Collapse
Affiliation(s)
- Matthew I. Banks
- Department of Anesthesiology, University of Wisconsin, Madison, Wisconsin, United States of America
- Department of Neuroscience, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Bryan M. Krause
- Department of Anesthesiology, University of Wisconsin, Madison, Wisconsin, United States of America
| | - D. Graham Berger
- Department of Anesthesiology, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Declan I. Campbell
- Department of Anesthesiology, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Aaron D. Boes
- Department of Neurology, The University of Iowa, Iowa City, Iowa, United States of America
| | - Joel E. Bruss
- Department of Neurology, The University of Iowa, Iowa City, Iowa, United States of America
| | - Christopher K. Kovach
- Department of Neurosurgery, The University of Iowa, Iowa City, Iowa, United States of America
| | - Hiroto Kawasaki
- Department of Neurosurgery, The University of Iowa, Iowa City, Iowa, United States of America
| | - Mitchell Steinschneider
- Department of Neurology, Albert Einstein College of Medicine, New York, New York, United States of America
- Department of Neuroscience, Albert Einstein College of Medicine, New York, New York, United States of America
| | - Kirill V. Nourski
- Department of Neurosurgery, The University of Iowa, Iowa City, Iowa, United States of America
- Iowa Neuroscience Institute, The University of Iowa, Iowa City, Iowa, United States of America
| |
Collapse
|
11
|
Simon JZ, Commuri V, Kulasingham JP. Time-locked auditory cortical responses in the high-gamma band: A window into primary auditory cortex. Front Neurosci 2022; 16:1075369. [PMID: 36570848 PMCID: PMC9773383 DOI: 10.3389/fnins.2022.1075369] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 11/24/2022] [Indexed: 12/13/2022] Open
Abstract
Primary auditory cortex is a critical stage in the human auditory pathway, a gateway between subcortical and higher-level cortical areas. Receiving the output of all subcortical processing, it sends its output on to higher-level cortex. Non-invasive physiological recordings of primary auditory cortex using electroencephalography (EEG) and magnetoencephalography (MEG), however, may not have sufficient specificity to separate responses generated in primary auditory cortex from those generated in underlying subcortical areas or neighboring cortical areas. This limitation is important for investigations of effects of top-down processing (e.g., selective-attention-based) on primary auditory cortex: higher-level areas are known to be strongly influenced by top-down processes, but subcortical areas are often assumed to perform strictly bottom-up processing. Fortunately, recent advances have made it easier to isolate the neural activity of primary auditory cortex from other areas. In this perspective, we focus on time-locked responses to stimulus features in the high gamma band (70-150 Hz) and with early cortical latency (∼40 ms), intermediate between subcortical and higher-level areas. We review recent findings from physiological studies employing either repeated simple sounds or continuous speech, obtaining either a frequency following response (FFR) or temporal response function (TRF). The potential roles of top-down processing are underscored, and comparisons with invasive intracranial EEG (iEEG) and animal model recordings are made. We argue that MEG studies employing continuous speech stimuli may offer particular benefits, in that only a few minutes of speech generates robust high gamma responses from bilateral primary auditory cortex, and without measurable interference from subcortical or higher-level areas.
Collapse
Affiliation(s)
- Jonathan Z. Simon
- Department of Electrical and Computer Engineering, University of Maryland, College Park, College Park, MD, United States,Department of Biology, University of Maryland, College Park, College Park, MD, United States,Institute for Systems Research, University of Maryland, College Park, College Park, MD, United States,*Correspondence: Jonathan Z. Simon,
| | - Vrishab Commuri
- Department of Electrical and Computer Engineering, University of Maryland, College Park, College Park, MD, United States
| | | |
Collapse
|
12
|
Peter V, van Ommen S, Kalashnikova M, Mazuka R, Nazzi T, Burnham D. Language specificity in cortical tracking of speech rhythm at the mora, syllable, and foot levels. Sci Rep 2022; 12:13477. [PMID: 35931787 PMCID: PMC9356059 DOI: 10.1038/s41598-022-17401-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 07/25/2022] [Indexed: 11/29/2022] Open
Abstract
Recent research shows that adults’ neural oscillations track the rhythm of the speech signal. However, the extent to which this tracking is driven by the acoustics of the signal, or by language-specific processing remains unknown. Here adult native listeners of three rhythmically different languages (English, French, Japanese) were compared on their cortical tracking of speech envelopes synthesized in their three native languages, which allowed for coding at each of the three language’s dominant rhythmic unit, respectively the foot (2.5 Hz), syllable (5 Hz), or mora (10 Hz) level. The three language groups were also tested with a sequence in a non-native language, Polish, and a non-speech vocoded equivalent, to investigate possible differential speech/nonspeech processing. The results first showed that cortical tracking was most prominent at 5 Hz (syllable rate) for all three groups, but the French listeners showed enhanced tracking at 5 Hz compared to the English and the Japanese groups. Second, across groups, there were no differences in responses for speech versus non-speech at 5 Hz (syllable rate), but there was better tracking for speech than for non-speech at 10 Hz (not the syllable rate). Together these results provide evidence for both language-general and language-specific influences on cortical tracking.
Collapse
Affiliation(s)
- Varghese Peter
- MARCS Institute for Brain Behaviour and Development, Western Sydney University, Penrith, NSW, Australia. .,School of Health and Behavioural Sciences, University of the Sunshine Coast, Sippy Downs, Australia.
| | - Sandrien van Ommen
- Integrative Neuroscience and Cognition Center, CNRS-Université Paris Cité, Paris, France.,Neurosciences Fondamentales, University of Geneva, Geneva, Switzerland
| | - Marina Kalashnikova
- MARCS Institute for Brain Behaviour and Development, Western Sydney University, Penrith, NSW, Australia.,BCBL, Basque Center on Cognition, Brain and Language, San Sebastian, Guipuzcoa, Spain.,IKERBASQUE, Basque Foundation for Science, Bilbao, Bizcaya, Spain
| | - Reiko Mazuka
- Laboratory for Language Development, RIKEN Center for Brain Science, Saitama, Japan.,Department of Psychology and Neuroscience, Duke University, Durham, NC, USA
| | - Thierry Nazzi
- Integrative Neuroscience and Cognition Center, CNRS-Université Paris Cité, Paris, France
| | - Denis Burnham
- MARCS Institute for Brain Behaviour and Development, Western Sydney University, Penrith, NSW, Australia
| |
Collapse
|
13
|
Nourski KV, Steinschneider M, Rhone AE, Kovach CK, Kawasaki H, Howard MA. Gamma Activation and Alpha Suppression within Human Auditory Cortex during a Speech Classification Task. J Neurosci 2022; 42:5034-5046. [PMID: 35534226 PMCID: PMC9233444 DOI: 10.1523/jneurosci.2187-21.2022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 01/11/2022] [Accepted: 04/22/2022] [Indexed: 01/21/2023] Open
Abstract
The dynamics of information flow within the auditory cortical hierarchy associated with speech processing and the emergence of hemispheric specialization remain incompletely understood. To study these questions with high spatiotemporal resolution, intracranial recordings in 29 human neurosurgical patients of both sexes were obtained while subjects performed a semantic classification task. Neural activity was recorded from posteromedial portion of Heschl's gyrus (HGPM) and anterolateral portion of Heschl's gyrus (HGAL), planum temporale (PT), planum polare, insula, and superior temporal gyrus (STG). Responses to monosyllabic words exhibited early gamma power increases and a later suppression of alpha power, envisioned to represent feedforward activity and decreased feedback signaling, respectively. Gamma activation and alpha suppression had distinct magnitude and latency profiles. HGPM and PT had the strongest gamma responses with shortest onset latencies, indicating that they are the earliest auditory cortical processing stages. The origin of attenuated top-down influences in auditory cortex, as indexed by alpha suppression, was in STG and HGAL. Gamma responses and alpha suppression were typically larger to nontarget words than tones. Alpha suppression was uniformly greater to target versus nontarget stimuli. Hemispheric bias for words versus tones and for target versus nontarget words, when present, was left lateralized. Better task performance was associated with increased gamma activity in the left PT and greater alpha suppression in HGPM and HGAL bilaterally. The prominence of alpha suppression during semantic classification and its accessibility for noninvasive electrophysiologic studies suggests that this measure is a promising index of auditory cortical speech processing.SIGNIFICANCE STATEMENT Understanding the dynamics of cortical speech processing requires the use of active tasks. This is the first comprehensive intracranial electroencephalography study to examine cortical activity within the superior temporal plane, lateral superior temporal gyrus, and the insula during a semantic classification task. Distinct gamma activation and alpha suppression profiles clarify the functional organization of feedforward and feedback processing within the auditory cortical hierarchy. Asymmetries in cortical speech processing emerge at early processing stages. Relationships between cortical activity and task performance are interpreted in the context of current models of speech processing. Results lay the groundwork for iEEG studies using connectivity measures of the bidirectional information flow within the auditory processing hierarchy.
Collapse
Affiliation(s)
- Kirill V Nourski
- Department of Neurosurgery, University of Iowa, Iowa City, Iowa 52242
- Iowa Neuroscience Institute, University of Iowa, Iowa City, Iowa 52242
| | - Mitchell Steinschneider
- Departments of Neurology and Neuroscience, Albert Einstein College of Medicine, Bronx, New York 10461
| | - Ariane E Rhone
- Department of Neurosurgery, University of Iowa, Iowa City, Iowa 52242
| | | | - Hiroto Kawasaki
- Department of Neurosurgery, University of Iowa, Iowa City, Iowa 52242
| | - Matthew A Howard
- Department of Neurosurgery, University of Iowa, Iowa City, Iowa 52242
- Iowa Neuroscience Institute, University of Iowa, Iowa City, Iowa 52242
- Pappajohn Biomedical Institute, University of Iowa, Iowa City, Iowa 52242
| |
Collapse
|
14
|
Mc Laughlin M, Khatoun A, Asamoah B. Detection of tACS Entrainment Critically Depends on Epoch Length. Front Cell Neurosci 2022; 16:806556. [PMID: 35360495 PMCID: PMC8963722 DOI: 10.3389/fncel.2022.806556] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 02/11/2022] [Indexed: 11/26/2022] Open
Abstract
Neural entrainment is the phase synchronization of a population of neurons to an external rhythmic stimulus such as applied in the context of transcranial alternating current stimulation (tACS). tACS can cause profound effects on human behavior. However, there remain a significant number of studies that find no behavioral effect when tACS is applied to human subjects. To investigate this discrepancy, we applied time sensitive phase lock value (PLV) based analysis to single unit data from the rat motor cortex. The analysis revealed that detection of neural entrainment depends critically on the epoch length within which spiking information is accumulated. Increasing the epoch length allowed for detection of progressively weaker levels of neural entrainment. Based on this single unit analysis, we hypothesized that tACS effects on human behavior would be more easily detected in a behavior paradigm which utilizes longer epoch lengths. We tested this by using tACS to entrain tremor in patients and healthy volunteers. When the behavioral data were analyzed using short duration epochs tremor entrainment effects were not detectable. However, as the epoch length was progressively increased, weak tremor entrainment became detectable. These results suggest that tACS behavioral paradigms that rely on the accumulation of information over long epoch lengths will tend to be successful at detecting behavior effects. However, tACS paradigms that rely on short epoch lengths are less likely to detect effects.
Collapse
|
15
|
Margiotoudi K, Bohn M, Schwob N, Taglialatela J, Pulvermüller F, Epping A, Schweller K, Allritz M. Bo-NO-bouba-kiki: picture-word mapping but no spontaneous sound symbolic speech-shape mapping in a language trained bonobo. Proc Biol Sci 2022; 289:20211717. [PMID: 35105236 PMCID: PMC8808101 DOI: 10.1098/rspb.2021.1717] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 01/04/2022] [Indexed: 12/11/2022] Open
Abstract
Humans share the ability to intuitively map 'sharp' or 'round' pseudowords, such as 'bouba' versus 'kiki', to abstract edgy versus round shapes, respectively. This effect, known as sound symbolism, appears early in human development. The phylogenetic origin of this phenomenon, however, is unclear: are humans the only species capable of experiencing correspondences between speech sounds and shapes, or could similar effects be observed in other animals? Thus far, evidence from an implicit matching experiment failed to find evidence of this sound symbolic matching in great apes, suggesting its human uniqueness. However, explicit tests of sound symbolism have never been conducted with nonhuman great apes. In the present study, a language-competent bonobo completed a cross-modal matching-to-sample task in which he was asked to match spoken English words to pictures, as well as 'sharp' or 'round' pseudowords to shapes. Sound symbolic trials were interspersed among English words. The bonobo matched English words to pictures with high accuracy, but did not show any evidence of spontaneous sound symbolic matching. Our results suggest that speech exposure/comprehension alone cannot explain sound symbolism. This lends plausibility to the hypothesis that biological differences between human and nonhuman primates could account for the putative human specificity of this effect.
Collapse
Affiliation(s)
- Konstantina Margiotoudi
- Brain Language Laboratory, Department of Philosophy and Humanities, WE4, Freie Universität Berlin, Berlin, Germany
- Berlin School of Mind and Brain, Humboldt Universität Berlin, Berlin, Germany
- Laboratory of Cognitive Psychology, CNRS and Aix-Marseille University, Marseille, France
| | - Manuel Bohn
- Department of Comparative Cultural Psychology, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Natalie Schwob
- Department of Psychology, The Pennsylvania State University, University Park, PA, USA
| | - Jared Taglialatela
- Ape Cognition and Conservation Initiative, Des Moines, IA, USA
- Department of Ecology, Evolution and Organismal Biology, Kennesaw State University, Kennesaw, GA, USA
| | - Friedemann Pulvermüller
- Brain Language Laboratory, Department of Philosophy and Humanities, WE4, Freie Universität Berlin, Berlin, Germany
- Berlin School of Mind and Brain, Humboldt Universität Berlin, Berlin, Germany
- Einstein Center for Neurosciences Berlin, Berlin, Germany
- Cluster of Excellence ‘Matters of Activity’, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Amanda Epping
- Ape Cognition and Conservation Initiative, Des Moines, IA, USA
| | - Ken Schweller
- Ape Cognition and Conservation Initiative, Des Moines, IA, USA
| | - Matthias Allritz
- School of Psychology and Neuroscience, University of St Andrews, St Andrews, Fife KY16 9JP, UK
| |
Collapse
|
16
|
Glanz O, Hader M, Schulze-Bonhage A, Auer P, Ball T. A Study of Word Complexity Under Conditions of Non-experimental, Natural Overt Speech Production Using ECoG. Front Hum Neurosci 2022; 15:711886. [PMID: 35185491 PMCID: PMC8854223 DOI: 10.3389/fnhum.2021.711886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 12/15/2021] [Indexed: 11/25/2022] Open
Abstract
The linguistic complexity of words has largely been studied on the behavioral level and in experimental settings. Only little is known about the neural processes underlying it in uninstructed, spontaneous conversations. We built up a multimodal neurolinguistic corpus composed of synchronized audio, video, and electrocorticographic (ECoG) recordings from the fronto-temporo-parietal cortex to address this phenomenon based on uninstructed, spontaneous speech production. We performed extensive linguistic annotations of the language material and calculated word complexity using several numeric parameters. We orthogonalized the parameters with the help of a linear regression model. Then, we correlated the spectral components of neural activity with the individual linguistic parameters and with the residuals of the linear regression model, and compared the results. The proportional relation between the number of consonants and vowels, which was the most informative parameter with regard to the neural representation of word complexity, showed effects in two areas: the frontal one was at the junction of the premotor cortex, the prefrontal cortex, and Brodmann area 44. The postcentral one lay directly above the lateral sulcus and comprised the ventral central sulcus, the parietal operculum and the adjacent inferior parietal cortex. Beyond the physiological findings summarized here, our methods may be useful for those interested in ways of studying neural effects related to natural language production and in surmounting the intrinsic problem of collinearity between multiple features of spontaneously spoken material.
Collapse
Affiliation(s)
- Olga Glanz
- GRK 1624 “Frequency Effects in Language,” University of Freiburg, Freiburg, Germany
- Department of German Linguistics, University of Freiburg, Freiburg, Germany
- The Hermann Paul School of Linguistics, University of Freiburg, Freiburg, Germany
- BrainLinks-BrainTools, University of Freiburg, Freiburg, Germany
- Neurobiology and Biophysics, Faculty of Biology, University of Freiburg, Freiburg, Germany
- Translational Neurotechnology Lab, Department of Neurosurgery, Faculty of Medicine, Medical Center—University of Freiburg, University of Freiburg, Freiburg, Germany
- Olga Glanz (Iljina),
| | - Marina Hader
- BrainLinks-BrainTools, University of Freiburg, Freiburg, Germany
- Translational Neurotechnology Lab, Department of Neurosurgery, Faculty of Medicine, Medical Center—University of Freiburg, University of Freiburg, Freiburg, Germany
| | - Andreas Schulze-Bonhage
- Department of Neurosurgery, Faculty of Medicine, Epilepsy Center, Medical Center—University of Freiburg, University of Freiburg, Freiburg, Germany
- Bernstein Center Freiburg, University of Freiburg, Freiburg, Germany
| | - Peter Auer
- GRK 1624 “Frequency Effects in Language,” University of Freiburg, Freiburg, Germany
- Department of German Linguistics, University of Freiburg, Freiburg, Germany
- The Hermann Paul School of Linguistics, University of Freiburg, Freiburg, Germany
| | - Tonio Ball
- BrainLinks-BrainTools, University of Freiburg, Freiburg, Germany
- Translational Neurotechnology Lab, Department of Neurosurgery, Faculty of Medicine, Medical Center—University of Freiburg, University of Freiburg, Freiburg, Germany
- Bernstein Center Freiburg, University of Freiburg, Freiburg, Germany
- *Correspondence: Tonio Ball,
| |
Collapse
|
17
|
Ruthig P, Schönwiesner M. Common principles in the lateralisation of auditory cortex structure and function for vocal communication in primates and rodents. Eur J Neurosci 2022; 55:827-845. [PMID: 34984748 DOI: 10.1111/ejn.15590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Accepted: 12/24/2021] [Indexed: 11/27/2022]
Abstract
This review summarises recent findings on the lateralisation of communicative sound processing in the auditory cortex (AC) of humans, non-human primates, and rodents. Functional imaging in humans has demonstrated a left hemispheric preference for some acoustic features of speech, but it is unclear to which degree this is caused by bottom-up acoustic feature selectivity or top-down modulation from language areas. Although non-human primates show a less pronounced functional lateralisation in AC, the properties of AC fields and behavioral asymmetries are qualitatively similar. Rodent studies demonstrate microstructural circuits that might underlie bottom-up acoustic feature selectivity in both hemispheres. Functionally, the left AC in the mouse appears to be specifically tuned to communication calls, whereas the right AC may have a more 'generalist' role. Rodents also show anatomical AC lateralisation, such as differences in size and connectivity. Several of these functional and anatomical characteristics are also lateralized in human AC. Thus, complex vocal communication processing shares common features among rodents and primates. We argue that a synthesis of results from humans, non-human primates, and rodents is necessary to identify the neural circuitry of vocal communication processing. However, data from different species and methods are often difficult to compare. Recent advances may enable better integration of methods across species. Efforts to standardise data formats and analysis tools would benefit comparative research and enable synergies between psychological and biological research in the area of vocal communication processing.
Collapse
Affiliation(s)
- Philip Ruthig
- Faculty of Life Sciences, Leipzig University, Leipzig, Sachsen.,Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig
| | | |
Collapse
|
18
|
Landemard A, Bimbard C, Demené C, Shamma S, Norman-Haignere S, Boubenec Y. Distinct higher-order representations of natural sounds in human and ferret auditory cortex. eLife 2021; 10:e65566. [PMID: 34792467 PMCID: PMC8601661 DOI: 10.7554/elife.65566] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 10/22/2021] [Indexed: 11/29/2022] Open
Abstract
Little is known about how neural representations of natural sounds differ across species. For example, speech and music play a unique role in human hearing, yet it is unclear how auditory representations of speech and music differ between humans and other animals. Using functional ultrasound imaging, we measured responses in ferrets to a set of natural and spectrotemporally matched synthetic sounds previously tested in humans. Ferrets showed similar lower-level frequency and modulation tuning to that observed in humans. But while humans showed substantially larger responses to natural vs. synthetic speech and music in non-primary regions, ferret responses to natural and synthetic sounds were closely matched throughout primary and non-primary auditory cortex, even when tested with ferret vocalizations. This finding reveals that auditory representations in humans and ferrets diverge sharply at late stages of cortical processing, potentially driven by higher-order processing demands in speech and music.
Collapse
Affiliation(s)
- Agnès Landemard
- Laboratoire des Systèmes Perceptifs, Département d’Études Cognitives, École Normale Supérieure PSL Research University, CNRSParisFrance
| | - Célian Bimbard
- Laboratoire des Systèmes Perceptifs, Département d’Études Cognitives, École Normale Supérieure PSL Research University, CNRSParisFrance
- University College LondonLondonUnited Kingdom
| | - Charlie Demené
- Physics for Medicine Paris, Inserm, ESPCI Paris, PSL Research University, CNRSParisFrance
| | - Shihab Shamma
- Laboratoire des Systèmes Perceptifs, Département d’Études Cognitives, École Normale Supérieure PSL Research University, CNRSParisFrance
- Institute for Systems Research, Department of Electrical and Computer Engineering, University of MarylandCollege ParkUnited States
| | - Sam Norman-Haignere
- Laboratoire des Systèmes Perceptifs, Département d’Études Cognitives, École Normale Supérieure PSL Research University, CNRSParisFrance
- HHMI Postdoctoral Fellow of the Life Sciences Research FoundationBaltimoreUnited States
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkUnited States
| | - Yves Boubenec
- Laboratoire des Systèmes Perceptifs, Département d’Études Cognitives, École Normale Supérieure PSL Research University, CNRSParisFrance
| |
Collapse
|
19
|
Hamilton LS, Oganian Y, Hall J, Chang EF. Parallel and distributed encoding of speech across human auditory cortex. Cell 2021; 184:4626-4639.e13. [PMID: 34411517 DOI: 10.1016/j.cell.2021.07.019] [Citation(s) in RCA: 76] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2020] [Revised: 02/11/2021] [Accepted: 07/19/2021] [Indexed: 12/27/2022]
Abstract
Speech perception is thought to rely on a cortical feedforward serial transformation of acoustic into linguistic representations. Using intracranial recordings across the entire human auditory cortex, electrocortical stimulation, and surgical ablation, we show that cortical processing across areas is not consistent with a serial hierarchical organization. Instead, response latency and receptive field analyses demonstrate parallel and distinct information processing in the primary and nonprimary auditory cortices. This functional dissociation was also observed where stimulation of the primary auditory cortex evokes auditory hallucination but does not distort or interfere with speech perception. Opposite effects were observed during stimulation of nonprimary cortex in superior temporal gyrus. Ablation of the primary auditory cortex does not affect speech perception. These results establish a distributed functional organization of parallel information processing throughout the human auditory cortex and demonstrate an essential independent role for nonprimary auditory cortex in speech processing.
Collapse
Affiliation(s)
- Liberty S Hamilton
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Yulia Oganian
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Jeffery Hall
- Department of Neurology and Neurosurgery, McGill University Montreal Neurological Institute, Montreal, QC, H3A 2B4, Canada
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA.
| |
Collapse
|
20
|
Soni S, Tata MS. Brain electrical dynamics in speech segmentation depends upon prior experience with the language. BRAIN AND LANGUAGE 2021; 219:104967. [PMID: 34022679 DOI: 10.1016/j.bandl.2021.104967] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Revised: 04/26/2021] [Accepted: 05/10/2021] [Indexed: 06/12/2023]
Abstract
It remains unclear whether the process of speech tracking, which facilitates speech segmentation, reflects top-down mechanisms related to prior linguistic models or stimulus-driven mechanisms, or possibly both. To address this, we recorded electroencephalography (EEG) responses from native and non-native speakers of English that had different prior experience with the English language but heard acoustically identical stimuli. Despite a significant difference in the ability to segment and perceive speech, our EEG results showed that theta-band tracking of the speech envelope did not depend significantly on prior experience with language. However, tracking in the theta-band did show changes across repetitions of the same sentence, suggesting a priming effect. Furthermore, native and non-native speakers showed different phase dynamics at word boundaries, suggesting differences in segmentation mechanisms. Finally, we found that the correlation between higher frequency dynamics reflecting phoneme-level processing and perceptual segmentation of words might depend on prior experience with the spoken language.
Collapse
Affiliation(s)
- Shweta Soni
- The University of Lethbridge, Lethbridge, AB, Canada.
| | | |
Collapse
|
21
|
Erkens J, Schulte M, Vormann M, Wilsch A, Herrmann CS. Hearing Impaired Participants Improve More Under Envelope-Transcranial Alternating Current Stimulation When Signal to Noise Ratio Is High. Neurosci Insights 2021; 16:2633105520988854. [PMID: 33709079 PMCID: PMC7907945 DOI: 10.1177/2633105520988854] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 12/31/2020] [Indexed: 11/16/2022] Open
Abstract
An issue commonly expressed by hearing aid users is a difficulty to understand speech in complex hearing scenarios, that is, when speech is presented together with background noise or in situations with multiple speakers. Conventional hearing aids are already designed with these issues in mind, using beamforming to only enhance sound from a specific direction, but these are limited in solving these issues as they can only modulate incoming sound at the cochlear level. However, evidence exists that age-related hearing loss might partially be caused later in the hearing processes due to brain processes slowing down and becoming less efficient. In this study, we tested whether it would be possible to improve the hearing process at the cortical level by improving neural tracking of speech. The speech envelopes of target sentences were transformed into an electrical signal and stimulated onto elderly participants' cortices using transcranial alternating current stimulation (tACS). We compared 2 different signal to noise ratios (SNRs) with 5 different delays between sound presentation and stimulation ranging from 50 ms to 150 ms, and the differences in effects between elderly normal hearing and elderly hearing impaired participants. When the task was performed at a high SNR, hearing impaired participants appeared to gain more from envelope-tACS compared to when the task was performed at a lower SNR. This was not the case for normal hearing participants. Furthermore, a post-hoc analysis of the different time-lags suggest that elderly were significantly better at a stimulation time-lag of 150 ms when the task was presented at a high SNR. In this paper, we outline why these effects are worth exploring further, and what they tell us about the optimal tACS time-lag.
Collapse
Affiliation(s)
- Jules Erkens
- Department of Psychology, Cluster of
Excellence “Hearing4All,” European Medical School, Carl von Ossietzky University,
Oldenburg, Germany
| | | | | | - Anna Wilsch
- Department of Psychology, Cluster of
Excellence “Hearing4All,” European Medical School, Carl von Ossietzky University,
Oldenburg, Germany
| | - Christoph S Herrmann
- Department of Psychology, Cluster of
Excellence “Hearing4All,” European Medical School, Carl von Ossietzky University,
Oldenburg, Germany
- Research Center Neurosensory Science,
Carl von Ossietzky University, Oldenburg, Germany
| |
Collapse
|
22
|
Reetzke R, Gnanateja GN, Chandrasekaran B. Neural tracking of the speech envelope is differentially modulated by attention and language experience. BRAIN AND LANGUAGE 2021; 213:104891. [PMID: 33290877 PMCID: PMC7856208 DOI: 10.1016/j.bandl.2020.104891] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 09/22/2020] [Accepted: 11/18/2020] [Indexed: 05/13/2023]
Abstract
The ability to selectively attend to a speech signal amid competing sounds is a significant challenge, especially for listeners trying to comprehend non-native speech. Attention is critical to direct neural processing resources to the most essential information. Here, neural tracking of the speech envelope of an English story narrative and cortical auditory evoked potentials (CAEPs) to non-speech stimuli were simultaneously assayed in native and non-native listeners of English. Although native listeners exhibited higher narrative comprehension accuracy, non-native listeners exhibited enhanced neural tracking of the speech envelope and heightened CAEP magnitudes. These results support an emerging view that although attention to a target speech signal enhances neural tracking of the speech envelope, this mechanism itself may not confer speech comprehension advantages. Our findings suggest that non-native listeners may engage neural attentional processes that enhance low-level acoustic features, regardless if the target signal contains speech or non-speech information.
Collapse
Affiliation(s)
- Rachel Reetzke
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, United States; Center for Autism and Related Disorders, Kennedy Krieger Institute, United States
| | - G Nike Gnanateja
- Department of Communication Science and Disorders, University of Pittsburgh, United States
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, University of Pittsburgh, United States.
| |
Collapse
|
23
|
Kulasingham JP, Brodbeck C, Presacco A, Kuchinsky SE, Anderson S, Simon JZ. High gamma cortical processing of continuous speech in younger and older listeners. Neuroimage 2020; 222:117291. [PMID: 32835821 PMCID: PMC7736126 DOI: 10.1016/j.neuroimage.2020.117291] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 08/12/2020] [Accepted: 08/16/2020] [Indexed: 12/11/2022] Open
Abstract
Neural processing along the ascending auditory pathway is often associated with a progressive reduction in characteristic processing rates. For instance, the well-known frequency-following response (FFR) of the auditory midbrain, as measured with electroencephalography (EEG), is dominated by frequencies from ∼100 Hz to several hundred Hz, phase-locking to the acoustic stimulus at those frequencies. In contrast, cortical responses, whether measured by EEG or magnetoencephalography (MEG), are typically characterized by frequencies of a few Hz to a few tens of Hz, time-locking to acoustic envelope features. In this study we investigated a crossover case, cortically generated responses time-locked to continuous speech features at FFR-like rates. Using MEG, we analyzed responses in the high gamma range of 70-200 Hz to continuous speech using neural source-localized reverse correlation and the corresponding temporal response functions (TRFs). Continuous speech stimuli were presented to 40 subjects (17 younger, 23 older adults) with clinically normal hearing and their MEG responses were analyzed in the 70-200 Hz band. Consistent with the relative insensitivity of MEG to many subcortical structures, the spatiotemporal profile of these response components indicated a cortical origin with ∼40 ms peak latency and a right hemisphere bias. TRF analysis was performed using two separate aspects of the speech stimuli: a) the 70-200 Hz carrier of the speech, and b) the 70-200 Hz temporal modulations in the spectral envelope of the speech stimulus. The response was dominantly driven by the envelope modulation, with a much weaker contribution from the carrier. Age-related differences were also analyzed to investigate a reversal previously seen along the ascending auditory pathway, whereby older listeners show weaker midbrain FFR responses than younger listeners, but, paradoxically, have stronger cortical low frequency responses. In contrast to both these earlier results, this study did not find clear age-related differences in high gamma cortical responses to continuous speech. Cortical responses at FFR-like frequencies shared some properties with midbrain responses at the same frequencies and with cortical responses at much lower frequencies.
Collapse
Affiliation(s)
- Joshua P Kulasingham
- (a)Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, United States.
| | - Christian Brodbeck
- (b)Institute for Systems Research, University of Maryland, College Park, Maryland, United States.
| | - Alessandro Presacco
- (b)Institute for Systems Research, University of Maryland, College Park, Maryland, United States.
| | - Stefanie E Kuchinsky
- (c)Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, Maryland, United States.
| | - Samira Anderson
- (d)Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland, United States.
| | - Jonathan Z Simon
- (a)Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, United States; (b)Institute for Systems Research, University of Maryland, College Park, Maryland, United States; (e)Department of Biology, University of Maryland, College Park, Maryland, United States.
| |
Collapse
|
24
|
Nourski KV, Steinschneider M, Rhone AE, Kovach CK, Banks MI, Krause BM, Kawasaki H, Howard MA. Electrophysiology of the Human Superior Temporal Sulcus during Speech Processing. Cereb Cortex 2020; 31:1131-1148. [PMID: 33063098 DOI: 10.1093/cercor/bhaa281] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 08/06/2020] [Accepted: 09/01/2020] [Indexed: 12/20/2022] Open
Abstract
The superior temporal sulcus (STS) is a crucial hub for speech perception and can be studied with high spatiotemporal resolution using electrodes targeting mesial temporal structures in epilepsy patients. Goals of the current study were to clarify functional distinctions between the upper (STSU) and the lower (STSL) bank, hemispheric asymmetries, and activity during self-initiated speech. Electrophysiologic properties were characterized using semantic categorization and dialog-based tasks. Gamma-band activity and alpha-band suppression were used as complementary measures of STS activation. Gamma responses to auditory stimuli were weaker in STSL compared with STSU and had longer onset latencies. Activity in anterior STS was larger during speaking than listening; the opposite pattern was observed more posteriorly. Opposite hemispheric asymmetries were found for alpha suppression in STSU and STSL. Alpha suppression in the STS emerged earlier than in core auditory cortex, suggesting feedback signaling within the auditory cortical hierarchy. STSL was the only region where gamma responses to words presented in the semantic categorization tasks were larger in subjects with superior task performance. More pronounced alpha suppression was associated with better task performance in Heschl's gyrus, superior temporal gyrus, and STS. Functional differences between STSU and STSL warrant their separate assessment in future studies.
Collapse
Affiliation(s)
- Kirill V Nourski
- Department of Neurosurgery, The University of Iowa, Iowa City, IA 52242, USA.,Iowa Neuroscience Institute, The University of Iowa, Iowa City, IA 52242, USA
| | - Mitchell Steinschneider
- Departments of Neurology and Neuroscience, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Ariane E Rhone
- Department of Neurosurgery, The University of Iowa, Iowa City, IA 52242, USA
| | | | - Matthew I Banks
- Department of Anesthesiology, University of Wisconsin-Madison, Madison, WI 53705, USA.,Department of Neuroscience, University of Wisconsin-Madison, Madison, WI 53705, USA
| | - Bryan M Krause
- Department of Anesthesiology, University of Wisconsin-Madison, Madison, WI 53705, USA
| | - Hiroto Kawasaki
- Department of Neurosurgery, The University of Iowa, Iowa City, IA 52242, USA
| | - Matthew A Howard
- Department of Neurosurgery, The University of Iowa, Iowa City, IA 52242, USA.,Iowa Neuroscience Institute, The University of Iowa, Iowa City, IA 52242, USA.,Pappajohn Biomedical Institute, The University of Iowa, Iowa City, IA 52242, USA
| |
Collapse
|
25
|
Dynamic Time-Locking Mechanism in the Cortical Representation of Spoken Words. eNeuro 2020; 7:ENEURO.0475-19.2020. [PMID: 32513662 PMCID: PMC7470935 DOI: 10.1523/eneuro.0475-19.2020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2019] [Revised: 05/15/2020] [Accepted: 06/01/2020] [Indexed: 11/21/2022] Open
Abstract
Human speech has a unique capacity to carry and communicate rich meanings. However, it is not known how the highly dynamic and variable perceptual signal is mapped to existing linguistic and semantic representations. In this novel approach, we used the natural acoustic variability of sounds and mapped them to magnetoencephalography (MEG) data using physiologically-inspired machine-learning models. We aimed at determining how well the models, differing in their representation of temporal information, serve to decode and reconstruct spoken words from MEG recordings in 16 healthy volunteers. We discovered that dynamic time-locking of the cortical activation to the unfolding speech input is crucial for the encoding of the acoustic-phonetic features of speech. In contrast, time-locking was not highlighted in cortical processing of non-speech environmental sounds that conveyed the same meanings as the spoken words, including human-made sounds with temporal modulation content similar to speech. The amplitude envelope of the spoken words was particularly well reconstructed based on cortical evoked responses. Our results indicate that speech is encoded cortically with especially high temporal fidelity. This speech tracking by evoked responses may partly reflect the same underlying neural mechanism as the frequently reported entrainment of the cortical oscillations to the amplitude envelope of speech. Furthermore, the phoneme content was reflected in cortical evoked responses simultaneously with the spectrotemporal features, pointing to an instantaneous transformation of the unfolding acoustic features into linguistic representations during speech processing.
Collapse
|
26
|
Fox NP, Leonard M, Sjerps MJ, Chang EF. Transformation of a temporal speech cue to a spatial neural code in human auditory cortex. eLife 2020; 9:e53051. [PMID: 32840483 PMCID: PMC7556862 DOI: 10.7554/elife.53051] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Accepted: 08/21/2020] [Indexed: 11/28/2022] Open
Abstract
In speech, listeners extract continuously-varying spectrotemporal cues from the acoustic signal to perceive discrete phonetic categories. Spectral cues are spatially encoded in the amplitude of responses in phonetically-tuned neural populations in auditory cortex. It remains unknown whether similar neurophysiological mechanisms encode temporal cues like voice-onset time (VOT), which distinguishes sounds like /b/ and/p/. We used direct brain recordings in humans to investigate the neural encoding of temporal speech cues with a VOT continuum from /ba/ to /pa/. We found that distinct neural populations respond preferentially to VOTs from one phonetic category, and are also sensitive to sub-phonetic VOT differences within a population's preferred category. In a simple neural network model, simulated populations tuned to detect either temporal gaps or coincidences between spectral cues captured encoding patterns observed in real neural data. These results demonstrate that a spatial/amplitude neural code underlies the cortical representation of both spectral and temporal speech cues.
Collapse
Affiliation(s)
- Neal P Fox
- Department of Neurological Surgery, University of California, San FranciscoSan FranciscoUnited States
| | - Matthew Leonard
- Department of Neurological Surgery, University of California, San FranciscoSan FranciscoUnited States
| | - Matthias J Sjerps
- Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive Neuroimaging, Radboud UniversityNijmegenNetherlands
- Max Planck Institute for PsycholinguisticsNijmegenNetherlands
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San FranciscoSan FranciscoUnited States
- Weill Institute for Neurosciences, University of California, San FranciscoSan FranciscoUnited States
| |
Collapse
|
27
|
Erkens J, Schulte M, Vormann M, Herrmann CS. Lacking Effects of Envelope Transcranial Alternating Current Stimulation Indicate the Need to Revise Envelope Transcranial Alternating Current Stimulation Methods. Neurosci Insights 2020; 15:2633105520936623. [PMID: 32685924 PMCID: PMC7343360 DOI: 10.1177/2633105520936623] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Accepted: 05/28/2020] [Indexed: 12/13/2022] Open
Abstract
In recent years, several studies have reported beneficial effects of transcranial alternating current stimulation (tACS) in experiments regarding sound and speech perception. A new development in this field is envelope-tACS: The goal of this method is to improve cortical entrainment to the speech signal by stimulating with a waveform based on the speech envelope. One challenge of this stimulation method is timing; the electrical stimulation needs to be phase-aligned with the naturally occurring cortical entrainment to the auditory stimuli. Due to individual differences in anatomy and processing speed, the optimal time-lag between presentation of sound and applying envelope-tACS varies between participants. To better investigate the effects of envelope-tACS, we performed a speech comprehension task with a larger amount of time-lags than previous experiments, as well as an equal amount of sham conditions. No significant difference between optimal stimulation time-lag condition and best sham condition was found. Further investigation of the data revealed a significant difference between the positive and negative half-cycles of the stimulation conditions but not for sham. However, we also found a significant learning effect over the course of the experiment which was of comparable size to the effects of envelope-tACS found in previous auditory tACS studies. In this article, we discuss possible explanations for why our findings did not match up with those of previous studies and the issues that come with researching and developing envelope-tACS.
Collapse
Affiliation(s)
- Jules Erkens
- Experimental Psychology Lab, Department of Psychology, Cluster of Excellence 'Hearing4All', European Medical School, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| | | | | | - Christoph S Herrmann
- Experimental Psychology Lab, Department of Psychology, Cluster of Excellence 'Hearing4All', European Medical School, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany.,Research Center Neurosensory Science, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| |
Collapse
|
28
|
McFayden TC, Baskin P, Stephens JDW, He S. Cortical Auditory Event-Related Potentials and Categorical Perception of Voice Onset Time in Children With an Auditory Neuropathy Spectrum Disorder. Front Hum Neurosci 2020; 14:184. [PMID: 32523521 PMCID: PMC7261872 DOI: 10.3389/fnhum.2020.00184] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Accepted: 04/27/2020] [Indexed: 11/13/2022] Open
Abstract
Objective: This study evaluated cortical encoding of voice onset time (VOT) in quiet and noise, and their potential associations with the behavioral categorical perception of VOT in children with auditory neuropathy spectrum disorder (ANSD). Design: Subjects were 11 children with ANSD ranging in age between 6.4 and 16.2 years. The stimulus was an /aba/-/apa/ vowel-consonant-vowel continuum comprising eight tokens with VOTs ranging from 0 ms (voiced endpoint) to 88 ms (voiceless endpoint). For speech in noise, speech tokens were mixed with the speech-shaped noise from the Hearing In Noise Test at a signal-to-noise ratio (SNR) of +5 dB. Speech-evoked auditory event-related potentials (ERPs) and behavioral categorization perception of VOT were measured in quiet in all subjects, and at an SNR of +5 dB in seven subjects. The stimuli were presented at 35 dB SL (re: pure tone average) or 115 dB SPL if this limit was less than 35 dB SL. In addition to the onset response, the auditory change complex (ACC) elicited by VOT was recorded in eight subjects. Results: Speech evoked ERPs recorded in all subjects consisted of a vertex positive peak (i.e., P1), followed by a trough occurring approximately 100 ms later (i.e., N2). For results measured in quiet, there was no significant difference in categorical boundaries estimated using ERP measures and behavioral procedures. Categorical boundaries estimated in quiet using both ERP and behavioral measures closely correlated with the most-recently measured Phonetically Balanced Kindergarten (PBK) scores. Adding a competing background noise did not affect categorical boundaries estimated using either behavioral or ERP procedures in three subjects. For the other four subjects, categorical boundaries estimated in noise using behavioral measures were prolonged. However, adding background noise only increased categorical boundaries measured using ERPs in three out of these four subjects. Conclusions: VCV continuum can be used to evaluate behavioral identification and the neural encoding of VOT in children with ANSD. In quiet, categorical boundaries of VOT estimated using behavioral measures and ERP recordings are closely associated with speech recognition performance in children with ANSD. Underlying mechanisms for excessive speech perception deficits in noise may vary for individual patients with ANSD.
Collapse
Affiliation(s)
- Tyler C McFayden
- Department of Psychology, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Paola Baskin
- Department of Anesthesiology, School of Medicine, University of California, San Diego, San Diego, CA, United States
| | - Joseph D W Stephens
- Department of Psychology, North Carolina Agricultural and Technical State University, Greensboro, NC, United States
| | - Shuman He
- Department of Otolaryngology-Head and Neck Surgery, Wexner Medical Center, The Ohio State University, Columbus, OH, United States.,Department of Audiology, Nationwide Children's Hospital, Columbus, OH, United States
| |
Collapse
|
29
|
Ortiz-Mantilla S, Realpe-Bonilla T, Benasich AA. Early Interactive Acoustic Experience with Non-speech Generalizes to Speech and Confers a Syllabic Processing Advantage at 9 Months. Cereb Cortex 2020; 29:1789-1801. [PMID: 30722000 PMCID: PMC6418390 DOI: 10.1093/cercor/bhz001] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2018] [Revised: 12/04/2018] [Accepted: 01/07/2019] [Indexed: 12/19/2022] Open
Abstract
During early development, the infant brain is highly plastic and sensory experiences modulate emerging cortical maps, enhancing processing efficiency as infants set up key linguistic precursors. Early interactive acoustic experience (IAE) with spectrotemporally-modulated non-speech has been shown to facilitate optimal acoustic processing and generalizes to novel non-speech sounds at 7-months-of-age. Here we demonstrate that effects of non-speech IAE endure well beyond the immediate training period and robustly generalize to speech processing. Infants who received non-speech IAE differed at 9-months-of-age from both naïve controls and those with only passive acoustic exposure, demonstrating broad modulation of oscillatory dynamics. For the standard syllable, increased high-gamma (>70 Hz) power within auditory cortices indicates that IAE fosters native speech processing, facilitating establishment of phonemic representations. The higher left beta power seen may reflect increased linking of sensory information and corresponding articulatory patterns, while bilateral decreases in theta power suggest more mature automatized speech processing, as less neuronal resources were allocated to process syllabic information. For the deviant syllable, left-lateralized gamma (<70 Hz) enhancement suggests IAE promotes phonemic-related discrimination abilities. Theta power increases in right auditory cortex, known for favoring slow-rate decoding, implies IAE facilitates the more demanding processing of the sporadic deviant syllable.
Collapse
Affiliation(s)
- Silvia Ortiz-Mantilla
- Center for Molecular & Behavioral Neuroscience, Rutgers University-Newark, 197 University Avenue, Newark, NJ, USA
| | - Teresa Realpe-Bonilla
- Center for Molecular & Behavioral Neuroscience, Rutgers University-Newark, 197 University Avenue, Newark, NJ, USA
| | - April A Benasich
- Center for Molecular & Behavioral Neuroscience, Rutgers University-Newark, 197 University Avenue, Newark, NJ, USA
| |
Collapse
|
30
|
Wang Y, Zhang J, Zou J, Luo H, Ding N. Prior Knowledge Guides Speech Segregation in Human Auditory Cortex. Cereb Cortex 2020; 29:1561-1571. [PMID: 29788144 DOI: 10.1093/cercor/bhy052] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Revised: 01/22/2018] [Accepted: 02/15/2018] [Indexed: 11/12/2022] Open
Abstract
Segregating concurrent sound streams is a computationally challenging task that requires integrating bottom-up acoustic cues (e.g. pitch) and top-down prior knowledge about sound streams. In a multi-talker environment, the brain can segregate different speakers in about 100 ms in auditory cortex. Here, we used magnetoencephalographic (MEG) recordings to investigate the temporal and spatial signature of how the brain utilizes prior knowledge to segregate 2 speech streams from the same speaker, which can hardly be separated based on bottom-up acoustic cues. In a primed condition, the participants know the target speech stream in advance while in an unprimed condition no such prior knowledge is available. Neural encoding of each speech stream is characterized by the MEG responses tracking the speech envelope. We demonstrate that an effect in bilateral superior temporal gyrus and superior temporal sulcus is much stronger in the primed condition than in the unprimed condition. Priming effects are observed at about 100 ms latency and last more than 600 ms. Interestingly, prior knowledge about the target stream facilitates speech segregation by mainly suppressing the neural tracking of the non-target speech stream. In sum, prior knowledge leads to reliable speech segregation in auditory cortex, even in the absence of reliable bottom-up speech segregation cue.
Collapse
Affiliation(s)
- Yuanye Wang
- School of Psychological and Cognitive Sciences, Peking University, Beijing, China.,McGovern Institute for Brain Research, Peking University, Beijing, China.,Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
| | - Jianfeng Zhang
- College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Jiajie Zou
- College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Huan Luo
- School of Psychological and Cognitive Sciences, Peking University, Beijing, China.,McGovern Institute for Brain Research, Peking University, Beijing, China.,Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
| | - Nai Ding
- College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, Zhejiang, China.,Key Laboratory for Biomedical Engineering of Ministry of Education, Zhejiang University, Hangzhou, Zhejiang, China.,State Key Laboratory of Industrial Control Technology, Zhejiang University, Hangzhou, Zhejiang, China.,Interdisciplinary Center for Social Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| |
Collapse
|
31
|
Musicians use speech-specific areas when processing tones: The key to their superior linguistic competence? Behav Brain Res 2020; 390:112662. [PMID: 32442547 DOI: 10.1016/j.bbr.2020.112662] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 04/21/2020] [Accepted: 04/22/2020] [Indexed: 11/23/2022]
Abstract
It is known that musicians compared to non-musicians have some superior speech and language competence, yet the mechanisms how musical training leads to this advantage are not well specified. This event-related fMRI study confirmed that musicians outperformed non-musicians in processing not only of musical tones but also syllables and identified a network differentiating musicians from non-musicians during processing of linguistic sounds. Within this network, the activation of bilateral superior temporal gyrus was shared with all subjects during processing of the acoustically well-matched musical and linguistic sounds, and with the activation distinguishing tones with a complex harmonic spectrum (bowed tone) from a simpler one (plucked tone). These results confirm that better speech processing in musicians relies on improved cross-domain spectral analysis. Activation of left posterior superior temporal sulcus (pSTS), premotor cortex, inferior frontal and fusiform gyrus (FG) also distinguishing musicians from non-musicians during syllable processing overlapped with the activation segregating linguistic from musical sounds in all subjects. Since these brain-regions were not involved during tone processing in non-musicians, they could code for functions which are specialized for speech. Musicians recruited pSTS and FG during tone processing, thus these speech-specialized brain-areas processed musical sounds in the presence of musical training. This study shows that the linguistic advantage of musicians is linked not only to improved cross-domain spectral analysis, but also to the functional adaptation of brain resources that are specialized for speech, but accessible to the domain of music in the presence of musical training.
Collapse
|
32
|
Early lexical influences on sublexical processing in speech perception: Evidence from electrophysiology. Cognition 2020; 197:104162. [DOI: 10.1016/j.cognition.2019.104162] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Revised: 12/11/2019] [Accepted: 12/16/2019] [Indexed: 11/17/2022]
|
33
|
Ratnanather JT. Structural neuroimaging of the altered brain stemming from pediatric and adolescent hearing loss-Scientific and clinical challenges. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2019; 12:e1469. [PMID: 31802640 DOI: 10.1002/wsbm.1469] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Revised: 10/01/2019] [Accepted: 10/13/2019] [Indexed: 12/20/2022]
Abstract
There has been a spurt in structural neuroimaging studies of the effect of hearing loss on the brain. Specifically, magnetic resonance imaging (MRI) and diffusion tensor imaging (DTI) technologies provide an opportunity to quantify changes in gray and white matter structures at the macroscopic scale. To date, there have been 32 MRI and 23 DTI studies that have analyzed structural differences accruing from pre- or peri-lingual pediatric hearing loss with congenital or early onset etiology and postlingual hearing loss in pre-to-late adolescence. Additionally, there have been 15 prospective clinical structural neuroimaging studies of children and adolescents being evaluated for cochlear implants. The results of the 70 studies are summarized in two figures and three tables. Plastic changes in the brain are seen to be multifocal rather than diffuse, that is, differences are consistent across regions implicated in the hearing, speech and language networks regardless of modes of communication and amplification. Structures in that play an important role in cognition are affected to a lesser extent. A limitation of these studies is the emphasis on volumetric measures and on homogeneous groups of subjects with hearing loss. It is suggested that additional measures of morphometry and connectivity could contribute to a greater understanding of the effect of hearing loss on the brain. Then an interpretation of the observed macroscopic structural differences is given. This is followed by discussion of how structural imaging can be combined with functional imaging to provide biomarkers for longitudinal tracking of amplification. This article is categorized under: Developmental Biology > Developmental Processes in Health and Disease Translational, Genomic, and Systems Medicine > Translational Medicine Laboratory Methods and Technologies > Imaging.
Collapse
Affiliation(s)
- J Tilak Ratnanather
- Center for Imaging Science, Johns Hopkins University, Baltimore, Maryland.,Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland.,Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland
| |
Collapse
|
34
|
O'Sullivan J, Herrero J, Smith E, Schevon C, McKhann GM, Sheth SA, Mehta AD, Mesgarani N. Hierarchical Encoding of Attended Auditory Objects in Multi-talker Speech Perception. Neuron 2019; 104:1195-1209.e3. [PMID: 31648900 DOI: 10.1016/j.neuron.2019.09.007] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 07/11/2019] [Accepted: 09/06/2019] [Indexed: 11/15/2022]
Abstract
Humans can easily focus on one speaker in a multi-talker acoustic environment, but how different areas of the human auditory cortex (AC) represent the acoustic components of mixed speech is unknown. We obtained invasive recordings from the primary and nonprimary AC in neurosurgical patients as they listened to multi-talker speech. We found that neural sites in the primary AC responded to individual speakers in the mixture and were relatively unchanged by attention. In contrast, neural sites in the nonprimary AC were less discerning of individual speakers but selectively represented the attended speaker. Moreover, the encoding of the attended speaker in the nonprimary AC was invariant to the degree of acoustic overlap with the unattended speaker. Finally, this emergent representation of attended speech in the nonprimary AC was linearly predictable from the primary AC responses. Our results reveal the neural computations underlying the hierarchical formation of auditory objects in human AC during multi-talker speech perception.
Collapse
Affiliation(s)
- James O'Sullivan
- Department of Electrical Engineering, Columbia University, New York, NY, USA
| | - Jose Herrero
- Department of Neurosurgery, Hofstra-Northwell School of Medicine and Feinstein Institute for Medical Research, Manhasset, New York, NY, USA
| | - Elliot Smith
- Department of Neurological Surgery, The Neurological Institute, New York, NY, USA; Department of Neurosurgery, University of Utah, Salt Lake City, UT, USA
| | - Catherine Schevon
- Department of Neurological Surgery, The Neurological Institute, New York, NY, USA
| | - Guy M McKhann
- Department of Neurological Surgery, The Neurological Institute, New York, NY, USA
| | - Sameer A Sheth
- Department of Neurological Surgery, The Neurological Institute, New York, NY, USA; Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
| | - Ashesh D Mehta
- Department of Neurosurgery, Hofstra-Northwell School of Medicine and Feinstein Institute for Medical Research, Manhasset, New York, NY, USA
| | - Nima Mesgarani
- Department of Electrical Engineering, Columbia University, New York, NY, USA.
| |
Collapse
|
35
|
Margiotoudi K, Allritz M, Bohn M, Pulvermüller F. Sound symbolic congruency detection in humans but not in great apes. Sci Rep 2019; 9:12705. [PMID: 31481655 PMCID: PMC6722092 DOI: 10.1038/s41598-019-49101-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Accepted: 08/15/2019] [Indexed: 11/20/2022] Open
Abstract
Theories on the evolution of language highlight iconicity as one of the unique features of human language. One important manifestation of iconicity is sound symbolism, the intrinsic relationship between meaningless speech sounds and visual shapes, as exemplified by the famous correspondences between the pseudowords ‘maluma’ vs. ‘takete’ and abstract curved and angular shapes. Although sound symbolism has been studied extensively in humans including young children and infants, it has never been investigated in non-human primates lacking language. In the present study, we administered the classic “takete-maluma” paradigm in both humans (N = 24 and N = 31) and great apes (N = 8). In a forced choice matching task, humans but not great apes, showed crossmodal sound symbolic congruency effects, whereby effects were more pronounced for shape selections following round-sounding primes than following edgy-sounding primes. These results suggest that the ability to detect sound symbolic correspondences is the outcome of a phylogenetic process, whose underlying emerging mechanism may be relevant to symbolic ability more generally.
Collapse
Affiliation(s)
- Konstantina Margiotoudi
- Brain Language Laboratory, Department of Philosophy and Humanities, WE4, Freie Universität Berlin, 14195, Berlin, Germany. .,Berlin School of Mind and Brain, Humboldt Universität zu Berlin, 10099, Berlin, Germany.
| | - Matthias Allritz
- School of Psychology & Neuroscience, University of St. Andrews, St. Andrews, Fife, UK
| | - Manuel Bohn
- Leipziger Forschungszentrum für frühkindliche Entwicklung, Universität Leipzig, Leipzig, Germany.,Department of Psychology, Stanford University, Stanford, USA
| | - Friedemann Pulvermüller
- Brain Language Laboratory, Department of Philosophy and Humanities, WE4, Freie Universität Berlin, 14195, Berlin, Germany.,Berlin School of Mind and Brain, Humboldt Universität zu Berlin, 10099, Berlin, Germany.,Cluster of Excellence "Matters of Activity", Humboldt Universität zu Berlin, 10099, Berlin, Germany.,Einstein Center for Neurosciences Berlin, 10117, Berlin, Germany
| |
Collapse
|
36
|
Jain S, Nataraja NP. The Relationship between Temporal Integration and Temporal Envelope Perception in Noise by Males with Mild Sensorineural Hearing Loss. J Int Adv Otol 2019; 15:257-262. [PMID: 31418715 DOI: 10.5152/iao.2019.6555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
OBJECTIVES A surge of literature indicated that temporal integration and temporal envelope perception contribute largely to the perception of speech. A review of literature showed that the perception of speech with temporal integration and temporal envelope perception in noise might be affected due to sensorineural hearing loss but to a varying degree. Because the temporal integration and temporal envelope share similar physiological processing at the cochlear level, the present study was aimed to identify the relationship between temporal integration and temporal envelope perception in noise by individuals with mild sensorineural hearing loss. MATERIALS AND METHODS Thirty adult males with mild sensorineural hearing loss and thirty age- and gender-matched normal-hearing individuals volunteered for being the participants of the study. The temporal integration was measured using synthetic consonant-vowel-consonant syllables, varied for onset, offset, and onset-offset of second and third formant frequencies of the vowel following and preceding consonants in six equal steps, thus forming a six-step onset, offset, and onset-offset continuum, each. The duration of the transition was kept short (40 ms) in one set of continua and long (80 ms) in another. Temporal integration scores were calculated as the differences in the identification of the categorical boundary between short- and long-transition continua. Temporal envelope perception was measured using sentences processed in quiet, 0 dB, and -5 dB signal-to-noise ratios at 4, 8, 16, and 32 contemporary frequency channels, and the temporal envelope was extracted for each sentence using the Hilbert transformation. RESULTS A significant effect of hearing loss was observed on temporal integration, but not on temporal envelope perception. However, when the temporal integration abilities were controlled, the variable effect of hearing loss on temporal envelope perception was noted. CONCLUSION It was important to measure the temporal integration to accurately account for the envelope perception by individuals with normal hearing and those with hearing loss.
Collapse
Affiliation(s)
- Saransh Jain
- Department of Audiology, JSS Institute of Speech and Hearing, JSS Research Foundation, Mysuru, India
| | | |
Collapse
|
37
|
Schmidt-Kassow M, Thöne K, Kaiser J. Auditory-motor coupling affects phonetic encoding. Brain Res 2019; 1716:39-49. [PMID: 29191770 DOI: 10.1016/j.brainres.2017.11.022] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2017] [Revised: 10/24/2017] [Accepted: 11/21/2017] [Indexed: 10/18/2022]
Abstract
Recent studies have shown that moving in synchrony with auditory stimuli boosts attention allocation and verbal learning. Furthermore rhythmic tones are processed more efficiently than temporally random tones ('timing effect'), and this effect is increased when participants actively synchronize their motor performance with the rhythm of the tones, resulting in auditory-motor synchronization. Here, we investigated whether this applies also to sequences of linguistic stimuli (syllables). We compared temporally irregular syllable sequences with two temporally regular conditions where either the interval between syllable onsets (stimulus onset asynchrony, SOA) or the interval between the syllables' vowel onsets was kept constant. Entrainment to the stimulus presentation frequency (1 Hz) and event-related potentials were assessed in 24 adults who were instructed to detect pre-defined deviant syllables while they either pedaled or sat still on a stationary exercise bike. We found larger 1 Hz entrainment and P300 amplitudes for the SOA presentation during motor activity. Furthermore, the magnitude of the P300 component correlated with the motor variability in the SOA condition and 1 Hz entrainment, while in turn 1 Hz entrainment correlated with auditory-motor synchronization performance. These findings demonstrate that acute auditory-motor coupling facilitates phonetic encoding.
Collapse
Affiliation(s)
| | - Katharina Thöne
- Institute of Medical Psychology, Goethe University, Frankfurt, Germany
| | - Jochen Kaiser
- Institute of Medical Psychology, Goethe University, Frankfurt, Germany
| |
Collapse
|
38
|
Burghard A, Voigt MB, Kral A, Hubka P. Categorical processing of fast temporal sequences in the guinea pig auditory brainstem. Commun Biol 2019; 2:265. [PMID: 31341964 PMCID: PMC6642126 DOI: 10.1038/s42003-019-0472-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2018] [Accepted: 05/23/2019] [Indexed: 11/21/2022] Open
Abstract
Discrimination of temporal sequences is crucial for auditory object recognition, phoneme categorization and speech understanding. The present study shows that auditory brainstem responses (ABR) to pairs of noise bursts separated by a short gap can be classified into two distinct groups based on the ratio of gap duration to initial noise burst duration in guinea pigs. If this ratio was smaller than 0.5, the ABR to the trailing noise burst was strongly suppressed. On the other hand, if the initial noise burst duration was short compared to the gap duration (a ratio greater than 0.5), a release from suppression and/or enhancement of the trailing ABR was observed. Consequently, initial noise bursts of shorter duration caused a faster transition between response classes than initial noise bursts of longer duration. We propose that the described findings represent a neural correlate of subcortical categorical preprocessing of temporal sequences in the auditory system.
Collapse
Affiliation(s)
- Alice Burghard
- Institute of Audioneurotechnology & Department of Experimental Otology, ENT Clinics, Hannover Medical School, Hannover, D-30625 Germany
- Department of Neuroscience, University of Connecticut Health Center, Farmington, CT 06030 USA
| | - Mathias Benjamin Voigt
- Institute of Audioneurotechnology & Department of Experimental Otology, ENT Clinics, Hannover Medical School, Hannover, D-30625 Germany
| | - Andrej Kral
- Institute of Audioneurotechnology & Department of Experimental Otology, ENT Clinics, Hannover Medical School, Hannover, D-30625 Germany
| | - Peter Hubka
- Institute of Audioneurotechnology & Department of Experimental Otology, ENT Clinics, Hannover Medical School, Hannover, D-30625 Germany
| |
Collapse
|
39
|
Yi HG, Leonard MK, Chang EF. The Encoding of Speech Sounds in the Superior Temporal Gyrus. Neuron 2019; 102:1096-1110. [PMID: 31220442 PMCID: PMC6602075 DOI: 10.1016/j.neuron.2019.04.023] [Citation(s) in RCA: 170] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Revised: 04/08/2019] [Accepted: 04/16/2019] [Indexed: 01/02/2023]
Abstract
The human superior temporal gyrus (STG) is critical for extracting meaningful linguistic features from speech input. Local neural populations are tuned to acoustic-phonetic features of all consonants and vowels and to dynamic cues for intonational pitch. These populations are embedded throughout broader functional zones that are sensitive to amplitude-based temporal cues. Beyond speech features, STG representations are strongly modulated by learned knowledge and perceptual goals. Currently, a major challenge is to understand how these features are integrated across space and time in the brain during natural speech comprehension. We present a theory that temporally recurrent connections within STG generate context-dependent phonological representations, spanning longer temporal sequences relevant for coherent percepts of syllables, words, and phrases.
Collapse
Affiliation(s)
- Han Gyol Yi
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Matthew K Leonard
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA.
| |
Collapse
|
40
|
Zou J, Feng J, Xu T, Jin P, Luo C, Zhang J, Pan X, Chen F, Zheng J, Ding N. Auditory and language contributions to neural encoding of speech features in noisy environments. Neuroimage 2019; 192:66-75. [DOI: 10.1016/j.neuroimage.2019.02.047] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Revised: 01/31/2019] [Accepted: 02/19/2019] [Indexed: 11/28/2022] Open
|
41
|
Tamura S, Ito K, Hirose N, Mori S. Effects of Manipulating the Amplitude of Consonant Noise Portion on Subcortical Representation of Voice Onset Time and Voicing Perception in Stop Consonants. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:434-441. [PMID: 30950688 DOI: 10.1044/2018_jslhr-h-18-0102] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Purpose The purpose of this study was to investigate whether speech perception would reflect small latency changes in subcortical speech representation. Method Twelve native Japanese listeners participated in the experiment. Those listeners participated in speech identification task and auditory brainstem response (ABR) measurement using /d/-/t/ continuum stimuli varying in voice onset time (VOT) with manipulation of the amplitude of initial noise (consonant) portion, the duration of which corresponded to VOT. Results Increasing the noise portion amplitude lengthened subcortical representation of VOT, which is the latency difference between ABRs synchronizing to the onsets of initial noise and following periodic (vowel) portions (VOTABR) and made listeners likely to perceive the stimuli with ambiguous VOT as a voiceless stop /t/. In addition, the amount of VOTABR lengthening was close to that of the VOT boundary shortening. Conclusion A few milliseconds of difference in subcortical speech representation are important for the perception of speech sounds with ambiguous acoustic cues. Supplemental Material https://doi.org/10.23641/asha.7728695.
Collapse
Affiliation(s)
- Shunsuke Tamura
- Graduate School of Information Science and Electrical Engineering, Kyushu University, Fukuoka, Japan
| | - Kazuhito Ito
- Faculty of Information Science and Electrical Engineering, Kyushu University, Fukuoka, Japan
| | - Nobuyuki Hirose
- Faculty of Information Science and Electrical Engineering, Kyushu University, Fukuoka, Japan
| | - Shuji Mori
- Faculty of Information Science and Electrical Engineering, Kyushu University, Fukuoka, Japan
| |
Collapse
|
42
|
Steadman MA, Sumner CJ. Changes in Neuronal Representations of Consonants in the Ascending Auditory System and Their Role in Speech Recognition. Front Neurosci 2018; 12:671. [PMID: 30369863 PMCID: PMC6194309 DOI: 10.3389/fnins.2018.00671] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Accepted: 09/06/2018] [Indexed: 11/25/2022] Open
Abstract
A fundamental task of the ascending auditory system is to produce representations that facilitate the recognition of complex sounds. This is particularly challenging in the context of acoustic variability, such as that between different talkers producing the same phoneme. These representations are transformed as information is propagated throughout the ascending auditory system from the inner ear to the auditory cortex (AI). Investigating these transformations and their role in speech recognition is key to understanding hearing impairment and the development of future clinical interventions. Here, we obtained neural responses to an extensive set of natural vowel-consonant-vowel phoneme sequences, each produced by multiple talkers, in three stages of the auditory processing pathway. Auditory nerve (AN) representations were simulated using a model of the peripheral auditory system and extracellular neuronal activity was recorded in the inferior colliculus (IC) and primary auditory cortex (AI) of anaesthetized guinea pigs. A classifier was developed to examine the efficacy of these representations for recognizing the speech sounds. Individual neurons convey progressively less information from AN to AI. Nonetheless, at the population level, representations are sufficiently rich to facilitate recognition of consonants with a high degree of accuracy at all stages indicating a progression from a dense, redundant representation to a sparse, distributed one. We examined the timescale of the neural code for consonant recognition and found that optimal timescales increase throughout the ascending auditory system from a few milliseconds in the periphery to several tens of milliseconds in the cortex. Despite these longer timescales, we found little evidence to suggest that representations up to the level of AI become increasingly invariant to across-talker differences. Instead, our results support the idea that the role of the subcortical auditory system is one of dimensionality expansion, which could provide a basis for flexible classification of arbitrary speech sounds.
Collapse
Affiliation(s)
- Mark A. Steadman
- MRC Institute of Hearing Research, School of Medicine, The University of Nottingham, Nottingham, United Kingdom
- Department of Bioengineering, Imperial College London, London, United Kingdom
| | - Christian J. Sumner
- MRC Institute of Hearing Research, School of Medicine, The University of Nottingham, Nottingham, United Kingdom
| |
Collapse
|
43
|
Hamilton LS, Edwards E, Chang EF. A Spatial Map of Onset and Sustained Responses to Speech in the Human Superior Temporal Gyrus. Curr Biol 2018; 28:1860-1871.e4. [DOI: 10.1016/j.cub.2018.04.033] [Citation(s) in RCA: 98] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Revised: 03/04/2018] [Accepted: 04/10/2018] [Indexed: 01/05/2023]
|
44
|
Cueing listeners to attend to a target talker progressively improves word report as the duration of the cue-target interval lengthens to 2,000 ms. Atten Percept Psychophys 2018; 80:1520-1538. [PMID: 29696570 DOI: 10.3758/s13414-018-1531-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Endogenous attention is typically studied by presenting instructive cues in advance of a target stimulus array. For endogenous visual attention, task performance improves as the duration of the cue-target interval increases up to 800 ms. Less is known about how endogenous auditory attention unfolds over time or the mechanisms by which an instructive cue presented in advance of an auditory array improves performance. The current experiment used five cue-target intervals (0, 250, 500, 1,000, and 2,000 ms) to compare four hypotheses for how preparatory attention develops over time in a multi-talker listening task. Young adults were cued to attend to a target talker who spoke in a mixture of three talkers. Visual cues indicated the target talker's spatial location or their gender. Participants directed attention to location and gender simultaneously ("objects") at all cue-target intervals. Participants were consistently faster and more accurate at reporting words spoken by the target talker when the cue-target interval was 2,000 ms than 0 ms. In addition, the latency of correct responses progressively shortened as the duration of the cue-target interval increased from 0 to 2,000 ms. These findings suggest that the mechanisms involved in preparatory auditory attention develop gradually over time, taking at least 2,000 ms to reach optimal configuration, yet providing cumulative improvements in speech intelligibility as the duration of the cue-target interval increases from 0 to 2,000 ms. These results demonstrate an improvement in performance for cue-target intervals longer than those that have been reported previously in the visual or auditory modalities.
Collapse
|
45
|
Wilsch A, Neuling T, Obleser J, Herrmann CS. Transcranial alternating current stimulation with speech envelopes modulates speech comprehension. Neuroimage 2018; 172:766-774. [PMID: 29355765 DOI: 10.1016/j.neuroimage.2018.01.038] [Citation(s) in RCA: 55] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2017] [Revised: 12/11/2017] [Accepted: 01/15/2018] [Indexed: 02/03/2023] Open
Abstract
Cortical entrainment of the auditory cortex to the broadband temporal envelope of a speech signal is crucial for speech comprehension. Entrainment results in phases of high and low neural excitability, which structure and decode the incoming speech signal. Entrainment to speech is strongest in the theta frequency range (4-8 Hz), the average frequency of the speech envelope. If a speech signal is degraded, entrainment to the speech envelope is weaker and speech intelligibility declines. Besides perceptually evoked cortical entrainment, transcranial alternating current stimulation (tACS) entrains neural oscillations by applying an electric signal to the brain. Accordingly, tACS-induced entrainment in auditory cortex has been shown to improve auditory perception. The aim of the current study was to modulate speech intelligibility externally by means of tACS such that the electric current corresponds to the envelope of the presented speech stream (i.e., envelope-tACS). Participants performed the Oldenburg sentence test with sentences presented in noise in combination with envelope-tACS. Critically, tACS was induced at time lags of 0-250 ms in 50-ms steps relative to sentence onset (auditory stimuli were simultaneous to or preceded tACS). We performed single-subject sinusoidal, linear, and quadratic fits to the sentence comprehension performance across the time lags. We could show that the sinusoidal fit described the modulation of sentence comprehension best. Importantly, the average frequency of the sinusoidal fit was 5.12 Hz, corresponding to the peaks of the amplitude spectrum of the stimulated envelopes. This finding was supported by a significant 5-Hz peak in the average power spectrum of individual performance time series. Altogether, envelope-tACS modulates intelligibility of speech in noise, presumably by enhancing and disrupting (time lag with in- or out-of-phase stimulation, respectively) cortical entrainment to the speech envelope in auditory cortex.
Collapse
Affiliation(s)
- Anna Wilsch
- Experimental Psychology Lab, Department of Psychology, Cluster of Excellence "Hearing4all", European Medical School, Carl von Ossietzky University, 26129 Oldenburg, Germany
| | - Toralf Neuling
- Department of Psychology, University of Salzburg, 5020 Salzburg, Austria
| | - Jonas Obleser
- Department of Psychology, University of Lübeck, 23562 Lübeck, Germany
| | - Christoph S Herrmann
- Experimental Psychology Lab, Department of Psychology, Cluster of Excellence "Hearing4all", European Medical School, Carl von Ossietzky University, 26129 Oldenburg, Germany; Research Center Neurosensory Science, Carl von Ossietzky University, 26129 Oldenburg, Germany.
| |
Collapse
|
46
|
Attention Is Required for Knowledge-Based Sequential Grouping: Insights from the Integration of Syllables into Words. J Neurosci 2017; 38:1178-1188. [PMID: 29255005 DOI: 10.1523/jneurosci.2606-17.2017] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Revised: 11/08/2017] [Accepted: 12/05/2017] [Indexed: 11/21/2022] Open
Abstract
How the brain groups sequential sensory events into chunks is a fundamental question in cognitive neuroscience. This study investigates whether top-down attention or specific tasks are required for the brain to apply lexical knowledge to group syllables into words. Neural responses tracking the syllabic and word rhythms of a rhythmic speech sequence were concurrently monitored using electroencephalography (EEG). The participants performed different tasks, attending to either the rhythmic speech sequence or a distractor, which was another speech stream or a nonlinguistic auditory/visual stimulus. Attention to speech, but not a lexical-meaning-related task, was required for reliable neural tracking of words, even when the distractor was a nonlinguistic stimulus presented cross-modally. Neural tracking of syllables, however, was reliably observed in all tested conditions. These results strongly suggest that neural encoding of individual auditory events (i.e., syllables) is automatic, while knowledge-based construction of temporal chunks (i.e., words) crucially relies on top-down attention.SIGNIFICANCE STATEMENT Why we cannot understand speech when not paying attention is an old question in psychology and cognitive neuroscience. Speech processing is a complex process that involves multiple stages, e.g., hearing and analyzing the speech sound, recognizing words, and combining words into phrases and sentences. The current study investigates which speech-processing stage is blocked when we do not listen carefully. We show that the brain can reliably encode syllables, basic units of speech sounds, even when we do not pay attention. Nevertheless, when distracted, the brain cannot group syllables into multisyllabic words, which are basic units for speech meaning. Therefore, the process of converting speech sound into meaning crucially relies on attention.
Collapse
|
47
|
Bocquelet F, Hueber T, Girin L, Chabardès S, Yvert B. Key considerations in designing a speech brain-computer interface. ACTA ACUST UNITED AC 2017; 110:392-401. [PMID: 28756027 DOI: 10.1016/j.jphysparis.2017.07.002] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2017] [Revised: 06/21/2017] [Accepted: 07/19/2017] [Indexed: 01/08/2023]
Abstract
Restoring communication in case of aphasia is a key challenge for neurotechnologies. To this end, brain-computer strategies can be envisioned to allow artificial speech synthesis from the continuous decoding of neural signals underlying speech imagination. Such speech brain-computer interfaces do not exist yet and their design should consider three key choices that need to be made: the choice of appropriate brain regions to record neural activity from, the choice of an appropriate recording technique, and the choice of a neural decoding scheme in association with an appropriate speech synthesis method. These key considerations are discussed here in light of (1) the current understanding of the functional neuroanatomy of cortical areas underlying overt and covert speech production, (2) the available literature making use of a variety of brain recording techniques to better characterize and address the challenge of decoding cortical speech signals, and (3) the different speech synthesis approaches that can be considered depending on the level of speech representation (phonetic, acoustic or articulatory) envisioned to be decoded at the core of a speech BCI paradigm.
Collapse
Affiliation(s)
- Florent Bocquelet
- INSERM, BrainTech Laboratory U1205, F-38000 Grenoble, France; Univ. Grenoble Alpes, BrainTech Laboratory U1205, F-38000 Grenoble, France
| | - Thomas Hueber
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, 38000 Grenoble, France
| | - Laurent Girin
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, 38000 Grenoble, France
| | | | - Blaise Yvert
- INSERM, BrainTech Laboratory U1205, F-38000 Grenoble, France; Univ. Grenoble Alpes, BrainTech Laboratory U1205, F-38000 Grenoble, France.
| |
Collapse
|
48
|
Pulvermüller F. Neural reuse of action perception circuits for language, concepts and communication. Prog Neurobiol 2017; 160:1-44. [PMID: 28734837 DOI: 10.1016/j.pneurobio.2017.07.001] [Citation(s) in RCA: 96] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2017] [Revised: 05/12/2017] [Accepted: 07/13/2017] [Indexed: 10/19/2022]
Abstract
Neurocognitive and neurolinguistics theories make explicit statements relating specialized cognitive and linguistic processes to specific brain loci. These linking hypotheses are in need of neurobiological justification and explanation. Recent mathematical models of human language mechanisms constrained by fundamental neuroscience principles and established knowledge about comparative neuroanatomy offer explanations for where, when and how language is processed in the human brain. In these models, network structure and connectivity along with action- and perception-induced correlation of neuronal activity co-determine neurocognitive mechanisms. Language learning leads to the formation of action perception circuits (APCs) with specific distributions across cortical areas. Cognitive and linguistic processes such as speech production, comprehension, verbal working memory and prediction are modelled by activity dynamics in these APCs, and combinatorial and communicative-interactive knowledge is organized in the dynamics within, and connections between APCs. The network models and, in particular, the concept of distributionally-specific circuits, can account for some previously not well understood facts about the cortical 'hubs' for semantic processing and the motor system's role in language understanding and speech sound recognition. A review of experimental data evaluates predictions of the APC model and alternative theories, also providing detailed discussion of some seemingly contradictory findings. Throughout, recent disputes about the role of mirror neurons and grounded cognition in language and communication are assessed critically.
Collapse
Affiliation(s)
- Friedemann Pulvermüller
- Brain Language Laboratory, Department of Philosophy & Humanities, WE4, Freie Universität Berlin, 14195 Berlin, Germany; Berlin School of Mind and Brain, Humboldt Universität zu Berlin, 10099 Berlin, Germany; Einstein Center for Neurosciences, Berlin 10117 Berlin, Germany.
| |
Collapse
|
49
|
Fuglsang SA, Dau T, Hjortkjær J. Noise-robust cortical tracking of attended speech in real-world acoustic scenes. Neuroimage 2017; 156:435-444. [PMID: 28412441 DOI: 10.1016/j.neuroimage.2017.04.026] [Citation(s) in RCA: 95] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2016] [Revised: 04/07/2017] [Accepted: 04/10/2017] [Indexed: 11/30/2022] Open
Abstract
Selectively attending to one speaker in a multi-speaker scenario is thought to synchronize low-frequency cortical activity to the attended speech signal. In recent studies, reconstruction of speech from single-trial electroencephalogram (EEG) data has been used to decode which talker a listener is attending to in a two-talker situation. It is currently unclear how this generalizes to more complex sound environments. Behaviorally, speech perception is robust to the acoustic distortions that listeners typically encounter in everyday life, but it is unknown whether this is mirrored by a noise-robust neural tracking of attended speech. Here we used advanced acoustic simulations to recreate real-world acoustic scenes in the laboratory. In virtual acoustic realities with varying amounts of reverberation and number of interfering talkers, listeners selectively attended to the speech stream of a particular talker. Across the different listening environments, we found that the attended talker could be accurately decoded from single-trial EEG data irrespective of the different distortions in the acoustic input. For highly reverberant environments, speech envelopes reconstructed from neural responses to the distorted stimuli resembled the original clean signal more than the distorted input. With reverberant speech, we observed a late cortical response to the attended speech stream that encoded temporal modulations in the speech signal without its reverberant distortion. Single-trial attention decoding accuracies based on 40-50s long blocks of data from 64 scalp electrodes were equally high (80-90% correct) in all considered listening environments and remained statistically significant using down to 10 scalp electrodes and short (<30-s) unaveraged EEG segments. In contrast to the robust decoding of the attended talker we found that decoding of the unattended talker deteriorated with the acoustic distortions. These results suggest that cortical activity tracks an attended speech signal in a way that is invariant to acoustic distortions encountered in real-life sound environments. Noise-robust attention decoding additionally suggests a potential utility of stimulus reconstruction techniques in attention-controlled brain-computer interfaces.
Collapse
Affiliation(s)
- Søren Asp Fuglsang
- Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, Ørsteds Plads, Building 352, 2800 Kgs. Lyngby, Denmark.
| | - Torsten Dau
- Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, Ørsteds Plads, Building 352, 2800 Kgs. Lyngby, Denmark
| | - Jens Hjortkjær
- Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, Ørsteds Plads, Building 352, 2800 Kgs. Lyngby, Denmark; Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital Hvidovre, Kettegaard Allé 30, 2650 Hvidovre, Denmark.
| |
Collapse
|
50
|
Nourski KV. Auditory processing in the human cortex: An intracranial electrophysiology perspective. Laryngoscope Investig Otolaryngol 2017; 2:147-156. [PMID: 28894834 PMCID: PMC5562943 DOI: 10.1002/lio2.73] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2016] [Revised: 01/22/2017] [Accepted: 02/02/2017] [Indexed: 12/11/2022] Open
Abstract
Objective Direct electrophysiological recordings in epilepsy patients offer an opportunity to study human auditory cortical processing with unprecedented spatiotemporal resolution. This review highlights recent intracranial studies of human auditory cortex and focuses on its basic response properties as well as modulation of cortical activity during the performance of active behavioral tasks. Data Sources: Literature review. Review Methods: A review of the literature was conducted to summarize the functional organization of human auditory and auditory‐related cortex as revealed using intracranial recordings. Results The tonotopically organized core auditory cortex within the posteromedial portion of Heschl's gyrus represents spectrotemporal features of sounds with high temporal precision and short response latencies. At this level of processing, high gamma (70–150 Hz) activity is minimally modulated by task demands. Non‐core cortex on the lateral surface of the superior temporal gyrus also maintains representation of stimulus acoustic features and, for speech, subserves transformation of acoustic inputs into phonemic representations. High gamma responses in this region are modulated by task requirements. Prefrontal cortex exhibits complex response patterns, related to stimulus intelligibility and task relevance. At this level of auditory processing, activity is strongly modulated by task requirements and reflects behavioral performance. Conclusions Direct recordings from the human brain reveal hierarchical organization of sound processing within auditory and auditory‐related cortex. Level of Evidence Level V
Collapse
Affiliation(s)
- Kirill V Nourski
- Department of Neurosurgery The University of Iowa Iowa City IA U.S.A
| |
Collapse
|