1
|
Karunathilake IMD, Brodbeck C, Bhattasali S, Resnik P, Simon JZ. Neural Dynamics of the Processing of Speech Features: Evidence for a Progression of Features from Acoustic to Sentential Processing. J Neurosci 2025; 45:e1143242025. [PMID: 39809543 PMCID: PMC11905352 DOI: 10.1523/jneurosci.1143-24.2025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 12/30/2024] [Accepted: 01/03/2025] [Indexed: 01/16/2025] Open
Abstract
When we listen to speech, our brain's neurophysiological responses "track" its acoustic features, but it is less well understood how these auditory responses are enhanced by linguistic content. Here, we recorded magnetoencephalography responses while subjects of both sexes listened to four types of continuous speechlike passages: speech envelope-modulated noise, English-like nonwords, scrambled words, and a narrative passage. Temporal response function (TRF) analysis provides strong neural evidence for the emergent features of speech processing in the cortex, from acoustics to higher-level linguistics, as incremental steps in neural speech processing. Critically, we show a stepwise hierarchical progression of progressively higher-order features over time, reflected in both bottom-up (early) and top-down (late) processing stages. Linguistically driven top-down mechanisms take the form of late N400-like responses, suggesting a central role of predictive coding mechanisms at multiple levels. As expected, the neural processing of lower-level acoustic feature responses is bilateral or right lateralized, with left lateralization emerging only for lexicosemantic features. Finally, our results identify potential neural markers, linguistic-level late responses, derived from TRF components modulated by linguistic content, suggesting that these markers are indicative of speech comprehension rather than mere speech perception.
Collapse
Affiliation(s)
| | - Christian Brodbeck
- Department of Computing and Software, McMaster University, Hamilton, Ontario L8S 4L8, Canada
| | - Shohini Bhattasali
- Department of Language Studies, University of Toronto, Scarborough, Ontario M1C 1A4, Canada
| | - Philip Resnik
- Departments of Linguistics and Institute for Advanced Computer Studies, College Park, Maryland, 20742
| | - Jonathan Z Simon
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland 20742
- Biology, University of Maryland, College Park, Maryland, 20742
- Institute for Systems Research, University of Maryland, College Park, Maryland 20742
| |
Collapse
|
2
|
Karunathilake ID, Brodbeck C, Bhattasali S, Resnik P, Simon JZ. Neural Dynamics of the Processing of Speech Features: Evidence for a Progression of Features from Acoustic to Sentential Processing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.02.578603. [PMID: 38352332 PMCID: PMC10862830 DOI: 10.1101/2024.02.02.578603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/22/2024]
Abstract
When we listen to speech, our brain's neurophysiological responses "track" its acoustic features, but it is less well understood how these auditory responses are enhanced by linguistic content. Here, we recorded magnetoencephalography (MEG) responses while subjects listened to four types of continuous-speech-like passages: speech-envelope modulated noise, English-like non-words, scrambled words, and a narrative passage. Temporal response function (TRF) analysis provides strong neural evidence for the emergent features of speech processing in cortex, from acoustics to higher-level linguistics, as incremental steps in neural speech processing. Critically, we show a stepwise hierarchical progression of progressively higher order features over time, reflected in both bottom-up (early) and top-down (late) processing stages. Linguistically driven top-down mechanisms take the form of late N400-like responses, suggesting a central role of predictive coding mechanisms at multiple levels. As expected, the neural processing of lower-level acoustic feature responses is bilateral or right lateralized, with left lateralization emerging only for lexical-semantic features. Finally, our results identify potential neural markers, linguistic level late responses, derived from TRF components modulated by linguistic content, suggesting that these markers are indicative of speech comprehension rather than mere speech perception.
Collapse
Affiliation(s)
| | - Christian Brodbeck
- Department of Computing and Software, McMaster University, Hamilton, ON, Canada
| | - Shohini Bhattasali
- Department of Language Studies, University of Toronto, Scarborough, Canada
| | - Philip Resnik
- Department of Linguistics, and Institute for Advanced Computer Studies, University of Maryland, College Park, MD, 20742
| | - Jonathan Z. Simon
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, 20742
- Department of Biology, University of Maryland, College Park, MD, USA
- Institute for Systems Research, University of Maryland, College Park, MD, 20742
| |
Collapse
|
3
|
Puffay C, Vanthornhout J, Gillis M, Clercq PD, Accou B, Hamme HV, Francart T. Classifying coherent versus nonsense speech perception from EEG using linguistic speech features. Sci Rep 2024; 14:18922. [PMID: 39143297 PMCID: PMC11324895 DOI: 10.1038/s41598-024-69568-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Accepted: 08/06/2024] [Indexed: 08/16/2024] Open
Abstract
When a person listens to natural speech, the relation between features of the speech signal and the corresponding evoked electroencephalogram (EEG) is indicative of neural processing of the speech signal. Using linguistic representations of speech, we investigate the differences in neural processing between speech in a native and foreign language that is not understood. We conducted experiments using three stimuli: a comprehensible language, an incomprehensible language, and randomly shuffled words from a comprehensible language, while recording the EEG signal of native Dutch-speaking participants. We modeled the neural tracking of linguistic features of the speech signals using a deep-learning model in a match-mismatch task that relates EEG signals to speech, while accounting for lexical segmentation features reflecting acoustic processing. The deep learning model effectively classifies coherent versus nonsense languages. We also observed significant differences in tracking patterns between comprehensible and incomprehensible speech stimuli within the same language. It demonstrates the potential of deep learning frameworks in measuring speech understanding objectively.
Collapse
Affiliation(s)
- Corentin Puffay
- Department Neurosciences, KU Leuven, ExpORL, Leuven, Belgium.
- Department of Electrical engineering (ESAT), KU Leuven, PSI, Leuven, Belgium.
| | | | - Marlies Gillis
- Department Neurosciences, KU Leuven, ExpORL, Leuven, Belgium
| | | | - Bernd Accou
- Department Neurosciences, KU Leuven, ExpORL, Leuven, Belgium
- Department of Electrical engineering (ESAT), KU Leuven, PSI, Leuven, Belgium
| | - Hugo Van Hamme
- Department of Electrical engineering (ESAT), KU Leuven, PSI, Leuven, Belgium
| | - Tom Francart
- Department Neurosciences, KU Leuven, ExpORL, Leuven, Belgium.
| |
Collapse
|
4
|
MacIntyre AD, Carlyon RP, Goehring T. Neural Decoding of the Speech Envelope: Effects of Intelligibility and Spectral Degradation. Trends Hear 2024; 28:23312165241266316. [PMID: 39183533 PMCID: PMC11345737 DOI: 10.1177/23312165241266316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 05/23/2024] [Accepted: 06/16/2024] [Indexed: 08/27/2024] Open
Abstract
During continuous speech perception, endogenous neural activity becomes time-locked to acoustic stimulus features, such as the speech amplitude envelope. This speech-brain coupling can be decoded using non-invasive brain imaging techniques, including electroencephalography (EEG). Neural decoding may provide clinical use as an objective measure of stimulus encoding by the brain-for example during cochlear implant listening, wherein the speech signal is severely spectrally degraded. Yet, interplay between acoustic and linguistic factors may lead to top-down modulation of perception, thereby complicating audiological applications. To address this ambiguity, we assess neural decoding of the speech envelope under spectral degradation with EEG in acoustically hearing listeners (n = 38; 18-35 years old) using vocoded speech. We dissociate sensory encoding from higher-order processing by employing intelligible (English) and non-intelligible (Dutch) stimuli, with auditory attention sustained using a repeated-phrase detection task. Subject-specific and group decoders were trained to reconstruct the speech envelope from held-out EEG data, with decoder significance determined via random permutation testing. Whereas speech envelope reconstruction did not vary by spectral resolution, intelligible speech was associated with better decoding accuracy in general. Results were similar across subject-specific and group analyses, with less consistent effects of spectral degradation in group decoding. Permutation tests revealed possible differences in decoder statistical significance by experimental condition. In general, while robust neural decoding was observed at the individual and group level, variability within participants would most likely prevent the clinical use of such a measure to differentiate levels of spectral degradation and intelligibility on an individual basis.
Collapse
Affiliation(s)
| | - Robert P. Carlyon
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | - Tobias Goehring
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| |
Collapse
|
5
|
Van Hirtum T, Somers B, Dieudonné B, Verschueren E, Wouters J, Francart T. Neural envelope tracking predicts speech intelligibility and hearing aid benefit in children with hearing loss. Hear Res 2023; 439:108893. [PMID: 37806102 DOI: 10.1016/j.heares.2023.108893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 09/01/2023] [Accepted: 09/27/2023] [Indexed: 10/10/2023]
Abstract
Early assessment of hearing aid benefit is crucial, as the extent to which hearing aids provide audible speech information predicts speech and language outcomes. A growing body of research has proposed neural envelope tracking as an objective measure of speech intelligibility, particularly for individuals unable to provide reliable behavioral feedback. However, its potential for evaluating speech intelligibility and hearing aid benefit in children with hearing loss remains unexplored. In this study, we investigated neural envelope tracking in children with permanent hearing loss through two separate experiments. EEG data were recorded while children listened to age-appropriate stories (Experiment 1) or an animated movie (Experiment 2) under aided and unaided conditions (using personal hearing aids) at multiple stimulus intensities. Neural envelope tracking was evaluated using a linear decoder reconstructing the speech envelope from the EEG in the delta band (0.5-4 Hz). Additionally, we calculated temporal response functions (TRFs) to investigate the spatio-temporal dynamics of the response. In both experiments, neural tracking increased with increasing stimulus intensity, but only in the unaided condition. In the aided condition, neural tracking remained stable across a wide range of intensities, as long as speech intelligibility was maintained. Similarly, TRF amplitudes increased with increasing stimulus intensity in the unaided condition, while in the aided condition significant differences were found in TRF latency rather than TRF amplitude. This suggests that decreasing stimulus intensity does not necessarily impact neural tracking. Furthermore, the use of personal hearing aids significantly enhanced neural envelope tracking, particularly in challenging speech conditions that would be inaudible when unaided. Finally, we found a strong correlation between neural envelope tracking and behaviorally measured speech intelligibility for both narrated stories (Experiment 1) and movie stimuli (Experiment 2). Altogether, these findings indicate that neural envelope tracking could be a valuable tool for predicting speech intelligibility benefits derived from personal hearing aids in hearing-impaired children. Incorporating narrated stories or engaging movies expands the accessibility of these methods even in clinical settings, offering new avenues for using objective speech measures to guide pediatric audiology decision-making.
Collapse
Affiliation(s)
- Tilde Van Hirtum
- KU Leuven - University of Leuven, Department of Neurosciences, Experimental Oto-rhino-laryngology, Herestraat 49 bus 721, 3000 Leuven, Belgium
| | - Ben Somers
- KU Leuven - University of Leuven, Department of Neurosciences, Experimental Oto-rhino-laryngology, Herestraat 49 bus 721, 3000 Leuven, Belgium
| | - Benjamin Dieudonné
- KU Leuven - University of Leuven, Department of Neurosciences, Experimental Oto-rhino-laryngology, Herestraat 49 bus 721, 3000 Leuven, Belgium
| | - Eline Verschueren
- KU Leuven - University of Leuven, Department of Neurosciences, Experimental Oto-rhino-laryngology, Herestraat 49 bus 721, 3000 Leuven, Belgium
| | - Jan Wouters
- KU Leuven - University of Leuven, Department of Neurosciences, Experimental Oto-rhino-laryngology, Herestraat 49 bus 721, 3000 Leuven, Belgium
| | - Tom Francart
- KU Leuven - University of Leuven, Department of Neurosciences, Experimental Oto-rhino-laryngology, Herestraat 49 bus 721, 3000 Leuven, Belgium.
| |
Collapse
|