1
|
Daeglau M, Otten J, Grimm G, Mirkovic B, Hohmann V, Debener S. Neural speech tracking in a virtual acoustic environment: audio-visual benefit for unscripted continuous speech. Front Hum Neurosci 2025; 19:1560558. [PMID: 40270565 PMCID: PMC12014754 DOI: 10.3389/fnhum.2025.1560558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2025] [Accepted: 03/27/2025] [Indexed: 04/25/2025] Open
Abstract
The audio-visual benefit in speech perception-where congruent visual input enhances auditory processing-is well-documented across age groups, particularly in challenging listening conditions and among individuals with varying hearing abilities. However, most studies rely on highly controlled laboratory environments with scripted stimuli. Here, we examine the audio-visual benefit using unscripted, natural speech from untrained speakers within a virtual acoustic environment. Using electroencephalography (EEG) and cortical speech tracking, we assessed neural responses across audio-visual, audio-only, visual-only, and masked-lip conditions to isolate the role of lip movements. Additionally, we analysed individual differences in acoustic and visual features of the speakers, including pitch, jitter, and lip-openness, to explore their influence on the audio-visual speech tracking benefit. Results showed a significant audio-visual enhancement in speech tracking with background noise, with the masked-lip condition performing similarly to the audio-only condition, emphasizing the importance of lip movements in adverse listening situations. Our findings reveal the feasibility of cortical speech tracking with naturalistic stimuli and underscore the impact of individual speaker characteristics on audio-visual integration in real-world listening contexts.
Collapse
Affiliation(s)
- Mareike Daeglau
- Neuropsychology Lab, Department of Psychology, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| | - Jürgen Otten
- Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| | - Giso Grimm
- Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| | - Bojana Mirkovic
- Neuropsychology Lab, Department of Psychology, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| | - Volker Hohmann
- Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| | - Stefan Debener
- Neuropsychology Lab, Department of Psychology, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| |
Collapse
|
2
|
Koprowska A, Wendt D, Serman M, Dau T, Marozeau J. The effect of auditory training on listening effort in hearing-aid users: insights from a pupillometry study. Int J Audiol 2025; 64:59-69. [PMID: 38289621 DOI: 10.1080/14992027.2024.2307415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 11/20/2023] [Accepted: 01/04/2024] [Indexed: 01/30/2025]
Abstract
OBJECTIVE The study investigated how auditory training affects effort exerted by hearing-impaired listeners in speech-in-noise task. DESIGN Pupillometry was used to characterise listening effort during a hearing in noise test (HINT) before and after phoneme-in-noise identification training. Half of the study participants completed the training, while the other half formed an active control group. STUDY SAMPLE Twenty 63-to-79 years old experienced hearing-aid users. RESULTS Higher peak pupil dilations (PPDs) were obtained at the end of the study compared to the beginning in both groups of the participants. The analysis of pupil dilation in an extended time window revealed, however, that the magnitude of pupillary response increased more in the training than in the control group. The effect of training on effort was observed in pupil responses even when no improvement in HINT was found. CONCLUSION The results demonstrate that using a listening effort metric adds additional insights into the effectiveness of auditory training compared to the situation when only speech-in-noise performance is considered. Trends observed in pupil responses suggested increased effort-both after the training and the placebo intervention-most likely reflecting the effect of the individual's motivation.
Collapse
Affiliation(s)
- Aleksandra Koprowska
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
- Copenhagen Hearing and Balance Center, Rigshospitalet, Copenhagen, Denmark
| | - Dorothea Wendt
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
- Eriksholm Research Centre, Snekkersten, Denmark
| | | | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
- Copenhagen Hearing and Balance Center, Rigshospitalet, Copenhagen, Denmark
| | - Jeremy Marozeau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
- Copenhagen Hearing and Balance Center, Rigshospitalet, Copenhagen, Denmark
| |
Collapse
|
3
|
Bolt E, Giroud N. Neural encoding of linguistic speech cues is unaffected by cognitive decline, but decreases with increasing hearing impairment. Sci Rep 2024; 14:19105. [PMID: 39154048 PMCID: PMC11330478 DOI: 10.1038/s41598-024-69602-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Accepted: 08/07/2024] [Indexed: 08/19/2024] Open
Abstract
The multivariate temporal response function (mTRF) is an effective tool for investigating the neural encoding of acoustic and complex linguistic features in natural continuous speech. In this study, we investigated how neural representations of speech features derived from natural stimuli are related to early signs of cognitive decline in older adults, taking into account the effects of hearing. Participants without ( n = 25 ) and with ( n = 19 ) early signs of cognitive decline listened to an audiobook while their electroencephalography responses were recorded. Using the mTRF framework, we modeled the relationship between speech input and neural response via different acoustic, segmented and linguistic encoding models and examined the response functions in terms of encoding accuracy, signal power, peak amplitudes and latencies. Our results showed no significant effect of cognitive decline or hearing ability on the neural encoding of acoustic and linguistic speech features. However, we found a significant interaction between hearing ability and the word-level segmentation model, suggesting that hearing impairment specifically affects encoding accuracy for this model, while other features were not affected by hearing ability. These results suggest that while speech processing markers remain unaffected by cognitive decline and hearing loss per se, neural encoding of word-level segmented speech features in older adults is affected by hearing loss but not by cognitive decline. This study emphasises the effectiveness of mTRF analysis in studying the neural encoding of speech and argues for an extension of research to investigate its clinical impact on hearing loss and cognition.
Collapse
Affiliation(s)
- Elena Bolt
- Computational Neuroscience of Speech and Hearing, Department of Computational Linguistics, University of Zurich, 8050, Zurich, Switzerland.
- International Max Planck Research School on the Life Course (IMPRS LIFE), University of Zurich, 8050, Zurich, Switzerland.
| | - Nathalie Giroud
- Computational Neuroscience of Speech and Hearing, Department of Computational Linguistics, University of Zurich, 8050, Zurich, Switzerland
- International Max Planck Research School on the Life Course (IMPRS LIFE), University of Zurich, 8050, Zurich, Switzerland
- Language and Medicine Centre Zurich, Competence Centre of Medical Faculty and Faculty of Arts and Sciences, University of Zurich, 8050, Zurich, Switzerland
| |
Collapse
|
4
|
Kulasingham JP, Innes-Brown H, Enqvist M, Alickovic E. Level-Dependent Subcortical Electroencephalography Responses to Continuous Speech. eNeuro 2024; 11:ENEURO.0135-24.2024. [PMID: 39142822 DOI: 10.1523/eneuro.0135-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 07/02/2024] [Accepted: 07/26/2024] [Indexed: 08/16/2024] Open
Abstract
The auditory brainstem response (ABR) is a measure of subcortical activity in response to auditory stimuli. The wave V peak of the ABR depends on the stimulus intensity level, and has been widely used for clinical hearing assessment. Conventional methods estimate the ABR average electroencephalography (EEG) responses to short unnatural stimuli such as clicks. Recent work has moved toward more ecologically relevant continuous speech stimuli using linear deconvolution models called temporal response functions (TRFs). Investigating whether the TRF waveform changes with stimulus intensity is a crucial step toward the use of natural speech stimuli for hearing assessments involving subcortical responses. Here, we develop methods to estimate level-dependent subcortical TRFs using EEG data collected from 21 participants listening to continuous speech presented at 4 different intensity levels. We find that level-dependent changes can be detected in the wave V peak of the subcortical TRF for almost all participants, and are consistent with level-dependent changes in click-ABR wave V. We also investigate the most suitable peripheral auditory model to generate predictors for level-dependent subcortical TRFs and find that simple gammatone filterbanks perform the best. Additionally, around 6 min of data may be sufficient for detecting level-dependent effects and wave V peaks above the noise floor for speech segments with higher intensity. Finally, we show a proof-of-concept that level-dependent subcortical TRFs can be detected even for the inherent intensity fluctuations in natural continuous speech.
Collapse
Affiliation(s)
- Joshua P Kulasingham
- Automatic Control, Department of Electrical Engineering, Linköping University, 581 83 Linköping, Sweden
| | - Hamish Innes-Brown
- Eriksholm Research Centre, DK-3070 Snekkersten, Denmark
- Department of Health Technology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| | - Martin Enqvist
- Automatic Control, Department of Electrical Engineering, Linköping University, 581 83 Linköping, Sweden
| | - Emina Alickovic
- Automatic Control, Department of Electrical Engineering, Linköping University, 581 83 Linköping, Sweden
- Eriksholm Research Centre, DK-3070 Snekkersten, Denmark
| |
Collapse
|
5
|
Bolt E, Giroud N. Auditory Encoding of Natural Speech at Subcortical and Cortical Levels Is Not Indicative of Cognitive Decline. eNeuro 2024; 11:ENEURO.0545-23.2024. [PMID: 38658138 PMCID: PMC11082929 DOI: 10.1523/eneuro.0545-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 03/27/2024] [Accepted: 03/29/2024] [Indexed: 04/26/2024] Open
Abstract
More and more patients worldwide are diagnosed with dementia, which emphasizes the urgent need for early detection markers. In this study, we built on the auditory hypersensitivity theory of a previous study-which postulated that responses to auditory input in the subcortex as well as cortex are enhanced in cognitive decline-and examined auditory encoding of natural continuous speech at both neural levels for its indicative potential for cognitive decline. We recruited study participants aged 60 years and older, who were divided into two groups based on the Montreal Cognitive Assessment, one group with low scores (n = 19, participants with signs of cognitive decline) and a control group (n = 25). Participants completed an audiometric assessment and then we recorded their electroencephalography while they listened to an audiobook and click sounds. We derived temporal response functions and evoked potentials from the data and examined response amplitudes for their potential to predict cognitive decline, controlling for hearing ability and age. Contrary to our expectations, no evidence of auditory hypersensitivity was observed in participants with signs of cognitive decline; response amplitudes were comparable in both cognitive groups. Moreover, the combination of response amplitudes showed no predictive value for cognitive decline. These results challenge the proposed hypothesis and emphasize the need for further research to identify reliable auditory markers for the early detection of cognitive decline.
Collapse
Affiliation(s)
- Elena Bolt
- Computational Neuroscience of Speech and Hearing, Department of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland
- International Max Planck Research School on the Life Course (IMPRS LIFE), University of Zurich, Zurich 8050, Switzerland
| | - Nathalie Giroud
- Computational Neuroscience of Speech and Hearing, Department of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland
- International Max Planck Research School on the Life Course (IMPRS LIFE), University of Zurich, Zurich 8050, Switzerland
- Language & Medicine Centre Zurich, Competence Centre of Medical Faculty and Faculty of Arts and Sciences, University of Zurich, Zurich 8050, Switzerland
| |
Collapse
|
6
|
Kulasingham JP, Bachmann FL, Eskelund K, Enqvist M, Innes-Brown H, Alickovic E. Predictors for estimating subcortical EEG responses to continuous speech. PLoS One 2024; 19:e0297826. [PMID: 38330068 PMCID: PMC10852227 DOI: 10.1371/journal.pone.0297826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 01/12/2024] [Indexed: 02/10/2024] Open
Abstract
Perception of sounds and speech involves structures in the auditory brainstem that rapidly process ongoing auditory stimuli. The role of these structures in speech processing can be investigated by measuring their electrical activity using scalp-mounted electrodes. However, typical analysis methods involve averaging neural responses to many short repetitive stimuli that bear little relevance to daily listening environments. Recently, subcortical responses to more ecologically relevant continuous speech were detected using linear encoding models. These methods estimate the temporal response function (TRF), which is a regression model that minimises the error between the measured neural signal and a predictor derived from the stimulus. Using predictors that model the highly non-linear peripheral auditory system may improve linear TRF estimation accuracy and peak detection. Here, we compare predictors from both simple and complex peripheral auditory models for estimating brainstem TRFs on electroencephalography (EEG) data from 24 participants listening to continuous speech. We also investigate the data length required for estimating subcortical TRFs, and find that around 12 minutes of data is sufficient for clear wave V peaks (>3 dB SNR) to be seen in nearly all participants. Interestingly, predictors derived from simple filterbank-based models of the peripheral auditory system yield TRF wave V peak SNRs that are not significantly different from those estimated using a complex model of the auditory nerve, provided that the nonlinear effects of adaptation in the auditory system are appropriately modelled. Crucially, computing predictors from these simpler models is more than 50 times faster compared to the complex model. This work paves the way for efficient modelling and detection of subcortical processing of continuous speech, which may lead to improved diagnosis metrics for hearing impairment and assistive hearing technology.
Collapse
Affiliation(s)
- Joshua P. Kulasingham
- Automatic Control, Department of Electrical Engineering, Linköping University, Linköping, Sweden
| | | | | | - Martin Enqvist
- Automatic Control, Department of Electrical Engineering, Linköping University, Linköping, Sweden
| | - Hamish Innes-Brown
- Eriksholm Research Centre, Snekkersten, Denmark
- Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
| | - Emina Alickovic
- Automatic Control, Department of Electrical Engineering, Linköping University, Linköping, Sweden
- Eriksholm Research Centre, Snekkersten, Denmark
| |
Collapse
|
7
|
Shan T, Cappelloni MS, Maddox RK. Subcortical responses to music and speech are alike while cortical responses diverge. Sci Rep 2024; 14:789. [PMID: 38191488 PMCID: PMC10774448 DOI: 10.1038/s41598-023-50438-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 12/20/2023] [Indexed: 01/10/2024] Open
Abstract
Music and speech are encountered daily and are unique to human beings. Both are transformed by the auditory pathway from an initial acoustical encoding to higher level cognition. Studies of cortex have revealed distinct brain responses to music and speech, but differences may emerge in the cortex or may be inherited from different subcortical encoding. In the first part of this study, we derived the human auditory brainstem response (ABR), a measure of subcortical encoding, to recorded music and speech using two analysis methods. The first method, described previously and acoustically based, yielded very different ABRs between the two sound classes. The second method, however, developed here and based on a physiological model of the auditory periphery, gave highly correlated responses to music and speech. We determined the superiority of the second method through several metrics, suggesting there is no appreciable impact of stimulus class (i.e., music vs speech) on the way stimulus acoustics are encoded subcortically. In this study's second part, we considered the cortex. Our new analysis method resulted in cortical music and speech responses becoming more similar but with remaining differences. The subcortical and cortical results taken together suggest that there is evidence for stimulus-class dependent processing of music and speech at the cortical but not subcortical level.
Collapse
Affiliation(s)
- Tong Shan
- Department of Biomedical Engineering, University of Rochester, Rochester, NY, USA
- Del Monte Institute for Neuroscience, University of Rochester, Rochester, NY, USA
- Center for Visual Science, University of Rochester, Rochester, NY, USA
| | - Madeline S Cappelloni
- Department of Biomedical Engineering, University of Rochester, Rochester, NY, USA
- Del Monte Institute for Neuroscience, University of Rochester, Rochester, NY, USA
- Center for Visual Science, University of Rochester, Rochester, NY, USA
| | - Ross K Maddox
- Department of Biomedical Engineering, University of Rochester, Rochester, NY, USA.
- Del Monte Institute for Neuroscience, University of Rochester, Rochester, NY, USA.
- Center for Visual Science, University of Rochester, Rochester, NY, USA.
- Department of Neuroscience, University of Rochester, Rochester, NY, USA.
| |
Collapse
|
8
|
Bachmann FL, Kulasingham JP, Eskelund K, Enqvist M, Alickovic E, Innes-Brown H. Extending Subcortical EEG Responses to Continuous Speech to the Sound-Field. Trends Hear 2024; 28:23312165241246596. [PMID: 38738341 PMCID: PMC11092544 DOI: 10.1177/23312165241246596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 03/08/2024] [Accepted: 03/26/2024] [Indexed: 05/14/2024] Open
Abstract
The auditory brainstem response (ABR) is a valuable clinical tool for objective hearing assessment, which is conventionally detected by averaging neural responses to thousands of short stimuli. Progressing beyond these unnatural stimuli, brainstem responses to continuous speech presented via earphones have been recently detected using linear temporal response functions (TRFs). Here, we extend earlier studies by measuring subcortical responses to continuous speech presented in the sound-field, and assess the amount of data needed to estimate brainstem TRFs. Electroencephalography (EEG) was recorded from 24 normal hearing participants while they listened to clicks and stories presented via earphones and loudspeakers. Subcortical TRFs were computed after accounting for non-linear processing in the auditory periphery by either stimulus rectification or an auditory nerve model. Our results demonstrated that subcortical responses to continuous speech could be reliably measured in the sound-field. TRFs estimated using auditory nerve models outperformed simple rectification, and 16 minutes of data was sufficient for the TRFs of all participants to show clear wave V peaks for both earphones and sound-field stimuli. Subcortical TRFs to continuous speech were highly consistent in both earphone and sound-field conditions, and with click ABRs. However, sound-field TRFs required slightly more data (16 minutes) to achieve clear wave V peaks compared to earphone TRFs (12 minutes), possibly due to effects of room acoustics. By investigating subcortical responses to sound-field speech stimuli, this study lays the groundwork for bringing objective hearing assessment closer to real-life conditions, which may lead to improved hearing evaluations and smart hearing technologies.
Collapse
Affiliation(s)
| | - Joshua P. Kulasingham
- Automatic Control, Department of Electrical Engineering, Linköping University, Linköping, Sweden
| | | | - Martin Enqvist
- Automatic Control, Department of Electrical Engineering, Linköping University, Linköping, Sweden
| | - Emina Alickovic
- Eriksholm Research Centre, Snekkersten, Denmark
- Automatic Control, Department of Electrical Engineering, Linköping University, Linköping, Sweden
| | - Hamish Innes-Brown
- Eriksholm Research Centre, Snekkersten, Denmark
- Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
| |
Collapse
|
9
|
Koprowska A, Marozeau J, Dau T, Serman M. The effect of phoneme-based auditory training on speech intelligibility in hearing-aid users. Int J Audiol 2023; 62:1048-1058. [PMID: 36301675 DOI: 10.1080/14992027.2022.2135032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 10/04/2022] [Indexed: 11/05/2022]
Abstract
OBJECTIVE Hearing loss commonly causes difficulties in understanding speech in the presence of background noise. The benefits of hearing-aids in terms of speech intelligibility in challenging listening scenarios remain limited. The present study investigated if phoneme-in-noise discrimination training improves phoneme identification and sentence intelligibility in noise in hearing-aid users. DESIGN Two groups of participants received either a two-week training program or a control intervention. Three phoneme categories were trained: onset consonants (C1), vowels (V) and post-vowel consonants (C2) in C1-V-C2-/i/ logatomes from the Danish nonsense word corpus (DANOK). Phoneme identification test and hearing in noise test (HINT) were administered before and after the respective interventions and, for the training group only, after three months. STUDY SAMPLE Twenty 63-to-79 years old individuals with a mild-to-moderate sensorineural hearing loss and at least one year of experience using hearing-aids. RESULTS The training provided an improvement in phoneme identification scores for vowels and post-vowel consonants, which was retained over three months. No significant performance improvement in HINT was found. CONCLUSION The study demonstrates that the training induced a robust refinement of auditory perception at a phoneme level but provides no evidence for the generalisation to an untrained sentence intelligibility task.
Collapse
Affiliation(s)
- Aleksandra Koprowska
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
- Copenhagen Hearing and Balance Center, Rigshospitalet, Copenhagen, Denmark
| | - Jeremy Marozeau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
- Copenhagen Hearing and Balance Center, Rigshospitalet, Copenhagen, Denmark
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
- Copenhagen Hearing and Balance Center, Rigshospitalet, Copenhagen, Denmark
| | | |
Collapse
|
10
|
Kegler M, Weissbart H, Reichenbach T. The neural response at the fundamental frequency of speech is modulated by word-level acoustic and linguistic information. Front Neurosci 2022; 16:915744. [PMID: 35942153 PMCID: PMC9355803 DOI: 10.3389/fnins.2022.915744] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Accepted: 07/04/2022] [Indexed: 11/21/2022] Open
Abstract
Spoken language comprehension requires rapid and continuous integration of information, from lower-level acoustic to higher-level linguistic features. Much of this processing occurs in the cerebral cortex. Its neural activity exhibits, for instance, correlates of predictive processing, emerging at delays of a few 100 ms. However, the auditory pathways are also characterized by extensive feedback loops from higher-level cortical areas to lower-level ones as well as to subcortical structures. Early neural activity can therefore be influenced by higher-level cognitive processes, but it remains unclear whether such feedback contributes to linguistic processing. Here, we investigated early speech-evoked neural activity that emerges at the fundamental frequency. We analyzed EEG recordings obtained when subjects listened to a story read by a single speaker. We identified a response tracking the speaker's fundamental frequency that occurred at a delay of 11 ms, while another response elicited by the high-frequency modulation of the envelope of higher harmonics exhibited a larger magnitude and longer latency of about 18 ms with an additional significant component at around 40 ms. Notably, while the earlier components of the response likely originate from the subcortical structures, the latter presumably involves contributions from cortical regions. Subsequently, we determined the magnitude of these early neural responses for each individual word in the story. We then quantified the context-independent frequency of each word and used a language model to compute context-dependent word surprisal and precision. The word surprisal represented how predictable a word is, given the previous context, and the word precision reflected the confidence about predicting the next word from the past context. We found that the word-level neural responses at the fundamental frequency were predominantly influenced by the acoustic features: the average fundamental frequency and its variability. Amongst the linguistic features, only context-independent word frequency showed a weak but significant modulation of the neural response to the high-frequency envelope modulation. Our results show that the early neural response at the fundamental frequency is already influenced by acoustic as well as linguistic information, suggesting top-down modulation of this neural response.
Collapse
Affiliation(s)
- Mikolaj Kegler
- Department of Bioengineering, Centre for Neurotechnology, Imperial College London, London, United Kingdom
| | - Hugo Weissbart
- Donders Centre for Cognitive Neuroimaging, Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Tobias Reichenbach
- Department of Bioengineering, Centre for Neurotechnology, Imperial College London, London, United Kingdom
- Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-University Erlangen-Nuremberg, Erlangen, Germany
- *Correspondence: Tobias Reichenbach
| |
Collapse
|