1
|
Gnanateja GN, Rupp K, Llanos F, Hect J, German JS, Teichert T, Abel TJ, Chandrasekaran B. Cortical processing of discrete prosodic patterns in continuous speech. Nat Commun 2025; 16:1947. [PMID: 40032850 PMCID: PMC11876672 DOI: 10.1038/s41467-025-56779-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 01/29/2025] [Indexed: 03/05/2025] Open
Abstract
Prosody has a vital function in speech, structuring a speaker's intended message for the listener. The superior temporal gyrus (STG) is considered a critical hub for prosody, but the role of earlier auditory regions like Heschl's gyrus (HG), associated with pitch processing, remains unclear. Using intracerebral recordings in humans and non-human primate models, we investigated prosody processing in narrative speech, focusing on pitch accents-abstract phonological units that signal word prominence and communicative intent. In humans, HG encoded pitch accents as abstract representations beyond spectrotemporal features, distinct from segmental speech processing, and outperforms STG in disambiguating pitch accents. Multivariate models confirm HG's unique representation of pitch accent categories. In the non-human primate, pitch accents were not abstractly encoded, despite robust spectrotemporal processing, highlighting the role of experience in shaping abstract representations. These findings emphasize a key role for the HG in early prosodic abstraction and advance our understanding of human speech processing.
Collapse
Affiliation(s)
- G Nike Gnanateja
- Speech Processing and Auditory Neuroscience Lab, Department of Communication Sciences and Disorder, University of Wisconsin-Madison, Madison, WI, USA
| | - Kyle Rupp
- Pediatric Brain Electrophysiology Laboratory, Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA
| | - Fernando Llanos
- UT Austin Neurolinguistics Lab, Department of Linguistics, The University of Texas at Austin, Austin, TX, USA
| | - Jasmine Hect
- Pediatric Brain Electrophysiology Laboratory, Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA
| | - James S German
- Aix-Marseille University, CNRS, LPL, Aix-en-Provence, France
| | - Tobias Teichert
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Taylor J Abel
- Pediatric Brain Electrophysiology Laboratory, Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA.
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, PA, USA.
| | - Bharath Chandrasekaran
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, PA, USA.
- Roxelyn and Richard Pepper Department of Communication Sciences & Disorders, Northwestern University, Evanston, IL, USA.
- Knowles Hearing Center, Evanston, IL, 60208, USA.
| |
Collapse
|
2
|
Llanos F, Stump T, Crowhurst M. Investigating the Neural Basis of the Loud-first Principle of the Iambic-Trochaic Law. J Cogn Neurosci 2025; 37:14-27. [PMID: 39231274 DOI: 10.1162/jocn_a_02241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/06/2024]
Abstract
The perception of rhythmic patterns is crucial for the recognition of words in spoken languages, yet it remains unclear how these patterns are represented in the brain. Here, we tested the hypothesis that rhythmic patterns are encoded by neural activity phase-locked to the temporal modulation of these patterns in the speech signal. To test this hypothesis, we analyzed EEGs evoked with long sequences of alternating syllables acoustically manipulated to be perceived as a series of different rhythmic groupings in English. We found that the magnitude of the EEG at the syllable and grouping rates of each sequence was significantly higher than the noise baseline, indicating that the neural parsing of syllables and rhythmic groupings operates at different timescales. Distributional differences between the scalp topographies associated with each timescale suggests a further mechanistic dissociation between the neural segmentation of syllables and groupings. In addition, we observed that the neural tracking of louder syllables, which in trochaic languages like English are associated with the beginning of rhythmic groupings, was more robust than the neural tracking of softer syllables. The results of further bootstrapping and brain-behavior analyses indicate that the perception of rhythmic patterns is modulated by the magnitude of grouping alternations in the neural signal. These findings suggest that the temporal coding of rhythmic patterns in stress-based languages like English is supported by temporal regularities that are linguistically relevant in the speech signal.
Collapse
|
3
|
Naeije G, Niesen M, Vander Ghinst M, Bourguignon M. Simultaneous EEG recording of cortical tracking of speech and movement kinematics. Neuroscience 2024; 561:1-10. [PMID: 39395635 DOI: 10.1016/j.neuroscience.2024.10.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 09/23/2024] [Accepted: 10/06/2024] [Indexed: 10/14/2024]
Abstract
RATIONALE Cortical activity is coupled with streams of sensory stimulation. The coupling with the temporal envelope of heard speech is known as the cortical tracking of speech (CTS), and that with movement kinematics is known as the corticokinematic coupling (CKC). Simultaneous measurement of both couplings is desirable in clinical settings, but it is unknown whether the inherent dual-tasking condition has an impact on CTS or CKC. AIM We aim to determine whether and how CTS and CKC levels are affected when recorded simultaneously. METHODS Twenty-three healthy young adults underwent 64-channel EEG recordings while listening to stories and while performing repetitive finger-tapping movements in 3 conditions: separately (audio- or tapping-only) or simultaneously (audio-tapping). CTS and CKC values were estimated using coherence analysis between each EEG signal and speech temporal envelope (CTS) or finger acceleration (CKC). CTS was also estimated as the reconstruction accuracy of a decoding model. RESULTS Across recordings, CTS assessed with reconstruction accuracy was significant in 85 % of the subjects at phrasal frequency (0.5 Hz) and in 68 % at syllabic frequencies (4-8 Hz), and CKC was significant in over 85 % of the subjects at movement frequency and its first harmonic. Comparing CTS and CKC values evaluated in separate recordings to those in simultaneous recordings revealed no significant difference and moderate-to-high levels of correlation. CONCLUSION Despite the subtle behavioral effects, CTS and CKC are not evidently altered by the dual-task setting inherent to recording them simultaneously and can be evaluated simultaneously using EEG in clinical settings.
Collapse
Affiliation(s)
- Gilles Naeije
- Laboratoire de Neuroanatomie et Neuroimagerie Translationnelles, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium; Centre de Référence Neuromusculaire, Department of Neurology, HUB Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium.
| | - Maxime Niesen
- Laboratoire de Neuroanatomie et Neuroimagerie Translationnelles, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium; Service d'ORL et de chirurgie cervico-faciale, HUB Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium
| | - Marc Vander Ghinst
- Laboratoire de Neuroanatomie et Neuroimagerie Translationnelles, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium; Service d'ORL et de chirurgie cervico-faciale, HUB Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium
| | - Mathieu Bourguignon
- Laboratoire de Neuroanatomie et Neuroimagerie Translationnelles, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium; Laboratory of Neurophysiology and Movement Biomechanics, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium
| |
Collapse
|
4
|
Keding O, Alickovic E, Skoglund MA, Sandsten M. Novel bias-reduced coherence measure for EEG-based speech tracking in listeners with hearing impairment. Front Neurosci 2024; 18:1415397. [PMID: 39568664 PMCID: PMC11577966 DOI: 10.3389/fnins.2024.1415397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Accepted: 10/16/2024] [Indexed: 11/22/2024] Open
Abstract
In the literature, auditory attention is explored through neural speech tracking, primarily entailing modeling and analyzing electroencephalography (EEG) responses to natural speech via linear filtering. Our study takes a novel approach, introducing an enhanced coherence estimation technique to assess the strength of neural speech tracking. This enables effective discrimination between attended and ignored speech. To mitigate the impact of colored noise in EEG, we address two biases-overall coherence-level bias and spectral peak-shifting bias. In a listening study involving 32 participants with hearing impairment, tasked with attending to competing talkers in background noise, our coherence-based method effectively discerns EEG representations of attended and ignored speech. We comprehensively analyze frequency bands, individual frequencies, and EEG channels. Frequency bands of importance are shown to be delta, theta and alpha, and the important EEG channels are the central. Lastly, we showcase coherence differences across different noise reduction settings implemented in hearing aids (HAs), underscoring our method's potential to objectively assess auditory attention and enhance HA efficacy.
Collapse
Affiliation(s)
- Oskar Keding
- Centre for Mathematical Sciences, Lund University, Lund, Sweden
| | - Emina Alickovic
- Eriksholm Research Centre, Oticon A/S, Snekkersten, Denmark
- Department of Electrical Engineering, Linköping University, Linköping, Sweden
| | - Martin A Skoglund
- Eriksholm Research Centre, Oticon A/S, Snekkersten, Denmark
- Department of Electrical Engineering, Linköping University, Linköping, Sweden
| | - Maria Sandsten
- Centre for Mathematical Sciences, Lund University, Lund, Sweden
| |
Collapse
|
5
|
Brilliant, Yaar-Soffer Y, Herrmann CS, Henkin Y, Kral A. Theta and alpha oscillatory signatures of auditory sensory and cognitive loads during complex listening. Neuroimage 2024; 289:120546. [PMID: 38387743 DOI: 10.1016/j.neuroimage.2024.120546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 02/07/2024] [Accepted: 02/15/2024] [Indexed: 02/24/2024] Open
Abstract
The neuronal signatures of sensory and cognitive load provide access to brain activities related to complex listening situations. Sensory and cognitive loads are typically reflected in measures like response time (RT) and event-related potentials (ERPs) components. It's, however, strenuous to distinguish the underlying brain processes solely from these measures. In this study, along with RT- and ERP-analysis, we performed time-frequency analysis and source localization of oscillatory activity in participants performing two different auditory tasks with varying degrees of complexity and related them to sensory and cognitive load. We studied neuronal oscillatory activity in both periods before the behavioral response (pre-response) and after it (post-response). Robust oscillatory activities were found in both periods and were differentially affected by sensory and cognitive load. Oscillatory activity under sensory load was characterized by decrease in pre-response (early) theta activity and increased alpha activity. Oscillatory activity under cognitive load was characterized by increased theta activity, mainly in post-response (late) time. Furthermore, source localization revealed specific brain regions responsible for processing these loads, such as temporal and frontal lobe, cingulate cortex and precuneus. The results provide evidence that in complex listening situations, the brain processes sensory and cognitive loads differently. These neural processes have specific oscillatory signatures and are long lasting, extending beyond the behavioral response.
Collapse
Affiliation(s)
- Brilliant
- Department of Experimental Otology, Hannover Medical School, 30625 Hannover, Germany.
| | - Y Yaar-Soffer
- Department of Communication Disorder, Tel Aviv University, 5262657 Tel Aviv, Israel; Hearing, Speech and Language Center, Sheba Medical Center, 5265601 Tel Hashomer, Israel
| | - C S Herrmann
- Experimental Psychology Division, University of Oldenburg, 26111 Oldenburg, Germany
| | - Y Henkin
- Department of Communication Disorder, Tel Aviv University, 5262657 Tel Aviv, Israel; Hearing, Speech and Language Center, Sheba Medical Center, 5265601 Tel Hashomer, Israel
| | - A Kral
- Department of Experimental Otology, Hannover Medical School, 30625 Hannover, Germany
| |
Collapse
|
6
|
Park S, Park KH, Han W. The Effects of Music-Based Auditory Training on Hearing-Impaired Older Adults With Mild Cognitive Impairment. Clin Exp Otorhinolaryngol 2024; 17:26-36. [PMID: 38062716 PMCID: PMC10933806 DOI: 10.21053/ceo.2023.00815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 11/04/2023] [Accepted: 12/08/2023] [Indexed: 03/14/2024] Open
Abstract
OBJECTIVES The present study aimed to determine the effect of music-based auditory training on older adults with hearing loss and decreased cognitive ability, which are common conditions in the older population. METHODS In total, 20 older adults diagnosed with both mild-to-moderately severe hearing loss and mild cognitive impairment (MCI) participated. Half of this group were randomly assigned to the auditory training group (ATG), and the other half were designated as the control group (CG). For the ATG, a 40-minute training session (10 minutes for singing a song, 15 minutes for playing instruments, and 15 minutes for playing games with music discrimination) was conducted twice a week for 8 weeks (for a total of 16 sessions). To confirm the training effects, all participants were given tests pre- and post-training, and then a follow-up test was administered 2 weeks after the training, using various auditory and cognitive tests and a self-reporting questionnaire. RESULTS The ATG demonstrated significant improvement in all auditory test scores compared to the CG. Additionally, there was a notable enhancement in cognitive test scores post-training, except for the digit span tests. However, there was no statistically significant difference in the questionnaire scores between the two groups, although the ATG did score higher post-training. CONCLUSION The music-based auditory training resulted in a significant improvement in auditory function and a partial enhancement in cognitive ability among elderly patients with hearing loss and MCI. We anticipate that this music-based approach will be adopted for auditory training in clinical settings due to its engaging and easy-to-follow nature.
Collapse
Affiliation(s)
- Sihun Park
- Division of Speech Pathology and Audiology, College of Natural Sciences, Hallym University, Chuncheon, Korea
- Laboratory of Hearing and Technology, Research Institute of Audiology and Speech Pathology, College of Natural Sciences, Hallym University, Chuncheon, Korea
| | - Kyoung Ho Park
- Department of Otolaryngology-Head and Neck Surgery, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Woojae Han
- Division of Speech Pathology and Audiology, College of Natural Sciences, Hallym University, Chuncheon, Korea
- Laboratory of Hearing and Technology, Research Institute of Audiology and Speech Pathology, College of Natural Sciences, Hallym University, Chuncheon, Korea
| |
Collapse
|
7
|
MacIntyre AD, Carlyon RP, Goehring T. Neural Decoding of the Speech Envelope: Effects of Intelligibility and Spectral Degradation. Trends Hear 2024; 28:23312165241266316. [PMID: 39183533 PMCID: PMC11345737 DOI: 10.1177/23312165241266316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 05/23/2024] [Accepted: 06/16/2024] [Indexed: 08/27/2024] Open
Abstract
During continuous speech perception, endogenous neural activity becomes time-locked to acoustic stimulus features, such as the speech amplitude envelope. This speech-brain coupling can be decoded using non-invasive brain imaging techniques, including electroencephalography (EEG). Neural decoding may provide clinical use as an objective measure of stimulus encoding by the brain-for example during cochlear implant listening, wherein the speech signal is severely spectrally degraded. Yet, interplay between acoustic and linguistic factors may lead to top-down modulation of perception, thereby complicating audiological applications. To address this ambiguity, we assess neural decoding of the speech envelope under spectral degradation with EEG in acoustically hearing listeners (n = 38; 18-35 years old) using vocoded speech. We dissociate sensory encoding from higher-order processing by employing intelligible (English) and non-intelligible (Dutch) stimuli, with auditory attention sustained using a repeated-phrase detection task. Subject-specific and group decoders were trained to reconstruct the speech envelope from held-out EEG data, with decoder significance determined via random permutation testing. Whereas speech envelope reconstruction did not vary by spectral resolution, intelligible speech was associated with better decoding accuracy in general. Results were similar across subject-specific and group analyses, with less consistent effects of spectral degradation in group decoding. Permutation tests revealed possible differences in decoder statistical significance by experimental condition. In general, while robust neural decoding was observed at the individual and group level, variability within participants would most likely prevent the clinical use of such a measure to differentiate levels of spectral degradation and intelligibility on an individual basis.
Collapse
Affiliation(s)
| | - Robert P. Carlyon
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | - Tobias Goehring
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| |
Collapse
|
8
|
Quique YM, Gnanateja GN, Dickey MW, Evans WS, Chandrasekaran B. Examining cortical tracking of the speech envelope in post-stroke aphasia. Front Hum Neurosci 2023; 17:1122480. [PMID: 37780966 PMCID: PMC10538638 DOI: 10.3389/fnhum.2023.1122480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 08/28/2023] [Indexed: 10/03/2023] Open
Abstract
Introduction People with aphasia have been shown to benefit from rhythmic elements for language production during aphasia rehabilitation. However, it is unknown whether rhythmic processing is associated with such benefits. Cortical tracking of the speech envelope (CTenv) may provide a measure of encoding of speech rhythmic properties and serve as a predictor of candidacy for rhythm-based aphasia interventions. Methods Electroencephalography was used to capture electrophysiological responses while Spanish speakers with aphasia (n = 9) listened to a continuous speech narrative (audiobook). The Temporal Response Function was used to estimate CTenv in the delta (associated with word- and phrase-level properties), theta (syllable-level properties), and alpha bands (attention-related properties). CTenv estimates were used to predict aphasia severity, performance in rhythmic perception and production tasks, and treatment response in a sentence-level rhythm-based intervention. Results CTenv in delta and theta, but not alpha, predicted aphasia severity. Neither CTenv in delta, alpha, or theta bands predicted performance in rhythmic perception or production tasks. Some evidence supported that CTenv in theta could predict sentence-level learning in aphasia, but alpha and delta did not. Conclusion CTenv of the syllable-level properties was relatively preserved in individuals with less language impairment. In contrast, higher encoding of word- and phrase-level properties was relatively impaired and was predictive of more severe language impairments. CTenv and treatment response to sentence-level rhythm-based interventions need to be further investigated.
Collapse
Affiliation(s)
- Yina M. Quique
- Center for Education in Health Sciences, Northwestern University Feinberg School of Medicine, Chicago, IL, United States
| | - G. Nike Gnanateja
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, Madison, WI, United States
| | - Michael Walsh Dickey
- VA Pittsburgh Healthcare System, Pittsburgh, PA, United States
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, PA, United States
| | | | - Bharath Chandrasekaran
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, PA, United States
- Roxelyn and Richard Pepper Department of Communication Science and Disorders, School of Communication. Northwestern University, Evanston, IL, United States
| |
Collapse
|
9
|
Mohammadi Y, Graversen C, Østergaard J, Andersen OK, Reichenbach T. Phase-locking of Neural Activity to the Envelope of Speech in the Delta Frequency Band Reflects Differences between Word Lists and Sentences. J Cogn Neurosci 2023; 35:1301-1311. [PMID: 37379482 DOI: 10.1162/jocn_a_02016] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2023]
Abstract
The envelope of a speech signal is tracked by neural activity in the cerebral cortex. The cortical tracking occurs mainly in two frequency bands, theta (4-8 Hz) and delta (1-4 Hz). Tracking in the faster theta band has been mostly associated with lower-level acoustic processing, such as the parsing of syllables, whereas the slower tracking in the delta band relates to higher-level linguistic information of words and word sequences. However, much regarding the more specific association between cortical tracking and acoustic as well as linguistic processing remains to be uncovered. Here, we recorded EEG responses to both meaningful sentences and random word lists in different levels of signal-to-noise ratios (SNRs) that lead to different levels of speech comprehension as well as listening effort. We then related the neural signals to the acoustic stimuli by computing the phase-locking value (PLV) between the EEG recordings and the speech envelope. We found that the PLV in the delta band increases with increasing SNR for sentences but not for the random word lists, showing that the PLV in this frequency band reflects linguistic information. When attempting to disentangle the effects of SNR, speech comprehension, and listening effort, we observed a trend that the PLV in the delta band might reflect listening effort rather than the other two variables, although the effect was not statistically significant. In summary, our study shows that the PLV in the delta band reflects linguistic information and might be related to listening effort.
Collapse
|
10
|
Baese-Berk MM, Chandrasekaran B, Roark CL. The nature of non-native speech sound representations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:3025. [PMID: 36456300 PMCID: PMC9671621 DOI: 10.1121/10.0015230] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 10/20/2022] [Accepted: 11/01/2022] [Indexed: 05/23/2023]
Abstract
Most current theories and models of second language speech perception are grounded in the notion that learners acquire speech sound categories in their target language. In this paper, this classic idea in speech perception is revisited, given that clear evidence for formation of such categories is lacking in previous research. To understand the debate on the nature of speech sound representations in a second language, an operational definition of "category" is presented, and the issues of categorical perception and current theories of second language learning are reviewed. Following this, behavioral and neuroimaging evidence for and against acquisition of categorical representations is described. Finally, recommendations for future work are discussed. The paper concludes with a recommendation for integration of behavioral and neuroimaging work and theory in this area.
Collapse
Affiliation(s)
| | - Bharath Chandrasekaran
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA
| | - Casey L Roark
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA
| |
Collapse
|
11
|
Zinszer BD, Yuan Q, Zhang Z, Chandrasekaran B, Guo T. Continuous speech tracking in bilinguals reflects adaptation to both language and noise. BRAIN AND LANGUAGE 2022; 230:105128. [PMID: 35537247 DOI: 10.1016/j.bandl.2022.105128] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 04/13/2022] [Accepted: 04/21/2022] [Indexed: 06/14/2023]
Abstract
Listeners regularly comprehend continuous speech despite noisy conditions. Previous studies show that neural tracking of speech degrades under noise, predicts comprehension, and increases for non-native listeners. We test the hypothesis that listeners similarly increase tracking for both L2 and noisy L1 speech, after adjusting for comprehension. Twenty-four Chinese-English bilinguals underwent EEG while listening to one hour of an audiobook, mixed with three levels of noise, in Mandarin and English and answered comprehension questions. We estimated tracking of the speech envelope in EEG for each one-minute segment using the multivariate temporal response function (mTRF). Contrary to our prediction, L2 tracking was significantly lower than L1, while L1 tracking significantly increased with noise maskers without reducing comprehension. However, greater L2 proficiency was positively associated with greater L2 tracking. We discuss how studies of speech envelope tracking using noise and bilingualism might be reconciled through a focus on exerted rather than demanded effort.
Collapse
Affiliation(s)
| | - Qiming Yuan
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, China
| | - Zhaoqi Zhang
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, China
| | | | - Taomei Guo
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, China.
| |
Collapse
|
12
|
Gnanateja GN, Devaraju DS, Heyne M, Quique YM, Sitek KR, Tardif MC, Tessmer R, Dial HR. On the Role of Neural Oscillations Across Timescales in Speech and Music Processing. Front Comput Neurosci 2022; 16:872093. [PMID: 35814348 PMCID: PMC9260496 DOI: 10.3389/fncom.2022.872093] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 05/24/2022] [Indexed: 11/25/2022] Open
Abstract
This mini review is aimed at a clinician-scientist seeking to understand the role of oscillations in neural processing and their functional relevance in speech and music perception. We present an overview of neural oscillations, methods used to study them, and their functional relevance with respect to music processing, aging, hearing loss, and disorders affecting speech and language. We first review the oscillatory frequency bands and their associations with speech and music processing. Next we describe commonly used metrics for quantifying neural oscillations, briefly touching upon the still-debated mechanisms underpinning oscillatory alignment. Following this, we highlight key findings from research on neural oscillations in speech and music perception, as well as contributions of this work to our understanding of disordered perception in clinical populations. Finally, we conclude with a look toward the future of oscillatory research in speech and music perception, including promising methods and potential avenues for future work. We note that the intention of this mini review is not to systematically review all literature on cortical tracking of speech and music. Rather, we seek to provide the clinician-scientist with foundational information that can be used to evaluate and design research studies targeting the functional role of oscillations in speech and music processing in typical and clinical populations.
Collapse
Affiliation(s)
- G. Nike Gnanateja
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, United States
| | - Dhatri S. Devaraju
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, United States
| | - Matthias Heyne
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, United States
| | - Yina M. Quique
- Center for Education in Health Sciences, Northwestern University, Chicago, IL, United States
| | - Kevin R. Sitek
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, United States
| | - Monique C. Tardif
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, United States
| | - Rachel Tessmer
- Department of Speech, Language, and Hearing Sciences, The University of Texas at Austin, Austin, TX, United States
| | - Heather R. Dial
- Department of Speech, Language, and Hearing Sciences, The University of Texas at Austin, Austin, TX, United States
- Department of Communication Sciences and Disorders, University of Houston, Houston, TX, United States
| |
Collapse
|
13
|
Muncke J, Kuruvila I, Hoppe U. Prediction of Speech Intelligibility by Means of EEG Responses to Sentences in Noise. Front Neurosci 2022; 16:876421. [PMID: 35720724 PMCID: PMC9198593 DOI: 10.3389/fnins.2022.876421] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Accepted: 03/13/2022] [Indexed: 11/13/2022] Open
Abstract
Objective Understanding speech in noisy conditions is challenging even for people with mild hearing loss, and intelligibility for an individual person is usually evaluated by using several subjective test methods. In the last few years, a method has been developed to determine a temporal response function (TRF) between speech envelope and simultaneous electroencephalographic (EEG) measurements. By using this TRF it is possible to predict the EEG signal for any speech signal. Recent studies have suggested that the accuracy of this prediction varies with the level of noise added to the speech signal and can predict objectively the individual speech intelligibility. Here we assess the variations of the TRF itself when it is calculated for measurements with different signal-to-noise ratios and apply these variations to predict speech intelligibility. Methods For 18 normal hearing subjects the individual threshold of 50% speech intelligibility was determined by using a speech in noise test. Additionally, subjects listened passively to speech material of the speech in noise test at different signal-to-noise ratios close to individual threshold of 50% speech intelligibility while an EEG was recorded. Afterwards the shape of TRFs for each signal-to-noise ratio and subject were compared with the derived intelligibility. Results The strongest effect of variations in stimulus signal-to-noise ratio on the TRF shape occurred close to 100 ms after the stimulus presentation, and was located in the left central scalp region. The investigated variations in TRF morphology showed a strong correlation with speech intelligibility, and we were able to predict the individual threshold of 50% speech intelligibility with a mean deviation of less then 1.5 dB. Conclusion The intelligibility of speech in noise can be predicted by analyzing the shape of the TRF derived from different stimulus signal-to-noise ratios. Because TRFs are interpretable, in a manner similar to auditory evoked potentials, this method offers new options for clinical diagnostics.
Collapse
Affiliation(s)
- Jan Muncke
- Department of Audiology, ENT-Clinic, University Hospital Erlangen, Erlangen, Germany
| | - Ivine Kuruvila
- Department of Audiology, ENT-Clinic, University Hospital Erlangen, Erlangen, Germany
- WS Audiology, Erlangen, Germany
| | - Ulrich Hoppe
- Department of Audiology, ENT-Clinic, University Hospital Erlangen, Erlangen, Germany
| |
Collapse
|
14
|
Devaraju DS, Kemp A, Eddins DA, Shrivastav R, Chandrasekaran B, Hampton Wray A. Effects of Task Demands on Neural Correlates of Acoustic and Semantic Processing in Challenging Listening Conditions. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:3697-3706. [PMID: 34403278 DOI: 10.1044/2021_jslhr-21-00006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Purpose Listeners shift their listening strategies between lower level acoustic information and higher level semantic information to prioritize maximum speech intelligibility in challenging listening conditions. Although increasing task demands via acoustic degradation modulates lexical-semantic processing, the neural mechanisms underlying different listening strategies are unclear. The current study examined the extent to which encoding of lower level acoustic cues is modulated by task demand and associations with lexical-semantic processes. Method Electroencephalography was acquired while participants listened to sentences in the presence of four-talker babble that contained either higher or lower probability final words. Task difficulty was modulated by time available to process responses. Cortical tracking of speech-neural correlates of acoustic temporal envelope processing-were estimated using temporal response functions. Results Task difficulty did not affect cortical tracking of temporal envelope of speech under challenging listening conditions. Neural indices of lexical-semantic processing (N400 amplitudes) were larger with increased task difficulty. No correlations were observed between the cortical tracking of temporal envelope of speech and lexical-semantic processes, even after controlling for the effect of individualized signal-to-noise ratios. Conclusions Cortical tracking of the temporal envelope of speech and semantic processing are differentially influenced by task difficulty. While increased task demands modulated higher level semantic processing, cortical tracking of the temporal envelope of speech may be influenced by task difficulty primarily when the demand is manipulated in terms of acoustic properties of the stimulus, consistent with an emerging perspective in speech perception.
Collapse
Affiliation(s)
- Dhatri S Devaraju
- Department of Communication Science and Disorders, University of Pittsburgh, PA
| | - Amy Kemp
- Department of Communication Sciences and Special Education, University of Georgia, Athens
| | - David A Eddins
- Department of Communication Sciences & Disorders, University of South Florida, Tampa
| | | | | | - Amanda Hampton Wray
- Department of Communication Science and Disorders, University of Pittsburgh, PA
| |
Collapse
|
15
|
Llanos F, German JS, Gnanateja GN, Chandrasekaran B. The neural processing of pitch accents in continuous speech. Neuropsychologia 2021; 158:107883. [PMID: 33989647 DOI: 10.1016/j.neuropsychologia.2021.107883] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 04/29/2021] [Accepted: 05/03/2021] [Indexed: 12/21/2022]
Abstract
Pitch accents are local pitch patterns that convey differences in word prominence and modulate the information structure of the discourse. Despite the importance to discourse in languages like English, neural processing of pitch accents remains understudied. The current study investigates the neural processing of pitch accents by native and non-native English speakers while they are listening to or ignoring 45 min of continuous, natural speech. Leveraging an approach used to study phonemes in natural speech, we analyzed thousands of electroencephalography (EEG) segments time-locked to pitch accents in a prosodic transcription. The optimal neural discrimination between pitch accent categories emerged at latencies between 100 and 200 ms. During these latencies, we found a strong structural alignment between neural and phonetic representations of pitch accent categories. In the same latencies, native listeners exhibited more robust processing of pitch accent contrasts than non-native listeners. However, these group differences attenuated when the speech signal was ignored. We can reliably capture the neural processing of discrete and contrastive pitch accent categories in continuous speech. Our analytic approach also captures how language-specific knowledge and selective attention influences the neural processing of pitch accent categories.
Collapse
Affiliation(s)
- Fernando Llanos
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, USA; Department of Linguistics, The University of Texas at Austin, Austin, TX, USA
| | - James S German
- Aix-Marseille University, CNRS, LPL, Aix-en-Provence, France
| | - G Nike Gnanateja
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, USA
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
16
|
Alickovic E, Ng EHN, Fiedler L, Santurette S, Innes-Brown H, Graversen C. Effects of Hearing Aid Noise Reduction on Early and Late Cortical Representations of Competing Talkers in Noise. Front Neurosci 2021; 15:636060. [PMID: 33841081 PMCID: PMC8032942 DOI: 10.3389/fnins.2021.636060] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 02/26/2021] [Indexed: 11/13/2022] Open
Abstract
OBJECTIVES Previous research using non-invasive (magnetoencephalography, MEG) and invasive (electrocorticography, ECoG) neural recordings has demonstrated the progressive and hierarchical representation and processing of complex multi-talker auditory scenes in the auditory cortex. Early responses (<85 ms) in primary-like areas appear to represent the individual talkers with almost equal fidelity and are independent of attention in normal-hearing (NH) listeners. However, late responses (>85 ms) in higher-order non-primary areas selectively represent the attended talker with significantly higher fidelity than unattended talkers in NH and hearing-impaired (HI) listeners. Motivated by these findings, the objective of this study was to investigate the effect of a noise reduction scheme (NR) in a commercial hearing aid (HA) on the representation of complex multi-talker auditory scenes in distinct hierarchical stages of the auditory cortex by using high-density electroencephalography (EEG). DESIGN We addressed this issue by investigating early (<85 ms) and late (>85 ms) EEG responses recorded in 34 HI subjects fitted with HAs. The HA noise reduction (NR) was either on or off while the participants listened to a complex auditory scene. Participants were instructed to attend to one of two simultaneous talkers in the foreground while multi-talker babble noise played in the background (+3 dB SNR). After each trial, a two-choice question about the content of the attended speech was presented. RESULTS Using a stimulus reconstruction approach, our results suggest that the attention-related enhancement of neural representations of target and masker talkers located in the foreground, as well as suppression of the background noise in distinct hierarchical stages is significantly affected by the NR scheme. We found that the NR scheme contributed to the enhancement of the foreground and of the entire acoustic scene in the early responses, and that this enhancement was driven by better representation of the target speech. We found that the target talker in HI listeners was selectively represented in late responses. We found that use of the NR scheme resulted in enhanced representations of the target and masker speech in the foreground and a suppressed representation of the noise in the background in late responses. We found a significant effect of EEG time window on the strengths of the cortical representation of the target and masker. CONCLUSION Together, our analyses of the early and late responses obtained from HI listeners support the existing view of hierarchical processing in the auditory cortex. Our findings demonstrate the benefits of a NR scheme on the representation of complex multi-talker auditory scenes in different areas of the auditory cortex in HI listeners.
Collapse
Affiliation(s)
- Emina Alickovic
- Eriksholm Research Centre, Oticon A/S, Snekkersten, Denmark
- Department of Electrical Engineering, Linkoping University, Linkoping, Sweden
| | - Elaine Hoi Ning Ng
- Centre for Applied Audiology Research, Oticon A/S, Smørum, Denmark
- Department of Behavioral Sciences and Learning, Linkoping University, Linkoping, Sweden
| | - Lorenz Fiedler
- Eriksholm Research Centre, Oticon A/S, Snekkersten, Denmark
| | - Sébastien Santurette
- Centre for Applied Audiology Research, Oticon A/S, Smørum, Denmark
- Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
| | | | | |
Collapse
|
17
|
Reetzke R, Gnanateja GN, Chandrasekaran B. Neural tracking of the speech envelope is differentially modulated by attention and language experience. BRAIN AND LANGUAGE 2021; 213:104891. [PMID: 33290877 PMCID: PMC7856208 DOI: 10.1016/j.bandl.2020.104891] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 09/22/2020] [Accepted: 11/18/2020] [Indexed: 05/13/2023]
Abstract
The ability to selectively attend to a speech signal amid competing sounds is a significant challenge, especially for listeners trying to comprehend non-native speech. Attention is critical to direct neural processing resources to the most essential information. Here, neural tracking of the speech envelope of an English story narrative and cortical auditory evoked potentials (CAEPs) to non-speech stimuli were simultaneously assayed in native and non-native listeners of English. Although native listeners exhibited higher narrative comprehension accuracy, non-native listeners exhibited enhanced neural tracking of the speech envelope and heightened CAEP magnitudes. These results support an emerging view that although attention to a target speech signal enhances neural tracking of the speech envelope, this mechanism itself may not confer speech comprehension advantages. Our findings suggest that non-native listeners may engage neural attentional processes that enhance low-level acoustic features, regardless if the target signal contains speech or non-speech information.
Collapse
Affiliation(s)
- Rachel Reetzke
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, United States; Center for Autism and Related Disorders, Kennedy Krieger Institute, United States
| | - G Nike Gnanateja
- Department of Communication Science and Disorders, University of Pittsburgh, United States
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, University of Pittsburgh, United States.
| |
Collapse
|
18
|
Dial HR, Gnanateja GN, Tessmer RS, Gorno-Tempini ML, Chandrasekaran B, Henry ML. Cortical Tracking of the Speech Envelope in Logopenic Variant Primary Progressive Aphasia. Front Hum Neurosci 2021; 14:597694. [PMID: 33488371 PMCID: PMC7815818 DOI: 10.3389/fnhum.2020.597694] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Accepted: 11/19/2020] [Indexed: 11/13/2022] Open
Abstract
Logopenic variant primary progressive aphasia (lvPPA) is a neurodegenerative language disorder primarily characterized by impaired phonological processing. Sentence repetition and comprehension deficits are observed in lvPPA and linked to impaired phonological working memory, but recent evidence also implicates impaired speech perception. Currently, neural encoding of the speech envelope, which forms the scaffolding for perception, is not clearly understood in lvPPA. We leveraged recent analytical advances in electrophysiology to examine speech envelope encoding in lvPPA. We assessed cortical tracking of the speech envelope and in-task comprehension of two spoken narratives in individuals with lvPPA (n = 10) and age-matched (n = 10) controls. Despite markedly reduced narrative comprehension relative to controls, individuals with lvPPA had increased cortical tracking of the speech envelope in theta oscillations, which track low-level features (e.g., syllables), but not delta oscillations, which track speech units that unfold across a longer time scale (e.g., words, phrases, prosody). This neural signature was highly correlated across narratives. Results indicate an increased reliance on acoustic cues during speech encoding. This may reflect inefficient encoding of bottom-up speech cues, likely as a consequence of dysfunctional temporoparietal cortex.
Collapse
Affiliation(s)
- Heather R. Dial
- Aphasia Research and Treatment Lab, Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| | - G. Nike Gnanateja
- SoundBrain Lab, Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, United States
| | - Rachel S. Tessmer
- Aphasia Research and Treatment Lab, Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| | - Maria Luisa Gorno-Tempini
- Language Neurobiology Laboratory, Department of Neurology, Memory and Aging Center, University of California, San Francisco, San Francisco, CA, United States
| | - Bharath Chandrasekaran
- SoundBrain Lab, Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, United States
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, PA, United States
| | - Maya L. Henry
- Aphasia Research and Treatment Lab, Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
- Department of Neurology, Dell Medical School, University of Texas at Austin, Austin, TX, United States
| |
Collapse
|