1
|
Sivridag F, Schürholz J, Hoehl S, Mani N. Children's cortical speech tracking in face-to-face and online video communication. Sci Rep 2025; 15:20134. [PMID: 40542072 DOI: 10.1038/s41598-025-04778-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Accepted: 05/28/2025] [Indexed: 06/22/2025] Open
Abstract
In today's digital age, online video communication has become an important way for children to interact with their social partners, especially given the increased use of such tools during the pandemic. While previous studies suggest that children can learn and engage well in virtual settings, there is limited evidence examining the neural mechanisms supporting speech processing in face-to-face and video interactions. This study examines 5-year-old German speaking children's cortical speech tracking (n = 29), a measure of how their brains process speech, in both scenarios. Our findings indicate comparable levels of cortical speech tracking in both conditions, albeit with subtle differences. This implies that children exhibit similar neural responses to speech in both situations and may adopt different strategies to overcome potential challenges in video communication. These neural results align with previous behavioural findings, supporting the notion that live online video interactions can serve as an effective communication medium for children.
Collapse
Affiliation(s)
- Fatih Sivridag
- University of Göttingen, Psychology of Language, 37073, Göttingen, Germany.
- Leibniz Science Campus Primate Cognition, 37077, Göttingen, Germany.
| | - Josefine Schürholz
- University of Göttingen, Psychology of Language, 37073, Göttingen, Germany
| | - Stefanie Hoehl
- University of Vienna, Faculty of Psychology, 1010, Vienna, Austria
| | - Nivedita Mani
- University of Göttingen, Psychology of Language, 37073, Göttingen, Germany
- Leibniz Science Campus Primate Cognition, 37077, Göttingen, Germany
| |
Collapse
|
2
|
Gagné N, Greenlaw KM, Coffey EBJ. Sound degradation type differentially affects neural indicators of cognitive workload and speech tracking. Hear Res 2025; 464:109303. [PMID: 40412301 DOI: 10.1016/j.heares.2025.109303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/10/2024] [Revised: 03/26/2025] [Accepted: 05/07/2025] [Indexed: 05/27/2025]
Abstract
Hearing-in-noise (HIN) is a challenging task that is essential to human functioning in social, vocational, and educational contexts. Successful speech perception in noisy settings is thought to rely in part on the brain's ability to enhance neural representations of attended speech. In everyday HIN situations, important features of speech (i.e., pitch, rhythm) may be degraded in addition to being embedded in noise. The impact of these differences in sound quality on experiences of workload and neural representations of speech will be important for informing our knowledge on the cognitive demands imposed by every-day difficult listening situations. We investigated HIN perception in 20 healthy adults using continuous speech that was either clean, spectrally degraded, or temporally degraded. Each sound condition was presented both with and without pink noise. Participants engaged in a selective listening task, in which a short-story was presented with varying sound quality, while EEG data were recorded. Neural correlates of cognitive workload were obtained using power levels of two frequency bands sensitive to task difficulty manipulations: alpha (8 - 12 Hz) and theta (4 - 8 Hz). Acoustic and linguistic features (speech envelope, word onsets, word surprisal) were decoded to reveal the degree to which speech was successfully encoded. Overall, alpha-theta power increased significantly when noise was added across sound conditions, while prediction accuracy of speech tracking decreased, suggesting that more effort was required to listen, and that the speech was not as successfully encoded. The temporal degradation also resulted in greater EEG power, possibly as a function of a compensatory mechanism to restore the important temporal information required for speech comprehension. Our findings suggest that measures related to cognitive workload and successful speech encoding are differentially affected by noise and sound degradations, which may help to inform future interventions that aim to mitigate these every-day challenges.
Collapse
Affiliation(s)
- Nathan Gagné
- Department of Psychology, Concordia University, Montréal, Canada; International Laboratory for Brain, Music and Sound Research (BRAMS); The Centre for Research on Brain, Language and Music (CRBLM).
| | - Keelin M Greenlaw
- Department of Psychology, Concordia University, Montréal, Canada; International Laboratory for Brain, Music and Sound Research (BRAMS); The Centre for Research on Brain, Language and Music (CRBLM)
| | - Emily B J Coffey
- Department of Psychology, Concordia University, Montréal, Canada; International Laboratory for Brain, Music and Sound Research (BRAMS); The Centre for Research on Brain, Language and Music (CRBLM)
| |
Collapse
|
3
|
Osorio S, Assaneo MF. Anatomically distinct cortical tracking of music and speech by slow (1-8Hz) and fast (70-120Hz) oscillatory activity. PLoS One 2025; 20:e0320519. [PMID: 40341725 PMCID: PMC12061428 DOI: 10.1371/journal.pone.0320519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Accepted: 02/19/2025] [Indexed: 05/11/2025] Open
Abstract
Music and speech encode hierarchically organized structural complexity at the service of human expressiveness and communication. Previous research has shown that populations of neurons in auditory regions track the envelope of acoustic signals within the range of slow and fast oscillatory activity. However, the extent to which cortical tracking is influenced by the interplay between stimulus type, frequency band, and brain anatomy remains an open question. In this study, we reanalyzed intracranial recordings from thirty subjects implanted with electrocorticography (ECoG) grids in the left cerebral hemisphere, drawn from an existing open-access ECoG database. Participants passively watched a movie where visual scenes were accompanied by either music or speech stimuli. Cross-correlation between brain activity and the envelope of music and speech signals, along with density-based clustering analyses and linear mixed-effects modeling, revealed both anatomically overlapping and functionally distinct mapping of the tracking effect as a function of stimulus type and frequency band. We observed widespread left-hemisphere tracking of music and speech signals in the Slow Frequency Band (SFB, band-passed filtered low-frequency signal between 1-8Hz), with near zero temporal lags. In contrast, cortical tracking in the High Frequency Band (HFB, envelope of the 70-120Hz band-passed filtered signal) was higher during speech perception, was more densely concentrated in classical language processing areas, and showed a frontal-to-temporal gradient in lag values that was not observed during perception of musical stimuli. Our results highlight a complex interaction between cortical region and frequency band that shapes temporal dynamics during processing of naturalistic music and speech signals.
Collapse
Affiliation(s)
- Sergio Osorio
- Department of Neurology, Harvard Medical School, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Boston, Massachusetts, United States of America
| | | |
Collapse
|
4
|
Wang Y, Wu D, Ding N, Zou J, Lu Y, Ma Y, Zhang X, Yu W, Wang K. Linear phase property of speech envelope tracking response in Heschl's gyrus and superior temporal gyrus. Cortex 2025; 186:1-10. [PMID: 40138746 DOI: 10.1016/j.cortex.2025.02.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 01/17/2025] [Accepted: 02/27/2025] [Indexed: 03/29/2025]
Abstract
Understanding how the brain tracks speech during listening remains a challenge. The phase resetting hypothesis proposes that the envelope-tracking response is generated by resetting the phase of intrinsic nonlinear neural oscillations, whereas the evoked response hypothesis proposes that the envelope-tracking response is the linear superposition of transient responses evoked by a sequence of acoustic events in speech. Recent studies have demonstrated a linear phase property of the envelope-tracking response, supporting the evoked response hypothesis. However, the cortical regions aligning with the evoked response hypothesis remain unclear. To address this question, we directly recorded from the cortex using stereo-electroencephalography (SEEG) in nineteen epilepsy patients as they listened to natural speech, and we investigated whether the phase lag between the speech envelope and neural activity linearly changes across frequency. We found that the linear phase property of low-frequency (LF) (.5-40 Hz) envelope tracking was widely observed in Heschl's gyrus (HG) and superior temporal gyrus (STG), with additional sparser distribution in insula, postcentral gyrus, and precentral gyrus. Furthermore, the latency of LF envelope-tracking responses derived from phase-frequency curve exhibited an increase gradient along HG and in the posterior-to-anterior direction in STG. Our findings suggest that auditory cortex can track speech envelope in line with the evoked response hypothesis.
Collapse
Affiliation(s)
- Yaoyao Wang
- Research Center for Intelligent Computing Infrastructure Innovation, Zhejiang Lab, Hangzhou, 311121, China
| | - Dengchang Wu
- Epilepsy Center, Department of Neurology, First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, 310009, China
| | - Nai Ding
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, 310027, China
| | - Jiajie Zou
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, 310027, China
| | - Yuhan Lu
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, 310027, China
| | - Yuehui Ma
- Epilepsy Center, Department of Neurosurgery, First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, 310009, China
| | - Xing Zhang
- Epilepsy Center, Department of Neurology, First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, 310009, China
| | - Wenyuan Yu
- Research Center for Life Sciences Computing, Zhejiang Lab, Hangzhou, 311121, China; Mental Health Education, Consultation and Research Center, Shenzhen Polytechnic University, Shenzhen, 518055, China.
| | - Kang Wang
- Epilepsy Center, Department of Neurology, First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, 310009, China.
| |
Collapse
|
5
|
Preisig BC, Meyer M. Predictive coding and dimension-selective attention enhance the lateralization of spoken language processing. Neurosci Biobehav Rev 2025; 172:106111. [PMID: 40118260 DOI: 10.1016/j.neubiorev.2025.106111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2024] [Revised: 02/12/2025] [Accepted: 03/15/2025] [Indexed: 03/23/2025]
Abstract
Hemispheric lateralization in speech and language processing exemplifies functional brain specialization. Seminal work in patients with left hemisphere damage highlighted the left-hemispheric dominance in language functions. However, speech processing is not confined to the left hemisphere. Hence, some researchers associate lateralization with auditory processing asymmetries: slow temporal and fine spectral acoustic information is preferentially processed in right auditory regions, while faster temporal information is primarily handled by left auditory regions. Other scholars posit that lateralization relates more to linguistic processing, particularly for speech and speech-like stimuli. We argue that these seemingly distinct accounts are interdependent. Linguistic analysis of speech relies on top-down processes, such as predictive coding and dimension-selective auditory attention, which enhance lateralized processing by engaging left-lateralized sensorimotor networks. Our review highlights that lateralization is weaker for simple sounds, stronger for speech-like sounds, and strongest for meaningful speech. Evidence shows that predictive speech processing and selective attention enhance lateralization. We illustrate that these top-down processes rely on left-lateralized sensorimotor networks and provide insights into the role of these networks in speech processing.
Collapse
Affiliation(s)
- Basil C Preisig
- The Institute for the Interdisciplinary Study of Language Evolution, Evolutionary Neuroscience of Language, University of Zurich, Switzerland; Zurich Center for Linguistics, University of Zurich, Switzerland; Neuroscience Center Zurich, University of Zurich and Eidgenössische Technische Hochschule Zurich, Switzerland.
| | - Martin Meyer
- The Institute for the Interdisciplinary Study of Language Evolution, Evolutionary Neuroscience of Language, University of Zurich, Switzerland; Zurich Center for Linguistics, University of Zurich, Switzerland; Neuroscience Center Zurich, University of Zurich and Eidgenössische Technische Hochschule Zurich, Switzerland
| |
Collapse
|
6
|
Keitel A, Pelofi C, Guan X, Watson E, Wight L, Allen S, Mencke I, Keitel C, Rimmele J. Cortical and behavioral tracking of rhythm in music: Effects of pitch predictability, enjoyment, and expertise. Ann N Y Acad Sci 2025; 1546:120-135. [PMID: 40101105 PMCID: PMC11998481 DOI: 10.1111/nyas.15315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/20/2025]
Abstract
The cortical tracking of stimulus features is a crucial neural requisite of how we process continuous music. We here tested whether cortical tracking of the beat, typically related to rhythm processing, is modulated by pitch predictability and other top-down factors. Participants listened to tonal (high pitch predictability) and atonal (low pitch predictability) music while undergoing electroencephalography. We analyzed their cortical tracking of the acoustic envelope. Cortical envelope tracking was stronger while listening to atonal music, potentially reflecting listeners' violated pitch expectations and increased attention allocation. Envelope tracking was also stronger with more expertise and enjoyment. Furthermore, we showed cortical tracking of pitch surprisal (using IDyOM), which suggests that listeners' expectations match those computed by the IDyOM model, with higher surprisal for atonal music. Behaviorally, we measured participants' ability to finger-tap to the beat of tonal and atonal sequences in two experiments. Finger-tapping performance was better in the tonal condition, indicating a positive effect of pitch predictability on behavioral rhythm processing. Cortical envelope tracking predicted tapping performance for tonal music, as did pitch-surprisal tracking for atonal music, indicating that high and low predictability might impose different processing regimes. Taken together, our results show various ways that top-down factors impact musical rhythm processing.
Collapse
Affiliation(s)
- Anne Keitel
- Department of PsychologyUniversity of DundeeDundeeUK
| | - Claire Pelofi
- Department of PsychologyNew York UniversityNew YorkNew YorkUSA
- Max Planck NYU Center for Language, Music, and EmotionNew YorkNew YorkUSA
| | - Xinyi Guan
- Max Planck NYU Center for Language, Music, and EmotionNew YorkNew YorkUSA
- Digital and Cognitive Musicology LabÉcole Polytechnique Fédérale de LausanneLausanneSwitzerland
| | - Emily Watson
- Department of PsychologyUniversity of DundeeDundeeUK
| | - Lucy Wight
- Department of PsychologyUniversity of DundeeDundeeUK
- School of PsychologyAston UniversityBirminghamUK
| | - Sarah Allen
- Department of PsychologyUniversity of DundeeDundeeUK
| | - Iris Mencke
- Department of Medical Physics and AcousticsUniversity of OldenburgOldenburgGermany
- Department of MusicMax‐Planck‐Institute for Empirical AestheticsFrankfurtGermany
| | | | - Johanna Rimmele
- Max Planck NYU Center for Language, Music, and EmotionNew YorkNew YorkUSA
- Department of Cognitive NeuropsychologyMax‐Planck‐Institute for Empirical AestheticsFrankfurtGermany
| |
Collapse
|
7
|
Fernández‐Merino L, Lizarazu M, Molinaro N, Kalashnikova M. Temporal Structure of Music Improves the Cortical Encoding of Speech. Hum Brain Mapp 2025; 46:e70199. [PMID: 40129256 PMCID: PMC11933723 DOI: 10.1002/hbm.70199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 02/03/2025] [Accepted: 03/11/2025] [Indexed: 03/26/2025] Open
Abstract
Long- and short-term musical training has been proposed to improve the efficiency of cortical tracking of speech, which refers to the synchronization of brain oscillations and the acoustic temporal structure of external stimuli. Here, we study how musical sequences with different rhythm structures can guide the temporal dynamics of auditory oscillations synchronized with the speech envelope. For this purpose, we investigated the effects of prior exposure to rhythmically structured musical sequences on cortical tracking of speech in Basque-Spanish bilingual adults (Experiment 1; N = 33, 22 female, Mean age = 25 years). We presented participants with sentences in Basque and Spanish preceded by musical sequences that differed in their rhythmical structure. The rhythmical structure of the musical sequences was created to (1) reflect and match the syllabic structure of the sentences, (2) reflect a regular rhythm but not match the syllabic structure of the sentences, and (3) follow an irregular rhythm. Participants' brain responses were recorded using electroencephalography, and speech-brain coherence in the delta and theta bands was calculated. Results showed stronger speech-brain coherence in the delta band in the first condition, but only for Spanish stimuli. A follow-up experiment including a subset of the initial sample (Experiment 2; N = 20) was conducted to investigate whether language-specific stimuli properties influenced the Basque results. Similar to Experiment 1, we found stronger speech-brain coherence in the delta and theta bands when the sentences were preceded by musical sequences that matched their syllabic structure. These results suggest that not only the regularity in music is crucial for influencing cortical tracking of speech, but so is adjusting this regularity to optimally reflect the rhythmic characteristics of listeners' native language(s). Despite finding some language-specific differences across frequencies, we showed that rhythm, inherent in musical signals, guides the adaptation of brain oscillations, by adapting the temporal dynamics of the oscillatory activity to the rhythmic scaffolding of the musical signal.
Collapse
Affiliation(s)
- Laura Fernández‐Merino
- Basque Center on Cognition, Brain and LanguageSan SebastianSpain
- University of the Basque Country (Universidad del País Vasco/Euskal Herriko Unibertsitatea)San SebastianSpain
| | - Mikel Lizarazu
- Basque Center on Cognition, Brain and LanguageSan SebastianSpain
| | - Nicola Molinaro
- Basque Center on Cognition, Brain and LanguageSan SebastianSpain
- Ikerbasque, Basque Foundation for ScienceBilbaoSpain
| | - Marina Kalashnikova
- Basque Center on Cognition, Brain and LanguageSan SebastianSpain
- Ikerbasque, Basque Foundation for ScienceBilbaoSpain
| |
Collapse
|
8
|
Orf M, Hannemann R, Obleser J. Does Amplitude Compression Help or Hinder Attentional Neural Speech Tracking? J Neurosci 2025; 45:e0238242024. [PMID: 39843232 PMCID: PMC11905343 DOI: 10.1523/jneurosci.0238-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 09/09/2024] [Accepted: 10/19/2024] [Indexed: 01/24/2025] Open
Abstract
Amplitude compression is an indispensable feature of contemporary audio production and especially relevant in modern hearing aids. The cortical fate of amplitude-compressed speech signals is not well studied, however, and may yield undesired side effects: We hypothesize that compressing the amplitude envelope of continuous speech reduces neural tracking. Yet, leveraging such a "compression side effect" on unwanted, distracting sounds could potentially support attentive listening if effectively reducing their neural tracking. In this study, we examined 24 young normal hearing (NH) individuals, 19 older hearing-impaired (HI) individuals, and 12 older normal hearing individuals. Participants were instructed to focus on one of two competing talkers while ignoring the other. Envelope compression (1:8 ratio, loudness-matched) was applied to one or both streams containing short speech repeats. Electroencephalography allowed us to quantify the cortical response function and degree of speech tracking. With compression applied to the attended target stream, HI participants showed reduced behavioral accuracy, and compressed speech yielded generally lowered metrics of neural tracking. Importantly, we found that compressing the ignored stream resulted in a stronger neural representation of the uncompressed target speech. Our results imply that intelligent compression algorithms, with variable compression ratios applied to separated sources, could help individuals with hearing loss suppress distraction in complex multitalker environments.
Collapse
Affiliation(s)
- Martin Orf
- Department of Psychology, University of Lübeck, Lübeck 23562, Germany
- Center of Brain, Behavior and Metabolism (CBBM), University of Lübeck, Lübeck 23562, Germany
| | - Ronny Hannemann
- WS Audiology, Erlangen 91058, Germany
- WS Audiology, Lynge 3540, Denmark
| | - Jonas Obleser
- Department of Psychology, University of Lübeck, Lübeck 23562, Germany
- Center of Brain, Behavior and Metabolism (CBBM), University of Lübeck, Lübeck 23562, Germany
| |
Collapse
|
9
|
Gnanateja GN, Rupp K, Llanos F, Hect J, German JS, Teichert T, Abel TJ, Chandrasekaran B. Cortical processing of discrete prosodic patterns in continuous speech. Nat Commun 2025; 16:1947. [PMID: 40032850 PMCID: PMC11876672 DOI: 10.1038/s41467-025-56779-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 01/29/2025] [Indexed: 03/05/2025] Open
Abstract
Prosody has a vital function in speech, structuring a speaker's intended message for the listener. The superior temporal gyrus (STG) is considered a critical hub for prosody, but the role of earlier auditory regions like Heschl's gyrus (HG), associated with pitch processing, remains unclear. Using intracerebral recordings in humans and non-human primate models, we investigated prosody processing in narrative speech, focusing on pitch accents-abstract phonological units that signal word prominence and communicative intent. In humans, HG encoded pitch accents as abstract representations beyond spectrotemporal features, distinct from segmental speech processing, and outperforms STG in disambiguating pitch accents. Multivariate models confirm HG's unique representation of pitch accent categories. In the non-human primate, pitch accents were not abstractly encoded, despite robust spectrotemporal processing, highlighting the role of experience in shaping abstract representations. These findings emphasize a key role for the HG in early prosodic abstraction and advance our understanding of human speech processing.
Collapse
Affiliation(s)
- G Nike Gnanateja
- Speech Processing and Auditory Neuroscience Lab, Department of Communication Sciences and Disorder, University of Wisconsin-Madison, Madison, WI, USA
| | - Kyle Rupp
- Pediatric Brain Electrophysiology Laboratory, Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA
| | - Fernando Llanos
- UT Austin Neurolinguistics Lab, Department of Linguistics, The University of Texas at Austin, Austin, TX, USA
| | - Jasmine Hect
- Pediatric Brain Electrophysiology Laboratory, Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA
| | - James S German
- Aix-Marseille University, CNRS, LPL, Aix-en-Provence, France
| | - Tobias Teichert
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Taylor J Abel
- Pediatric Brain Electrophysiology Laboratory, Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA.
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, PA, USA.
| | - Bharath Chandrasekaran
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, PA, USA.
- Roxelyn and Richard Pepper Department of Communication Sciences & Disorders, Northwestern University, Evanston, IL, USA.
- Knowles Hearing Center, Evanston, IL, 60208, USA.
| |
Collapse
|
10
|
Balconi M, Acconito C, Angioletti L. A preliminary EEG study on persuasive communication towards groupness. Sci Rep 2025; 15:6242. [PMID: 39979540 PMCID: PMC11842712 DOI: 10.1038/s41598-025-90301-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Accepted: 02/12/2025] [Indexed: 02/22/2025] Open
Abstract
Social neuroscience has acknowledged the role of persuasion but examined either the Persuader's or the Receiver's neural mechanisms. This study explored electrophysiological (EEG) correlates of Persuader and Receiver during a naturalistic persuasive interaction, in which Persuader aimed to convince Receiver that adopting a group decision-making orientation was the best solution to manage a group dynamic. EEG data - frequency bands: delta (0.5-3.5 Hz), theta (4-7.5 Hz), alpha (8-12.5 Hz), beta (13-30 Hz) and gamma (30.5-50 Hz) - were collected from 14 Persuaders and 14 Receivers. Findings indicated that the strategic efforts of Persuaders to enhance groupness are linked to activation in specific EEG bands (delta, theta and alpha) that distinguish them from Receivers. There is a significant distribution of these activations in the frontal areas of the Persuaders (especially, frontal right hemisphere for theta band), contrasting with the more temporal and posterior activations observed in Receivers (where the frontal areas are generally less activated). The study concludes that, under the same behavioral conditions in terms of group orientation, persuasive interaction shows specific EEG markers that connote the role of the Persuader characterized by greater attentional effort during the interaction.
Collapse
Affiliation(s)
- Michela Balconi
- International research center for Cognitive Applied Neuroscience (IrcCAN), Università Cattolica del Sacro Cuore, Milan, Italy
- Research Unit in Affective and Social Neuroscience, Department of Psychology, Università Cattolica del Sacro Cuore, Milan, Italy
| | - Carlotta Acconito
- International research center for Cognitive Applied Neuroscience (IrcCAN), Università Cattolica del Sacro Cuore, Milan, Italy.
- Research Unit in Affective and Social Neuroscience, Department of Psychology, Università Cattolica del Sacro Cuore, Milan, Italy.
| | - Laura Angioletti
- International research center for Cognitive Applied Neuroscience (IrcCAN), Università Cattolica del Sacro Cuore, Milan, Italy
- Research Unit in Affective and Social Neuroscience, Department of Psychology, Università Cattolica del Sacro Cuore, Milan, Italy
| |
Collapse
|
11
|
Xu Y, Tan X, Luo M, Xie Q, Yang F, Zhan CA. Reliable quantification of neural entrainment to rhythmic auditory stimulation: simulation and experimental validation. J Neural Eng 2025; 22:016026. [PMID: 39870044 DOI: 10.1088/1741-2552/adaeec] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Accepted: 01/27/2025] [Indexed: 01/29/2025]
Abstract
Objective.Entrainment has been considered as a potential mechanism underlying the facilitatory effect of rhythmic neural stimulation on neurorehabilitation. The inconsistent effects of brain stimulation on neurorehabilitation found in the literature may be caused by the variability in neural entrainment. To dissect the underlying mechanisms and optimize brain stimulation for improved effectiveness, it is critical to reliably assess the occurrence and the strength of neural entrainment. However, the factors influencing entrainment assessment are not yet fully understood. This study aims to investigate whether and how the relevant factors (i.e. data length, frequency bandwidth, signal-to-noise ratio (SNR), center frequency, and the constant component of stimulus-response phase-difference) influence the assessment reliability of neural entrainment.Approach.We simulated data for 28 scenarios to answer above questions. We also recorded experimental data to verify the findings from our simulation study.Main results.A minimal data length is required to achieve reliable neural entrainment assessment, and this requirement critically depends on the bandwidth and SNR, but is independent of the center frequency and the constant component of stimulus-response phase-difference. Furthermore, changing of bandwidth is accompanied by the change of SNR.Significance.The present study has revealed how data length, bandwidth, and SNR critically affect the assessment reliability of neural entrainment. The findings provide a foundation for the parameter setting in experiment design and data analysis in neural entrainment studies. While this study is within the context of rhythmic auditory stimulation, the conclusions may be applicable for neural entrainment to other rhythmic stimulations.
Collapse
Affiliation(s)
- Yiwen Xu
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China
- Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou 510515, People's Republic of China
| | - Xiaodan Tan
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China
- Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou 510515, People's Republic of China
- Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, People's Republic of China
| | - Minmin Luo
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China
- Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou 510515, People's Republic of China
- Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, People's Republic of China
| | - Qiuyou Xie
- Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou 510280, People's Republic of China
| | - Feng Yang
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China
- Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou 510515, People's Republic of China
- Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, People's Republic of China
| | - Chang'an A Zhan
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China
- Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou 510515, People's Republic of China
- Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, People's Republic of China
- Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou 510280, People's Republic of China
| |
Collapse
|
12
|
Liang M, Gerwien J, Gutschalk A. A listening advantage for native speech is reflected by attention-related activity in auditory cortex. Commun Biol 2025; 8:180. [PMID: 39910341 PMCID: PMC11799217 DOI: 10.1038/s42003-025-07601-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 01/24/2025] [Indexed: 02/07/2025] Open
Abstract
The listening advantage for native speech is well known, but the neural basis of the effect remains unknown. Here we test the hypothesis that attentional enhancement in auditory cortex is stronger for native speech, using magnetoencephalography. Chinese and German speech stimuli were recorded by a bilingual speaker and combined into a two-stream, cocktail-party scene, with consistent and inconsistent language combinations. A group of native speakers of Chinese and a group of native speakers of German performed a detection task in the cued target stream. Results show that attention enhances negative-going activity in the temporal response function deconvoluted from the speech envelope. This activity is stronger when the target stream is in the native compared to the non-native language, and for inconsistent compared to consistent language stimuli. We interpret the findings to show that the stronger activity for native speech could be related to better top-down prediction of the native speech streams.
Collapse
Affiliation(s)
- Meng Liang
- Department of Neurology, University of Heidelberg, Im Neuenheimer Feld 400, 69120, Heidelberg, Germany
| | - Johannes Gerwien
- Institute of German as a Foreign Language Philology, University of Heidelberg, Plöck 55, 69117, Heidelberg, Germany
| | - Alexander Gutschalk
- Department of Neurology, University of Heidelberg, Im Neuenheimer Feld 400, 69120, Heidelberg, Germany.
| |
Collapse
|
13
|
Henke L, Meyer L. Chunk Duration Limits the Learning of Multiword Chunks: Behavioral and Electroencephalography Evidence from Statistical Learning. J Cogn Neurosci 2025; 37:167-184. [PMID: 39382964 DOI: 10.1162/jocn_a_02257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/11/2024]
Abstract
Language comprehension involves the grouping of words into larger multiword chunks. This is required to recode information into sparser representations to mitigate memory limitations and counteract forgetting. It has been suggested that electrophysiological processing time windows constrain the formation of these units. Specifically, the period of rhythmic neural activity (i.e., low-frequency neural oscillations) may set an upper limit of 2-3 sec. Here, we assess whether learning of new multiword chunks is also affected by this neural limit. We applied an auditory statistical learning paradigm of an artificial language while manipulating the duration of to-be-learned chunks. Participants listened to isochronous sequences of disyllabic pseudowords from which they could learn hidden three-word chunks based on transitional probabilities. We presented chunks of 1.95, 2.55, and 3.15 sec that were created by varying the pause interval between pseudowords. In a first behavioral experiment, we tested learning using an implicit target detection task. We found better learning for chunks of 2.55 sec as compared to longer durations in line with an upper limit of the proposed time constraint. In a second experiment, we recorded participants' electroencephalogram during the exposure phase to use frequency tagging as a neural index of statistical learning. Extending the behavioral findings, results show a significant decline in neural tracking for chunks exceeding 3 sec as compared to both shorter durations. Overall, we suggest that language learning is constrained by endogenous time constraints, possibly reflecting electrophysiological processing windows.
Collapse
Affiliation(s)
- Lena Henke
- Max Planck Institute for Human Cognitive and Brain Sciences
| | - Lars Meyer
- Max Planck Institute for Human Cognitive and Brain Sciences
- University Hospital Münster
| |
Collapse
|
14
|
Czepiel AM, Fink LK, Scharinger M, Seibert C, Wald‐Fuhrmann M, Kotz SA. Audio-visual concert performances synchronize audience's heart rates. Ann N Y Acad Sci 2025; 1543:117-132. [PMID: 39752187 PMCID: PMC11776452 DOI: 10.1111/nyas.15279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2025]
Abstract
People enjoy engaging with music. Live music concerts provide an excellent option to investigate real-world music experiences, and at the same time, use neurophysiological synchrony to assess dynamic engagement. In the current study, we assessed engagement in a live concert setting using synchrony of cardiorespiratory measures, comparing inter-subject, stimulus-response, correlation, and phase coherence. As engagement might be enhanced in a concert setting by seeing musicians perform, we presented audiences with audio-only (AO) and audio-visual (AV) piano performances. Only correlation synchrony measures were above chance level. In comparing time-averaged synchrony across conditions, AV performances evoked a higher inter-subject correlation of heart rate (ISC-HR). However, synchrony averaged across music pieces did not correspond to self-reported engagement. On the other hand, time-resolved analyses show that synchronized deceleration-acceleration heart rate (HR) patterns, typical of an "orienting response" (an index of directed attention), occurred within music pieces at salient events of section boundaries. That is, seeing musicians perform heightened audience engagement at structurally important moments in Western classical music. Overall, we could show that multisensory information shapes dynamic engagement. By comparing different synchrony measures, we further highlight the advantages of time series analysis, specifically ISC-HR, as a robust measure of holistic musical listening experiences in naturalistic concert settings.
Collapse
Affiliation(s)
- Anna M. Czepiel
- Department of PsychologyUniversity of Toronto MississaugaMississaugaOntarioCanada
- Department of MusicMax Planck Institute for Empirical AestheticsFrankfurt am MainGermany
- Department of Neuropsychology and Psychopharmacology, Faculty of Psychology and NeuroscienceMaastricht UniversityMaastrichtThe Netherlands
| | - Lauren K. Fink
- Department of MusicMax Planck Institute for Empirical AestheticsFrankfurt am MainGermany
- Department of Psychology, Neuroscience & BehaviourMcMaster UniversityHamiltonOntarioCanada
| | - Mathias Scharinger
- Research Group Phonetics, Department of German LinguisticsUniversity of MarburgMarburgGermany
- Department of Language and LiteratureMax Planck Institute for Empirical AestheticsFrankfurt am MainGermany
| | - Christoph Seibert
- Institute for Music Informatics and MusicologyUniversity of Music KarlsruheKarlsruheGermany
| | - Melanie Wald‐Fuhrmann
- Department of MusicMax Planck Institute for Empirical AestheticsFrankfurt am MainGermany
| | - Sonja A. Kotz
- Department of Neuropsychology and Psychopharmacology, Faculty of Psychology and NeuroscienceMaastricht UniversityMaastrichtThe Netherlands
- Department of NeuropsychologyMax Planck Institute for Human Cognitive and Brain SciencesLeipzigGermany
| |
Collapse
|
15
|
Thomas T, Martin CD, Caffarra S. The impact of speaker accent on discourse processing: A frequency investigation. BRAIN AND LANGUAGE 2025; 260:105509. [PMID: 39657290 DOI: 10.1016/j.bandl.2024.105509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 10/18/2024] [Accepted: 12/01/2024] [Indexed: 12/12/2024]
Abstract
Previous studies indicate differences in native and foreign speech processing (Lev-Ari, 2018), with mixed evidence for differences between dialectal and foreign accent processing (Adank, Evans, Stuart-Smith, & Scott, 2009; Floccia et al., 2006, 2009; Girard, Floccia, & Goslin, 2008). Two theories have been proposed: The Perceptual Distance Hypothesis suggests that dialectal accent processing is an attenuated version of foreign accent processing (Clarke & Garrett, 2004), while the Different Processes Hypothesis argues that foreign and dialectal accents are processed via distinct mechanisms (Floccia, Butler, Girard, & Goslin, 2009). A recent single-word ERP study suggested flexibility in these mechanisms (Thomas, Martin, & Caffarra, 2022). The present study deepens this investigation by investigating differences in native, dialectal, and foreign accent processing across frequency bands during extended speech. Electroencephalographic data was recorded from 30 participants who listened to dialogues of approximately six minutes spoken in native, dialectal and foreign accents. Power spectral density estimation (1-35 Hz) was performed. Linear mixed models were done in frequency windows of particular relevance to discourse processing. Frequency bands associated with phoneme [gamma], syllable [theta], and prosody [delta] were considered along with those of general cognitive mechanisms [alpha and beta]. Results show power differences in the Gamma frequency range. While in higher frequency ranges foreign accent processing is differentiated from power amplitudes of native and dialectal accent processing, in low frequencies we do not see any accent-related power amplitude modulations. This suggests that there may be a difference in phoneme processing for native accent types and foreign accent, while we speculate that top-down mechanisms during discourse processing may mitigate the effects observed with short units of speech.
Collapse
Affiliation(s)
- Trisha Thomas
- Basque Center on Cognition, Brain and Language, San Sebastian, Spain; Harvard University, 50 Church st, Cambridge, MA 02138, USA.
| | - Clara D Martin
- Basque Center on Cognition, Brain and Language, San Sebastian, Spain; Basque Foundation for Science (Ikerbasque), Spain
| | - Sendy Caffarra
- Basque Center on Cognition, Brain and Language, San Sebastian, Spain; University School of Medicine, 291 Campus Drive, Li Ka Shing Building, Stanford, CA 94305 5101, USA; Stanford University Graduate School of Education, 485 Lasuen Mall, Stanford, CA 94305, USA; University of Modena and Reggio Emilia, Via Campi 287, 41125 Modena, Italy
| |
Collapse
|
16
|
Giroud J, Trébuchon A, Mercier M, Davis MH, Morillon B. The human auditory cortex concurrently tracks syllabic and phonemic timescales via acoustic spectral flux. SCIENCE ADVANCES 2024; 10:eado8915. [PMID: 39705351 DOI: 10.1126/sciadv.ado8915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 11/15/2024] [Indexed: 12/22/2024]
Abstract
Dynamical theories of speech processing propose that the auditory cortex parses acoustic information in parallel at the syllabic and phonemic timescales. We developed a paradigm to independently manipulate both linguistic timescales, and acquired intracranial recordings from 11 patients who are epileptic listening to French sentences. Our results indicate that (i) syllabic and phonemic timescales are both reflected in the acoustic spectral flux; (ii) during comprehension, the auditory cortex tracks the syllabic timescale in the theta range, while neural activity in the alpha-beta range phase locks to the phonemic timescale; (iii) these neural dynamics occur simultaneously and share a joint spatial location; (iv) the spectral flux embeds two timescales-in the theta and low-beta ranges-across 17 natural languages. These findings help us understand how the human brain extracts acoustic information from the continuous speech signal at multiple timescales simultaneously, a prerequisite for subsequent linguistic processing.
Collapse
Affiliation(s)
- Jérémy Giroud
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | - Agnès Trébuchon
- Aix Marseille Université, INSERM, INS, Institut de Neurosciences des Systèmes, Marseille, France
- APHM, Clinical Neurophysiology, Timone Hospital, Marseille, France
| | - Manuel Mercier
- Aix Marseille Université, INSERM, INS, Institut de Neurosciences des Systèmes, Marseille, France
| | - Matthew H Davis
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | - Benjamin Morillon
- Aix Marseille Université, INSERM, INS, Institut de Neurosciences des Systèmes, Marseille, France
| |
Collapse
|
17
|
Karunathilake ID, Brodbeck C, Bhattasali S, Resnik P, Simon JZ. Neural Dynamics of the Processing of Speech Features: Evidence for a Progression of Features from Acoustic to Sentential Processing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.02.578603. [PMID: 38352332 PMCID: PMC10862830 DOI: 10.1101/2024.02.02.578603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/22/2024]
Abstract
When we listen to speech, our brain's neurophysiological responses "track" its acoustic features, but it is less well understood how these auditory responses are enhanced by linguistic content. Here, we recorded magnetoencephalography (MEG) responses while subjects listened to four types of continuous-speech-like passages: speech-envelope modulated noise, English-like non-words, scrambled words, and a narrative passage. Temporal response function (TRF) analysis provides strong neural evidence for the emergent features of speech processing in cortex, from acoustics to higher-level linguistics, as incremental steps in neural speech processing. Critically, we show a stepwise hierarchical progression of progressively higher order features over time, reflected in both bottom-up (early) and top-down (late) processing stages. Linguistically driven top-down mechanisms take the form of late N400-like responses, suggesting a central role of predictive coding mechanisms at multiple levels. As expected, the neural processing of lower-level acoustic feature responses is bilateral or right lateralized, with left lateralization emerging only for lexical-semantic features. Finally, our results identify potential neural markers, linguistic level late responses, derived from TRF components modulated by linguistic content, suggesting that these markers are indicative of speech comprehension rather than mere speech perception.
Collapse
Affiliation(s)
| | - Christian Brodbeck
- Department of Computing and Software, McMaster University, Hamilton, ON, Canada
| | - Shohini Bhattasali
- Department of Language Studies, University of Toronto, Scarborough, Canada
| | - Philip Resnik
- Department of Linguistics, and Institute for Advanced Computer Studies, University of Maryland, College Park, MD, 20742
| | - Jonathan Z. Simon
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, 20742
- Department of Biology, University of Maryland, College Park, MD, USA
- Institute for Systems Research, University of Maryland, College Park, MD, 20742
| |
Collapse
|
18
|
Cervantes Constantino F, Caputi Á. Cortical tracking of speakers' spectral changes predicts selective listening. Cereb Cortex 2024; 34:bhae472. [PMID: 39656649 DOI: 10.1093/cercor/bhae472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 10/20/2024] [Accepted: 11/15/2024] [Indexed: 12/17/2024] Open
Abstract
A social scene is particularly informative when people are distinguishable. To understand somebody amid a "cocktail party" chatter, we automatically index their voice. This ability is underpinned by parallel processing of vocal spectral contours from speech sounds, but it has not yet been established how this occurs in the brain's cortex. We investigate single-trial neural tracking of slow frequency modulations in speech using electroencephalography. Participants briefly listened to unfamiliar single speakers, and in addition, they performed a cocktail party comprehension task. Quantified through stimulus reconstruction methods, robust tracking was found in neural responses to slow (delta-theta range) modulations of frequency contours in the fourth and fifth formant band, equivalent to the 3.5-5 KHz audible range. The spectral spacing between neighboring instantaneous frequency contours (ΔF), which also yields indexical information from the vocal tract, was similarly decodable. Moreover, EEG evidence of listeners' spectral tracking abilities predicted their chances of succeeding at selective listening when faced with two-speaker speech mixtures. In summary, the results indicate that the communicating brain can rely on locking of cortical rhythms to major changes led by upper resonances of the vocal tract. Their corresponding articulatory mechanics hence continuously issue a fundamental credential for listeners to target in real time.
Collapse
Affiliation(s)
- Francisco Cervantes Constantino
- Instituto de Investigaciones Biológicas Clemente Estable, Department of Integrative and Computational Neurosciences, Av. Italia 3318, Montevideo, 11.600, Uruguay
- Facultad de Psicología, Universidad de la República
| | - Ángel Caputi
- Instituto de Investigaciones Biológicas Clemente Estable, Department of Integrative and Computational Neurosciences, Av. Italia 3318, Montevideo, 11.600, Uruguay
| |
Collapse
|
19
|
Naeije G, Niesen M, Vander Ghinst M, Bourguignon M. Simultaneous EEG recording of cortical tracking of speech and movement kinematics. Neuroscience 2024; 561:1-10. [PMID: 39395635 DOI: 10.1016/j.neuroscience.2024.10.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 09/23/2024] [Accepted: 10/06/2024] [Indexed: 10/14/2024]
Abstract
RATIONALE Cortical activity is coupled with streams of sensory stimulation. The coupling with the temporal envelope of heard speech is known as the cortical tracking of speech (CTS), and that with movement kinematics is known as the corticokinematic coupling (CKC). Simultaneous measurement of both couplings is desirable in clinical settings, but it is unknown whether the inherent dual-tasking condition has an impact on CTS or CKC. AIM We aim to determine whether and how CTS and CKC levels are affected when recorded simultaneously. METHODS Twenty-three healthy young adults underwent 64-channel EEG recordings while listening to stories and while performing repetitive finger-tapping movements in 3 conditions: separately (audio- or tapping-only) or simultaneously (audio-tapping). CTS and CKC values were estimated using coherence analysis between each EEG signal and speech temporal envelope (CTS) or finger acceleration (CKC). CTS was also estimated as the reconstruction accuracy of a decoding model. RESULTS Across recordings, CTS assessed with reconstruction accuracy was significant in 85 % of the subjects at phrasal frequency (0.5 Hz) and in 68 % at syllabic frequencies (4-8 Hz), and CKC was significant in over 85 % of the subjects at movement frequency and its first harmonic. Comparing CTS and CKC values evaluated in separate recordings to those in simultaneous recordings revealed no significant difference and moderate-to-high levels of correlation. CONCLUSION Despite the subtle behavioral effects, CTS and CKC are not evidently altered by the dual-task setting inherent to recording them simultaneously and can be evaluated simultaneously using EEG in clinical settings.
Collapse
Affiliation(s)
- Gilles Naeije
- Laboratoire de Neuroanatomie et Neuroimagerie Translationnelles, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium; Centre de Référence Neuromusculaire, Department of Neurology, HUB Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium.
| | - Maxime Niesen
- Laboratoire de Neuroanatomie et Neuroimagerie Translationnelles, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium; Service d'ORL et de chirurgie cervico-faciale, HUB Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium
| | - Marc Vander Ghinst
- Laboratoire de Neuroanatomie et Neuroimagerie Translationnelles, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium; Service d'ORL et de chirurgie cervico-faciale, HUB Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium
| | - Mathieu Bourguignon
- Laboratoire de Neuroanatomie et Neuroimagerie Translationnelles, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium; Laboratory of Neurophysiology and Movement Biomechanics, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium
| |
Collapse
|
20
|
Robson H, Thomasson H, Upton E, Leff AP, Davis MH. The impact of speech rhythm and rate on comprehension in aphasia. Cortex 2024; 180:126-146. [PMID: 39427491 DOI: 10.1016/j.cortex.2024.09.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Revised: 07/10/2024] [Accepted: 09/01/2024] [Indexed: 10/22/2024]
Abstract
BACKGROUND Speech comprehension impairment in post-stroke aphasia is influenced by speech acoustics. This study investigated the impact of speech rhythm (syllabic isochrony) and rate on comprehension in people with aphasia (PWA). Rhythmical speech was hypothesised to support comprehension in PWA by reducing temporal variation, leading to enhanced speech tracking and more appropriate sampling of the speech stream. Speech rate was hypothesised to influence comprehension through auditory and linguistic processing time. METHODS One group of PWA (n = 19) and two groups of control participants (n = 10 and n = 18) performed a sentence-verification. Sentences were presented in two rhythm conditions (natural vs isochronous) and two rate conditions (typical, 3.6 Hz vs slow, 2.6 Hz) in a 2 × 2 factorial design. PWA and one group of controls performed the experiment with clear speech. The second group of controls performed the experiment with perceptually degraded speech. RESULTS D-prime analyses measured capacity to detect incongruent endings. Linear mixed effects models investigated the impact of group, rhythm, rate and clarity on d-prime scores. Control participants were negatively affected by isochronous rhythm in comparison to natural rhythm, likely due to alteration in linguistic cues. This negative impact remained or was exacerbated in control participants presented with degraded speech. In comparison, PWA were less affected by isochronous rhythm, despite producing d-prime scores matched to the degraded speech control group. Speech rate affected all groups, but only in interactions with rhythm, indicating that slow-rate isochronous speech was more comprehendible than typical-rate isochronous speech. CONCLUSIONS The comprehension network in PWA interacts differently with speech rhythm. Rhythmical speech may support acoustic speech tracking by enhancing predictability and ameliorate the detrimental impact of atypical rhythm on linguistic cues. Alternatively, reduced temporal prediction in aphasia may limit the impact of deviation from natural temporal structure. Reduction of speech rate below the typical range may not benefit comprehension in PWA.
Collapse
Affiliation(s)
- Holly Robson
- Language and Cognition, Psychology and Language Sciences, University College London, London, UK.
| | - Harriet Thomasson
- Language and Cognition, Psychology and Language Sciences, University College London, London, UK
| | - Emily Upton
- Language and Cognition, Psychology and Language Sciences, University College London, London, UK
| | - Alexander P Leff
- UCL Queen Square Institute of Neurology, University College London, London, UK
| | - Matthew H Davis
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| |
Collapse
|
21
|
Luo C, Ding N. Cortical encoding of hierarchical linguistic information when syllabic rhythms are obscured by echoes. Neuroimage 2024; 300:120875. [PMID: 39341475 DOI: 10.1016/j.neuroimage.2024.120875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 09/24/2024] [Accepted: 09/26/2024] [Indexed: 10/01/2024] Open
Abstract
In speech perception, low-frequency cortical activity tracks hierarchical linguistic units (e.g., syllables, phrases, and sentences) on top of acoustic features (e.g., speech envelope). Since the fluctuation of speech envelope typically corresponds to the syllabic boundaries, one common interpretation is that the acoustic envelope underlies the extraction of discrete syllables from continuous speech for subsequent linguistic processing. However, it remains unclear whether and how cortical activity encodes linguistic information when the speech envelope does not provide acoustic correlates of syllables. To address the issue, we introduced a frequency-tagging speech stream where the syllabic rhythm was obscured by echoic envelopes and investigated neural encoding of hierarchical linguistic information using electroencephalography (EEG). When listeners attended to the echoic speech, cortical activity showed reliable tracking of syllable, phrase, and sentence levels, among which the higher-level linguistic units elicited more robust neural responses. When attention was diverted from the echoic speech, reliable neural tracking of the syllable level was also observed in contrast to deteriorated neural tracking of the phrase and sentence levels. Further analyses revealed that the envelope aligned with the syllabic rhythm could be recovered from the echoic speech through a neural adaptation model, and the reconstructed envelope yielded higher predictive power for the neural tracking responses than either the original echoic envelope or anechoic envelope. Taken together, these results suggest that neural adaptation and attentional modulation jointly contribute to neural encoding of linguistic information in distorted speech where the syllabic rhythm is obscured by echoes.
Collapse
Affiliation(s)
- Cheng Luo
- Zhejiang Lab, Hangzhou 311121, China.
| | - Nai Ding
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China; The State Key Lab of Brain-Machine Intelligence; The MOE Frontier Science Center for Brain Science & Brain-machine Integration, Zhejiang University, Hangzhou 310027, China
| |
Collapse
|
22
|
Déaux EC, Piette T, Gaunet F, Legou T, Arnal L, Giraud AL. Dog-human vocal interactions match dogs' sensory-motor tuning. PLoS Biol 2024; 22:e3002789. [PMID: 39352912 PMCID: PMC11444399 DOI: 10.1371/journal.pbio.3002789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 08/06/2024] [Indexed: 10/04/2024] Open
Abstract
Within species, vocal and auditory systems presumably coevolved to converge on a critical temporal acoustic structure that can be best produced and perceived. While dogs cannot produce articulated sounds, they respond to speech, raising the question as to whether this heterospecific receptive ability could be shaped by exposure to speech or remains bounded by their own sensorimotor capacity. Using acoustic analyses of dog vocalisations, we show that their main production rhythm is slower than the dominant (syllabic) speech rate, and that human-dog-directed speech falls halfway in between. Comparative exploration of neural (electroencephalography) and behavioural responses to speech reveals that comprehension in dogs relies on a slower speech rhythm tracking (delta) than humans' (theta), even though dogs are equally sensitive to speech content and prosody. Thus, the dog audio-motor tuning differs from humans', and we hypothesise that humans may adjust their speech rate to this shared temporal channel as means to improve communication efficacy.
Collapse
Affiliation(s)
- Eloïse C. Déaux
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Théophane Piette
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Florence Gaunet
- Aix-Marseille University and CNRS, Laboratoire de Psychologie Cognitive (UMR 7290), Marseille, France
| | - Thierry Legou
- Aix Marseille University and CNRS, Laboratoire Parole et Langage (UMR 6057), Aix-en-Provence, France
| | - Luc Arnal
- Université Paris Cité, Institut Pasteur, AP-HP, Inserm, Fondation Pour l’Audition, Institut de l’Audition, IHU reConnect, F-75012 Paris, France
| | - Anne-Lise Giraud
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland
- Université Paris Cité, Institut Pasteur, AP-HP, Inserm, Fondation Pour l’Audition, Institut de l’Audition, IHU reConnect, F-75012 Paris, France
| |
Collapse
|
23
|
Kasten FH, Busson Q, Zoefel B. Opposing neural processing modes alternate rhythmically during sustained auditory attention. Commun Biol 2024; 7:1125. [PMID: 39266696 PMCID: PMC11393317 DOI: 10.1038/s42003-024-06834-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Accepted: 09/03/2024] [Indexed: 09/14/2024] Open
Abstract
During continuous tasks, humans show spontaneous fluctuations in performance, putatively caused by varying attentional resources allocated to process external information. If neural resources are used to process other, presumably "internal" information, sensory input can be missed and explain an apparent dichotomy of "internal" versus "external" attention. In the current study, we extract presumed neural signatures of these attentional modes in human electroencephalography (EEG): neural entrainment and α-oscillations (~10-Hz), linked to the processing and suppression of sensory information, respectively. We test whether they exhibit structured fluctuations over time, while listeners attend to an ecologically relevant stimulus, like speech, and complete a task that requires full and continuous attention. Results show an antagonistic relation between neural entrainment to speech and spontaneous α-oscillations in two distinct brain networks-one specialized in the processing of external information, the other reminiscent of the dorsal attention network. These opposing neural modes undergo slow, periodic fluctuations around ~0.07 Hz and are related to the detection of auditory targets. Our study might have tapped into a general attentional mechanism that is conserved across species and has important implications for situations in which sustained attention to sensory information is critical.
Collapse
Affiliation(s)
- Florian H Kasten
- Department for Cognitive, Affective, Behavioral Neuroscience with Focus Neurostimulation, Institute of Psychology, University of Trier, Trier, Germany.
- Centre de Recherche Cerveau & Cognition, CNRS, Toulouse, France.
- Université Toulouse III Paul Sabatier, Toulouse, France.
| | | | - Benedikt Zoefel
- Centre de Recherche Cerveau & Cognition, CNRS, Toulouse, France.
- Université Toulouse III Paul Sabatier, Toulouse, France.
| |
Collapse
|
24
|
Çetinçelik M, Jordan-Barros A, Rowland CF, Snijders TM. The effect of visual speech cues on neural tracking of speech in 10-month-old infants. Eur J Neurosci 2024; 60:5381-5399. [PMID: 39188179 DOI: 10.1111/ejn.16492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 07/04/2024] [Accepted: 07/20/2024] [Indexed: 08/28/2024]
Abstract
While infants' sensitivity to visual speech cues and the benefit of these cues have been well-established by behavioural studies, there is little evidence on the effect of visual speech cues on infants' neural processing of continuous auditory speech. In this study, we investigated whether visual speech cues, such as the movements of the lips, jaw, and larynx, facilitate infants' neural speech tracking. Ten-month-old Dutch-learning infants watched videos of a speaker reciting passages in infant-directed speech while electroencephalography (EEG) was recorded. In the videos, either the full face of the speaker was displayed or the speaker's mouth and jaw were masked with a block, obstructing the visual speech cues. To assess neural tracking, speech-brain coherence (SBC) was calculated, focusing particularly on the stress and syllabic rates (1-1.75 and 2.5-3.5 Hz respectively in our stimuli). First, overall, SBC was compared to surrogate data, and then, differences in SBC in the two conditions were tested at the frequencies of interest. Our results indicated that infants show significant tracking at both stress and syllabic rates. However, no differences were identified between the two conditions, meaning that infants' neural tracking was not modulated further by the presence of visual speech cues. Furthermore, we demonstrated that infants' neural tracking of low-frequency information is related to their subsequent vocabulary development at 18 months. Overall, this study provides evidence that infants' neural tracking of speech is not necessarily impaired when visual speech cues are not fully visible and that neural tracking may be a potential mechanism in successful language acquisition.
Collapse
Affiliation(s)
- Melis Çetinçelik
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Department of Experimental Psychology, Utrecht University, Utrecht, The Netherlands
- Cognitive Neuropsychology Department, Tilburg University, Tilburg, The Netherlands
| | - Antonia Jordan-Barros
- Centre for Brain and Cognitive Development, Department of Psychological Science, Birkbeck, University of London, London, UK
- Experimental Psychology, University College London, London, UK
| | - Caroline F Rowland
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Tineke M Snijders
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Cognitive Neuropsychology Department, Tilburg University, Tilburg, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| |
Collapse
|
25
|
Zoefel B, Abbasi O, Gross J, Kotz SA. Entrainment echoes in the cerebellum. Proc Natl Acad Sci U S A 2024; 121:e2411167121. [PMID: 39136991 PMCID: PMC11348099 DOI: 10.1073/pnas.2411167121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Accepted: 07/05/2024] [Indexed: 08/29/2024] Open
Abstract
Evidence accumulates that the cerebellum's role in the brain is not restricted to motor functions. Rather, cerebellar activity seems to be crucial for a variety of tasks that rely on precise event timing and prediction. Due to its complex structure and importance in communication, human speech requires a particularly precise and predictive coordination of neural processes to be successfully comprehended. Recent studies proposed that the cerebellum is indeed a major contributor to speech processing, but how this contribution is achieved mechanistically remains poorly understood. The current study aimed to reveal a mechanism underlying cortico-cerebellar coordination and demonstrate its speech-specificity. In a reanalysis of magnetoencephalography data, we found that activity in the cerebellum aligned to rhythmic sequences of noise-vocoded speech, irrespective of its intelligibility. We then tested whether these "entrained" responses persist, and how they interact with other brain regions, when a rhythmic stimulus stopped and temporal predictions had to be updated. We found that only intelligible speech produced sustained rhythmic responses in the cerebellum. During this "entrainment echo," but not during rhythmic speech itself, cerebellar activity was coupled with that in the left inferior frontal gyrus, and specifically at rates corresponding to the preceding stimulus rhythm. This finding represents evidence for specific cerebellum-driven temporal predictions in speech processing and their relay to cortical regions.
Collapse
Affiliation(s)
- Benedikt Zoefel
- Centre de Recherche Cerveau et Cognition, CNRS, Toulouse31100, France
- Université Paul Sabatier Toulouse III, Toulouse31400, France
| | - Omid Abbasi
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster48149, Germany
| | - Joachim Gross
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster48149, Germany
- Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster48149, Germany
| | - Sonja A. Kotz
- Department of Neuropsychology and Psychopharmacology, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht 6229, the Netherlands
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig04103, Germany
| |
Collapse
|
26
|
Issa MF, Khan I, Ruzzoli M, Molinaro N, Lizarazu M. On the speech envelope in the cortical tracking of speech. Neuroimage 2024; 297:120675. [PMID: 38885886 DOI: 10.1016/j.neuroimage.2024.120675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 06/05/2024] [Accepted: 06/06/2024] [Indexed: 06/20/2024] Open
Abstract
The synchronization between the speech envelope and neural activity in auditory regions, referred to as cortical tracking of speech (CTS), plays a key role in speech processing. The method selected for extracting the envelope is a crucial step in CTS measurement, and the absence of a consensus on best practices among the various methods can influence analysis outcomes and interpretation. Here, we systematically compare five standard envelope extraction methods the absolute value of Hilbert transform (absHilbert), gammatone filterbanks, heuristic approach, Bark scale, and vocalic energy), analyzing their impact on the CTS. We present performance metrics for each method based on the recording of brain activity from participants listening to speech in clear and noisy conditions, utilizing intracranial EEG, MEG and EEG data. As expected, we observed significant CTS in temporal brain regions below 10 Hz across all datasets, regardless of the extraction methods. In general, the gammatone filterbanks approach consistently demonstrated superior performance compared to other methods. Results from our study can guide scientists in the field to make informed decisions about the optimal analysis to extract the CTS, contributing to advancing the understanding of the neuronal mechanisms implicated in CTS.
Collapse
Affiliation(s)
- Mohamed F Issa
- BCBL, Basque Center on Cognition, Brain and Language, San Sebastian, Spain; Department of Scientific Computing, Faculty of Computers and Artificial Intelligence, Benha University, Benha, Egypt.
| | - Izhar Khan
- BCBL, Basque Center on Cognition, Brain and Language, San Sebastian, Spain
| | - Manuela Ruzzoli
- BCBL, Basque Center on Cognition, Brain and Language, San Sebastian, Spain; Ikerbasque, Basque Foundation for Science, Bilbao, Spain
| | - Nicola Molinaro
- BCBL, Basque Center on Cognition, Brain and Language, San Sebastian, Spain; Ikerbasque, Basque Foundation for Science, Bilbao, Spain
| | - Mikel Lizarazu
- BCBL, Basque Center on Cognition, Brain and Language, San Sebastian, Spain
| |
Collapse
|
27
|
Chalas N, Meyer L, Lo CW, Park H, Kluger DS, Abbasi O, Kayser C, Nitsch R, Gross J. Dissociating prosodic from syntactic delta activity during natural speech comprehension. Curr Biol 2024; 34:3537-3549.e5. [PMID: 39047734 DOI: 10.1016/j.cub.2024.06.072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 06/24/2024] [Accepted: 06/27/2024] [Indexed: 07/27/2024]
Abstract
Decoding human speech requires the brain to segment the incoming acoustic signal into meaningful linguistic units, ranging from syllables and words to phrases. Integrating these linguistic constituents into a coherent percept sets the root of compositional meaning and hence understanding. One important cue for segmentation in natural speech is prosodic cues, such as pauses, but their interplay with higher-level linguistic processing is still unknown. Here, we dissociate the neural tracking of prosodic pauses from the segmentation of multi-word chunks using magnetoencephalography (MEG). We find that manipulating the regularity of pauses disrupts slow speech-brain tracking bilaterally in auditory areas (below 2 Hz) and in turn increases left-lateralized coherence of higher-frequency auditory activity at speech onsets (around 25-45 Hz). Critically, we also find that multi-word chunks-defined as short, coherent bundles of inter-word dependencies-are processed through the rhythmic fluctuations of low-frequency activity (below 2 Hz) bilaterally and independently of prosodic cues. Importantly, low-frequency alignment at chunk onsets increases the accuracy of an encoding model in bilateral auditory and frontal areas while controlling for the effect of acoustics. Our findings provide novel insights into the neural basis of speech perception, demonstrating that both acoustic features (prosodic cues) and abstract linguistic processing at the multi-word timescale are underpinned independently by low-frequency electrophysiological brain activity in the delta frequency range.
Collapse
Affiliation(s)
- Nikos Chalas
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany; Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany; Institute for Translational Neuroscience, University of Münster, Münster, Germany.
| | - Lars Meyer
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Chia-Wen Lo
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Hyojin Park
- Centre for Human Brain Health (CHBH), School of Psychology, University of Birmingham, Birmingham, UK
| | - Daniel S Kluger
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany; Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany
| | - Omid Abbasi
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany
| | - Christoph Kayser
- Department for Cognitive Neuroscience, Faculty of Biology, Bielefeld University, 33615 Bielefeld, Germany
| | - Robert Nitsch
- Institute for Translational Neuroscience, University of Münster, Münster, Germany
| | - Joachim Gross
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany; Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany
| |
Collapse
|
28
|
Cometa A, Battaglini C, Artoni F, Greco M, Frank R, Repetto C, Bottoni F, Cappa SF, Micera S, Ricciardi E, Moro A. Brain and grammar: revealing electrophysiological basic structures with competing statistical models. Cereb Cortex 2024; 34:bhae317. [PMID: 39098819 DOI: 10.1093/cercor/bhae317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 07/08/2024] [Accepted: 07/24/2024] [Indexed: 08/06/2024] Open
Abstract
Acoustic, lexical, and syntactic information are simultaneously processed in the brain requiring complex strategies to distinguish their electrophysiological activity. Capitalizing on previous works that factor out acoustic information, we could concentrate on the lexical and syntactic contribution to language processing by testing competing statistical models. We exploited electroencephalographic recordings and compared different surprisal models selectively involving lexical information, part of speech, or syntactic structures in various combinations. Electroencephalographic responses were recorded in 32 participants during listening to affirmative active declarative sentences. We compared the activation corresponding to basic syntactic structures, such as noun phrases vs. verb phrases. Lexical and syntactic processing activates different frequency bands, partially different time windows, and different networks. Moreover, surprisal models based on part of speech inventory only do not explain well the electrophysiological data, while those including syntactic information do. By disentangling acoustic, lexical, and syntactic information, we demonstrated differential brain sensitivity to syntactic information. These results confirm and extend previous measures obtained with intracranial recordings, supporting our hypothesis that syntactic structures are crucial in neural language processing. This study provides a detailed understanding of how the brain processes syntactic information, highlighting the importance of syntactic surprisal in shaping neural responses during language comprehension.
Collapse
Affiliation(s)
- Andrea Cometa
- MoMiLab, IMT School for Advanced Studies Lucca, Piazza S.Francesco, 19, Lucca 55100, Italy
- The BioRobotics Institute and Department of Excellence in Robotics and AI, Scuola Superiore Sant'Anna, Viale Rinaldo Piaggio 34, Pontedera 56025, Italy
- Cognitive Neuroscience (ICoN) Center, University School for Advanced Studies IUSS, Piazza Vittoria 15, Pavia 27100, Italy
| | - Chiara Battaglini
- Neurolinguistics and Experimental Pragmatics (NEP) Lab, University School for Advanced Studies IUSS Pavia, Piazza della Vittoria 15, Pavia 27100, Italy
| | - Fiorenzo Artoni
- Department of Clinical Neurosciences, Faculty of Medicine, University of Geneva, 1, rue Michel-Servet, Genéve 1211, Switzerland
| | - Matteo Greco
- Cognitive Neuroscience (ICoN) Center, University School for Advanced Studies IUSS, Piazza Vittoria 15, Pavia 27100, Italy
| | - Robert Frank
- Department of Linguistics, Yale University, 370 Temple St, New Haven, CT 06511, United States
| | - Claudia Repetto
- Department of Psychology, Università Cattolica del Sacro Cuore, Largo A. Gemelli 1, Milan 20123, Italy
| | - Franco Bottoni
- Istituto Clinico Humanitas, IRCCS, Via Alessandro Manzoni 56, Rozzano 20089, Italy
| | - Stefano F Cappa
- Cognitive Neuroscience (ICoN) Center, University School for Advanced Studies IUSS, Piazza Vittoria 15, Pavia 27100, Italy
- Dementia Research Center, IRCCS Mondino Foundation National Institute of Neurology, Via Mondino 2, Pavia 27100, Italy
| | - Silvestro Micera
- The BioRobotics Institute and Department of Excellence in Robotics and AI, Scuola Superiore Sant'Anna, Viale Rinaldo Piaggio 34, Pontedera 56025, Italy
- Bertarelli Foundation Chair in Translational NeuroEngineering, Center for Neuroprosthetics and School of Engineering, Ecole Polytechnique Federale de Lausanne, Campus Biotech, Chemin des Mines 9, Geneva, GE CH 1202, Switzerland
| | - Emiliano Ricciardi
- MoMiLab, IMT School for Advanced Studies Lucca, Piazza S.Francesco, 19, Lucca 55100, Italy
| | - Andrea Moro
- Cognitive Neuroscience (ICoN) Center, University School for Advanced Studies IUSS, Piazza Vittoria 15, Pavia 27100, Italy
| |
Collapse
|
29
|
Iverson P, Song J. Neural Tracking of Speech Acoustics in Noise Is Coupled with Lexical Predictability as Estimated by Large Language Models. eNeuro 2024; 11:ENEURO.0507-23.2024. [PMID: 39095091 PMCID: PMC11335968 DOI: 10.1523/eneuro.0507-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 07/15/2024] [Accepted: 07/15/2024] [Indexed: 08/04/2024] Open
Abstract
Adults heard recordings of two spatially separated speakers reading newspaper and magazine articles. They were asked to listen to one of them and ignore the other, and EEG was recorded to assess their neural processing. Machine learning extracted neural sources that tracked the target and distractor speakers at three levels: the acoustic envelope of speech (delta- and theta-band modulations), lexical frequency for individual words, and the contextual predictability of individual words estimated by GPT-4 and earlier lexical models. To provide a broader view of speech perception, half of the subjects completed a simultaneous visual task, and the listeners included both native and non-native English speakers. Distinct neural components were extracted for these levels of auditory and lexical processing, demonstrating that native English speakers had greater target-distractor separation compared with non-native English speakers on most measures, and that lexical processing was reduced by the visual task. Moreover, there was a novel interaction of lexical predictability and frequency with auditory processing; acoustic tracking was stronger for lexically harder words, suggesting that people listened harder to the acoustics when needed for lexical selection. This demonstrates that speech perception is not simply a feedforward process from acoustic processing to the lexicon. Rather, the adaptable context-sensitive processing long known to occur at a lexical level has broader consequences for perception, coupling with the acoustic tracking of individual speakers in noise.
Collapse
Affiliation(s)
- Paul Iverson
- Department of Speech, Hearing and Phonetic Sciences, University College London, London WC1N 1PF, United Kingdom
| | - Jieun Song
- School of Digital Humanities and Computational Social Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
| |
Collapse
|
30
|
Teng X, Larrouy-Maestri P, Poeppel D. Segmenting and Predicting Musical Phrase Structure Exploits Neural Gain Modulation and Phase Precession. J Neurosci 2024; 44:e1331232024. [PMID: 38926087 PMCID: PMC11270514 DOI: 10.1523/jneurosci.1331-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 05/29/2024] [Accepted: 06/11/2024] [Indexed: 06/28/2024] Open
Abstract
Music, like spoken language, is often characterized by hierarchically organized structure. Previous experiments have shown neural tracking of notes and beats, but little work touches on the more abstract question: how does the brain establish high-level musical structures in real time? We presented Bach chorales to participants (20 females and 9 males) undergoing electroencephalogram (EEG) recording to investigate how the brain tracks musical phrases. We removed the main temporal cues to phrasal structures, so that listeners could only rely on harmonic information to parse a continuous musical stream. Phrasal structures were disrupted by locally or globally reversing the harmonic progression, so that our observations on the original music could be controlled and compared. We first replicated the findings on neural tracking of musical notes and beats, substantiating the positive correlation between musical training and neural tracking. Critically, we discovered a neural signature in the frequency range ∼0.1 Hz (modulations of EEG power) that reliably tracks musical phrasal structure. Next, we developed an approach to quantify the phrasal phase precession of the EEG power, revealing that phrase tracking is indeed an operation of active segmentation involving predictive processes. We demonstrate that the brain establishes complex musical structures online over long timescales (>5 s) and actively segments continuous music streams in a manner comparable to language processing. These two neural signatures, phrase tracking and phrasal phase precession, provide new conceptual and technical tools to study the processes underpinning high-level structure building using noninvasive recording techniques.
Collapse
Affiliation(s)
- Xiangbin Teng
- Department of Psychology, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Pauline Larrouy-Maestri
- Music Department, Max-Planck-Institute for Empirical Aesthetics, Frankfurt 60322, Germany
- Center for Language, Music, and Emotion (CLaME), New York, New York 10003
| | - David Poeppel
- Center for Language, Music, and Emotion (CLaME), New York, New York 10003
- Department of Psychology, New York University, New York, New York 10003
- Ernst Struengmann Institute for Neuroscience, Frankfurt 60528, Germany
- Music and Audio Research Laboratory (MARL), New York, New York 11201
| |
Collapse
|
31
|
Pérez-Navarro J, Klimovich-Gray A, Lizarazu M, Piazza G, Molinaro N, Lallier M. Early language experience modulates the tradeoff between acoustic-temporal and lexico-semantic cortical tracking of speech. iScience 2024; 27:110247. [PMID: 39006483 PMCID: PMC11246002 DOI: 10.1016/j.isci.2024.110247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 03/14/2024] [Accepted: 06/07/2024] [Indexed: 07/16/2024] Open
Abstract
Cortical tracking of speech is relevant for the development of speech perception skills. However, no study to date has explored whether and how cortical tracking of speech is shaped by accumulated language experience, the central question of this study. In 35 bilingual children (6-year-old) with considerably bigger experience in one language, we collected electroencephalography data while they listened to continuous speech in their two languages. Cortical tracking of speech was assessed at acoustic-temporal and lexico-semantic levels. Children showed more robust acoustic-temporal tracking in the least experienced language, and more sensitive cortical tracking of semantic information in the most experienced language. Additionally, and only for the most experienced language, acoustic-temporal tracking was specifically linked to phonological abilities, and lexico-semantic tracking to vocabulary knowledge. Our results indicate that accumulated linguistic experience is a relevant maturational factor for the cortical tracking of speech at different levels during early language acquisition.
Collapse
Affiliation(s)
- Jose Pérez-Navarro
- Basque Center on Cognition, Brain and Language (BCBL), 20009 Donostia-San Sebastian, Spain
| | | | - Mikel Lizarazu
- Basque Center on Cognition, Brain and Language (BCBL), 20009 Donostia-San Sebastian, Spain
| | - Giorgio Piazza
- Basque Center on Cognition, Brain and Language (BCBL), 20009 Donostia-San Sebastian, Spain
| | - Nicola Molinaro
- Basque Center on Cognition, Brain and Language (BCBL), 20009 Donostia-San Sebastian, Spain
- Ikerbasque, Basque Foundation for Science, 48009 Bilbao, Spain
| | - Marie Lallier
- Basque Center on Cognition, Brain and Language (BCBL), 20009 Donostia-San Sebastian, Spain
| |
Collapse
|
32
|
Xiao Q, Zheng X, Wen Y, Yuan Z, Chen Z, Lan Y, Li S, Huang X, Zhong H, Xu C, Zhan C, Pan J, Xie Q. Individualized music induces theta-gamma phase-amplitude coupling in patients with disorders of consciousness. Front Neurosci 2024; 18:1395627. [PMID: 39010944 PMCID: PMC11248187 DOI: 10.3389/fnins.2024.1395627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Accepted: 06/18/2024] [Indexed: 07/17/2024] Open
Abstract
Objective This study aimed to determine whether patients with disorders of consciousness (DoC) could experience neural entrainment to individualized music, which explored the cross-modal influences of music on patients with DoC through phase-amplitude coupling (PAC). Furthermore, the study assessed the efficacy of individualized music or preferred music (PM) versus relaxing music (RM) in impacting patient outcomes, and examined the role of cross-modal influences in determining these outcomes. Methods Thirty-two patients with DoC [17 with vegetative state/unresponsive wakefulness syndrome (VS/UWS) and 15 with minimally conscious state (MCS)], alongside 16 healthy controls (HCs), were recruited for this study. Neural activities in the frontal-parietal network were recorded using scalp electroencephalography (EEG) during baseline (BL), RM and PM. Cerebral-acoustic coherence (CACoh) was explored to investigate participants' abilitiy to track music, meanwhile, the phase-amplitude coupling (PAC) was utilized to evaluate the cross-modal influences of music. Three months post-intervention, the outcomes of patients with DoC were followed up using the Coma Recovery Scale-Revised (CRS-R). Results HCs and patients with MCS showed higher CACoh compared to VS/UWS patients within musical pulse frequency (p = 0.016, p = 0.045; p < 0.001, p = 0.048, for RM and PM, respectively, following Bonferroni correction). Only theta-gamma PAC demonstrated a significant interaction effect between groups and music conditions (F (2,44) = 2.685, p = 0.036). For HCs, the theta-gamma PAC in the frontal-parietal network was stronger in the PM condition compared to the RM (p = 0.016) and BL condition (p < 0.001). For patients with MCS, the theta-gamma PAC was stronger in the PM than in the BL (p = 0.040), while no difference was observed among the three music conditions in patients with VS/UWS. Additionally, we found that MCS patients who showed improved outcomes after 3 months exhibited evident neural responses to preferred music (p = 0.019). Furthermore, the ratio of theta-gamma coupling changes in PM relative to BL could predict clinical outcomes in MCS patients (r = 0.992, p < 0.001). Conclusion Individualized music may serve as a potential therapeutic method for patients with DoC through cross-modal influences, which rely on enhanced theta-gamma PAC within the consciousness-related network.
Collapse
Affiliation(s)
- Qiuyi Xiao
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
- School of Laboratory Medicine and Biotechnology, Southern Medical University, Guangzhou, Guangdong, China
| | - Xiaochun Zheng
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
- School of Laboratory Medicine and Biotechnology, Southern Medical University, Guangzhou, Guangdong, China
| | - Yun Wen
- Music and Reflection Incorporated, Guangzhou, Guangdong, China
| | - Zhanxing Yuan
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Zerong Chen
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
- School of Laboratory Medicine and Biotechnology, Southern Medical University, Guangzhou, Guangdong, China
| | - Yue Lan
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Shuiyan Li
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
- School of Laboratory Medicine and Biotechnology, Southern Medical University, Guangzhou, Guangdong, China
| | - Xiyan Huang
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Haili Zhong
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Chengwei Xu
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Chang'an Zhan
- Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, Guangdong, China
| | - Jiahui Pan
- School of Software, South China Normal University, Guangzhou, Guangdong, China
| | - Qiuyou Xie
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
- School of Laboratory Medicine and Biotechnology, Southern Medical University, Guangzhou, Guangdong, China
- Department of Hyperbaric Oxygen, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
- School of Rehabilitation Sciences, Southern Medical University, Guangzhou, Guangdong, China
| |
Collapse
|
33
|
Baus C, Millan I, Chen XJ, Blanco-Elorrieta E. Exploring the Interplay Between Language Comprehension and Cortical Tracking: The Bilingual Test Case. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024; 5:484-496. [PMID: 38911463 PMCID: PMC11192516 DOI: 10.1162/nol_a_00141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 03/04/2024] [Indexed: 06/25/2024]
Abstract
Cortical tracking, the synchronization of brain activity to linguistic rhythms is a well-established phenomenon. However, its nature has been heavily contested: Is it purely epiphenomenal or does it play a fundamental role in speech comprehension? Previous research has used intelligibility manipulations to examine this topic. Here, we instead varied listeners' language comprehension skills while keeping the auditory stimulus constant. To do so, we tested 22 native English speakers and 22 Spanish/Catalan bilinguals learning English as a second language (SL) in an EEG cortical entrainment experiment and correlated the responses with the magnitude of the N400 component of a semantic comprehension task. As expected, native listeners effectively tracked sentential, phrasal, and syllabic linguistic structures. In contrast, SL listeners exhibited limitations in tracking sentential structures but successfully tracked phrasal and syllabic rhythms. Importantly, the amplitude of the neural entrainment correlated with the amplitude of the detection of semantic incongruities in SLs, showing a direct connection between tracking and the ability to understand speech. Together, these findings shed light on the interplay between language comprehension and cortical tracking, to identify neural entrainment as a fundamental principle for speech comprehension.
Collapse
Affiliation(s)
- Cristina Baus
- Department of Cognition, Development and Educational Psychology, University of Barcelona, Barcelona, Spain
- Institute of Neurosciences, University of Barcelona, Barcelona, Spain
| | | | | | - Esti Blanco-Elorrieta
- Department of Psychology, New York University, New York, NY, USA
- Department of Neural Science, New York University, New York, NY, USA
| |
Collapse
|
34
|
Kries J, De Clercq P, Gillis M, Vanthornhout J, Lemmens R, Francart T, Vandermosten M. Exploring neural tracking of acoustic and linguistic speech representations in individuals with post-stroke aphasia. Hum Brain Mapp 2024; 45:e26676. [PMID: 38798131 PMCID: PMC11128780 DOI: 10.1002/hbm.26676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 03/04/2024] [Accepted: 03/21/2024] [Indexed: 05/29/2024] Open
Abstract
Aphasia is a communication disorder that affects processing of language at different levels (e.g., acoustic, phonological, semantic). Recording brain activity via Electroencephalography while people listen to a continuous story allows to analyze brain responses to acoustic and linguistic properties of speech. When the neural activity aligns with these speech properties, it is referred to as neural tracking. Even though measuring neural tracking of speech may present an interesting approach to studying aphasia in an ecologically valid way, it has not yet been investigated in individuals with stroke-induced aphasia. Here, we explored processing of acoustic and linguistic speech representations in individuals with aphasia in the chronic phase after stroke and age-matched healthy controls. We found decreased neural tracking of acoustic speech representations (envelope and envelope onsets) in individuals with aphasia. In addition, word surprisal displayed decreased amplitudes in individuals with aphasia around 195 ms over frontal electrodes, although this effect was not corrected for multiple comparisons. These results show that there is potential to capture language processing impairments in individuals with aphasia by measuring neural tracking of continuous speech. However, more research is needed to validate these results. Nonetheless, this exploratory study shows that neural tracking of naturalistic, continuous speech presents a powerful approach to studying aphasia.
Collapse
Affiliation(s)
- Jill Kries
- Experimental Oto‐Rhino‐Laryngology, Department of Neurosciences, Leuven Brain InstituteKU LeuvenLeuvenBelgium
- Department of PsychologyStanford UniversityStanfordCaliforniaUSA
| | - Pieter De Clercq
- Experimental Oto‐Rhino‐Laryngology, Department of Neurosciences, Leuven Brain InstituteKU LeuvenLeuvenBelgium
| | - Marlies Gillis
- Experimental Oto‐Rhino‐Laryngology, Department of Neurosciences, Leuven Brain InstituteKU LeuvenLeuvenBelgium
| | - Jonas Vanthornhout
- Experimental Oto‐Rhino‐Laryngology, Department of Neurosciences, Leuven Brain InstituteKU LeuvenLeuvenBelgium
| | - Robin Lemmens
- Experimental Neurology, Department of NeurosciencesKU LeuvenLeuvenBelgium
- Laboratory of Neurobiology, VIB‐KU Leuven Center for Brain and Disease ResearchLeuvenBelgium
- Department of NeurologyUniversity Hospitals LeuvenLeuvenBelgium
| | - Tom Francart
- Experimental Oto‐Rhino‐Laryngology, Department of Neurosciences, Leuven Brain InstituteKU LeuvenLeuvenBelgium
| | - Maaike Vandermosten
- Experimental Oto‐Rhino‐Laryngology, Department of Neurosciences, Leuven Brain InstituteKU LeuvenLeuvenBelgium
| |
Collapse
|
35
|
Nora A, Rinkinen O, Renvall H, Service E, Arkkila E, Smolander S, Laasonen M, Salmelin R. Impaired Cortical Tracking of Speech in Children with Developmental Language Disorder. J Neurosci 2024; 44:e2048232024. [PMID: 38589232 PMCID: PMC11140678 DOI: 10.1523/jneurosci.2048-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 03/25/2024] [Accepted: 03/26/2024] [Indexed: 04/10/2024] Open
Abstract
In developmental language disorder (DLD), learning to comprehend and express oneself with spoken language is impaired, but the reason for this remains unknown. Using millisecond-scale magnetoencephalography recordings combined with machine learning models, we investigated whether the possible neural basis of this disruption lies in poor cortical tracking of speech. The stimuli were common spoken Finnish words (e.g., dog, car, hammer) and sounds with corresponding meanings (e.g., dog bark, car engine, hammering). In both children with DLD (10 boys and 7 girls) and typically developing (TD) control children (14 boys and 3 girls), aged 10-15 years, the cortical activation to spoken words was best modeled as time-locked to the unfolding speech input at ∼100 ms latency between sound and cortical activation. Amplitude envelope (amplitude changes) and spectrogram (detailed time-varying spectral content) of the spoken words, but not other sounds, were very successfully decoded based on time-locked brain responses in bilateral temporal areas; based on the cortical responses, the models could tell at ∼75-85% accuracy which of the two sounds had been presented to the participant. However, the cortical representation of the amplitude envelope information was poorer in children with DLD compared with TD children at longer latencies (at ∼200-300 ms lag). We interpret this effect as reflecting poorer retention of acoustic-phonetic information in short-term memory. This impaired tracking could potentially affect the processing and learning of words as well as continuous speech. The present results offer an explanation for the problems in language comprehension and acquisition in DLD.
Collapse
Affiliation(s)
- Anni Nora
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo FI-00076, Finland
- Aalto NeuroImaging (ANI), Aalto University, Espoo FI-00076, Finland
| | - Oona Rinkinen
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo FI-00076, Finland
- Aalto NeuroImaging (ANI), Aalto University, Espoo FI-00076, Finland
| | - Hanna Renvall
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo FI-00076, Finland
- Aalto NeuroImaging (ANI), Aalto University, Espoo FI-00076, Finland
- BioMag Laboratory, HUS Diagnostic Center, Helsinki University Hospital, Helsinki FI-00029, Finland
| | - Elisabet Service
- Department of Linguistics and Languages, Centre for Advanced Research in Experimental and Applied Linguistics (ARiEAL), McMaster University, Hamilton, Ontario L8S 4L8, Canada
- Department of Psychology and Logopedics, University of Helsinki, Helsinki FI-00014, Finland
| | - Eva Arkkila
- Department of Otorhinolaryngology and Phoniatrics, Head and Neck Center, Helsinki University Hospital and University of Helsinki, Helsinki FI-00014, Finland
| | - Sini Smolander
- Department of Otorhinolaryngology and Phoniatrics, Head and Neck Center, Helsinki University Hospital and University of Helsinki, Helsinki FI-00014, Finland
- Research Unit of Logopedics, University of Oulu, Oulu FI-90014, Finland
- Department of Logopedics, University of Eastern Finland, Joensuu FI-80101, Finland
| | - Marja Laasonen
- Department of Otorhinolaryngology and Phoniatrics, Head and Neck Center, Helsinki University Hospital and University of Helsinki, Helsinki FI-00014, Finland
- Department of Logopedics, University of Eastern Finland, Joensuu FI-80101, Finland
| | - Riitta Salmelin
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo FI-00076, Finland
- Aalto NeuroImaging (ANI), Aalto University, Espoo FI-00076, Finland
| |
Collapse
|
36
|
Aldag N, Nogueira W. Psychoacoustic and electroencephalographic responses to changes in amplitude modulation depth and frequency in relation to speech recognition in cochlear implantees. Sci Rep 2024; 14:8181. [PMID: 38589483 PMCID: PMC11002021 DOI: 10.1038/s41598-024-58225-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 03/26/2024] [Indexed: 04/10/2024] Open
Abstract
Temporal envelope modulations (TEMs) are one of the most important features that cochlear implant (CI) users rely on to understand speech. Electroencephalographic assessment of TEM encoding could help clinicians to predict speech recognition more objectively, even in patients unable to provide active feedback. The acoustic change complex (ACC) and the auditory steady-state response (ASSR) evoked by low-frequency amplitude-modulated pulse trains can be used to assess TEM encoding with electrical stimulation of individual CI electrodes. In this study, we focused on amplitude modulation detection (AMD) and amplitude modulation frequency discrimination (AMFD) with stimulation of a basal versus an apical electrode. In twelve adult CI users, we (a) assessed behavioral AMFD thresholds and (b) recorded cortical auditory evoked potentials (CAEPs), AMD-ACC, AMFD-ACC, and ASSR in a combined 3-stimulus paradigm. We found that the electrophysiological responses were significantly higher for apical than for basal stimulation. Peak amplitudes of AMFD-ACC were small and (therefore) did not correlate with speech-in-noise recognition. We found significant correlations between speech-in-noise recognition and (a) behavioral AMFD thresholds and (b) AMD-ACC peak amplitudes. AMD and AMFD hold potential to develop a clinically applicable tool for assessing TEM encoding to predict speech recognition in CI users.
Collapse
Affiliation(s)
- Nina Aldag
- Department of Otolaryngology, Hannover Medical School and Cluster of Excellence 'Hearing4all', Hanover, Germany
| | - Waldo Nogueira
- Department of Otolaryngology, Hannover Medical School and Cluster of Excellence 'Hearing4all', Hanover, Germany.
| |
Collapse
|
37
|
Riddle J, Schooler JW. Hierarchical consciousness: the Nested Observer Windows model. Neurosci Conscious 2024; 2024:niae010. [PMID: 38504828 PMCID: PMC10949963 DOI: 10.1093/nc/niae010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 01/31/2024] [Accepted: 02/26/2024] [Indexed: 03/21/2024] Open
Abstract
Foremost in our experience is the intuition that we possess a unified conscious experience. However, many observations run counter to this intuition: we experience paralyzing indecision when faced with two appealing behavioral choices, we simultaneously hold contradictory beliefs, and the content of our thought is often characterized by an internal debate. Here, we propose the Nested Observer Windows (NOW) Model, a framework for hierarchical consciousness wherein information processed across many spatiotemporal scales of the brain feeds into subjective experience. The model likens the mind to a hierarchy of nested mosaic tiles-where an image is composed of mosaic tiles, and each of these tiles is itself an image composed of mosaic tiles. Unitary consciousness exists at the apex of this nested hierarchy where perceptual constructs become fully integrated and complex behaviors are initiated via abstract commands. We define an observer window as a spatially and temporally constrained system within which information is integrated, e.g. in functional brain regions and neurons. Three principles from the signal analysis of electrical activity describe the nested hierarchy and generate testable predictions. First, nested observer windows disseminate information across spatiotemporal scales with cross-frequency coupling. Second, observer windows are characterized by a high degree of internal synchrony (with zero phase lag). Third, observer windows at the same spatiotemporal level share information with each other through coherence (with non-zero phase lag). The theoretical framework of the NOW Model accounts for a wide range of subjective experiences and a novel approach for integrating prominent theories of consciousness.
Collapse
Affiliation(s)
- Justin Riddle
- Department of Psychology, Florida State University, 1107 W Call St, Tallahassee, FL 32304, USA
| | - Jonathan W Schooler
- Department of Psychological & Brain Sciences, University of California, Santa Barbara, Psychological & Brain Sciences, Santa Barbara, CA 93106, USA
| |
Collapse
|
38
|
Corsini A, Tomassini A, Pastore A, Delis I, Fadiga L, D'Ausilio A. Speech perception difficulty modulates theta-band encoding of articulatory synergies. J Neurophysiol 2024; 131:480-491. [PMID: 38323331 DOI: 10.1152/jn.00388.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 01/04/2024] [Accepted: 01/25/2024] [Indexed: 02/08/2024] Open
Abstract
The human brain tracks available speech acoustics and extrapolates missing information such as the speaker's articulatory patterns. However, the extent to which articulatory reconstruction supports speech perception remains unclear. This study explores the relationship between articulatory reconstruction and task difficulty. Participants listened to sentences and performed a speech-rhyming task. Real kinematic data of the speaker's vocal tract were recorded via electromagnetic articulography (EMA) and aligned to corresponding acoustic outputs. We extracted articulatory synergies from the EMA data with principal component analysis (PCA) and employed partial information decomposition (PID) to separate the electroencephalographic (EEG) encoding of acoustic and articulatory features into unique, redundant, and synergistic atoms of information. We median-split sentences into easy (ES) and hard (HS) based on participants' performance and found that greater task difficulty involved greater encoding of unique articulatory information in the theta band. We conclude that fine-grained articulatory reconstruction plays a complementary role in the encoding of speech acoustics, lending further support to the claim that motor processes support speech perception.NEW & NOTEWORTHY Top-down processes originating from the motor system contribute to speech perception through the reconstruction of the speaker's articulatory movement. This study investigates the role of such articulatory simulation under variable task difficulty. We show that more challenging listening tasks lead to increased encoding of articulatory kinematics in the theta band and suggest that, in such situations, fine-grained articulatory reconstruction complements acoustic encoding.
Collapse
Affiliation(s)
- Alessandro Corsini
- Center for Translational Neurophysiology of Speech and Communication, Istituto Italiano di Tecnologia, Ferrara, Italy
- Department of Neuroscience and Rehabilitation, Università di Ferrara, Ferrara, Italy
| | - Alice Tomassini
- Center for Translational Neurophysiology of Speech and Communication, Istituto Italiano di Tecnologia, Ferrara, Italy
- Department of Neuroscience and Rehabilitation, Università di Ferrara, Ferrara, Italy
| | - Aldo Pastore
- Laboratorio NEST, Scuola Normale Superiore, Pisa, Italy
| | - Ioannis Delis
- School of Biomedical Sciences, University of Leeds, Leeds, United Kingdom
| | - Luciano Fadiga
- Center for Translational Neurophysiology of Speech and Communication, Istituto Italiano di Tecnologia, Ferrara, Italy
- Department of Neuroscience and Rehabilitation, Università di Ferrara, Ferrara, Italy
| | - Alessandro D'Ausilio
- Center for Translational Neurophysiology of Speech and Communication, Istituto Italiano di Tecnologia, Ferrara, Italy
- Department of Neuroscience and Rehabilitation, Università di Ferrara, Ferrara, Italy
| |
Collapse
|
39
|
Momtaz S, Bidelman GM. Effects of Stimulus Rate and Periodicity on Auditory Cortical Entrainment to Continuous Sounds. eNeuro 2024; 11:ENEURO.0027-23.2024. [PMID: 38253583 PMCID: PMC10913036 DOI: 10.1523/eneuro.0027-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 01/14/2024] [Accepted: 01/16/2024] [Indexed: 01/24/2024] Open
Abstract
The neural mechanisms underlying the exogenous coding and neural entrainment to repetitive auditory stimuli have seen a recent surge of interest. However, few studies have characterized how parametric changes in stimulus presentation alter entrained responses. We examined the degree to which the brain entrains to repeated speech (i.e., /ba/) and nonspeech (i.e., click) sounds using phase-locking value (PLV) analysis applied to multichannel human electroencephalogram (EEG) data. Passive cortico-acoustic tracking was investigated in N = 24 normal young adults utilizing EEG source analyses that isolated neural activity stemming from both auditory temporal cortices. We parametrically manipulated the rate and periodicity of repetitive, continuous speech and click stimuli to investigate how speed and jitter in ongoing sound streams affect oscillatory entrainment. Neuronal synchronization to speech was enhanced at 4.5 Hz (the putative universal rate of speech) and showed a differential pattern to that of clicks, particularly at higher rates. PLV to speech decreased with increasing jitter but remained superior to clicks. Surprisingly, PLV entrainment to clicks was invariant to periodicity manipulations. Our findings provide evidence that the brain's neural entrainment to complex sounds is enhanced and more sensitized when processing speech-like stimuli, even at the syllable level, relative to nonspeech sounds. The fact that this specialization is apparent even under passive listening suggests a priority of the auditory system for synchronizing to behaviorally relevant signals.
Collapse
Affiliation(s)
- Sara Momtaz
- School of Communication Sciences & Disorders, University of Memphis, Memphis, Tennessee 38152
- Boys Town National Research Hospital, Boys Town, Nebraska 68131
| | - Gavin M Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, Indiana 47408
- Program in Neuroscience, Indiana University, Bloomington, Indiana 47405
| |
Collapse
|
40
|
Zoefel B, Kösem A. Neural tracking of continuous acoustics: properties, speech-specificity and open questions. Eur J Neurosci 2024; 59:394-414. [PMID: 38151889 DOI: 10.1111/ejn.16221] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 11/17/2023] [Accepted: 11/22/2023] [Indexed: 12/29/2023]
Abstract
Human speech is a particularly relevant acoustic stimulus for our species, due to its role of information transmission during communication. Speech is inherently a dynamic signal, and a recent line of research focused on neural activity following the temporal structure of speech. We review findings that characterise neural dynamics in the processing of continuous acoustics and that allow us to compare these dynamics with temporal aspects in human speech. We highlight properties and constraints that both neural and speech dynamics have, suggesting that auditory neural systems are optimised to process human speech. We then discuss the speech-specificity of neural dynamics and their potential mechanistic origins and summarise open questions in the field.
Collapse
Affiliation(s)
- Benedikt Zoefel
- Centre de Recherche Cerveau et Cognition (CerCo), CNRS UMR 5549, Toulouse, France
- Université de Toulouse III Paul Sabatier, Toulouse, France
| | - Anne Kösem
- Lyon Neuroscience Research Center (CRNL), INSERM U1028, Bron, France
| |
Collapse
|
41
|
Dikker S, Brito NH, Dumas G. It takes a village: A multi-brain approach to studying multigenerational family communication. Dev Cogn Neurosci 2024; 65:101330. [PMID: 38091864 PMCID: PMC10716709 DOI: 10.1016/j.dcn.2023.101330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 08/27/2023] [Accepted: 12/07/2023] [Indexed: 12/17/2023] Open
Abstract
Grandparents play a critical role in child rearing across the globe. Yet, there is a shortage of neurobiological research examining the relationship between grandparents and their grandchildren. We employ multi-brain neurocomputational models to simulate how changes in neurophysiological processes in both development and healthy aging affect multigenerational inter-brain coupling - a neural marker that has been linked to a range of socio-emotional and cognitive outcomes. The simulations suggest that grandparent-child interactions may be paired with higher inter-brain coupling than parent-child interactions, raising the possibility that the former may be more advantageous under certain conditions. Critically, this enhancement of inter-brain coupling for grandparent-child interactions is more pronounced in tri-generational interactions that also include a parent, which may speak to findings that grandparent involvement in childrearing is most beneficial if the parent is also an active household member. Together, these findings underscore that a better understanding of the neurobiological basis of cross-generational interactions is vital, and that such knowledge can be helpful in guiding interventions that consider the whole family. We advocate for a community neuroscience approach in developmental social neuroscience to capture the diversity of child-caregiver relationships in real-world settings.
Collapse
|
42
|
Gao J, Chen H, Fang M, Ding N. Original speech and its echo are segregated and separately processed in the human brain. PLoS Biol 2024; 22:e3002498. [PMID: 38358954 PMCID: PMC10868781 DOI: 10.1371/journal.pbio.3002498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 01/15/2024] [Indexed: 02/17/2024] Open
Abstract
Speech recognition crucially relies on slow temporal modulations (<16 Hz) in speech. Recent studies, however, have demonstrated that the long-delay echoes, which are common during online conferencing, can eliminate crucial temporal modulations in speech but do not affect speech intelligibility. Here, we investigated the underlying neural mechanisms. MEG experiments demonstrated that cortical activity can effectively track the temporal modulations eliminated by an echo, which cannot be fully explained by basic neural adaptation mechanisms. Furthermore, cortical responses to echoic speech can be better explained by a model that segregates speech from its echo than by a model that encodes echoic speech as a whole. The speech segregation effect was observed even when attention was diverted but would disappear when segregation cues, i.e., speech fine structure, were removed. These results strongly suggested that, through mechanisms such as stream segregation, the auditory system can build an echo-insensitive representation of speech envelope, which can support reliable speech recognition.
Collapse
Affiliation(s)
- Jiaxin Gao
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, China
| | - Honghua Chen
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, China
| | - Mingxuan Fang
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, China
| | - Nai Ding
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, China
- Nanhu Brain-computer Interface Institute, Hangzhou, China
- The State key Lab of Brain-Machine Intelligence; The MOE Frontier Science Center for Brain Science & Brain-machine Integration, Zhejiang University, Hangzhou, China
| |
Collapse
|
43
|
Silva Pereira S, Özer EE, Sebastian-Galles N. Complexity of STG signals and linguistic rhythm: a methodological study for EEG data. Cereb Cortex 2024; 34:bhad549. [PMID: 38236741 DOI: 10.1093/cercor/bhad549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 12/29/2023] [Accepted: 12/30/2023] [Indexed: 02/06/2024] Open
Abstract
The superior temporal and the Heschl's gyri of the human brain play a fundamental role in speech processing. Neurons synchronize their activity to the amplitude envelope of the speech signal to extract acoustic and linguistic features, a process known as neural tracking/entrainment. Electroencephalography has been extensively used in language-related research due to its high temporal resolution and reduced cost, but it does not allow for a precise source localization. Motivated by the lack of a unified methodology for the interpretation of source reconstructed signals, we propose a method based on modularity and signal complexity. The procedure was tested on data from an experiment in which we investigated the impact of native language on tracking to linguistic rhythms in two groups: English natives and Spanish natives. In the experiment, we found no effect of native language but an effect of language rhythm. Here, we compare source projected signals in the auditory areas of both hemispheres for the different conditions using nonparametric permutation tests, modularity, and a dynamical complexity measure. We found increasing values of complexity for decreased regularity in the stimuli, giving us the possibility to conclude that languages with less complex rhythms are easier to track by the auditory cortex.
Collapse
Affiliation(s)
- Silvana Silva Pereira
- Center for Brain and Cognition, Department of Information and Communications Technologies, Universitat Pompeu Fabra, 08005 Barcelona, Spain
| | - Ege Ekin Özer
- Center for Brain and Cognition, Department of Information and Communications Technologies, Universitat Pompeu Fabra, 08005 Barcelona, Spain
| | - Nuria Sebastian-Galles
- Center for Brain and Cognition, Department of Information and Communications Technologies, Universitat Pompeu Fabra, 08005 Barcelona, Spain
| |
Collapse
|
44
|
Mongold SJ, Georgiev C, Legrand T, Bourguignon M. Afferents to Action: Cortical Proprioceptive Processing Assessed with Corticokinematic Coherence Specifically Relates to Gross Motor Skills. eNeuro 2024; 11:ENEURO.0384-23.2023. [PMID: 38164580 PMCID: PMC10849019 DOI: 10.1523/eneuro.0384-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 11/29/2023] [Accepted: 12/08/2023] [Indexed: 01/03/2024] Open
Abstract
Voluntary motor control is thought to be predicated on the ability to efficiently integrate and process somatosensory afferent information. However, current approaches in the field of motor control have not factored in objective markers of how the brain tracks incoming somatosensory information. Here, we asked whether motor performance relates to such markers obtained with an analysis of the coupling between peripheral kinematics and cortical oscillations during continuous movements, best known as corticokinematic coherence (CKC). Motor performance was evaluated by measuring both gross and fine motor skills using the Box and Blocks Test (BBT) and the Purdue Pegboard Test (PPT), respectively, and with a biomechanics measure of coordination. A total of 61 participants completed the BBT, while equipped with electroencephalography and electromyography, and the PPT. We evaluated CKC, from the signals collected during the BBT, as the coherence between movement rhythmicity and brain activity, and coordination as the cross-correlation between muscle activity. CKC at movements' first harmonic was positively associated with BBT scores (r = 0.41, p = 0.001), and alone showed no relationship with PPT scores (r = 0.07, p = 0.60), but in synergy with BBT scores, participants with lower PPT scores had higher CKC than expected based on their BBT score. Coordination was not associated with motor performance or CKC (p > 0.05). These findings demonstrate that cortical somatosensory processing in the form of strengthened brain-peripheral coupling is specifically associated with better gross motor skills and thus may be considered as a valuable addition to classical tests of proprioception acuity.
Collapse
Affiliation(s)
- Scott J Mongold
- Université libre de Bruxelles (ULB), UNI-ULB Neuroscience Institute, Laboratory of Neurophysiology and Movement Biomechanics, 1070 Brussels, Belgium
| | - Christian Georgiev
- Université libre de Bruxelles (ULB), UNI-ULB Neuroscience Institute, Laboratory of Neurophysiology and Movement Biomechanics, 1070 Brussels, Belgium
| | - Thomas Legrand
- Université libre de Bruxelles (ULB), UNI-ULB Neuroscience Institute, Laboratory of Neurophysiology and Movement Biomechanics, 1070 Brussels, Belgium
- University College Dublin (UCD), School of Electrical and Electronic Engineering, D04 V1W8 Dublin, Ireland
| | - Mathieu Bourguignon
- Université libre de Bruxelles (ULB), UNI-ULB Neuroscience Institute, Laboratory of Neurophysiology and Movement Biomechanics, 1070 Brussels, Belgium
- Université libre de Bruxelles (ULB), UNI - ULB Neurosciences Institute, Laboratoire de Neuroanatomie et de Neuroimagerie translationnelles (LN2T), 1070 Brussels, Belgium
- BCBL, Basque Center on Cognition, Brain and Language, 20009 San Sebastian, Spain
| |
Collapse
|
45
|
MacIntyre AD, Carlyon RP, Goehring T. Neural Decoding of the Speech Envelope: Effects of Intelligibility and Spectral Degradation. Trends Hear 2024; 28:23312165241266316. [PMID: 39183533 PMCID: PMC11345737 DOI: 10.1177/23312165241266316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 05/23/2024] [Accepted: 06/16/2024] [Indexed: 08/27/2024] Open
Abstract
During continuous speech perception, endogenous neural activity becomes time-locked to acoustic stimulus features, such as the speech amplitude envelope. This speech-brain coupling can be decoded using non-invasive brain imaging techniques, including electroencephalography (EEG). Neural decoding may provide clinical use as an objective measure of stimulus encoding by the brain-for example during cochlear implant listening, wherein the speech signal is severely spectrally degraded. Yet, interplay between acoustic and linguistic factors may lead to top-down modulation of perception, thereby complicating audiological applications. To address this ambiguity, we assess neural decoding of the speech envelope under spectral degradation with EEG in acoustically hearing listeners (n = 38; 18-35 years old) using vocoded speech. We dissociate sensory encoding from higher-order processing by employing intelligible (English) and non-intelligible (Dutch) stimuli, with auditory attention sustained using a repeated-phrase detection task. Subject-specific and group decoders were trained to reconstruct the speech envelope from held-out EEG data, with decoder significance determined via random permutation testing. Whereas speech envelope reconstruction did not vary by spectral resolution, intelligible speech was associated with better decoding accuracy in general. Results were similar across subject-specific and group analyses, with less consistent effects of spectral degradation in group decoding. Permutation tests revealed possible differences in decoder statistical significance by experimental condition. In general, while robust neural decoding was observed at the individual and group level, variability within participants would most likely prevent the clinical use of such a measure to differentiate levels of spectral degradation and intelligibility on an individual basis.
Collapse
Affiliation(s)
| | - Robert P. Carlyon
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | - Tobias Goehring
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| |
Collapse
|
46
|
Batterink LJ, Mulgrew J, Gibbings A. Rhythmically Modulating Neural Entrainment during Exposure to Regularities Influences Statistical Learning. J Cogn Neurosci 2024; 36:107-127. [PMID: 37902580 DOI: 10.1162/jocn_a_02079] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2023]
Abstract
The ability to discover regularities in the environment, such as syllable patterns in speech, is known as statistical learning. Previous studies have shown that statistical learning is accompanied by neural entrainment, in which neural activity temporally aligns with repeating patterns over time. However, it is unclear whether these rhythmic neural dynamics play a functional role in statistical learning or whether they largely reflect the downstream consequences of learning, such as the enhanced perception of learned words in speech. To better understand this issue, we manipulated participants' neural entrainment during statistical learning using continuous rhythmic visual stimulation. Participants were exposed to a speech stream of repeating nonsense words while viewing either (1) a visual stimulus with a "congruent" rhythm that aligned with the word structure, (2) a visual stimulus with an incongruent rhythm, or (3) a static visual stimulus. Statistical learning was subsequently measured using both an explicit and implicit test. Participants in the congruent condition showed a significant increase in neural entrainment over auditory regions at the relevant word frequency, over and above effects of passive volume conduction, indicating that visual stimulation successfully altered neural entrainment within relevant neural substrates. Critically, during the subsequent implicit test, participants in the congruent condition showed an enhanced ability to predict upcoming syllables and stronger neural phase synchronization to component words, suggesting that they had gained greater sensitivity to the statistical structure of the speech stream relative to the incongruent and static groups. This learning benefit could not be attributed to strategic processes, as participants were largely unaware of the contingencies between the visual stimulation and embedded words. These results indicate that manipulating neural entrainment during exposure to regularities influences statistical learning outcomes, suggesting that neural entrainment may functionally contribute to statistical learning. Our findings encourage future studies using non-invasive brain stimulation methods to further understand the role of entrainment in statistical learning.
Collapse
|
47
|
Ahmed F, Nidiffer AR, Lalor EC. The effect of gaze on EEG measures of multisensory integration in a cocktail party scenario. Front Hum Neurosci 2023; 17:1283206. [PMID: 38162285 PMCID: PMC10754997 DOI: 10.3389/fnhum.2023.1283206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 11/20/2023] [Indexed: 01/03/2024] Open
Abstract
Seeing the speaker's face greatly improves our speech comprehension in noisy environments. This is due to the brain's ability to combine the auditory and the visual information around us, a process known as multisensory integration. Selective attention also strongly influences what we comprehend in scenarios with multiple speakers-an effect known as the cocktail-party phenomenon. However, the interaction between attention and multisensory integration is not fully understood, especially when it comes to natural, continuous speech. In a recent electroencephalography (EEG) study, we explored this issue and showed that multisensory integration is enhanced when an audiovisual speaker is attended compared to when that speaker is unattended. Here, we extend that work to investigate how this interaction varies depending on a person's gaze behavior, which affects the quality of the visual information they have access to. To do so, we recorded EEG from 31 healthy adults as they performed selective attention tasks in several paradigms involving two concurrently presented audiovisual speakers. We then modeled how the recorded EEG related to the audio speech (envelope) of the presented speakers. Crucially, we compared two classes of model - one that assumed underlying multisensory integration (AV) versus another that assumed two independent unisensory audio and visual processes (A+V). This comparison revealed evidence of strong attentional effects on multisensory integration when participants were looking directly at the face of an audiovisual speaker. This effect was not apparent when the speaker's face was in the peripheral vision of the participants. Overall, our findings suggest a strong influence of attention on multisensory integration when high fidelity visual (articulatory) speech information is available. More generally, this suggests that the interplay between attention and multisensory integration during natural audiovisual speech is dynamic and is adaptable based on the specific task and environment.
Collapse
Affiliation(s)
| | | | - Edmund C. Lalor
- Department of Biomedical Engineering, Department of Neuroscience, and Del Monte Institute for Neuroscience, and Center for Visual Science, University of Rochester, Rochester, NY, United States
| |
Collapse
|
48
|
Karunathilake IMD, Kulasingham JP, Simon JZ. Neural tracking measures of speech intelligibility: Manipulating intelligibility while keeping acoustics unchanged. Proc Natl Acad Sci U S A 2023; 120:e2309166120. [PMID: 38032934 PMCID: PMC10710032 DOI: 10.1073/pnas.2309166120] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 10/21/2023] [Indexed: 12/02/2023] Open
Abstract
Neural speech tracking has advanced our understanding of how our brains rapidly map an acoustic speech signal onto linguistic representations and ultimately meaning. It remains unclear, however, how speech intelligibility is related to the corresponding neural responses. Many studies addressing this question vary the level of intelligibility by manipulating the acoustic waveform, but this makes it difficult to cleanly disentangle the effects of intelligibility from underlying acoustical confounds. Here, using magnetoencephalography recordings, we study neural measures of speech intelligibility by manipulating intelligibility while keeping the acoustics strictly unchanged. Acoustically identical degraded speech stimuli (three-band noise-vocoded, ~20 s duration) are presented twice, but the second presentation is preceded by the original (nondegraded) version of the speech. This intermediate priming, which generates a "pop-out" percept, substantially improves the intelligibility of the second degraded speech passage. We investigate how intelligibility and acoustical structure affect acoustic and linguistic neural representations using multivariate temporal response functions (mTRFs). As expected, behavioral results confirm that perceived speech clarity is improved by priming. mTRFs analysis reveals that auditory (speech envelope and envelope onset) neural representations are not affected by priming but only by the acoustics of the stimuli (bottom-up driven). Critically, our findings suggest that segmentation of sounds into words emerges with better speech intelligibility, and most strongly at the later (~400 ms latency) word processing stage, in prefrontal cortex, in line with engagement of top-down mechanisms associated with priming. Taken together, our results show that word representations may provide some objective measures of speech comprehension.
Collapse
Affiliation(s)
| | | | - Jonathan Z. Simon
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD20742
- Department of Biology, University of Maryland, College Park, MD20742
- Institute for Systems Research, University of Maryland, College Park, MD20742
| |
Collapse
|
49
|
Çetinçelik M, Rowland CF, Snijders TM. Ten-month-old infants' neural tracking of naturalistic speech is not facilitated by the speaker's eye gaze. Dev Cogn Neurosci 2023; 64:101297. [PMID: 37778275 PMCID: PMC10543766 DOI: 10.1016/j.dcn.2023.101297] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 08/21/2023] [Accepted: 09/08/2023] [Indexed: 10/03/2023] Open
Abstract
Eye gaze is a powerful ostensive cue in infant-caregiver interactions, with demonstrable effects on language acquisition. While the link between gaze following and later vocabulary is well-established, the effects of eye gaze on other aspects of language, such as speech processing, are less clear. In this EEG study, we examined the effects of the speaker's eye gaze on ten-month-old infants' neural tracking of naturalistic audiovisual speech, a marker for successful speech processing. Infants watched videos of a speaker telling stories, addressing the infant with direct or averted eye gaze. We assessed infants' speech-brain coherence at stress (1-1.75 Hz) and syllable (2.5-3.5 Hz) rates, tested for differences in attention by comparing looking times and EEG theta power in the two conditions, and investigated whether neural tracking predicts later vocabulary. Our results showed that infants' brains tracked the speech rhythm both at the stress and syllable rates, and that infants' neural tracking at the syllable rate predicted later vocabulary. However, speech-brain coherence did not significantly differ between direct and averted gaze conditions and infants did not show greater attention to direct gaze. Overall, our results suggest significant neural tracking at ten months, related to vocabulary development, but not modulated by speaker's gaze.
Collapse
Affiliation(s)
- Melis Çetinçelik
- Department of Experimental Psychology, Utrecht University, Utrecht, the Netherlands; Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands.
| | - Caroline F Rowland
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands; Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands
| | - Tineke M Snijders
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands; Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands; Cognitive Neuropsychology Department, Tilburg University, Tilburg, the Netherlands
| |
Collapse
|
50
|
Mai G, Wang WSY. Distinct roles of delta- and theta-band neural tracking for sharpening and predictive coding of multi-level speech features during spoken language processing. Hum Brain Mapp 2023; 44:6149-6172. [PMID: 37818940 PMCID: PMC10619373 DOI: 10.1002/hbm.26503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 08/17/2023] [Accepted: 09/13/2023] [Indexed: 10/13/2023] Open
Abstract
The brain tracks and encodes multi-level speech features during spoken language processing. It is evident that this speech tracking is dominant at low frequencies (<8 Hz) including delta and theta bands. Recent research has demonstrated distinctions between delta- and theta-band tracking but has not elucidated how they differentially encode speech across linguistic levels. Here, we hypothesised that delta-band tracking encodes prediction errors (enhanced processing of unexpected features) while theta-band tracking encodes neural sharpening (enhanced processing of expected features) when people perceive speech with different linguistic contents. EEG responses were recorded when normal-hearing participants attended to continuous auditory stimuli that contained different phonological/morphological and semantic contents: (1) real-words, (2) pseudo-words and (3) time-reversed speech. We employed multivariate temporal response functions to measure EEG reconstruction accuracies in response to acoustic (spectrogram), phonetic and phonemic features with the partialling procedure that singles out unique contributions of individual features. We found higher delta-band accuracies for pseudo-words than real-words and time-reversed speech, especially during encoding of phonetic features. Notably, individual time-lag analyses showed that significantly higher accuracies for pseudo-words than real-words started at early processing stages for phonetic encoding (<100 ms post-feature) and later stages for acoustic and phonemic encoding (>200 and 400 ms post-feature, respectively). Theta-band accuracies, on the other hand, were higher when stimuli had richer linguistic content (real-words > pseudo-words > time-reversed speech). Such effects also started at early stages (<100 ms post-feature) during encoding of all individual features or when all features were combined. We argue these results indicate that delta-band tracking may play a role in predictive coding leading to greater tracking of pseudo-words due to the presence of unexpected/unpredicted semantic information, while theta-band tracking encodes sharpened signals caused by more expected phonological/morphological and semantic contents. Early presence of these effects reflects rapid computations of sharpening and prediction errors. Moreover, by measuring changes in EEG alpha power, we did not find evidence that the observed effects can be solitarily explained by attentional demands or listening efforts. Finally, we used directed information analyses to illustrate feedforward and feedback information transfers between prediction errors and sharpening across linguistic levels, showcasing how our results fit with the hierarchical Predictive Coding framework. Together, we suggest the distinct roles of delta and theta neural tracking for sharpening and predictive coding of multi-level speech features during spoken language processing.
Collapse
Affiliation(s)
- Guangting Mai
- Hearing Theme, National Institute for Health Research Nottingham Biomedical Research Centre, Nottingham, UK
- Academic Unit of Mental Health and Clinical Neurosciences, School of Medicine, The University of Nottingham, Nottingham, UK
- Division of Psychology and Language Sciences, Faculty of Brain Sciences, University College London, London, UK
| | - William S-Y Wang
- Department of Chinese and Bilingual Studies, Hong Kong Polytechnic University, Hung Hom, Hong Kong
- Language Engineering Laboratory, The Chinese University of Hong Kong, Hong Kong, China
| |
Collapse
|