1
|
Wang S, Liu Y, Kou N, Chen Y, Liu T, Wang Y, Wang S. Impact of age-related hearing loss on decompensation of left DLPFC during speech perception in noise: a combined EEG-fNIRS study. GeroScience 2025; 47:2119-2134. [PMID: 39446223 PMCID: PMC11979022 DOI: 10.1007/s11357-024-01393-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Accepted: 10/13/2024] [Indexed: 10/25/2024] Open
Abstract
Understanding speech-in-noise is a significant challenge for individuals with age-related hearing loss (ARHL). Evidence suggests that increased activity in the frontal cortex compensates for impaired speech perception in healthy aging older adults. However, whether older adults with ARHL still show preserved compensatory function and the specific neural regulatory mechanisms underlying such compensation remains largely unclear. Here, by utilizing a synchronized EEG-fNIRS test, we investigated the neural oscillatory characteristics of the theta band and synchronous hemodynamic changes in the frontal cortex during a speech recognition task in noise. The study included healthy older adults (n = 26, aged 65.4 ± 2.8), those with mild hearing loss (n = 26, aged 66.3 ± 3.8), and those with moderate to severe hearing loss (n = 26, aged 67.5 ± 3.7). Results showed that, relative to healthy older adults, older adults with ARHL exhibited lower activation and weakened theta band neural oscillations in the left dorsolateral prefrontal cortex (DLPFC) under noisy conditions, and this decreased activity correlated with high-frequency hearing loss. Meanwhile, we found that the connectivity of the frontoparietal network was significantly reduced, which might depress the top-down articulatory prediction function affecting speech recognition performance in ARHL older adults. The results suggested that healthy aging older adults might exhibit compensatory attentional resource recruitment through a top-down auditory-motor integration mechanism. In comparison, older adults with ARHL reflected decompensation of the left DLPFC involving the frontoparietal integration network during speech recognition tasks in noise.
Collapse
Affiliation(s)
- Songjian Wang
- Beijing Institute of Otolaryngology, Otolaryngology-Head and Neck Surgery, Key Laboratory of Otolaryngology Head and Neck Surgery (Capital Medical University), Ministry of Education, Beijing Tongren Hospital, Dongcheng District, Capital Medical University, 17 Chongnei Hougou Hutong, Beijing, 100005, China
| | - Yi Liu
- Beijing Institute of Otolaryngology, Otolaryngology-Head and Neck Surgery, Key Laboratory of Otolaryngology Head and Neck Surgery (Capital Medical University), Ministry of Education, Beijing Tongren Hospital, Dongcheng District, Capital Medical University, 17 Chongnei Hougou Hutong, Beijing, 100005, China
| | - Nuonan Kou
- Beijing Institute of Otolaryngology, Otolaryngology-Head and Neck Surgery, Key Laboratory of Otolaryngology Head and Neck Surgery (Capital Medical University), Ministry of Education, Beijing Tongren Hospital, Dongcheng District, Capital Medical University, 17 Chongnei Hougou Hutong, Beijing, 100005, China
| | - Younuo Chen
- Beijing Institute of Otolaryngology, Otolaryngology-Head and Neck Surgery, Key Laboratory of Otolaryngology Head and Neck Surgery (Capital Medical University), Ministry of Education, Beijing Tongren Hospital, Dongcheng District, Capital Medical University, 17 Chongnei Hougou Hutong, Beijing, 100005, China
| | - Tong Liu
- Beijing Institute of Otolaryngology, Otolaryngology-Head and Neck Surgery, Key Laboratory of Otolaryngology Head and Neck Surgery (Capital Medical University), Ministry of Education, Beijing Tongren Hospital, Dongcheng District, Capital Medical University, 17 Chongnei Hougou Hutong, Beijing, 100005, China
| | - Yuan Wang
- Beijing Institute of Otolaryngology, Otolaryngology-Head and Neck Surgery, Key Laboratory of Otolaryngology Head and Neck Surgery (Capital Medical University), Ministry of Education, Beijing Tongren Hospital, Dongcheng District, Capital Medical University, 17 Chongnei Hougou Hutong, Beijing, 100005, China
| | - Shuo Wang
- Beijing Institute of Otolaryngology, Otolaryngology-Head and Neck Surgery, Key Laboratory of Otolaryngology Head and Neck Surgery (Capital Medical University), Ministry of Education, Beijing Tongren Hospital, Dongcheng District, Capital Medical University, 17 Chongnei Hougou Hutong, Beijing, 100005, China.
| |
Collapse
|
2
|
Mégevand P, Thézé R, Mehta AD. Naturalistic Audiovisual Illusions Reveal the Cortical Sites Involved in the Multisensory Processing of Speech. Eur J Neurosci 2025; 61:e70043. [PMID: 40029551 DOI: 10.1111/ejn.70043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Revised: 02/11/2025] [Accepted: 02/19/2025] [Indexed: 03/05/2025]
Abstract
Audiovisual speech illusions are a spectacular illustration of the effect of visual cues on the perception of speech. Because they allow dissociating perception from the physical characteristics of the sensory inputs, these illusions are useful to investigate the cerebral processing of audiovisual speech. However, the meaningless, monosyllabic utterances typically used to induce illusions are far removed from natural communication through speech. We developed naturalistic speech stimuli that embed mismatched auditory and visual cues within grammatically correct sentences to induce illusory perceptions in controlled fashion. Using intracranial EEG, we confirmed that the cortical processing of audiovisual speech recruits an ensemble of areas, from auditory and visual cortices to multisensory and associative regions. Importantly, we were able to resolve which cortical areas are driven more by the auditory or the visual contents of the speech stimulus or by the eventual perceptual report. Our results suggest that higher order sensory and associative areas, rather than early sensory cortices, are key loci for illusory perception. Naturalistic audiovisual speech illusions represent a powerful tool to dissect the specific roles of individual cortical areas in the processing of audiovisual speech.
Collapse
Affiliation(s)
- Pierre Mégevand
- Department of Clinical Neuroscience, Faculty of Medicine, University of Geneva, Geneva, Switzerland
- Division of Neurology, Geneva University Hospitals, Geneva, Switzerland
- Department of Fundamental Neuroscience, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Raphaël Thézé
- Department of Fundamental Neuroscience, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Ashesh D Mehta
- Department of Neurosurgery, Zucker School of Medicine at Hofstra/Northwell, Hempstead, New York, USA
- The Feinstein Institutes for Medical Research, Northwell Health, Manhasset, New York, USA
| |
Collapse
|
3
|
López-Madrona VJ, Trébuchon A, Bénar CG, Schön D, Morillon B. Different sustained and induced alpha oscillations emerge in the human auditory cortex during sound processing. Commun Biol 2024; 7:1570. [PMID: 39592826 PMCID: PMC11599602 DOI: 10.1038/s42003-024-07297-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Accepted: 11/19/2024] [Indexed: 11/28/2024] Open
Abstract
Alpha oscillations in the auditory cortex have been associated with attention and the suppression of irrelevant information. However, their anatomical organization and interaction with other neural processes remain unclear. Do alpha oscillations function as a local mechanism within most neural sources to regulate their internal excitation/inhibition balance, or do they belong to separated inhibitory sources gating information across the auditory network? To address this question, we acquired intracerebral electrophysiological recordings from epilepsy patients during rest and tones listening. Thanks to independent component analysis, we disentangled the different neural sources and labeled them as "oscillatory" if they presented strong alpha oscillations at rest, and/or "evoked" if they displayed a significant evoked response to the stimulation. Our results show that 1) sources are condition-specific and segregated in the auditory cortex, 2) both sources have a high-gamma response followed by an induced alpha suppression, 3) only oscillatory sources present a sustained alpha suppression during all the stimulation period. We hypothesize that there are two different alpha oscillations in the auditory cortex: an induced bottom-up response indicating a selective engagement of the primary cortex to process the stimuli, and a sustained suppression reflecting a general disinhibited state of the network to process sensory information.
Collapse
Affiliation(s)
- Víctor J López-Madrona
- Institute of Language, Communication, and the Brain, Aix-Marseille Univ, Marseille, France.
- Aix-Marseille Univ, INSERM, INS, Inst Neurosci Syst, Marseille, France.
| | - Agnès Trébuchon
- APHM, Timone Hospital, Epileptology and cerebral rhythmology, Marseille, 13005, France
- APHM, Timone Hospital, Functional and stereotactic neurosurgery, Marseille, 13005, France
| | - Christian G Bénar
- Aix-Marseille Univ, INSERM, INS, Inst Neurosci Syst, Marseille, France
| | - Daniele Schön
- Aix-Marseille Univ, INSERM, INS, Inst Neurosci Syst, Marseille, France
| | - Benjamin Morillon
- Aix-Marseille Univ, INSERM, INS, Inst Neurosci Syst, Marseille, France
| |
Collapse
|
4
|
Karthik G, Cao CZ, Demidenko MI, Jahn A, Stacey WC, Wasade VS, Brang D. Auditory cortex encodes lipreading information through spatially distributed activity. Curr Biol 2024; 34:4021-4032.e5. [PMID: 39153482 PMCID: PMC11387126 DOI: 10.1016/j.cub.2024.07.073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 04/29/2024] [Accepted: 07/19/2024] [Indexed: 08/19/2024]
Abstract
Watching a speaker's face improves speech perception accuracy. This benefit is enabled, in part, by implicit lipreading abilities present in the general population. While it is established that lipreading can alter the perception of a heard word, it is unknown how these visual signals are represented in the auditory system or how they interact with auditory speech representations. One influential, but untested, hypothesis is that visual speech modulates the population-coded representations of phonetic and phonemic features in the auditory system. This model is largely supported by data showing that silent lipreading evokes activity in the auditory cortex, but these activations could alternatively reflect general effects of arousal or attention or the encoding of non-linguistic features such as visual timing information. This gap limits our understanding of how vision supports speech perception. To test the hypothesis that the auditory system encodes visual speech information, we acquired functional magnetic resonance imaging (fMRI) data from healthy adults and intracranial recordings from electrodes implanted in patients with epilepsy during auditory and visual speech perception tasks. Across both datasets, linear classifiers successfully decoded the identity of silently lipread words using the spatial pattern of auditory cortex responses. Examining the time course of classification using intracranial recordings, lipread words were classified at earlier time points relative to heard words, suggesting a predictive mechanism for facilitating speech. These results support a model in which the auditory system combines the joint neural distributions evoked by heard and lipread words to generate a more precise estimate of what was said.
Collapse
Affiliation(s)
- Ganesan Karthik
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Cody Zhewei Cao
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109, USA
| | | | - Andrew Jahn
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109, USA
| | - William C Stacey
- Department of Neurology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Vibhangini S Wasade
- Henry Ford Hospital, Detroit, MI 48202, USA; Department of Neurology, Wayne State University School of Medicine, Detroit, MI 48201, USA
| | - David Brang
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
5
|
Senkowski D, Engel AK. Multi-timescale neural dynamics for multisensory integration. Nat Rev Neurosci 2024; 25:625-642. [PMID: 39090214 DOI: 10.1038/s41583-024-00845-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/02/2024] [Indexed: 08/04/2024]
Abstract
Carrying out any everyday task, be it driving in traffic, conversing with friends or playing basketball, requires rapid selection, integration and segregation of stimuli from different sensory modalities. At present, even the most advanced artificial intelligence-based systems are unable to replicate the multisensory processes that the human brain routinely performs, but how neural circuits in the brain carry out these processes is still not well understood. In this Perspective, we discuss recent findings that shed fresh light on the oscillatory neural mechanisms that mediate multisensory integration (MI), including power modulations, phase resetting, phase-amplitude coupling and dynamic functional connectivity. We then consider studies that also suggest multi-timescale dynamics in intrinsic ongoing neural activity and during stimulus-driven bottom-up and cognitive top-down neural network processing in the context of MI. We propose a new concept of MI that emphasizes the critical role of neural dynamics at multiple timescales within and across brain networks, enabling the simultaneous integration, segregation, hierarchical structuring and selection of information in different time windows. To highlight predictions from our multi-timescale concept of MI, real-world scenarios in which multi-timescale processes may coordinate MI in a flexible and adaptive manner are considered.
Collapse
Affiliation(s)
- Daniel Senkowski
- Department of Psychiatry and Neurosciences, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Andreas K Engel
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
| |
Collapse
|
6
|
Çetinçelik M, Jordan-Barros A, Rowland CF, Snijders TM. The effect of visual speech cues on neural tracking of speech in 10-month-old infants. Eur J Neurosci 2024; 60:5381-5399. [PMID: 39188179 DOI: 10.1111/ejn.16492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 07/04/2024] [Accepted: 07/20/2024] [Indexed: 08/28/2024]
Abstract
While infants' sensitivity to visual speech cues and the benefit of these cues have been well-established by behavioural studies, there is little evidence on the effect of visual speech cues on infants' neural processing of continuous auditory speech. In this study, we investigated whether visual speech cues, such as the movements of the lips, jaw, and larynx, facilitate infants' neural speech tracking. Ten-month-old Dutch-learning infants watched videos of a speaker reciting passages in infant-directed speech while electroencephalography (EEG) was recorded. In the videos, either the full face of the speaker was displayed or the speaker's mouth and jaw were masked with a block, obstructing the visual speech cues. To assess neural tracking, speech-brain coherence (SBC) was calculated, focusing particularly on the stress and syllabic rates (1-1.75 and 2.5-3.5 Hz respectively in our stimuli). First, overall, SBC was compared to surrogate data, and then, differences in SBC in the two conditions were tested at the frequencies of interest. Our results indicated that infants show significant tracking at both stress and syllabic rates. However, no differences were identified between the two conditions, meaning that infants' neural tracking was not modulated further by the presence of visual speech cues. Furthermore, we demonstrated that infants' neural tracking of low-frequency information is related to their subsequent vocabulary development at 18 months. Overall, this study provides evidence that infants' neural tracking of speech is not necessarily impaired when visual speech cues are not fully visible and that neural tracking may be a potential mechanism in successful language acquisition.
Collapse
Affiliation(s)
- Melis Çetinçelik
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Department of Experimental Psychology, Utrecht University, Utrecht, The Netherlands
- Cognitive Neuropsychology Department, Tilburg University, Tilburg, The Netherlands
| | - Antonia Jordan-Barros
- Centre for Brain and Cognitive Development, Department of Psychological Science, Birkbeck, University of London, London, UK
- Experimental Psychology, University College London, London, UK
| | - Caroline F Rowland
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Tineke M Snijders
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Cognitive Neuropsychology Department, Tilburg University, Tilburg, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| |
Collapse
|
7
|
Castellani N, Federici A, Fantoni M, Ricciardi E, Garbarini F, Bottari D. Brain Encoding of Naturalistic, Continuous, and Unpredictable Tactile Events. eNeuro 2024; 11:ENEURO.0238-24.2024. [PMID: 39266328 PMCID: PMC11429829 DOI: 10.1523/eneuro.0238-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 07/24/2024] [Accepted: 07/27/2024] [Indexed: 09/14/2024] Open
Abstract
Studies employing EEG to measure somatosensory responses have been typically optimized to compute event-related potentials in response to discrete events. However, tactile interactions involve continuous processing of nonstationary inputs that change in location, duration, and intensity. To fill this gap, this study aims to demonstrate the possibility of measuring the neural tracking of continuous and unpredictable tactile information. Twenty-seven young adults (females, 15) were continuously and passively stimulated with a random series of gentle brushes on single fingers of each hand, which were covered from view. Thus, tactile stimulations were unique for each participant and stimulated fingers. An encoding model measured the degree of synchronization between brain activity and continuous tactile input, generating a temporal response function (TRF). Brain topographies associated with the encoding of each finger stimulation showed a contralateral response at central sensors starting at 50 ms and peaking at ∼140 ms of lag, followed by a bilateral response at ∼240 ms. A series of analyses highlighted that reliable tactile TRF emerged after just 3 min of stimulation. Strikingly, topographical patterns of the TRF allowed discriminating digit lateralization across hands and digit representation within each hand. Our results demonstrated for the first time the possibility of using EEG to measure the neural tracking of a naturalistic, continuous, and unpredictable stimulation in the somatosensory domain. Crucially, this approach allows the study of brain activity following individualized, idiosyncratic tactile events to the fingers.
Collapse
Affiliation(s)
- Nicolò Castellani
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca 55100, Italy
- Manibus Lab, University of Turin, Turin 10124, Italy
| | | | - Marta Fantoni
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca 55100, Italy
| | | | | | - Davide Bottari
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca 55100, Italy
| |
Collapse
|
8
|
Monney J, Dallaire SE, Stoutah L, Fanda L, Mégevand P. Voxeloc: A time-saving graphical user interface for localizing and visualizing stereo-EEG electrodes. J Neurosci Methods 2024; 407:110154. [PMID: 38697518 DOI: 10.1016/j.jneumeth.2024.110154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 03/26/2024] [Accepted: 04/27/2024] [Indexed: 05/05/2024]
Abstract
BACKGROUND Thanks to its unrivalled spatial and temporal resolutions and signal-to-noise ratio, intracranial EEG (iEEG) is becoming a valuable tool in neuroscience research. To attribute functional properties to cortical tissue, it is paramount to be able to determine precisely the localization of each electrode with respect to a patient's brain anatomy. Several software packages or pipelines offer the possibility to localize manually or semi-automatically iEEG electrodes. However, their reliability and ease of use may leave to be desired. NEW METHOD Voxeloc (voxel electrode locator) is a Matlab-based graphical user interface to localize and visualize stereo-EEG electrodes. Voxeloc adopts a semi-automated approach to determine the coordinates of each electrode contact, the user only needing to indicate the deep-most contact of each electrode shaft and another point more proximally. RESULTS With a deliberately streamlined functionality and intuitive graphical user interface, the main advantages of Voxeloc are ease of use and inter-user reliability. Additionally, oblique slices along the shaft of each electrode can be generated to facilitate the precise localization of each contact. Voxeloc is open-source software and is compatible with the open iEEG-BIDS (Brain Imaging Data Structure) format. COMPARISON WITH EXISTING METHODS Localizing full patients' iEEG implants was faster using Voxeloc than two comparable software packages, and the inter-user agreement was better. CONCLUSIONS Voxeloc offers an easy-to-use and reliable tool to localize and visualize stereo-EEG electrodes. This will contribute to democratizing neuroscience research using iEEG.
Collapse
Affiliation(s)
- Jonathan Monney
- Clinical Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland; Basic Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Shannon E Dallaire
- Clinical Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland; Basic Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland; Dalhousie University, Halifax, Canada
| | - Lydia Stoutah
- Clinical Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland; Basic Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland; Université Paris-Saclay, Paris, France
| | - Lora Fanda
- Clinical Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland; Basic Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Pierre Mégevand
- Clinical Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland; Basic Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland; Neurology division, Geneva University Hospitals, Geneva, Switzerland.
| |
Collapse
|
9
|
Ten Oever S, Titone L, te Rietmolen N, Martin AE. Phase-dependent word perception emerges from region-specific sensitivity to the statistics of language. Proc Natl Acad Sci U S A 2024; 121:e2320489121. [PMID: 38805278 PMCID: PMC11161766 DOI: 10.1073/pnas.2320489121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 04/22/2024] [Indexed: 05/30/2024] Open
Abstract
Neural oscillations reflect fluctuations in excitability, which biases the percept of ambiguous sensory input. Why this bias occurs is still not fully understood. We hypothesized that neural populations representing likely events are more sensitive, and thereby become active on earlier oscillatory phases, when the ensemble itself is less excitable. Perception of ambiguous input presented during less-excitable phases should therefore be biased toward frequent or predictable stimuli that have lower activation thresholds. Here, we show such a frequency bias in spoken word recognition using psychophysics, magnetoencephalography (MEG), and computational modelling. With MEG, we found a double dissociation, where the phase of oscillations in the superior temporal gyrus and medial temporal gyrus biased word-identification behavior based on phoneme and lexical frequencies, respectively. This finding was reproduced in a computational model. These results demonstrate that oscillations provide a temporal ordering of neural activity based on the sensitivity of separable neural populations.
Collapse
Affiliation(s)
- Sanne Ten Oever
- Language and Computation in Neural Systems group, Max Planck Institute for Psycholinguistics, NijmegenXD 6525, The Netherlands
- Language and Computation in Neural Systems group, Donders Centre for Cognitive Neuroimaging, Donders Institute for Brain, Cognition and Behaviour, Radboud University, NijmegenEN 6525, The Netherlands
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, EV 6229, The Netherlands
| | - Lorenzo Titone
- Research Group Language Cycles, Max Planck Institute for Human Cognitive and Brain Sciences, LeipzigD-04303, Germany
| | - Noémie te Rietmolen
- Language and Computation in Neural Systems group, Max Planck Institute for Psycholinguistics, NijmegenXD 6525, The Netherlands
- Language and Computation in Neural Systems group, Donders Centre for Cognitive Neuroimaging, Donders Institute for Brain, Cognition and Behaviour, Radboud University, NijmegenEN 6525, The Netherlands
| | - Andrea E. Martin
- Language and Computation in Neural Systems group, Max Planck Institute for Psycholinguistics, NijmegenXD 6525, The Netherlands
- Language and Computation in Neural Systems group, Donders Centre for Cognitive Neuroimaging, Donders Institute for Brain, Cognition and Behaviour, Radboud University, NijmegenEN 6525, The Netherlands
| |
Collapse
|
10
|
Zou T, Li L, Huang X, Deng C, Wang X, Gao Q, Chen H, Li R. Dynamic causal modeling analysis reveals the modulation of motor cortex and integration in superior temporal gyrus during multisensory speech perception. Cogn Neurodyn 2024; 18:931-946. [PMID: 38826672 PMCID: PMC11143173 DOI: 10.1007/s11571-023-09945-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 02/03/2023] [Accepted: 02/10/2023] [Indexed: 03/06/2023] Open
Abstract
The processing of speech information from various sensory modalities is crucial for human communication. Both left posterior superior temporal gyrus (pSTG) and motor cortex importantly involve in the multisensory speech perception. However, the dynamic integration of primary sensory regions to pSTG and the motor cortex remain unclear. Here, we implemented a behavioral experiment of classical McGurk effect paradigm and acquired the task functional magnetic resonance imaging (fMRI) data during synchronized audiovisual syllabic perception from 63 normal adults. We conducted dynamic causal modeling (DCM) analysis to explore the cross-modal interactions among the left pSTG, left precentral gyrus (PrG), left middle superior temporal gyrus (mSTG), and left fusiform gyrus (FuG). Bayesian model selection favored a winning model that included modulations of connections to PrG (mSTG → PrG, FuG → PrG), from PrG (PrG → mSTG, PrG → FuG), and to pSTG (mSTG → pSTG, FuG → pSTG). Moreover, the coupling strength of the above connections correlated with behavioral McGurk susceptibility. In addition, significant differences were found in the coupling strength of these connections between strong and weak McGurk perceivers. Strong perceivers modulated less inhibitory visual influence, allowed less excitatory auditory information flowing into PrG, but integrated more audiovisual information in pSTG. Taken together, our findings show that the PrG and pSTG interact dynamically with primary cortices during audiovisual speech, and support the motor cortex plays a specifically functional role in modulating the gain and salience between auditory and visual modalities. Supplementary Information The online version contains supplementary material available at 10.1007/s11571-023-09945-z.
Collapse
Affiliation(s)
- Ting Zou
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Liyuan Li
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Xinju Huang
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Chijun Deng
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Xuyang Wang
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Qing Gao
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Huafu Chen
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Rong Li
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| |
Collapse
|
11
|
Yu Y, Lado A, Zhang Y, Magnotti JF, Beauchamp MS. Synthetic faces generated with the facial action coding system or deep neural networks improve speech-in-noise perception, but not as much as real faces. Front Neurosci 2024; 18:1379988. [PMID: 38784097 PMCID: PMC11111898 DOI: 10.3389/fnins.2024.1379988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 04/11/2024] [Indexed: 05/25/2024] Open
Abstract
The prevalence of synthetic talking faces in both commercial and academic environments is increasing as the technology to generate them grows more powerful and available. While it has long been known that seeing the face of the talker improves human perception of speech-in-noise, recent studies have shown that synthetic talking faces generated by deep neural networks (DNNs) are also able to improve human perception of speech-in-noise. However, in previous studies the benefit provided by DNN synthetic faces was only about half that of real human talkers. We sought to determine whether synthetic talking faces generated by an alternative method would provide a greater perceptual benefit. The facial action coding system (FACS) is a comprehensive system for measuring visually discernible facial movements. Because the action units that comprise FACS are linked to specific muscle groups, synthetic talking faces generated by FACS might have greater verisimilitude than DNN synthetic faces which do not reference an explicit model of the facial musculature. We tested the ability of human observers to identity speech-in-noise accompanied by a blank screen; the real face of the talker; and synthetic talking faces generated either by DNN or FACS. We replicated previous findings of a large benefit for seeing the face of a real talker for speech-in-noise perception and a smaller benefit for DNN synthetic faces. FACS faces also improved perception, but only to the same degree as DNN faces. Analysis at the phoneme level showed that the performance of DNN and FACS faces was particularly poor for phonemes that involve interactions between the teeth and lips, such as /f/, /v/, and /th/. Inspection of single video frames revealed that the characteristic visual features for these phonemes were weak or absent in synthetic faces. Modeling the real vs. synthetic difference showed that increasing the realism of a few phonemes could substantially increase the overall perceptual benefit of synthetic faces.
Collapse
Affiliation(s)
- Yingjia Yu
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Anastasia Lado
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Yue Zhang
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, United States
| | - John F. Magnotti
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Michael S. Beauchamp
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
12
|
Ten Oever S, Martin AE. Interdependence of "What" and "When" in the Brain. J Cogn Neurosci 2024; 36:167-186. [PMID: 37847823 DOI: 10.1162/jocn_a_02067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2023]
Abstract
From a brain's-eye-view, when a stimulus occurs and what it is are interrelated aspects of interpreting the perceptual world. Yet in practice, the putative perceptual inferences about sensory content and timing are often dichotomized and not investigated as an integrated process. We here argue that neural temporal dynamics can influence what is perceived, and in turn, stimulus content can influence the time at which perception is achieved. This computational principle results from the highly interdependent relationship of what and when in the environment. Both brain processes and perceptual events display strong temporal variability that is not always modeled; we argue that understanding-and, minimally, modeling-this temporal variability is key for theories of how the brain generates unified and consistent neural representations and that we ignore temporal variability in our analysis practice at the peril of both data interpretation and theory-building. Here, we review what and when interactions in the brain, demonstrate via simulations how temporal variability can result in misguided interpretations and conclusions, and outline how to integrate and synthesize what and when in theories and models of brain computation.
Collapse
Affiliation(s)
- Sanne Ten Oever
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Centre for Cognitive Neuroimaging, Nijmegen, The Netherlands
- Maastricht University, The Netherlands
| | - Andrea E Martin
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Centre for Cognitive Neuroimaging, Nijmegen, The Netherlands
| |
Collapse
|
13
|
Jiang Z, An X, Liu S, Yin E, Yan Y, Ming D. Neural oscillations reflect the individual differences in the temporal perception of audiovisual speech. Cereb Cortex 2023; 33:10575-10583. [PMID: 37727958 DOI: 10.1093/cercor/bhad304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2023] [Revised: 08/01/2023] [Accepted: 08/02/2023] [Indexed: 09/21/2023] Open
Abstract
Multisensory integration occurs within a limited time interval between multimodal stimuli. Multisensory temporal perception varies widely among individuals and involves perceptual synchrony and temporal sensitivity processes. Previous studies explored the neural mechanisms of individual differences for beep-flash stimuli, whereas there was no study for speech. In this study, 28 subjects (16 male) performed an audiovisual speech/ba/simultaneity judgment task while recording their electroencephalography. We examined the relationship between prestimulus neural oscillations (i.e. the pre-pronunciation movement-related oscillations) and temporal perception. The perceptual synchrony was quantified using the Point of Subjective Simultaneity and temporal sensitivity using the Temporal Binding Window. Our results revealed dissociated neural mechanisms for individual differences in Temporal Binding Window and Point of Subjective Simultaneity. The frontocentral delta power, reflecting top-down attention control, is positively related to the magnitude of individual auditory leading Temporal Binding Windows (auditory Temporal Binding Windows; LTBWs), whereas the parieto-occipital theta power, indexing bottom-up visual temporal attention specific to speech, is negatively associated with the magnitude of individual visual leading Temporal Binding Windows (visual Temporal Binding Windows; RTBWs). In addition, increased left frontal and bilateral temporoparietal occipital alpha power, reflecting general attentional states, is associated with increased Points of Subjective Simultaneity. Strengthening attention abilities might improve the audiovisual temporal perception of speech and further impact speech integration.
Collapse
Affiliation(s)
- Zeliang Jiang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, 300072 Tianjin, China
| | - Xingwei An
- Academy of Medical Engineering and Translational Medicine, Tianjin University, 300072 Tianjin, China
| | - Shuang Liu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, 300072 Tianjin, China
| | - Erwei Yin
- Academy of Medical Engineering and Translational Medicine, Tianjin University, 300072 Tianjin, China
- Defense Innovation Institute, Academy of Military Sciences (AMS), 100071 Beijing, China
- Tianjin Artificial Intelligence Innovation Center (TAIIC), 300457 Tianjin, China
| | - Ye Yan
- Academy of Medical Engineering and Translational Medicine, Tianjin University, 300072 Tianjin, China
- Defense Innovation Institute, Academy of Military Sciences (AMS), 100071 Beijing, China
- Tianjin Artificial Intelligence Innovation Center (TAIIC), 300457 Tianjin, China
| | - Dong Ming
- Academy of Medical Engineering and Translational Medicine, Tianjin University, 300072 Tianjin, China
| |
Collapse
|
14
|
Chalas N, Omigie D, Poeppel D, van Wassenhove V. Hierarchically nested networks optimize the analysis of audiovisual speech. iScience 2023; 26:106257. [PMID: 36909667 PMCID: PMC9993032 DOI: 10.1016/j.isci.2023.106257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Revised: 12/22/2022] [Accepted: 02/17/2023] [Indexed: 02/22/2023] Open
Abstract
In conversational settings, seeing the speaker's face elicits internal predictions about the upcoming acoustic utterance. Understanding how the listener's cortical dynamics tune to the temporal statistics of audiovisual (AV) speech is thus essential. Using magnetoencephalography, we explored how large-scale frequency-specific dynamics of human brain activity adapt to AV speech delays. First, we show that the amplitude of phase-locked responses parametrically decreases with natural AV speech synchrony, a pattern that is consistent with predictive coding. Second, we show that the temporal statistics of AV speech affect large-scale oscillatory networks at multiple spatial and temporal resolutions. We demonstrate a spatial nestedness of oscillatory networks during the processing of AV speech: these oscillatory hierarchies are such that high-frequency activity (beta, gamma) is contingent on the phase response of low-frequency (delta, theta) networks. Our findings suggest that the endogenous temporal multiplexing of speech processing confers adaptability within the temporal regimes that are essential for speech comprehension.
Collapse
Affiliation(s)
- Nikos Chalas
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, P.C., 48149 Münster, Germany
- CEA, DRF/Joliot, NeuroSpin, INSERM, Cognitive Neuroimaging Unit; CNRS; Université Paris-Saclay, 91191 Gif/Yvette, France
- School of Biology, Faculty of Sciences, Aristotle University of Thessaloniki, P.C., 54124 Thessaloniki, Greece
- Corresponding author
| | - Diana Omigie
- Department of Psychology, Goldsmiths University London, London, UK
| | - David Poeppel
- Department of Psychology, New York University, New York, NY 10003, USA
- Ernst Struengmann Institute for Neuroscience, 60528 Frankfurt am Main, Frankfurt, Germany
| | - Virginie van Wassenhove
- CEA, DRF/Joliot, NeuroSpin, INSERM, Cognitive Neuroimaging Unit; CNRS; Université Paris-Saclay, 91191 Gif/Yvette, France
- Corresponding author
| |
Collapse
|
15
|
Jiang Z, An X, Liu S, Wang L, Yin E, Yan Y, Ming D. The effect of prestimulus low-frequency neural oscillations on the temporal perception of audiovisual speech. Front Neurosci 2023; 17:1067632. [PMID: 36816126 PMCID: PMC9935937 DOI: 10.3389/fnins.2023.1067632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 01/17/2023] [Indexed: 02/05/2023] Open
Abstract
Objective Perceptual integration and segregation are modulated by the phase of ongoing neural oscillation whose frequency period is broader than the size of the temporal binding window (TBW). Studies have shown that the abstract beep-flash stimuli with about 100 ms TBW were modulated by the alpha band phase. Therefore, we hypothesize that the temporal perception of speech with about hundreds of milliseconds of TBW might be affected by the delta-theta phase. Methods Thus, we conducted a speech-stimuli-based audiovisual simultaneity judgment (SJ) experiment. Twenty human participants (12 females) attended this study, recording 62 channels of EEG. Results Behavioral results showed that the visual leading TBWs are broader than the auditory leading ones [273.37 ± 24.24 ms vs. 198.05 ± 19.28 ms, (mean ± sem)]. We used Phase Opposition Sum (POS) to quantify the differences in mean phase angles and phase concentrations between synchronous and asynchronous responses. The POS results indicated that the delta-theta phase was significantly different between synchronous and asynchronous responses in the A50V condition (50% synchronous responses in auditory leading SOA). However, in the V50A condition (50% synchronous responses in visual leading SOA), we only found the delta band effect. In the two conditions, we did not find a consistency of phases over subjects for both perceptual responses by the post hoc Rayleigh test (all ps > 0.05). The Rayleigh test results suggested that the phase might not reflect the neuronal excitability which assumed that the phases within a perceptual response across subjects concentrated on the same angle but were not uniformly distributed. But V-test showed the phase difference between synchronous and asynchronous responses across subjects had a significant phase opposition (all ps < 0.05) which is compatible with the POS result. Conclusion These results indicate that the speech temporal perception depends on the alignment of stimulus onset with an optimal phase of the neural oscillation whose frequency period might be broader than the size of TBW. The role of the oscillatory phase might be encoding the temporal information which varies across subjects rather than neuronal excitability. Given the enriched temporal structures of spoken language stimuli, the conclusion that phase encodes temporal information is plausible and valuable for future research.
Collapse
Affiliation(s)
- Zeliang Jiang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China
| | - Xingwei An
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China,Xingwei An,
| | - Shuang Liu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China
| | - Lu Wang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China
| | - Erwei Yin
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China,Defense Innovation Institute, Academy of Military Sciences (AMS), Beijing, China,Tianjin Artificial Intelligence Innovation Center (TAIIC), Tianjin, China
| | - Ye Yan
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China,Defense Innovation Institute, Academy of Military Sciences (AMS), Beijing, China,Tianjin Artificial Intelligence Innovation Center (TAIIC), Tianjin, China
| | - Dong Ming
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China,*Correspondence: Dong Ming,
| |
Collapse
|
16
|
Ronconi L, Vitale A, Federici A, Mazzoni N, Battaglini L, Molteni M, Casartelli L. Neural dynamics driving audio-visual integration in autism. Cereb Cortex 2023; 33:543-556. [PMID: 35266994 DOI: 10.1093/cercor/bhac083] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 02/04/2022] [Indexed: 02/03/2023] Open
Abstract
Audio-visual (AV) integration plays a crucial role in supporting social functions and communication in autism spectrum disorder (ASD). However, behavioral findings remain mixed and, importantly, little is known about the underlying neurophysiological bases. Studies in neurotypical adults indicate that oscillatory brain activity in different frequencies subserves AV integration, pointing to a central role of (i) individual alpha frequency (IAF), which would determine the width of the cross-modal binding window; (ii) pre-/peri-stimulus theta oscillations, which would reflect the expectation of AV co-occurrence; (iii) post-stimulus oscillatory phase reset, which would temporally align the different unisensory signals. Here, we investigate the neural correlates of AV integration in children with ASD and typically developing (TD) peers, measuring electroencephalography during resting state and in an AV integration paradigm. As for neurotypical adults, AV integration dynamics in TD children could be predicted by the IAF measured at rest and by a modulation of anticipatory theta oscillations at single-trial level. Conversely, in ASD participants, AV integration/segregation was driven exclusively by the neural processing of the auditory stimulus and the consequent auditory-induced phase reset in visual regions, suggesting that a disproportionate elaboration of the auditory input could be the main factor characterizing atypical AV integration in autism.
Collapse
Affiliation(s)
- Luca Ronconi
- School of Psychology, Vita-Salute San Raffaele University, 20132 Milan, Italy.,Division of Neuroscience, IRCCS San Raffaele Scientific Institute, 20132 Milan, Italy
| | - Andrea Vitale
- Theoretical and Cognitive Neuroscience Unit, Child Psychopathology Department, Scientific Institute IRCCS Eugenio Medea, 23842 Bosisio Parini, Italy
| | - Alessandra Federici
- Theoretical and Cognitive Neuroscience Unit, Child Psychopathology Department, Scientific Institute IRCCS Eugenio Medea, 23842 Bosisio Parini, Italy.,Sensory Experience Dependent (SEED) group, IMT School for Advanced Studies Lucca, 55100 Lucca, Italy
| | - Noemi Mazzoni
- Theoretical and Cognitive Neuroscience Unit, Child Psychopathology Department, Scientific Institute IRCCS Eugenio Medea, 23842 Bosisio Parini, Italy.,Laboratory for Autism and Neurodevelopmental Disorders, Center for Neuroscience and Cognitive Systems, Istituto Italiano di Tecnologia, 38068 Rovereto, Italy.,Department of Psychology and Cognitive Science, University of Trento, 38068 Rovereto, Italy
| | - Luca Battaglini
- Department of General Psychology, University of Padova, 35131 Padova, Italy.,Department of Physics and Astronomy "Galileo Galilei", University of Padova, 35131 Padova, Italy
| | - Massimo Molteni
- Child Psychopathology Department, Scientific Institute IRCCS Eugenio Medea, 23842 Bosisio Parini, Italy
| | - Luca Casartelli
- Theoretical and Cognitive Neuroscience Unit, Child Psychopathology Department, Scientific Institute IRCCS Eugenio Medea, 23842 Bosisio Parini, Italy
| |
Collapse
|
17
|
Aller M, Økland HS, MacGregor LJ, Blank H, Davis MH. Differential Auditory and Visual Phase-Locking Are Observed during Audio-Visual Benefit and Silent Lip-Reading for Speech Perception. J Neurosci 2022; 42:6108-6120. [PMID: 35760528 PMCID: PMC9351641 DOI: 10.1523/jneurosci.2476-21.2022] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 04/04/2022] [Accepted: 04/12/2022] [Indexed: 11/21/2022] Open
Abstract
Speech perception in noisy environments is enhanced by seeing facial movements of communication partners. However, the neural mechanisms by which audio and visual speech are combined are not fully understood. We explore MEG phase-locking to auditory and visual signals in MEG recordings from 14 human participants (6 females, 8 males) that reported words from single spoken sentences. We manipulated the acoustic clarity and visual speech signals such that critical speech information is present in auditory, visual, or both modalities. MEG coherence analysis revealed that both auditory and visual speech envelopes (auditory amplitude modulations and lip aperture changes) were phase-locked to 2-6 Hz brain responses in auditory and visual cortex, consistent with entrainment to syllable-rate components. Partial coherence analysis was used to separate neural responses to correlated audio-visual signals and showed non-zero phase-locking to auditory envelope in occipital cortex during audio-visual (AV) speech. Furthermore, phase-locking to auditory signals in visual cortex was enhanced for AV speech compared with audio-only speech that was matched for intelligibility. Conversely, auditory regions of the superior temporal gyrus did not show above-chance partial coherence with visual speech signals during AV conditions but did show partial coherence in visual-only conditions. Hence, visual speech enabled stronger phase-locking to auditory signals in visual areas, whereas phase-locking of visual speech in auditory regions only occurred during silent lip-reading. Differences in these cross-modal interactions between auditory and visual speech signals are interpreted in line with cross-modal predictive mechanisms during speech perception.SIGNIFICANCE STATEMENT Verbal communication in noisy environments is challenging, especially for hearing-impaired individuals. Seeing facial movements of communication partners improves speech perception when auditory signals are degraded or absent. The neural mechanisms supporting lip-reading or audio-visual benefit are not fully understood. Using MEG recordings and partial coherence analysis, we show that speech information is used differently in brain regions that respond to auditory and visual speech. While visual areas use visual speech to improve phase-locking to auditory speech signals, auditory areas do not show phase-locking to visual speech unless auditory speech is absent and visual speech is used to substitute for missing auditory signals. These findings highlight brain processes that combine visual and auditory signals to support speech understanding.
Collapse
Affiliation(s)
- Máté Aller
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| | - Heidi Solberg Økland
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| | - Lucy J MacGregor
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| | - Helen Blank
- University Medical Center Hamburg-Eppendorf, Hamburg, 20246, Germany
| | - Matthew H Davis
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| |
Collapse
|
18
|
Motor Circuit and Superior Temporal Sulcus Activities Linked to Individual Differences in Multisensory Speech Perception. Brain Topogr 2021; 34:779-792. [PMID: 34480635 DOI: 10.1007/s10548-021-00869-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2021] [Accepted: 08/24/2021] [Indexed: 10/20/2022]
Abstract
Integrating multimodal information into a unified perception is a fundamental human capacity. McGurk effect is a remarkable multisensory illusion that demonstrates a percept different from incongruent auditory and visual syllables. However, not all listeners perceive the McGurk illusion to the same degree. The neural basis for individual differences in modulation of multisensory integration and syllabic perception remains largely unclear. To probe the possible involvement of specific neural circuits in individual differences in multisensory speech perception, we first implemented a behavioral experiment to examine the McGurk susceptibility. Then, functional magnetic resonance imaging was performed in 63 participants to measure the brain activity in response to non-McGurk audiovisual syllables. We revealed significant individual variability in McGurk illusion perception. Moreover, we found significant differential activations of the auditory and visual regions and the left Superior temporal sulcus (STS), as well as multiple motor areas between strong and weak McGurk perceivers. Importantly, the individual engagement of the STS and motor areas could specifically predict the behavioral McGurk susceptibility, contrary to the sensory regions. These findings suggest that the distinct multimodal integration in STS as well as coordinated phonemic modulatory processes in motor circuits may serve as a neural substrate for interindividual differences in multisensory speech perception.
Collapse
|
19
|
Zulfiqar I, Moerel M, Lage-Castellanos A, Formisano E, De Weerd P. Audiovisual Interactions Among Near-Threshold Oscillating Stimuli in the Far Periphery Are Phase-Dependent. Front Hum Neurosci 2021; 15:642341. [PMID: 34526884 PMCID: PMC8435850 DOI: 10.3389/fnhum.2021.642341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 07/22/2021] [Indexed: 11/30/2022] Open
Abstract
Recent studies have highlighted the possible contributions of direct connectivity between early sensory cortices to audiovisual integration. Anatomical connections between the early auditory and visual cortices are concentrated in visual sites representing the peripheral field of view. Here, we aimed to engage early sensory interactive pathways with simple, far-peripheral audiovisual stimuli (auditory noise and visual gratings). Using a modulation detection task in one modality performed at an 84% correct threshold level, we investigated multisensory interactions by simultaneously presenting weak stimuli from the other modality in which the temporal modulation was barely-detectable (at 55 and 65% correct detection performance). Furthermore, we manipulated the temporal congruence between the cross-sensory streams. We found evidence for an influence of barely-detectable visual stimuli on the response times for auditory stimuli, but not for the reverse effect. These visual-to-auditory influences only occurred for specific phase-differences (at onset) between the modulated audiovisual stimuli. We discuss our findings in the light of a possible role of direct interactions between early visual and auditory areas, along with contributions from the higher-order association cortex. In sum, our results extend the behavioral evidence of audio-visual processing to the far periphery, and suggest - within this specific experimental setting - an asymmetry between the auditory influence on visual processing and the visual influence on auditory processing.
Collapse
Affiliation(s)
- Isma Zulfiqar
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
- Maastricht Centre for Systems Biology, Maastricht University, Maastricht, Netherlands
| | - Michelle Moerel
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
- Maastricht Centre for Systems Biology, Maastricht University, Maastricht, Netherlands
- Maastricht Brain Imaging Centre (MBIC), Maastricht, Netherlands
| | - Agustin Lage-Castellanos
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| | - Elia Formisano
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
- Maastricht Centre for Systems Biology, Maastricht University, Maastricht, Netherlands
- Maastricht Brain Imaging Centre (MBIC), Maastricht, Netherlands
| | - Peter De Weerd
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
- Maastricht Centre for Systems Biology, Maastricht University, Maastricht, Netherlands
| |
Collapse
|
20
|
Sheybani L, Mégevand P, Spinelli L, Bénar CG, Momjian S, Seeck M, Quairiaux C, Kleinschmidt A, Vulliémoz S. Slow oscillations open susceptible time windows for epileptic discharges. Epilepsia 2021; 62:2357-2371. [PMID: 34338315 PMCID: PMC9290693 DOI: 10.1111/epi.17020] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 07/13/2021] [Accepted: 07/14/2021] [Indexed: 12/15/2022]
Abstract
Objective In patients with epilepsy, interictal epileptic discharges are a diagnostic hallmark of epilepsy and represent abnormal, so‐called “irritative” activity that disrupts normal cognitive functions. Despite their clinical relevance, their mechanisms of generation remain poorly understood. It is assumed that brain activity switches abruptly, unpredictably, and supposedly randomly to these epileptic transients. We aim to study the period preceding these epileptic discharges, to extract potential proepileptogenic mechanisms supporting their expression. Methods We used multisite intracortical recordings from patients who underwent intracranial monitoring for refractory epilepsy, the majority of whom had a mesial temporal lobe seizure onset zone. Our objective was to evaluate the existence of proepileptogenic windows before interictal epileptic discharges. We tested whether the amplitude and phase synchronization of slow oscillations (.5–4 Hz and 4–7 Hz) increase before epileptic discharges and whether the latter are phase‐locked to slow oscillations. Then, we tested whether the phase‐locking of neuronal activity (assessed by high‐gamma activity, 60–160 Hz) to slow oscillations increases before epileptic discharges to provide a potential mechanism linking slow oscillations to interictal activities. Results Changes in widespread slow oscillations anticipate upcoming epileptic discharges. The network extends beyond the irritative zone, but the increase in amplitude and phase synchronization is rather specific to the irritative zone. In contrast, epileptic discharges are phase‐locked to widespread slow oscillations and the degree of phase‐locking tends to be higher outside the irritative zone. Then, within the irritative zone only, we observe an increased coupling between slow oscillations and neuronal discharges before epileptic discharges. Significance Our results show that epileptic discharges occur during vulnerable time windows set up by a specific phase of slow oscillations. The specificity of these permissive windows is further reinforced by the increased coupling of neuronal activity to slow oscillations. These findings contribute to our understanding of epilepsy as a distributed oscillopathy and open avenues for future neuromodulation strategies aiming at disrupting proepileptic mechanisms.
Collapse
Affiliation(s)
- Laurent Sheybani
- EEG and Epilepsy Unit / Neurology, Department of Clinical Neuroscience, University Hospitals and Faculty of Medicine of University of Geneva, Geneva, Switzerland
| | - Pierre Mégevand
- EEG and Epilepsy Unit / Neurology, Department of Clinical Neuroscience, University Hospitals and Faculty of Medicine of University of Geneva, Geneva, Switzerland.,Department of Basic Neuroscience, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Laurent Spinelli
- EEG and Epilepsy Unit / Neurology, Department of Clinical Neuroscience, University Hospitals and Faculty of Medicine of University of Geneva, Geneva, Switzerland
| | - Christian G Bénar
- Aix-Marseille University, National Institute of Health and Medical Research, Institute of Systems Neurosciences, Marseille, France
| | - Shahan Momjian
- Neurosurgery, Department of Clinical Neuroscience, University Hospitals and Faculty of Medicine of University of Geneva, Geneva, Switzerland
| | - Margitta Seeck
- EEG and Epilepsy Unit / Neurology, Department of Clinical Neuroscience, University Hospitals and Faculty of Medicine of University of Geneva, Geneva, Switzerland
| | - Charles Quairiaux
- Functional Brain Mapping Laboratory, Department of Basic Neuroscience, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Andreas Kleinschmidt
- EEG and Epilepsy Unit / Neurology, Department of Clinical Neuroscience, University Hospitals and Faculty of Medicine of University of Geneva, Geneva, Switzerland
| | - Serge Vulliémoz
- EEG and Epilepsy Unit / Neurology, Department of Clinical Neuroscience, University Hospitals and Faculty of Medicine of University of Geneva, Geneva, Switzerland
| |
Collapse
|
21
|
Ten Oever S, Martin AE. An oscillating computational model can track pseudo-rhythmic speech by using linguistic predictions. eLife 2021; 10:68066. [PMID: 34338196 PMCID: PMC8328513 DOI: 10.7554/elife.68066] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 07/16/2021] [Indexed: 11/19/2022] Open
Abstract
Neuronal oscillations putatively track speech in order to optimize sensory processing. However, it is unclear how isochronous brain oscillations can track pseudo-rhythmic speech input. Here we propose that oscillations can track pseudo-rhythmic speech when considering that speech time is dependent on content-based predictions flowing from internal language models. We show that temporal dynamics of speech are dependent on the predictability of words in a sentence. A computational model including oscillations, feedback, and inhibition is able to track pseudo-rhythmic speech input. As the model processes, it generates temporal phase codes, which are a candidate mechanism for carrying information forward in time. The model is optimally sensitive to the natural temporal speech dynamics and can explain empirical data on temporal speech illusions. Our results suggest that speech tracking does not have to rely only on the acoustics but could also exploit ongoing interactions between oscillations and constraints flowing from internal language models.
Collapse
Affiliation(s)
- Sanne Ten Oever
- Language and Computation in Neural Systems group, Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands.,Donders Centre for Cognitive Neuroimaging, Radboud University, Nijmegen, Netherlands.,Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| | - Andrea E Martin
- Language and Computation in Neural Systems group, Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands.,Donders Centre for Cognitive Neuroimaging, Radboud University, Nijmegen, Netherlands
| |
Collapse
|