Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Bernstein LE, Liebenthal E. Neural pathways for visual speech perception. Front Neurosci 2014;8:386. [PMID: 25520611 PMCID: PMC4248808 DOI: 10.3389/fnins.2014.00386] [Citation(s) in RCA: 89] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Accepted: 11/10/2014] [Indexed: 12/03/2022] Open

For:	Bernstein LE, Liebenthal E. Neural pathways for visual speech perception. Front Neurosci 2014;8:386. [PMID: 25520611 PMCID: PMC4248808 DOI: 10.3389/fnins.2014.00386] [Citation(s) in RCA: 89] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Accepted: 11/10/2014] [Indexed: 12/03/2022] Open

Number

Cited by Other Article(s)

Sato M. Audiovisual speech asynchrony asymmetrically modulates neural binding. Neuropsychologia 2024;198:108866. [PMID: 38518889 DOI: 10.1016/j.neuropsychologia.2024.108866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 01/09/2024] [Accepted: 03/19/2024] [Indexed: 03/24/2024]

Zou T, Li L, Huang X, Deng C, Wang X, Gao Q, Chen H, Li R. Dynamic causal modeling analysis reveals the modulation of motor cortex and integration in superior temporal gyrus during multisensory speech perception. Cogn Neurodyn 2024;18:931-946. [PMID: 38826672 PMCID: PMC11143173 DOI: 10.1007/s11571-023-09945-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 02/03/2023] [Accepted: 02/10/2023] [Indexed: 03/06/2023] Open

Affiliation(s)

Ting Zou The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
Liyuan Li The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
Xinju Huang The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
Chijun Deng The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
Xuyang Wang The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
Qing Gao The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
Huafu Chen The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
Rong Li The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China

Collapse

Kausel L, Michon M, Soto-Icaza P, Aboitiz F. A multimodal interface for speech perception: the role of the left superior temporal sulcus in social cognition and autism. Cereb Cortex 2024;34:84-93. [PMID: 38696598 DOI: 10.1093/cercor/bhae066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 01/17/2024] [Accepted: 02/03/2024] [Indexed: 05/04/2024] Open

Sato M. Competing influence of visual speech on auditory neural adaptation. BRAIN AND LANGUAGE 2023;247:105359. [PMID: 37951157 DOI: 10.1016/j.bandl.2023.105359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 09/25/2023] [Accepted: 11/06/2023] [Indexed: 11/13/2023]

Liu Y, Wang Z, Wei T, Zhou S, Yin Y, Mi Y, Liu X, Tang Y. Alterations of Audiovisual Integration in Alzheimer's Disease. Neurosci Bull 2023;39:1859-1872. [PMID: 37812301 PMCID: PMC10661680 DOI: 10.1007/s12264-023-01125-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 06/22/2023] [Indexed: 10/10/2023] Open

Nidiffer AR, Cao CZ, O'Sullivan A, Lalor EC. A representation of abstract linguistic categories in the visual system underlies successful lipreading. Neuroimage 2023;282:120391. [PMID: 37757989 DOI: 10.1016/j.neuroimage.2023.120391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 09/22/2023] [Accepted: 09/24/2023] [Indexed: 09/29/2023] Open

Jeschke L, Mathias B, von Kriegstein K. Inhibitory TMS over Visual Area V5/MT Disrupts Visual Speech Recognition. J Neurosci 2023;43:7690-7699. [PMID: 37848284 PMCID: PMC10634547 DOI: 10.1523/jneurosci.0975-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 07/26/2023] [Accepted: 09/04/2023] [Indexed: 10/19/2023] Open

Abstract

During face-to-face communication, the perception and recognition of facial movements can facilitate individuals' understanding of what is said. Facial movements are a form of complex biological motion. Separate neural pathways are thought to processing (1) simple, nonbiological motion with an obligatory waypoint in the motion-sensitive visual middle temporal area (V5/MT); and (2) complex biological motion. Here, we present findings that challenge this dichotomy. Neuronavigated offline transcranial magnetic stimulation (TMS) over V5/MT on 24 participants (17 females and 7 males) led to increased response times in the recognition of simple, nonbiological motion as well as visual speech recognition compared with TMS over the vertex, an active control region. TMS of area V5/MT also reduced practice effects on response times, that are typically observed in both visual speech and motion recognition tasks over time. Our findings provide the first indication that area V5/MT causally influences the recognition of visual speech.SIGNIFICANCE STATEMENT In everyday face-to-face communication, speech comprehension is often facilitated by viewing a speaker's facial movements. Several brain areas contribute to the recognition of visual speech. One area of interest is the motion-sensitive visual medial temporal area (V5/MT), which has been associated with the perception of simple, nonbiological motion such as moving dots, as well as more complex, biological motion such as visual speech. Here, we demonstrate using noninvasive brain stimulation that area V5/MT is causally relevant in recognizing visual speech. This finding provides new insights into the neural mechanisms that support the perception of human communication signals, which will help guide future research in typically developed individuals and populations with communication difficulties.

Collapse

Zhang Y, Rennig J, Magnotti JF, Beauchamp MS. Multivariate fMRI responses in superior temporal cortex predict visual contributions to, and individual differences in, the intelligibility of noisy speech. Neuroimage 2023;278:120271. [PMID: 37442310 PMCID: PMC10460966 DOI: 10.1016/j.neuroimage.2023.120271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 06/20/2023] [Accepted: 07/06/2023] [Indexed: 07/15/2023] Open

Banks MI, Krause BM, Berger DG, Campbell DI, Boes AD, Bruss JE, Kovach CK, Kawasaki H, Steinschneider M, Nourski KV. Functional geometry of auditory cortical resting state networks derived from intracranial electrophysiology. PLoS Biol 2023;21:e3002239. [PMID: 37651504 PMCID: PMC10499207 DOI: 10.1371/journal.pbio.3002239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 09/13/2023] [Accepted: 07/07/2023] [Indexed: 09/02/2023] Open

Bernstein LE, Auer ET, Eberhardt SP. Modality-Specific Perceptual Learning of Vocoded Auditory versus Lipread Speech: Different Effects of Prior Information. Brain Sci 2023;13:1008. [PMID: 37508940 PMCID: PMC10377548 DOI: 10.3390/brainsci13071008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/26/2023] [Accepted: 06/08/2023] [Indexed: 07/30/2023] Open

Abstract

Traditionally, speech perception training paradigms have not adequately taken into account the possibility that there may be modality-specific requirements for perceptual learning with auditory-only (AO) versus visual-only (VO) speech stimuli. The study reported here investigated the hypothesis that there are modality-specific differences in how prior information is used by normal-hearing participants during vocoded versus VO speech training. Two different experiments, one with vocoded AO speech (Experiment 1) and one with VO, lipread, speech (Experiment 2), investigated the effects of giving different types of prior information to trainees on each trial during training. The training was for four ~20 min sessions, during which participants learned to label novel visual images using novel spoken words. Participants were assigned to different types of prior information during training: Word Group trainees saw a printed version of each training word (e.g., "tethon"), and Consonant Group trainees saw only its consonants (e.g., "t_th_n"). Additional groups received no prior information (i.e., Experiment 1, AO Group; Experiment 2, VO Group) or a spoken version of the stimulus in a different modality from the training stimuli (Experiment 1, Lipread Group; Experiment 2, Vocoder Group). That is, in each experiment, there was a group that received prior information in the modality of the training stimuli from the other experiment. In both experiments, the Word Groups had difficulty retaining the novel words they attempted to learn during training. However, when the training stimuli were vocoded, the Word Group improved their phoneme identification. When the training stimuli were visual speech, the Consonant Group improved their phoneme identification and their open-set sentence lipreading. The results are considered in light of theoretical accounts of perceptual learning in relationship to perceptual modality.

Collapse

Saalasti S, Alho J, Lahnakoski JM, Bacha-Trams M, Glerean E, Jääskeläinen IP, Hasson U, Sams M. Lipreading a naturalistic narrative in a female population: Neural characteristics shared with listening and reading. Brain Behav 2023;13:e2869. [PMID: 36579557 PMCID: PMC9927859 DOI: 10.1002/brb3.2869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 11/29/2022] [Accepted: 12/06/2022] [Indexed: 12/30/2022] Open

Evidence of visual crossmodal reorganization positively relates to speech outcomes in cochlear implant users. Sci Rep 2022;12:17749. [PMID: 36273017 PMCID: PMC9587996 DOI: 10.1038/s41598-022-22117-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 10/10/2022] [Indexed: 01/18/2023] Open

Suess N, Hauswald A, Reisinger P, Rösch S, Keitel A, Weisz N. Cortical tracking of formant modulations derived from silently presented lip movements and its decline with age. Cereb Cortex 2022;32:4818-4833. [PMID: 35062025 PMCID: PMC9627034 DOI: 10.1093/cercor/bhab518] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Revised: 12/15/2021] [Accepted: 12/16/2021] [Indexed: 11/26/2022] Open

Abstract

The integration of visual and auditory cues is crucial for successful processing of speech, especially under adverse conditions. Recent reports have shown that when participants watch muted videos of speakers, the phonological information about the acoustic speech envelope, which is associated with but independent from the speakers' lip movements, is tracked by the visual cortex. However, the speech signal also carries richer acoustic details, for example, about the fundamental frequency and the resonant frequencies, whose visuophonological transformation could aid speech processing. Here, we investigated the neural basis of the visuo-phonological transformation processes of these more fine-grained acoustic details and assessed how they change as a function of age. We recorded whole-head magnetoencephalographic (MEG) data while the participants watched silent normal (i.e., natural) and reversed videos of a speaker and paid attention to their lip movements. We found that the visual cortex is able to track the unheard natural modulations of resonant frequencies (or formants) and the pitch (or fundamental frequency) linked to lip movements. Importantly, only the processing of natural unheard formants decreases significantly with age in the visual and also in the cingulate cortex. This is not the case for the processing of the unheard speech envelope, the fundamental frequency, or the purely visual information carried by lip movements. These results show that unheard spectral fine details (along with the unheard acoustic envelope) are transformed from a mere visual to a phonological representation. Aging affects especially the ability to derive spectral dynamics at formant frequencies. As listening in noisy environments should capitalize on the ability to track spectral fine details, our results provide a novel focus on compensatory processes in such challenging situations.

Collapse

Ross LA, Molholm S, Butler JS, Bene VAD, Foxe JJ. Neural correlates of multisensory enhancement in audiovisual narrative speech perception: a fMRI investigation. Neuroimage 2022;263:119598. [PMID: 36049699 DOI: 10.1016/j.neuroimage.2022.119598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 08/26/2022] [Accepted: 08/28/2022] [Indexed: 11/25/2022] Open

Affiliation(s)

Lars A Ross The Frederick J. and Marion A. Schindler Cognitive Neurophysiology Laboratory, The Ernest J. Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, New York, 14642, USA; Department of Imaging Sciences, University of Rochester Medical Center, University of Rochester School of Medicine and Dentistry, Rochester, New York, 14642, USA; The Cognitive Neurophysiology Laboratory, Departments of Pediatrics and Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, New York, 10461, USA.
Sophie Molholm The Frederick J. and Marion A. Schindler Cognitive Neurophysiology Laboratory, The Ernest J. Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, New York, 14642, USA; The Cognitive Neurophysiology Laboratory, Departments of Pediatrics and Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, New York, 10461, USA
John S Butler The Cognitive Neurophysiology Laboratory, Departments of Pediatrics and Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, New York, 10461, USA; School of Mathematical Sciences, Technological University Dublin, Kevin Street Campus, Dublin, Ireland
Victor A Del Bene The Cognitive Neurophysiology Laboratory, Departments of Pediatrics and Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, New York, 10461, USA; University of Alabama at Birmingham, Heersink School of Medicine, Department of Neurology, Birmingham, Alabama, 35233, USA
John J Foxe The Frederick J. and Marion A. Schindler Cognitive Neurophysiology Laboratory, The Ernest J. Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, New York, 14642, USA; The Cognitive Neurophysiology Laboratory, Departments of Pediatrics and Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, New York, 10461, USA.

Collapse

Fullerton AM, Vickers DA, Luke R, Billing AN, McAlpine D, Hernandez-Perez H, Peelle JE, Monaghan JJM, McMahon CM. Cross-modal functional connectivity supports speech understanding in cochlear implant users. Cereb Cortex 2022;33:3350-3371. [PMID: 35989307 PMCID: PMC10068270 DOI: 10.1093/cercor/bhac277] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 06/10/2022] [Accepted: 06/11/2022] [Indexed: 11/12/2022] Open

Zhang WJ, Li DN, Lian TH, Guo P, Zhang YN, Li JH, Guan HY, He MY, Zhang WJ, Zhang WJ, Luo DM, Wang XM, Zhang W. Clinical Features and Potential Mechanisms Relating Neuropathological Biomarkers and Blood-Brain Barrier in Patients With Alzheimer’s Disease and Hearing Loss. Front Aging Neurosci 2022;14:911028. [PMID: 35783139 PMCID: PMC9245454 DOI: 10.3389/fnagi.2022.911028] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 05/05/2022] [Indexed: 11/24/2022] Open

Abstract

Background

The aim of this study was to explore clinical features and potential mechanisms relating neuropathological biomarkers and blood-brain barrier (BBB) in Alzheimer’s disease (AD) and hearing loss (HL).

Materials and Methods

A total of 65 patients with AD were recruited and auditory function was assessed by threshold of pure tone audiometry (PTA). Patients were divided into AD with HL (AD-HL) and AD with no HL (AD-nHL) groups based on the standard of World Health Organization. Clinical symptoms were assessed by multiple rating scales. The levels of neuropathological biomarkers of β amyloid1-42 (Aβ_1–42) and multiple phosphorylated tau (P-tau), and BBB factors of matrix metalloproteinases (MMPs), receptor of advanced glycation end products, glial fibrillary acidic protein, and low-density lipoprotein receptor related protein 1 were measured.

Results

(1) Compared with AD-nHL group, AD-HL group had significantly impaired overall cognitive function and cognitive domains of memory, language, attention, execution, and activities of daily living (ADL) reflected by the scores of rating scales (P < 0.05). PTA threshold was significantly correlated with the impairments of overall cognitive function and cognitive domains of memory and language, and ADL in patients with AD (P < 0.05). (2) P-tau (S199) level was significantly increased in CSF from AD-HL group (P < 0.05), and was significantly and positively correlated with PTA threshold in patients with AD. (3) MMP-3 level was significantly elevated in CSF from AD-HL group (P < 0.05), and was significantly and positively correlated with PTA threshold in patients with AD (P < 0.05). (4) In AD-HL group, P-tau (S199) level was significantly and positively correlated with the levels of MMP-2 and MMP-3 in CSF (P < 0.05).

Conclusion

AD-HL patients have severely compromised overall cognitive function, multiple cognitive domains, and ADL. The potential mechanisms of AD-HL involve elevations of AD neuropathological biomarker of P-tau (S199) and BBB factor of MMP-3, and close correlations between P-tau (S199) and MMP-2/MMP-3 in CSF. Findings from this investigation highly suggest significance of early evaluation of HL for delaying AD progression, and indicate new directions of drug development by inhibiting neuropathological biomarkers of AD and protecting BBB.

Collapse

Johansson C, Folgerø PO. Is Reduced Visual Processing the Price of Language? Brain Sci 2022;12:brainsci12060771. [PMID: 35741656 PMCID: PMC9221435 DOI: 10.3390/brainsci12060771] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 06/06/2022] [Accepted: 06/08/2022] [Indexed: 02/01/2023] Open

Bernstein LE, Jordan N, Auer ET, Eberhardt SP. Lipreading: A Review of Its Continuing Importance for Speech Recognition With an Acquired Hearing Loss and Possibilities for Effective Training. Am J Audiol 2022;31:453-469. [PMID: 35316072 DOI: 10.1044/2021_aja-21-00112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open

Abstract

PURPOSE

The goal of this review article is to reinvigorate interest in lipreading and lipreading training for adults with acquired hearing loss. Most adults benefit from being able to see the talker when speech is degraded; however, the effect size is related to their lipreading ability, which is typically poor in adults who have experienced normal hearing through most of their lives. Lipreading training has been viewed as a possible avenue for rehabilitation of adults with an acquired hearing loss, but most training approaches have not been particularly successful. Here, we describe lipreading and theoretically motivated approaches to its training, as well as examples of successful training paradigms. We discuss some extensions to auditory-only (AO) and audiovisual (AV) speech recognition.

METHOD

Visual speech perception and word recognition are described. Traditional and contemporary views of training and perceptual learning are outlined. We focus on the roles of external and internal feedback and the training task in perceptual learning, and we describe results of lipreading training experiments.

RESULTS

Lipreading is commonly characterized as limited to viseme perception. However, evidence demonstrates subvisemic perception of visual phonetic information. Lipreading words also relies on lexical constraints, not unlike auditory spoken word recognition. Lipreading has been shown to be difficult to improve through training, but under specific feedback and task conditions, training can be successful, and learning can generalize to untrained materials, including AV sentence stimuli in noise. The results on lipreading have implications for AO and AV training and for use of acoustically processed speech in face-to-face communication.

CONCLUSION

Given its importance for speech recognition with a hearing loss, we suggest that the research and clinical communities integrate lipreading in their efforts to improve speech recognition in adults with acquired hearing loss.

Collapse

Zhang L, Du Y. Lip movements enhance speech representations and effective connectivity in auditory dorsal stream. Neuroimage 2022;257:119311. [PMID: 35589000 DOI: 10.1016/j.neuroimage.2022.119311] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Revised: 05/09/2022] [Accepted: 05/11/2022] [Indexed: 11/25/2022] Open

Jessica Tan SH, Kalashnikova M, Di Liberto GM, Crosse MJ, Burnham D. Seeing a Talking Face Matters: The Relationship between Cortical Tracking of Continuous Auditory-Visual Speech and Gaze Behaviour in Infants, Children and Adults. Neuroimage 2022;256:119217. [PMID: 35436614 DOI: 10.1016/j.neuroimage.2022.119217] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Revised: 04/09/2022] [Accepted: 04/14/2022] [Indexed: 11/24/2022] Open

Michon M, Zamorano-Abramson J, Aboitiz F. Faces and Voices Processing in Human and Primate Brains: Rhythmic and Multimodal Mechanisms Underlying the Evolution and Development of Speech. Front Psychol 2022;13:829083. [PMID: 35432052 PMCID: PMC9007199 DOI: 10.3389/fpsyg.2022.829083] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2021] [Accepted: 03/07/2022] [Indexed: 11/24/2022] Open

Bernstein LE, Auer ET, Eberhardt SP. During Lipreading Training With Sentence Stimuli, Feedback Controls Learning and Generalization to Audiovisual Speech in Noise. Am J Audiol 2022;31:57-77. [PMID: 34965362 DOI: 10.1044/2021_aja-21-00034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open

Abstract

PURPOSE

This study investigated the effects of external feedback on perceptual learning of visual speech during lipreading training with sentence stimuli. The goal was to improve visual-only (VO) speech recognition and increase accuracy of audiovisual (AV) speech recognition in noise. The rationale was that spoken word recognition depends on the accuracy of sublexical (phonemic/phonetic) speech perception; effective feedback during training must support sublexical perceptual learning.

METHOD

Normal-hearing (NH) adults were assigned to one of three types of feedback: Sentence feedback was the entire sentence printed after responding to the stimulus. Word feedback was the correct response words and perceptually near but incorrect response words. Consonant feedback was correct response words and consonants in incorrect but perceptually near response words. Six training sessions were given. Pre- and posttraining testing included an untrained control group. Test stimuli were disyllable nonsense words for forced-choice consonant identification, and isolated words and sentences for open-set identification. Words and sentences were VO, AV, and audio-only (AO) with the audio in speech-shaped noise.

RESULTS

Lipreading accuracy increased during training. Pre- and posttraining tests of consonant identification showed no improvement beyond test-retest increases obtained by untrained controls. Isolated word recognition with a talker not seen during training showed that the control group improved more than the sentence group. Tests of untrained sentences showed that the consonant group significantly improved in all of the stimulus conditions (VO, AO, and AV). Its mean words correct scores increased by 9.2 percentage points for VO, 3.4 percentage points for AO, and 9.8 percentage points for AV stimuli.

CONCLUSIONS

Consonant feedback during training with sentences stimuli significantly increased perceptual learning. The training generalized to untrained VO, AO, and AV sentence stimuli. Lipreading training has potential to significantly improve adults' face-to-face communication in noisy settings in which the talker can be seen.

Collapse

Peelle JE, Spehar B, Jones MS, McConkey S, Myerson J, Hale S, Sommers MS, Tye-Murray N. Increased Connectivity among Sensory and Motor Regions during Visual and Audiovisual Speech Perception. J Neurosci 2022;42:435-442. [PMID: 34815317 PMCID: PMC8802926 DOI: 10.1523/jneurosci.0114-21.2021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 10/29/2021] [Accepted: 11/08/2021] [Indexed: 11/21/2022] Open

Abstract

In everyday conversation, we usually process the talker's face as well as the sound of the talker's voice. Access to visual speech information is particularly useful when the auditory signal is degraded. Here, we used fMRI to monitor brain activity while adult humans (n = 60) were presented with visual-only, auditory-only, and audiovisual words. The audiovisual words were presented in quiet and in several signal-to-noise ratios. As expected, audiovisual speech perception recruited both auditory and visual cortex, with some evidence for increased recruitment of premotor cortex in some conditions (including in substantial background noise). We then investigated neural connectivity using psychophysiological interaction analysis with seed regions in both primary auditory cortex and primary visual cortex. Connectivity between auditory and visual cortices was stronger in audiovisual conditions than in unimodal conditions, including a wide network of regions in posterior temporal cortex and prefrontal cortex. In addition to whole-brain analyses, we also conducted a region-of-interest analysis on the left posterior superior temporal sulcus (pSTS), implicated in many previous studies of audiovisual speech perception. We found evidence for both activity and effective connectivity in pSTS for visual-only and audiovisual speech, although these were not significant in whole-brain analyses. Together, our results suggest a prominent role for cross-region synchronization in understanding both visual-only and audiovisual speech that complements activity in integrative brain regions like pSTS.SIGNIFICANCE STATEMENT In everyday conversation, we usually process the talker's face as well as the sound of the talker's voice. Access to visual speech information is particularly useful when the auditory signal is hard to understand (e.g., background noise). Prior work has suggested that specialized regions of the brain may play a critical role in integrating information from visual and auditory speech. Here, we show a complementary mechanism relying on synchronized brain activity among sensory and motor regions may also play a critical role. These findings encourage reconceptualizing audiovisual integration in the context of coordinated network activity.

Collapse

Benetti S, Collignon O. Cross-modal integration and plasticity in the superior temporal cortex. HANDBOOK OF CLINICAL NEUROLOGY 2022;187:127-143. [PMID: 35964967 DOI: 10.1016/b978-0-12-823493-8.00026-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Rennig J, Beauchamp MS. Intelligibility of audiovisual sentences drives multivoxel response patterns in human superior temporal cortex. Neuroimage 2021;247:118796. [PMID: 34906712 PMCID: PMC8819942 DOI: 10.1016/j.neuroimage.2021.118796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 11/18/2021] [Accepted: 12/08/2021] [Indexed: 11/18/2022] Open

Kaganovich N, Christ S. Event-related potentials evidence for long-term audiovisual representations of phonemes in adults. Eur J Neurosci 2021;54:7860-7875. [PMID: 34750895 DOI: 10.1111/ejn.15519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 11/03/2021] [Accepted: 11/05/2021] [Indexed: 10/19/2022]

Ito T, Ohashi H, Gracco VL. Somatosensory contribution to audio-visual speech processing. Cortex 2021;143:195-204. [PMID: 34450567 DOI: 10.1016/j.cortex.2021.07.013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 07/20/2021] [Accepted: 07/28/2021] [Indexed: 10/20/2022]

Abstract

Recent studies have demonstrated that the auditory speech perception of a listener can be modulated by somatosensory input applied to the facial skin suggesting that perception is an embodied process. However, speech perception is a multisensory process involving both the auditory and visual modalities. It is unknown whether and to what extent somatosensory stimulation to the facial skin modulates audio-visual speech perception. If speech perception is an embodied process, then somatosensory stimulation applied to the perceiver should influence audio-visual speech processing. Using the McGurk effect (the perceptual illusion that occurs when a sound is paired with the visual representation of a different sound, resulting in the perception of a third sound) we tested the prediction using a simple behavioral paradigm and at the neural level using event-related potentials (ERPs) and their cortical sources. We recorded ERPs from 64 scalp sites in response to congruent and incongruent audio-visual speech randomly presented with and without somatosensory stimulation associated with facial skin deformation. Subjects judged whether the production was /ba/ or not under all stimulus conditions. In the congruent audio-visual condition subjects identifying the sound as /ba/, but not in the incongruent condition consistent with the McGurk effect. Concurrent somatosensory stimulation improved the ability of participants to more correctly identify the production as /ba/ relative to the non-somatosensory condition in both congruent and incongruent conditions. ERP in response to the somatosensory stimulation for the incongruent condition reliably diverged 220 msec after stimulation onset. Cortical sources were estimated around the left anterior temporal gyrus, the right middle temporal gyrus, the right posterior superior temporal lobe and the right occipital region. The results demonstrate a clear multisensory convergence of somatosensory and audio-visual processing in both behavioral and neural processing consistent with the perspective that speech perception is a self-referenced, sensorimotor process.

Collapse

Karthik G, Plass J, Beltz AM, Liu Z, Grabowecky M, Suzuki S, Stacey WC, Wasade VS, Towle VL, Tao JX, Wu S, Issa NP, Brang D. Visual speech differentially modulates beta, theta, and high gamma bands in auditory cortex. Eur J Neurosci 2021;54:7301-7317. [PMID: 34587350 DOI: 10.1111/ejn.15482] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Revised: 08/20/2021] [Accepted: 08/28/2021] [Indexed: 12/13/2022]

O'Sullivan AE, Crosse MJ, Liberto GMD, de Cheveigné A, Lalor EC. Neurophysiological Indices of Audiovisual Speech Processing Reveal a Hierarchy of Multisensory Integration Effects. J Neurosci 2021;41:4991-5003. [PMID: 33824190 PMCID: PMC8197638 DOI: 10.1523/jneurosci.0906-20.2021] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2020] [Revised: 03/16/2021] [Accepted: 03/22/2021] [Indexed: 12/27/2022] Open

Abstract

Seeing a speaker's face benefits speech comprehension, especially in challenging listening conditions. This perceptual benefit is thought to stem from the neural integration of visual and auditory speech at multiple stages of processing, whereby movement of a speaker's face provides temporal cues to auditory cortex, and articulatory information from the speaker's mouth can aid recognizing specific linguistic units (e.g., phonemes, syllables). However, it remains unclear how the integration of these cues varies as a function of listening conditions. Here, we sought to provide insight on these questions by examining EEG responses in humans (males and females) to natural audiovisual (AV), audio, and visual speech in quiet and in noise. We represented our speech stimuli in terms of their spectrograms and their phonetic features and then quantified the strength of the encoding of those features in the EEG using canonical correlation analysis (CCA). The encoding of both spectrotemporal and phonetic features was shown to be more robust in AV speech responses than what would have been expected from the summation of the audio and visual speech responses, suggesting that multisensory integration occurs at both spectrotemporal and phonetic stages of speech processing. We also found evidence to suggest that the integration effects may change with listening conditions; however, this was an exploratory analysis and future work will be required to examine this effect using a within-subject design. These findings demonstrate that integration of audio and visual speech occurs at multiple stages along the speech processing hierarchy.SIGNIFICANCE STATEMENT During conversation, visual cues impact our perception of speech. Integration of auditory and visual speech is thought to occur at multiple stages of speech processing and vary flexibly depending on the listening conditions. Here, we examine audiovisual (AV) integration at two stages of speech processing using the speech spectrogram and a phonetic representation, and test how AV integration adapts to degraded listening conditions. We find significant integration at both of these stages regardless of listening conditions. These findings reveal neural indices of multisensory interactions at different stages of processing and provide support for the multistage integration framework.

Collapse

Johnston A, Brown BB, Elson R. Synchronous facial action binds dynamic facial features. Sci Rep 2021;11:7191. [PMID: 33785856 PMCID: PMC8010062 DOI: 10.1038/s41598-021-86725-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Accepted: 02/22/2021] [Indexed: 11/09/2022] Open

Vannuscorps G, Andres M, Carneiro SP, Rombaux E, Caramazza A. Typically Efficient Lipreading without Motor Simulation. J Cogn Neurosci 2021;33:611-621. [PMID: 33416443 DOI: 10.1162/jocn_a_01666] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

The interrelationship between the face and vocal tract configuration during audiovisual speech. Proc Natl Acad Sci U S A 2020;117:32791-32798. [PMID: 33293422 PMCID: PMC7768679 DOI: 10.1073/pnas.2006192117] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open

Abstract

Speech perception is improved when we are able to see the person who is speaking, but how visual speech cues are used to improve speech perception is currently unclear. Brain imaging has revealed that regions responsible for motor control are active during the perception of speech, opening up the possibility that visual cues are mapped onto an internal representation of the vocal tract. Here, we show that there is sufficient information in the configuration of the face to recover the vocal tract configuration and that the key areas responsible for driving the correspondence vary in accordance with the articulation required to form the acoustic signal at the appropriate point in a sentence.

It is well established that speech perception is improved when we are able to see the speaker talking along with hearing their voice, especially when the speech is noisy. While we have a good understanding of where speech integration occurs in the brain, it is unclear how visual and auditory cues are combined to improve speech perception. One suggestion is that integration can occur as both visual and auditory cues arise from a common generator: the vocal tract. Here, we investigate whether facial and vocal tract movements are linked during speech production by comparing videos of the face and fast magnetic resonance (MR) image sequences of the vocal tract. The joint variation in the face and vocal tract was extracted using an application of principal components analysis (PCA), and we demonstrate that MR image sequences can be reconstructed with high fidelity using only the facial video and PCA. Reconstruction fidelity was significantly higher when images from the two sequences corresponded in time, and including implicit temporal information by combining contiguous frames also led to a significant increase in fidelity. A “Bubbles” technique was used to identify which areas of the face were important for recovering information about the vocal tract, and vice versa, on a frame-by-frame basis. Our data reveal that there is sufficient information in the face to recover vocal tract shape during speech. In addition, the facial and vocal tract regions that are important for reconstruction are those that are used to generate the acoustic speech signal.

Collapse

Audio-visual integration in noise: Influence of auditory and visual stimulus degradation on eye movements and perception of the McGurk effect. Atten Percept Psychophys 2020;82:3544-3557. [PMID: 32533526 PMCID: PMC7788022 DOI: 10.3758/s13414-020-02042-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Thézé R, Giraud AL, Mégevand P. The phase of cortical oscillations determines the perceptual fate of visual cues in naturalistic audiovisual speech. SCIENCE ADVANCES 2020;6:6/45/eabc6348. [PMID: 33148648 PMCID: PMC7673697 DOI: 10.1126/sciadv.abc6348] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 09/17/2020] [Indexed: 06/11/2023]

Michon M, Boncompte G, López V. Electrophysiological Dynamics of Visual Speech Processing and the Role of Orofacial Effectors for Cross-Modal Predictions. Front Hum Neurosci 2020;14:538619. [PMID: 33192386 PMCID: PMC7653187 DOI: 10.3389/fnhum.2020.538619] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Accepted: 09/29/2020] [Indexed: 11/13/2022] Open

Audio-visual combination of syllables involves time-sensitive dynamics following from fusion failure. Sci Rep 2020;10:18009. [PMID: 33093570 PMCID: PMC7583249 DOI: 10.1038/s41598-020-75201-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Accepted: 10/05/2020] [Indexed: 11/08/2022] Open

Reduced resting state functional connectivity with increasing age-related hearing loss and McGurk susceptibility. Sci Rep 2020;10:16987. [PMID: 33046800 PMCID: PMC7550565 DOI: 10.1038/s41598-020-74012-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 09/15/2020] [Indexed: 11/21/2022] Open

Dash D, Wisler A, Ferrari P, Davenport EM, Maldjian J, Wang J. MEG Sensor Selection for Neural Speech Decoding. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2020;8:182320-182337. [PMID: 33204579 PMCID: PMC7668411 DOI: 10.1109/access.2020.3028831] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Abstract

Direct decoding of speech from the brain is a faster alternative to current electroencephalography (EEG) speller-based brain-computer interfaces (BCI) in providing communication assistance to locked-in patients. Magnetoencephalography (MEG) has recently shown great potential as a non-invasive neuroimaging modality for neural speech decoding, owing in part to its spatial selectivity over other high-temporal resolution devices. Standard MEG systems have a large number of cryogenically cooled channels/sensors (200 - 300) encapsulated within a fixed liquid helium dewar, precluding their use as wearable BCI devices. Fortunately, recently developed optically pumped magnetometers (OPM) do not require cryogens, and have the potential to be wearable and movable making them more suitable for BCI applications. This design is also modular allowing for customized montages to include only the sensors necessary for a particular task. As the number of sensors bears a heavy influence on the cost, size, and weight of MEG systems, minimizing the number of sensors is critical for designing practical MEG-based BCIs in the future. In this study, we sought to identify an optimal set of MEG channels to decode imagined and spoken phrases from the MEG signals. Using a forward selection algorithm with a support vector machine classifier we found that nine optimally located MEG gradiometers provided higher decoding accuracy compared to using all channels. Additionally, the forward selection algorithm achieved similar performance to dimensionality reduction using a stacked-sparse-autoencoder. Analysis of spatial dynamics of speech decoding suggested that both left and right hemisphere sensors contribute to speech decoding. Sensors approximately located near Broca's area were found to be commonly contributing among the higher-ranked sensors across all subjects.

Collapse

Face viewing behavior predicts multisensory gain during speech perception. Psychon Bull Rev 2020;27:70-77. [PMID: 31845209 DOI: 10.3758/s13423-019-01665-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Keitel A, Gross J, Kayser C. Shared and modality-specific brain regions that mediate auditory and visual word comprehension. eLife 2020;9:e56972. [PMID: 32831168 PMCID: PMC7470824 DOI: 10.7554/elife.56972] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Accepted: 08/18/2020] [Indexed: 12/22/2022] Open

Destoky F, Bertels J, Niesen M, Wens V, Vander Ghinst M, Leybaert J, Lallier M, Ince RAA, Gross J, De Tiège X, Bourguignon M. Cortical tracking of speech in noise accounts for reading strategies in children. PLoS Biol 2020;18:e3000840. [PMID: 32845876 PMCID: PMC7478533 DOI: 10.1371/journal.pbio.3000840] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Revised: 09/08/2020] [Accepted: 08/12/2020] [Indexed: 11/29/2022] Open

Affiliation(s)

Florian Destoky Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium
Julie Bertels Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium Consciousness, Cognition and Computation group, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium
Maxime Niesen Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium Service d'ORL et de chirurgie cervico-faciale, ULB-Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium
Vincent Wens Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium Department of Functional Neuroimaging, Service of Nuclear Medicine, CUB Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium
Marc Vander Ghinst Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium
Jacqueline Leybaert Laboratoire Cognition Langage et Développement, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium
Marie Lallier BCBL, Basque Center on Cognition, Brain and Language, San Sebastian, Spain
Robin A. A. Ince Institute of Neuroscience and Psychology, University of Glasgow, Glasgow, United Kingdom
Joachim Gross Institute of Neuroscience and Psychology, University of Glasgow, Glasgow, United Kingdom Institute for Biomagnetism and Biosignal analysis, University of Muenster, Muenster, Germany
Xavier De Tiège Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium Department of Functional Neuroimaging, Service of Nuclear Medicine, CUB Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium
Mathieu Bourguignon Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium Laboratoire Cognition Langage et Développement, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium BCBL, Basque Center on Cognition, Brain and Language, San Sebastian, Spain

Collapse

Ullas S, Hausfeld L, Cutler A, Eisner F, Formisano E. Neural Correlates of Phonetic Adaptation as Induced by Lexical and Audiovisual Context. J Cogn Neurosci 2020;32:2145-2158. [PMID: 32662723 DOI: 10.1162/jocn_a_01608] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]

Abstract

When speech perception is difficult, one way listeners adjust is by reconfiguring phoneme category boundaries, drawing on contextual information. Both lexical knowledge and lipreading cues are used in this way, but it remains unknown whether these two differing forms of perceptual learning are similar at a neural level. This study compared phoneme boundary adjustments driven by lexical or audiovisual cues, using ultra-high-field 7-T fMRI. During imaging, participants heard exposure stimuli and test stimuli. Exposure stimuli for lexical retuning were audio recordings of words, and those for audiovisual recalibration were audio-video recordings of lip movements during utterances of pseudowords. Test stimuli were ambiguous phonetic strings presented without context, and listeners reported what phoneme they heard. Reports reflected phoneme biases in preceding exposure blocks (e.g., more reported /p/ after /p/-biased exposure). Analysis of corresponding brain responses indicated that both forms of cue use were associated with a network of activity across the temporal cortex, plus parietal, insula, and motor areas. Audiovisual recalibration also elicited significant occipital cortex activity despite the lack of visual stimuli. Activity levels in several ROIs also covaried with strength of audiovisual recalibration, with greater activity accompanying larger recalibration shifts. Similar activation patterns appeared for lexical retuning, but here, no significant ROIs were identified. Audiovisual and lexical forms of perceptual learning thus induce largely similar brain response patterns. However, audiovisual recalibration involves additional visual cortex contributions, suggesting that previously acquired visual information (on lip movements) is retrieved and deployed to disambiguate auditory perception.

Collapse

Maffei V, Indovina I, Mazzarella E, Giusti MA, Macaluso E, Lacquaniti F, Viviani P. Sensitivity of occipito-temporal cortex, premotor and Broca's areas to visible speech gestures in a familiar language. PLoS One 2020;15:e0234695. [PMID: 32559213 PMCID: PMC7304574 DOI: 10.1371/journal.pone.0234695] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Accepted: 06/01/2020] [Indexed: 11/18/2022] Open

Proverbio AM, Camporeale E, Brusa A. Multimodal Recognition of Emotions in Music and Facial Expressions. Front Hum Neurosci 2020;14:32. [PMID: 32116613 PMCID: PMC7027335 DOI: 10.3389/fnhum.2020.00032] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2019] [Accepted: 01/23/2020] [Indexed: 01/24/2023] Open

Lip-Reading Enables the Brain to Synthesize Auditory Features of Unknown Silent Speech. J Neurosci 2019;40:1053-1065. [PMID: 31889007 DOI: 10.1523/jneurosci.1101-19.2019] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Revised: 11/28/2019] [Accepted: 12/04/2019] [Indexed: 11/21/2022] Open

Abstract

Lip-reading is crucial for understanding speech in challenging conditions. But how the brain extracts meaning from, silent, visual speech is still under debate. Lip-reading in silence activates the auditory cortices, but it is not known whether such activation reflects immediate synthesis of the corresponding auditory stimulus or imagery of unrelated sounds. To disentangle these possibilities, we used magnetoencephalography to evaluate how cortical activity in 28 healthy adult humans (17 females) entrained to the auditory speech envelope and lip movements (mouth opening) when listening to a spoken story without visual input (audio-only), and when seeing a silent video of a speaker articulating another story (video-only). In video-only, auditory cortical activity entrained to the absent auditory signal at frequencies <1 Hz more than to the seen lip movements. This entrainment process was characterized by an auditory-speech-to-brain delay of ∼70 ms in the left hemisphere, compared with ∼20 ms in audio-only. Entrainment to mouth opening was found in the right angular gyrus at <1 Hz, and in early visual cortices at 1-8 Hz. These findings demonstrate that the brain can use a silent lip-read signal to synthesize a coarse-grained auditory speech representation in early auditory cortices. Our data indicate the following underlying oscillatory mechanism: seeing lip movements first modulates neuronal activity in early visual cortices at frequencies that match articulatory lip movements; the right angular gyrus then extracts slower features of lip movements, mapping them onto the corresponding speech sound features; this information is fed to auditory cortices, most likely facilitating speech parsing.SIGNIFICANCE STATEMENT Lip-reading consists in decoding speech based on visual information derived from observation of a speaker's articulatory facial gestures. Lip-reading is known to improve auditory speech understanding, especially when speech is degraded. Interestingly, lip-reading in silence still activates the auditory cortices, even when participants do not know what the absent auditory signal should be. However, it was uncertain what such activation reflected. Here, using magnetoencephalographic recordings, we demonstrate that it reflects fast synthesis of the auditory stimulus rather than mental imagery of unrelated, speech or non-speech, sounds. Our results also shed light on the oscillatory dynamics underlying lip-reading.

Collapse

Visual inputs decrease brain activity in frontal areas during silent lipreading. PLoS One 2019;14:e0223782. [PMID: 31600311 PMCID: PMC6786756 DOI: 10.1371/journal.pone.0223782] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2019] [Accepted: 09/27/2019] [Indexed: 11/19/2022] Open

Karas PJ, Magnotti JF, Metzger BA, Zhu LL, Smith KB, Yoshor D, Beauchamp MS. The visual speech head start improves perception and reduces superior temporal cortex responses to auditory speech. eLife 2019;8:e48116. [PMID: 31393261 PMCID: PMC6687434 DOI: 10.7554/elife.48116] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2019] [Accepted: 07/17/2019] [Indexed: 12/30/2022] Open

Kolozsvári OB, Xu W, Leppänen PHT, Hämäläinen JA. Top-Down Predictions of Familiarity and Congruency in Audio-Visual Speech Perception at Neural Level. Front Hum Neurosci 2019;13:243. [PMID: 31354459 PMCID: PMC6639789 DOI: 10.3389/fnhum.2019.00243] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Accepted: 06/28/2019] [Indexed: 11/13/2022] Open

Abstract

During speech perception, listeners rely on multimodal input and make use of both auditory and visual information. When presented with speech, for example syllables, the differences in brain responses to distinct stimuli are not, however, caused merely by the acoustic or visual features of the stimuli. The congruency of the auditory and visual information and the familiarity of a syllable, that is, whether it appears in the listener's native language or not, also modulates brain responses. We investigated how the congruency and familiarity of the presented stimuli affect brain responses to audio-visual (AV) speech in 12 adult Finnish native speakers and 12 adult Chinese native speakers. They watched videos of a Chinese speaker pronouncing syllables (/pa/, /pha/, /ta/, /tha/, /fa/) during a magnetoencephalography (MEG) measurement where only /pa/ and /ta/ were part of Finnish phonology while all the stimuli were part of Chinese phonology. The stimuli were presented in audio-visual (congruent or incongruent), audio only, or visual only conditions. The brain responses were examined in five time-windows: 75-125, 150-200, 200-300, 300-400, and 400-600 ms. We found significant differences for the congruency comparison in the fourth time-window (300-400 ms) in both sensor and source level analysis. Larger responses were observed for the incongruent stimuli than for the congruent stimuli. For the familiarity comparisons no significant differences were found. The results are in line with earlier studies reporting on the modulation of brain responses for audio-visual congruency around 250-500 ms. This suggests a much stronger process for the general detection of a mismatch between predictions based on lip movements and the auditory signal than for the top-down modulation of brain responses based on phonological information.

Collapse

Bidelman GM, Sigley L, Lewis GA. Acoustic noise and vision differentially warp the auditory categorization of speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019;146:60. [PMID: 31370660 PMCID: PMC6786888 DOI: 10.1121/1.5114822] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Revised: 06/05/2019] [Accepted: 06/07/2019] [Indexed: 06/10/2023]

Imafuku M, Kanakogi Y, Butler D, Myowa M. Demystifying infant vocal imitation: The roles of mouth looking and speaker's gaze. Dev Sci 2019;22:e12825. [PMID: 30980494 DOI: 10.1111/desc.12825] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2017] [Revised: 01/08/2019] [Accepted: 03/01/2019] [Indexed: 12/20/2022]