1
|
Wojciechowski J, Beck J, Cygan H, Pankowska A, Wolak T. Neural mechanisms of lipreading in the Polish-speaking population: effects of linguistic complexity and sex differences. Sci Rep 2025; 15:13253. [PMID: 40247080 PMCID: PMC12006354 DOI: 10.1038/s41598-025-98026-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2024] [Accepted: 04/08/2025] [Indexed: 04/19/2025] Open
Abstract
Lipreading, the ability to understand speech by observing lips and facial movements, is a vital communication skill that enhances speech comprehension in diverse contexts, such as noisy environments. This study examines the neural mechanisms underlying lipreading in the Polish-speaking population, focusing on the complexity of linguistic material and potential sex differences in lipreading ability. Cohort of 51 participants (26 females) underwent a behavioral lipreading test and an fMRI-based speech comprehension task, utilizing visual-only and audiovisual stimuli, manipulating the lexicality and grammar of linguistic materials. Results indicated that males and females did not differ significantly in objective lipreading skills, though females rated their subjective abilities higher. Neuroimaging revealed increased activation in regions associated with speech processing, such as the superior temporal cortex, when participants engaged in visual-only lipreading compared to audiovisual condition. Lexicality of visual-only material engaged distinct neural pathways, highlighting the role of motor areas in visual speech comprehension. These findings contribute to understanding the neurocognitive processes in lipreading, suggesting that visual speech perception is a multimodal process involving extensive brain regions typically associated with auditory processing. The study underscores the potential of lipreading training in rehabilitating individuals with hearing loss and informs the development of assistive technologies.
Collapse
Affiliation(s)
- Jakub Wojciechowski
- Bioimaging Research Center, Institute of Physiology and Pathology of Hearing, 10 Mochnackiego St, Warsaw, 02-042, Poland
- Nencki Institute of Experimental Biology, Polish Academy of Sciences, 3 Pasteur St, Warsaw, 02-093, Poland
| | - Joanna Beck
- Bioimaging Research Center, Institute of Physiology and Pathology of Hearing, 10 Mochnackiego St, Warsaw, 02-042, Poland.
- Medical Faculty, Lazarski University, Warsaw, 02-662, Poland.
| | - Hanna Cygan
- Bioimaging Research Center, Institute of Physiology and Pathology of Hearing, 10 Mochnackiego St, Warsaw, 02-042, Poland
| | - Agnieszka Pankowska
- Rehabilitation Clinic, Institute of Physiology and Pathology of Hearing, 10 Mochnackiego St, Warsaw, 02-042, Poland
| | - Tomasz Wolak
- Bioimaging Research Center, Institute of Physiology and Pathology of Hearing, 10 Mochnackiego St, Warsaw, 02-042, Poland
| |
Collapse
|
2
|
Bedford O, Noly-Gandon A, Ara A, Wiesman AI, Albouy P, Baillet S, Penhune V, Zatorre RJ. Human Auditory-Motor Networks Show Frequency-Specific Phase-Based Coupling in Resting-State MEG. Hum Brain Mapp 2025; 46:e70045. [PMID: 39757971 DOI: 10.1002/hbm.70045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 08/31/2024] [Accepted: 09/25/2024] [Indexed: 01/07/2025] Open
Abstract
Perception and production of music and speech rely on auditory-motor coupling, a mechanism which has been linked to temporally precise oscillatory coupling between auditory and motor regions of the human brain, particularly in the beta frequency band. Recently, brain imaging studies using magnetoencephalography (MEG) have also shown that accurate auditory temporal predictions specifically depend on phase coherence between auditory and motor cortical regions. However, it is not yet clear whether this tight oscillatory phase coupling is an intrinsic feature of the auditory-motor loop, or whether it is only elicited by task demands. Further, we do not know if phase synchrony is uniquely enhanced in the auditory-motor system compared to other sensorimotor modalities, or to which degree it is amplified by musical training. In order to resolve these questions, we measured the degree of phase locking between motor regions and auditory or visual areas in musicians and non-musicians using resting-state MEG. We derived phase locking values (PLVs) and phase transfer entropy (PTE) values from 90 healthy young participants. We observed significantly higher PLVs across all auditory-motor pairings compared to all visuomotor pairings in all frequency bands. The pairing with the highest degree of phase synchrony was right primary auditory cortex with right ventral premotor cortex, a connection which has been highlighted in previous literature on auditory-motor coupling. Additionally, we observed that auditory-motor and visuomotor PLVs were significantly higher across all structures in the right hemisphere, and we found the highest differences between auditory and visual PLVs in the theta, alpha, and beta frequency bands. Last, we found that the theta and beta bands exhibited a preference for a motor-to-auditory PTE direction and that the alpha and gamma bands exhibited the opposite preference for an auditory-to-motor PTE direction. Taken together, these findings confirm our hypotheses that motor phase synchrony is significantly enhanced in auditory compared to visual cortical regions at rest, that these differences are highest across the theta-beta spectrum of frequencies, and that there exist alternating information flow loops across auditory-motor structures as a function of frequency. In our view, this supports the existence of an intrinsic, time-based coupling for low-latency integration of sounds and movements which involves synchronized phasic activity between primary auditory cortex with motor and premotor cortical areas.
Collapse
Affiliation(s)
- Oscar Bedford
- Montreal Neurological Institute, McGill University, Montréal, Quebec, Canada
- International Laboratory for Brain, Music and Sound Research (BRAMS), Montréal, Quebec, Canada
- Centre for Research on Brain, Language and Music (CRBLM), McGill University, Montréal, Quebec, Canada
| | - Alix Noly-Gandon
- Montreal Neurological Institute, McGill University, Montréal, Quebec, Canada
- International Laboratory for Brain, Music and Sound Research (BRAMS), Montréal, Quebec, Canada
- Centre for Research on Brain, Language and Music (CRBLM), McGill University, Montréal, Quebec, Canada
| | - Alberto Ara
- Montreal Neurological Institute, McGill University, Montréal, Quebec, Canada
- International Laboratory for Brain, Music and Sound Research (BRAMS), Montréal, Quebec, Canada
- Centre for Research on Brain, Language and Music (CRBLM), McGill University, Montréal, Quebec, Canada
| | - Alex I Wiesman
- Montreal Neurological Institute, McGill University, Montréal, Quebec, Canada
| | - Philippe Albouy
- International Laboratory for Brain, Music and Sound Research (BRAMS), Montréal, Quebec, Canada
- Centre for Research on Brain, Language and Music (CRBLM), McGill University, Montréal, Quebec, Canada
- CERVO Brain Research Centre, School of Psychology, Université Laval, Québec City, Quebec, Canada
| | - Sylvain Baillet
- Montreal Neurological Institute, McGill University, Montréal, Quebec, Canada
| | - Virginia Penhune
- International Laboratory for Brain, Music and Sound Research (BRAMS), Montréal, Quebec, Canada
- Centre for Research on Brain, Language and Music (CRBLM), McGill University, Montréal, Quebec, Canada
- Department of Psychology, Concordia University, Montréal, Quebec, Canada
| | - Robert J Zatorre
- Montreal Neurological Institute, McGill University, Montréal, Quebec, Canada
- International Laboratory for Brain, Music and Sound Research (BRAMS), Montréal, Quebec, Canada
- Centre for Research on Brain, Language and Music (CRBLM), McGill University, Montréal, Quebec, Canada
| |
Collapse
|
3
|
Arya R, Ervin B, Greiner HM, Buroker J, Byars AW, Tenney JR, Arthur TM, Fong SL, Lin N, Frink C, Rozhkov L, Scholle C, Skoch J, Leach JL, Mangano FT, Glauser TA, Hickok G, Holland KD. Emotional facial expression and perioral motor functions of the human auditory cortex. Clin Neurophysiol 2024; 163:102-111. [PMID: 38729074 PMCID: PMC11176009 DOI: 10.1016/j.clinph.2024.04.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 04/16/2024] [Accepted: 04/17/2024] [Indexed: 05/12/2024]
Abstract
OBJECTIVE We investigated the role of transverse temporal gyrus and adjacent cortex (TTG+) in facial expressions and perioral movements. METHODS In 31 patients undergoing stereo-electroencephalography monitoring, we describe behavioral responses elicited by electrical stimulation within the TTG+. Task-induced high-gamma modulation (HGM), auditory evoked responses, and resting-state connectivity were used to investigate the cortical sites having different types of responses on electrical stimulation. RESULTS Changes in facial expressions and perioral movements were elicited on electrical stimulation within TTG+ in 9 (29%) and 10 (32%) patients, respectively, in addition to the more common language responses (naming interruptions, auditory hallucinations, paraphasic errors). All functional sites showed auditory task induced HGM and evoked responses validating their location within the auditory cortex, however, motor sites showed lower peak amplitudes and longer peak latencies compared to language sites. Significant first-degree connections for motor sites included precentral, anterior cingulate, parahippocampal, and anterior insular gyri, whereas those for language sites included posterior superior temporal, posterior middle temporal, inferior frontal, supramarginal, and angular gyri. CONCLUSIONS Multimodal data suggests that TTG+ may participate in auditory-motor integration. SIGNIFICANCE TTG+ likely participates in facial expressions in response to emotional cues during an auditory discourse.
Collapse
Affiliation(s)
- Ravindra Arya
- Comprehensive Epilepsy Center, Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA; Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, OH, USA.
| | - Brian Ervin
- Comprehensive Epilepsy Center, Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA; Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, OH, USA
| | - Hansel M Greiner
- Comprehensive Epilepsy Center, Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Jason Buroker
- Comprehensive Epilepsy Center, Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Anna W Byars
- Comprehensive Epilepsy Center, Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Jeffrey R Tenney
- Comprehensive Epilepsy Center, Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Todd M Arthur
- Comprehensive Epilepsy Center, Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Susan L Fong
- Comprehensive Epilepsy Center, Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Nan Lin
- Comprehensive Epilepsy Center, Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Clayton Frink
- Comprehensive Epilepsy Center, Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Leonid Rozhkov
- Comprehensive Epilepsy Center, Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Craig Scholle
- Comprehensive Epilepsy Center, Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Jesse Skoch
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA; Division of Pediatric Neurosurgery, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - James L Leach
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA; Division of Pediatric Neuro-radiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Francesco T Mangano
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA; Division of Pediatric Neurosurgery, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Tracy A Glauser
- Comprehensive Epilepsy Center, Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Gregory Hickok
- Department of Cognitive Sciences, Department of Language Science, University of California, Irvine, CA, USA
| | - Katherine D Holland
- Comprehensive Epilepsy Center, Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| |
Collapse
|
4
|
Jeschke L, Mathias B, von Kriegstein K. Inhibitory TMS over Visual Area V5/MT Disrupts Visual Speech Recognition. J Neurosci 2023; 43:7690-7699. [PMID: 37848284 PMCID: PMC10634547 DOI: 10.1523/jneurosci.0975-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 07/26/2023] [Accepted: 09/04/2023] [Indexed: 10/19/2023] Open
Abstract
During face-to-face communication, the perception and recognition of facial movements can facilitate individuals' understanding of what is said. Facial movements are a form of complex biological motion. Separate neural pathways are thought to processing (1) simple, nonbiological motion with an obligatory waypoint in the motion-sensitive visual middle temporal area (V5/MT); and (2) complex biological motion. Here, we present findings that challenge this dichotomy. Neuronavigated offline transcranial magnetic stimulation (TMS) over V5/MT on 24 participants (17 females and 7 males) led to increased response times in the recognition of simple, nonbiological motion as well as visual speech recognition compared with TMS over the vertex, an active control region. TMS of area V5/MT also reduced practice effects on response times, that are typically observed in both visual speech and motion recognition tasks over time. Our findings provide the first indication that area V5/MT causally influences the recognition of visual speech.SIGNIFICANCE STATEMENT In everyday face-to-face communication, speech comprehension is often facilitated by viewing a speaker's facial movements. Several brain areas contribute to the recognition of visual speech. One area of interest is the motion-sensitive visual medial temporal area (V5/MT), which has been associated with the perception of simple, nonbiological motion such as moving dots, as well as more complex, biological motion such as visual speech. Here, we demonstrate using noninvasive brain stimulation that area V5/MT is causally relevant in recognizing visual speech. This finding provides new insights into the neural mechanisms that support the perception of human communication signals, which will help guide future research in typically developed individuals and populations with communication difficulties.
Collapse
Affiliation(s)
- Lisa Jeschke
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, 01069 Dresden, Germany
| | - Brian Mathias
- School of Psychology, University of Aberdeen, Aberdeen AB243FX, United Kingdom
| | - Katharina von Kriegstein
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, 01069 Dresden, Germany
| |
Collapse
|
5
|
Dopierała AAW, López Pérez D, Mercure E, Pluta A, Malinowska-Korczak A, Evans S, Wolak T, Tomalski P. Watching talking faces: The development of cortical representation of visual syllables in infancy. BRAIN AND LANGUAGE 2023; 244:105304. [PMID: 37481794 DOI: 10.1016/j.bandl.2023.105304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 07/13/2023] [Accepted: 07/17/2023] [Indexed: 07/25/2023]
Abstract
From birth, we perceive speech by hearing and seeing people talk. In adults cortical representations of visual speech are processed in the putative temporal visual speech area (TVSA), but it remains unknown how these representations develop. We measured infants' cortical responses to silent visual syllables and non-communicative mouth movements using functional Near-Infrared Spectroscopy. Our results indicate that cortical specialisation for visual speech may emerge during infancy. The putative TVSA was active to both visual syllables and gurning around 5 months of age, and more active to gurning than to visual syllables around 10 months of age. Multivariate pattern analysis classification of distinct cortical responses to visual speech and gurning was successful at 10, but not at 5 months of age. These findings imply that cortical representations of visual speech change between 5 and 10 months of age, showing that the putative TVSA is initially broadly tuned and becomes selective with age.
Collapse
Affiliation(s)
- Aleksandra A W Dopierała
- Faculty of Psychology, University of Warsaw, Warsaw, Poland; Department of Psychology, University of British Columbia, Vancouver, Canada.
| | - David López Pérez
- Institute of Psychology, Polish Academy of Sciences, Warsaw, Poland.
| | | | - Agnieszka Pluta
- Faculty of Psychology, University of Warsaw, Warsaw, Poland; Institute of Physiology and Pathology of Hearing, Bioimaging Research Center, World Hearing Centre, Warsaw, Poland.
| | | | - Samuel Evans
- University of Westminister, London, UK; Kings College London, London, UK.
| | - Tomasz Wolak
- Institute of Physiology and Pathology of Hearing, Bioimaging Research Center, World Hearing Centre, Warsaw, Poland.
| | - Przemysław Tomalski
- Faculty of Psychology, University of Warsaw, Warsaw, Poland; Institute of Psychology, Polish Academy of Sciences, Warsaw, Poland.
| |
Collapse
|
6
|
Saalasti S, Alho J, Lahnakoski JM, Bacha-Trams M, Glerean E, Jääskeläinen IP, Hasson U, Sams M. Lipreading a naturalistic narrative in a female population: Neural characteristics shared with listening and reading. Brain Behav 2023; 13:e2869. [PMID: 36579557 PMCID: PMC9927859 DOI: 10.1002/brb3.2869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 11/29/2022] [Accepted: 12/06/2022] [Indexed: 12/30/2022] Open
Abstract
INTRODUCTION Few of us are skilled lipreaders while most struggle with the task. Neural substrates that enable comprehension of connected natural speech via lipreading are not yet well understood. METHODS We used a data-driven approach to identify brain areas underlying the lipreading of an 8-min narrative with participants whose lipreading skills varied extensively (range 6-100%, mean = 50.7%). The participants also listened to and read the same narrative. The similarity between individual participants' brain activity during the whole narrative, within and between conditions, was estimated by a voxel-wise comparison of the Blood Oxygenation Level Dependent (BOLD) signal time courses. RESULTS Inter-subject correlation (ISC) of the time courses revealed that lipreading, listening to, and reading the narrative were largely supported by the same brain areas in the temporal, parietal and frontal cortices, precuneus, and cerebellum. Additionally, listening to and reading connected naturalistic speech particularly activated higher-level linguistic processing in the parietal and frontal cortices more consistently than lipreading, probably paralleling the limited understanding obtained via lip-reading. Importantly, higher lipreading test score and subjective estimate of comprehension of the lipread narrative was associated with activity in the superior and middle temporal cortex. CONCLUSIONS Our new data illustrates that findings from prior studies using well-controlled repetitive speech stimuli and stimulus-driven data analyses are also valid for naturalistic connected speech. Our results might suggest an efficient use of brain areas dealing with phonological processing in skilled lipreaders.
Collapse
Affiliation(s)
- Satu Saalasti
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland.,Brain and Mind Laboratory, Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland.,Advanced Magnetic Imaging (AMI) Centre, Aalto NeuroImaging, School of Science, Aalto University, Espoo, Finland
| | - Jussi Alho
- Brain and Mind Laboratory, Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland
| | - Juha M Lahnakoski
- Brain and Mind Laboratory, Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland.,Independent Max Planck Research Group for Social Neuroscience, Max Planck Institute of Psychiatry, Munich, Germany.,Institute of Neuroscience and Medicine, Brain & Behaviour (INM-7), Research Center Jülich, Jülich, Germany.,Institute of Systems Neuroscience, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Mareike Bacha-Trams
- Brain and Mind Laboratory, Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland
| | - Enrico Glerean
- Brain and Mind Laboratory, Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland.,Department of Psychology and the Neuroscience Institute, Princeton University, Princeton, USA
| | - Iiro P Jääskeläinen
- Brain and Mind Laboratory, Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland
| | - Uri Hasson
- Department of Psychology and the Neuroscience Institute, Princeton University, Princeton, USA
| | - Mikko Sams
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland.,Aalto Studios - MAGICS, Aalto University, Espoo, Finland
| |
Collapse
|
7
|
Alp N, Ozkan H. Neural correlates of integration processes during dynamic face perception. Sci Rep 2022; 12:118. [PMID: 34996892 PMCID: PMC8742062 DOI: 10.1038/s41598-021-02808-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 11/22/2021] [Indexed: 11/10/2022] Open
Abstract
Integrating the spatiotemporal information acquired from the highly dynamic world around us is essential to navigate, reason, and decide properly. Although this is particularly important in a face-to-face conversation, very little research to date has specifically examined the neural correlates of temporal integration in dynamic face perception. Here we present statistically robust observations regarding the brain activations measured via electroencephalography (EEG) that are specific to the temporal integration. To that end, we generate videos of neutral faces of individuals and non-face objects, modulate the contrast of the even and odd frames at two specific frequencies (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$f_1$$\end{document}f1 and \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$f_2$$\end{document}f2) in an interlaced manner, and measure the steady-state visual evoked potential as participants view the videos. Then, we analyze the intermodulation components (IMs: (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$nf_1\pm mf_2$$\end{document}nf1±mf2), a linear combination of the fundamentals with integer multipliers) that consequently reflect the nonlinear processing and indicate temporal integration by design. We show that electrodes around the medial temporal, inferior, and medial frontal areas respond strongly and selectively when viewing dynamic faces, which manifests the essential processes underlying our ability to perceive and understand our social world. The generation of IMs is only possible if even and odd frames are processed in succession and integrated temporally, therefore, the strong IMs in our frequency spectrum analysis show that the time between frames (1/60 s) is sufficient for temporal integration.
Collapse
Affiliation(s)
- Nihan Alp
- Psychology, Sabanci University, Istanbul, Turkey.
| | - Huseyin Ozkan
- Electronics Engineering, Sabanci University, Istanbul, Turkey
| |
Collapse
|
8
|
Yang T, Formuli A, Paolini M, Zeki S. The neural determinants of beauty. Eur J Neurosci 2021; 55:91-106. [PMID: 34837282 DOI: 10.1111/ejn.15543] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 09/13/2021] [Accepted: 11/19/2021] [Indexed: 11/26/2022]
Abstract
The perception of faces correlates with activity in a number of brain areas, but only when a face is perceived as beautiful is the medial orbitofrontal cortex (mOFC) also engaged. Here, we enquire whether it is the emergence of a particular pattern of neural activity in face perceptive areas during the experience of a face as beautiful that determines whether there is, as a correlate, activity in mOFC. Seventeen subjects of both genders viewed and rated facial stimuli according to how beautiful they perceived them to be while the activity in their brains was imaged with functional magnetic resonance imaging. A univariate analysis revealed parametrically scaled activity within several areas, including the occipital face area (OFA), fusiform face area (FFA) and the cuneus; the strength of activity in these areas correlated with the declared intensity of the aesthetic experience of faces; multivariate analyses showed strong patterns of activation in the FFA and the cuneus and weaker patterns in the OFA and the posterior superior temporal sulcus (pSTS). The mOFC was only engaged when specific patterns of activity emerged in these areas. A psychophysiological interaction analysis with mOFC as the seed area revealed the involvement of the right FFA and the right OFA. We conjecture that it is the collective specific pattern-based activity in these face perceptive areas, with activity in the mOFC as a correlate, that constitutes the neural basis for the experience of facial beauty, bringing us a step closer to understanding the neural determinants of aesthetic experience.
Collapse
Affiliation(s)
- Taoxi Yang
- Laboratory of Neurobiology, Division of Cell & Developmental Biology, University College London, London, UK
| | - Arusu Formuli
- Institute of Medical Psychology, Ludwig-Maximilians-Universität, Munich, Germany
| | - Marco Paolini
- Department of Radiology, University Hospital, Ludwig-Maximilians-Universität, Munich, Germany
| | - Semir Zeki
- Laboratory of Neurobiology, Division of Cell & Developmental Biology, University College London, London, UK
| |
Collapse
|
9
|
Ito T, Ohashi H, Gracco VL. Somatosensory contribution to audio-visual speech processing. Cortex 2021; 143:195-204. [PMID: 34450567 DOI: 10.1016/j.cortex.2021.07.013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 07/20/2021] [Accepted: 07/28/2021] [Indexed: 10/20/2022]
Abstract
Recent studies have demonstrated that the auditory speech perception of a listener can be modulated by somatosensory input applied to the facial skin suggesting that perception is an embodied process. However, speech perception is a multisensory process involving both the auditory and visual modalities. It is unknown whether and to what extent somatosensory stimulation to the facial skin modulates audio-visual speech perception. If speech perception is an embodied process, then somatosensory stimulation applied to the perceiver should influence audio-visual speech processing. Using the McGurk effect (the perceptual illusion that occurs when a sound is paired with the visual representation of a different sound, resulting in the perception of a third sound) we tested the prediction using a simple behavioral paradigm and at the neural level using event-related potentials (ERPs) and their cortical sources. We recorded ERPs from 64 scalp sites in response to congruent and incongruent audio-visual speech randomly presented with and without somatosensory stimulation associated with facial skin deformation. Subjects judged whether the production was /ba/ or not under all stimulus conditions. In the congruent audio-visual condition subjects identifying the sound as /ba/, but not in the incongruent condition consistent with the McGurk effect. Concurrent somatosensory stimulation improved the ability of participants to more correctly identify the production as /ba/ relative to the non-somatosensory condition in both congruent and incongruent conditions. ERP in response to the somatosensory stimulation for the incongruent condition reliably diverged 220 msec after stimulation onset. Cortical sources were estimated around the left anterior temporal gyrus, the right middle temporal gyrus, the right posterior superior temporal lobe and the right occipital region. The results demonstrate a clear multisensory convergence of somatosensory and audio-visual processing in both behavioral and neural processing consistent with the perspective that speech perception is a self-referenced, sensorimotor process.
Collapse
Affiliation(s)
- Takayuki Ito
- University Grenoble-Alpes, CNRS, Grenoble-INP, GIPSA-Lab, Saint Martin D'heres Cedex, France; Haskins Laboratories, New Haven, CT, USA.
| | | | - Vincent L Gracco
- Haskins Laboratories, New Haven, CT, USA; McGill University, Montréal, QC, Canada
| |
Collapse
|
10
|
Liu M, Liu CH, Zheng S, Zhao K, Fu X. Reexamining the neural network involved in perception of facial expression: A meta-analysis. Neurosci Biobehav Rev 2021; 131:179-191. [PMID: 34536463 DOI: 10.1016/j.neubiorev.2021.09.024] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Revised: 08/19/2021] [Accepted: 09/09/2021] [Indexed: 11/15/2022]
Abstract
Perception of facial expression is essential for social interactions. Although a few competing models have enjoyed some success to map brain regions, they are also facing difficult challenges. The current study used an updated activation likelihood estimation (ALE) method of meta-analysis to explore the involvement of brain regions in facial expression processing. The sample contained 96 functional magnetic resonance imaging (fMRI) and positron emission tomography (PET) studies of healthy adults with the results of whole-brain analyses. The key findings revealed that the ventral pathway, especially the left fusiform face area (FFA) region, was more responsive to facial expression. The left posterior FFA showed strong involvement when participants passively viewing emotional faces without being asked to judge the type of expression or other attributes of the stimuli. Through meta-analytic connectivity modeling (MACM) of the main brain regions in the ventral pathway, we constructed a co-activating neural network as a revised model of facial expression processing that assigns prominent roles to the amygdala, FFA, the occipital gyrus, and the inferior frontal gyrus.
Collapse
Affiliation(s)
- Mingtong Liu
- State Key Laboratory of Brain and Cognitive Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, 100101, China; Department of Psychology, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Chang Hong Liu
- Department of Psychology, Bournemouth University, Dorset, United Kingdom
| | - Shuang Zheng
- State Key Laboratory of Brain and Cognitive Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, 100101, China; Department of Psychology, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Ke Zhao
- State Key Laboratory of Brain and Cognitive Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, 100101, China; Department of Psychology, University of Chinese Academy of Sciences, Beijing, 100049, China.
| | - Xiaolan Fu
- State Key Laboratory of Brain and Cognitive Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, 100101, China; Department of Psychology, University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
11
|
Maffei V, Indovina I, Mazzarella E, Giusti MA, Macaluso E, Lacquaniti F, Viviani P. Sensitivity of occipito-temporal cortex, premotor and Broca's areas to visible speech gestures in a familiar language. PLoS One 2020; 15:e0234695. [PMID: 32559213 PMCID: PMC7304574 DOI: 10.1371/journal.pone.0234695] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Accepted: 06/01/2020] [Indexed: 11/18/2022] Open
Abstract
When looking at a speaking person, the analysis of facial kinematics contributes to language discrimination and to the decoding of the time flow of visual speech. To disentangle these two factors, we investigated behavioural and fMRI responses to familiar and unfamiliar languages when observing speech gestures with natural or reversed kinematics. Twenty Italian volunteers viewed silent video-clips of speech shown as recorded (Forward, biological motion) or reversed in time (Backward, non-biological motion), in Italian (familiar language) or Arabic (non-familiar language). fMRI revealed that language (Italian/Arabic) and time-rendering (Forward/Backward) modulated distinct areas in the ventral occipito-temporal cortex, suggesting that visual speech analysis begins in this region, earlier than previously thought. Left premotor ventral (superior subdivision) and dorsal areas were preferentially activated with the familiar language independently of time-rendering, challenging the view that the role of these regions in speech processing is purely articulatory. The left premotor ventral region in the frontal operculum, thought to include part of the Broca's area, responded to the natural familiar language, consistent with the hypothesis of motor simulation of speech gestures.
Collapse
Affiliation(s)
- Vincenzo Maffei
- Laboratory of Neuromotor Physiology, IRCCS Santa Lucia Foundation, Rome, Italy
- Centre of Space BioMedicine and Department of Systems Medicine, University of Rome Tor Vergata, Rome, Italy
- Data Lake & BI, DOT - Technology, Poste Italiane, Rome, Italy
| | - Iole Indovina
- Laboratory of Neuromotor Physiology, IRCCS Santa Lucia Foundation, Rome, Italy
- Departmental Faculty of Medicine and Surgery, Saint Camillus International University of Health and Medical Sciences, Rome, Italy
| | | | - Maria Assunta Giusti
- Centre of Space BioMedicine and Department of Systems Medicine, University of Rome Tor Vergata, Rome, Italy
| | - Emiliano Macaluso
- ImpAct Team, Lyon Neuroscience Research Center, Lyon, France
- Laboratory of Neuroimaging, IRCCS Santa Lucia Foundation, Rome, Italy
| | - Francesco Lacquaniti
- Laboratory of Neuromotor Physiology, IRCCS Santa Lucia Foundation, Rome, Italy
- Centre of Space BioMedicine and Department of Systems Medicine, University of Rome Tor Vergata, Rome, Italy
| | - Paolo Viviani
- Laboratory of Neuromotor Physiology, IRCCS Santa Lucia Foundation, Rome, Italy
- Centre of Space BioMedicine and Department of Systems Medicine, University of Rome Tor Vergata, Rome, Italy
| |
Collapse
|
12
|
Borowiak K, Maguinness C, von Kriegstein K. Dorsal-movement and ventral-form regions are functionally connected during visual-speech recognition. Hum Brain Mapp 2020; 41:952-972. [PMID: 31749219 PMCID: PMC7267922 DOI: 10.1002/hbm.24852] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2019] [Revised: 09/03/2019] [Accepted: 10/21/2019] [Indexed: 01/17/2023] Open
Abstract
Faces convey social information such as emotion and speech. Facial emotion processing is supported via interactions between dorsal-movement and ventral-form visual cortex regions. Here, we explored, for the first time, whether similar dorsal-ventral interactions (assessed via functional connectivity), might also exist for visual-speech processing. We then examined whether altered dorsal-ventral connectivity is observed in adults with high-functioning autism spectrum disorder (ASD), a disorder associated with impaired visual-speech recognition. We acquired functional magnetic resonance imaging (fMRI) data with concurrent eye tracking in pairwise matched control and ASD participants. In both groups, dorsal-movement regions in the visual motion area 5 (V5/MT) and the temporal visual speech area (TVSA) were functionally connected to ventral-form regions (i.e., the occipital face area [OFA] and the fusiform face area [FFA]) during the recognition of visual speech, in contrast to the recognition of face identity. Notably, parts of this functional connectivity were decreased in the ASD group compared to the controls (i.e., right V5/MT-right OFA, left TVSA-left FFA). The results confirmed our hypothesis that functional connectivity between dorsal-movement and ventral-form regions exists during visual-speech processing. Its partial dysfunction in ASD might contribute to difficulties in the recognition of dynamic face information relevant for successful face-to-face communication.
Collapse
Affiliation(s)
- Kamila Borowiak
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität DresdenDresdenGermany
- Max Planck Institute for Human Cognitive and Brain SciencesLeipzigGermany
- Berlin School of Mind and Brain, Humboldt University of BerlinBerlinGermany
| | - Corrina Maguinness
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität DresdenDresdenGermany
- Max Planck Institute for Human Cognitive and Brain SciencesLeipzigGermany
| | - Katharina von Kriegstein
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität DresdenDresdenGermany
- Max Planck Institute for Human Cognitive and Brain SciencesLeipzigGermany
| |
Collapse
|
13
|
Integrating faces and bodies: Psychological and neural perspectives on whole person perception. Neurosci Biobehav Rev 2020; 112:472-486. [PMID: 32088346 DOI: 10.1016/j.neubiorev.2020.02.021] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 12/19/2019] [Accepted: 02/15/2020] [Indexed: 11/20/2022]
Abstract
The human "person" is a common percept we encounter. Research on person perception has been focused either on face or body perception-with less attention paid to whole person perception. We review psychological and neuroscience studies aimed at understanding how face and body processing operate in concert to support intact person perception. We address this question considering: a.) the task to be accomplished (identification, emotion processing, detection), b.) the neural stage of processing (early/late visual mechanisms), and c.) the relevant brain regions for face/body/person processing. From the psychological perspective, we conclude that the integration of faces and bodies is mediated by the goal of the processing (e.g., emotion analysis, identification, etc.). From the neural perspective, we propose a hierarchical functional neural architecture of face-body integration that retains a degree of separation between the dorsal and ventral visual streams. We argue for two centers of integration: a ventral semantic integration hub that is the result of progressive, posterior-to-anterior, face-body integration; and a social agent integration hub in the dorsal stream STS.
Collapse
|
14
|
Lip-Reading Enables the Brain to Synthesize Auditory Features of Unknown Silent Speech. J Neurosci 2019; 40:1053-1065. [PMID: 31889007 DOI: 10.1523/jneurosci.1101-19.2019] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Revised: 11/28/2019] [Accepted: 12/04/2019] [Indexed: 11/21/2022] Open
Abstract
Lip-reading is crucial for understanding speech in challenging conditions. But how the brain extracts meaning from, silent, visual speech is still under debate. Lip-reading in silence activates the auditory cortices, but it is not known whether such activation reflects immediate synthesis of the corresponding auditory stimulus or imagery of unrelated sounds. To disentangle these possibilities, we used magnetoencephalography to evaluate how cortical activity in 28 healthy adult humans (17 females) entrained to the auditory speech envelope and lip movements (mouth opening) when listening to a spoken story without visual input (audio-only), and when seeing a silent video of a speaker articulating another story (video-only). In video-only, auditory cortical activity entrained to the absent auditory signal at frequencies <1 Hz more than to the seen lip movements. This entrainment process was characterized by an auditory-speech-to-brain delay of ∼70 ms in the left hemisphere, compared with ∼20 ms in audio-only. Entrainment to mouth opening was found in the right angular gyrus at <1 Hz, and in early visual cortices at 1-8 Hz. These findings demonstrate that the brain can use a silent lip-read signal to synthesize a coarse-grained auditory speech representation in early auditory cortices. Our data indicate the following underlying oscillatory mechanism: seeing lip movements first modulates neuronal activity in early visual cortices at frequencies that match articulatory lip movements; the right angular gyrus then extracts slower features of lip movements, mapping them onto the corresponding speech sound features; this information is fed to auditory cortices, most likely facilitating speech parsing.SIGNIFICANCE STATEMENT Lip-reading consists in decoding speech based on visual information derived from observation of a speaker's articulatory facial gestures. Lip-reading is known to improve auditory speech understanding, especially when speech is degraded. Interestingly, lip-reading in silence still activates the auditory cortices, even when participants do not know what the absent auditory signal should be. However, it was uncertain what such activation reflected. Here, using magnetoencephalographic recordings, we demonstrate that it reflects fast synthesis of the auditory stimulus rather than mental imagery of unrelated, speech or non-speech, sounds. Our results also shed light on the oscillatory dynamics underlying lip-reading.
Collapse
|
15
|
Visual inputs decrease brain activity in frontal areas during silent lipreading. PLoS One 2019; 14:e0223782. [PMID: 31600311 PMCID: PMC6786756 DOI: 10.1371/journal.pone.0223782] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2019] [Accepted: 09/27/2019] [Indexed: 11/19/2022] Open
Abstract
Aim The aim of the present work is to analyze the modulation of the brain activity within the areas involved in lipreading when an additional visual stimulus is included. Methods The experiment consisted of two fMRI runs (lipreading_only and lipreading+picture) where two conditions were considered in each one (oral speech sentences condition [OSS] and oral speech syllables condition [OSSY]). Results During lipreading-only, higher activity in the left middle temporal gyrus (MTG) was identified for OSS than OSSY; during lipreading+picture, apart from the left MTG, higher activity was also present in the supplementary motor area (SMA), the left precentral gyrus (PreCG) and the left inferior frontal gyrus (IFG). The comparison between these two runs revealed higher activity for lipreading-only in the SMA and the left IFG. Conclusion The presence of a visual reference during a lipreading task leads to a decrease in activity in frontal areas.
Collapse
|
16
|
Detection and Attention for Auditory, Visual, and Audiovisual Speech in Children with Hearing Loss. Ear Hear 2019; 41:508-520. [PMID: 31592903 DOI: 10.1097/aud.0000000000000798] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Efficient multisensory speech detection is critical for children who must quickly detect/encode a rapid stream of speech to participate in conversations and have access to the audiovisual cues that underpin speech and language development, yet multisensory speech detection remains understudied in children with hearing loss (CHL). This research assessed detection, along with vigilant/goal-directed attention, for multisensory versus unisensory speech in CHL versus children with normal hearing (CNH). DESIGN Participants were 60 CHL who used hearing aids and communicated successfully aurally/orally and 60 age-matched CNH. Simple response times determined how quickly children could detect a preidentified easy-to-hear stimulus (70 dB SPL, utterance "buh" presented in auditory only [A], visual only [V], or audiovisual [AV] modes). The V mode formed two facial conditions: static versus dynamic face. Faster detection for multisensory (AV) than unisensory (A or V) input indicates multisensory facilitation. We assessed mean responses and faster versus slower responses (defined by first versus third quartiles of response-time distributions), which were respectively conceptualized as: faster responses (first quartile) reflect efficient detection with efficient vigilant/goal-directed attention and slower responses (third quartile) reflect less efficient detection associated with attentional lapses. Finally, we studied associations between these results and personal characteristics of CHL. RESULTS Unisensory A versus V modes: Both groups showed better detection and attention for A than V input. The A input more readily captured children's attention and minimized attentional lapses, which supports A-bound processing even by CHL who were processing low fidelity A input. CNH and CHL did not differ in ability to detect A input at conversational speech level. Multisensory AV versus A modes: Both groups showed better detection and attention for AV than A input. The advantage for AV input was facial effect (both static and dynamic faces), a pattern suggesting that communication is a social interaction that is more than just words. Attention did not differ between groups; detection was faster in CHL than CNH for AV input, but not for A input. Associations between personal characteristics/degree of hearing loss of CHL and results: CHL with greatest deficits in detection of V input had poorest word recognition skills and CHL with greatest reduction of attentional lapses from AV input had poorest vocabulary skills. Both outcomes are consistent with the idea that CHL who are processing low fidelity A input depend disproportionately on V and AV input to learn to identify words and associate them with concepts. As CHL aged, attention to V input improved. Degree of HL did not influence results. CONCLUSIONS Understanding speech-a daily challenge for CHL-is a complex task that demands efficient detection of and attention to AV speech cues. Our results support the clinical importance of multisensory approaches to understand and advance spoken communication by CHL.
Collapse
|
17
|
Classification of schizophrenia and normal controls using 3D convolutional neural network and outcome visualization. Schizophr Res 2019; 212:186-195. [PMID: 31395487 DOI: 10.1016/j.schres.2019.07.034] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/11/2018] [Revised: 05/16/2019] [Accepted: 07/21/2019] [Indexed: 02/07/2023]
Abstract
BACKGROUND The recent deep learning-based studies on the classification of schizophrenia (SCZ) using MRI data rely on manual extraction of feature vector, which destroys the 3D structure of MRI data. In order to both identify SCZ and find relevant biomarkers, preserving the 3D structure in classification pipeline is critical. OBJECTIVES The present study investigated whether the proposed 3D convolutional neural network (CNN) model produces higher accuracy compared to the support vector machine (SVM) and other 3D-CNN models in distinguishing individuals with SCZ spectrum disorders (SSDs) from healthy controls. We sought to construct saliency map using class saliency visualization (CSV) method. METHODS Task-based fMRI data were obtained from 103 patients with SSDs and 41 normal controls. To preserve spatial locality, we used 3D activation map as input for the 3D convolutional autoencoder (3D-CAE)-based CNN model. Data on 62 patients with SSDs were used for unsupervised pretraining with 3D-CAE. Data on the remaining 41 patients and 41 normal controls were processed for training and testing with CNN. The performance of our model was analyzed and compared with SVM and other 3D-CNN models. The learned CNN model was visualized using CSV method. RESULTS Using task-based fMRI data, our model achieved 84.15%∼84.43% classification accuracies, outperforming SVM and other 3D-CNN models. The inferior and middle temporal lobes were identified as key regions for classification. CONCLUSIONS Our findings suggest that the proposed 3D-CAE-based CNN can classify patients with SSDs and controls with higher accuracy compared to other models. Visualization of salient regions provides important clinical information.
Collapse
|
18
|
Sato W, Kochiyama T, Uono S, Sawada R, Kubota Y, Yoshimura S, Toichi M. Widespread and lateralized social brain activity for processing dynamic facial expressions. Hum Brain Mapp 2019; 40:3753-3768. [PMID: 31090126 DOI: 10.1002/hbm.24629] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2019] [Revised: 04/28/2019] [Accepted: 05/02/2019] [Indexed: 11/07/2022] Open
Abstract
Dynamic facial expressions of emotions constitute natural and powerful means of social communication in daily life. A number of previous neuroimaging studies have explored the neural mechanisms underlying the processing of dynamic facial expressions, and indicated the activation of certain social brain regions (e.g., the amygdala) during such tasks. However, the activated brain regions were inconsistent across studies, and their laterality was rarely evaluated. To investigate these issues, we measured brain activity using functional magnetic resonance imaging in a relatively large sample (n = 51) during the observation of dynamic facial expressions of anger and happiness and their corresponding dynamic mosaic images. The observation of dynamic facial expressions, compared with dynamic mosaics, elicited stronger activity in the bilateral posterior cortices, including the inferior occipital gyri, fusiform gyri, and superior temporal sulci. The dynamic facial expressions also activated bilateral limbic regions, including the amygdalae and ventromedial prefrontal cortices, more strongly versus mosaics. In the same manner, activation was found in the right inferior frontal gyrus (IFG) and left cerebellum. Laterality analyses comparing original and flipped images revealed right hemispheric dominance in the superior temporal sulcus and IFG and left hemispheric dominance in the cerebellum. These results indicated that the neural mechanisms underlying processing of dynamic facial expressions include widespread social brain regions associated with perceptual, emotional, and motor functions, and include a clearly lateralized (right cortical and left cerebellar) network like that involved in language processing.
Collapse
Affiliation(s)
- Wataru Sato
- Kokoro Research Center, Kyoto University, Kyoto, Japan
| | | | - Shota Uono
- Department of Neurodevelopmental Psychiatry, Habilitation and Rehabilitation, Kyoto University, Kyoto, Japan
| | - Reiko Sawada
- Department of Neurodevelopmental Psychiatry, Habilitation and Rehabilitation, Kyoto University, Kyoto, Japan
| | - Yasutaka Kubota
- Health and Medical Services Center, Shiga University, Hikone, Shiga, Japan
| | - Sayaka Yoshimura
- Department of Neurodevelopmental Psychiatry, Habilitation and Rehabilitation, Kyoto University, Kyoto, Japan
| | - Motomi Toichi
- Faculty of Human Health Science, Kyoto University, Kyoto, Japan.,The Organization for Promoting Neurodevelopmental Disorder Research, Kyoto, Japan
| |
Collapse
|
19
|
Abstract
Speech research during recent years has moved progressively away from its traditional focus on audition toward a more multisensory approach. In addition to audition and vision, many somatosenses including proprioception, pressure, vibration and aerotactile sensation are all highly relevant modalities for experiencing and/or conveying speech. In this article, we review both long-standing cross-modal effects stemming from decades of audiovisual speech research as well as new findings related to somatosensory effects. Cross-modal effects in speech perception to date are found to be constrained by temporal congruence and signal relevance, but appear to be unconstrained by spatial congruence. Far from taking place in a one-, two- or even three-dimensional space, the literature reveals that speech occupies a highly multidimensional sensory space. We argue that future research in cross-modal effects should expand to consider each of these modalities both separately and in combination with other modalities in speech.
Collapse
Affiliation(s)
- Megan Keough
- Interdisciplinary Speech Research Lab, Department of Linguistics, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Donald Derrick
- New Zealand Institute of Brain and Behaviour, University of Canterbury, Christchurch 8140, New Zealand
- MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Penrith, New South Wales 2751, Australia
| | - Bryan Gick
- Interdisciplinary Speech Research Lab, Department of Linguistics, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
- Haskins Laboratories, Yale University, New Haven, CT 06511, USA
| |
Collapse
|
20
|
Jerger S, Damian MF, Karl C, Abdi H. Developmental Shifts in Detection and Attention for Auditory, Visual, and Audiovisual Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2018; 61:3095-3112. [PMID: 30515515 PMCID: PMC6440305 DOI: 10.1044/2018_jslhr-h-17-0343] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2017] [Revised: 01/02/2018] [Accepted: 07/16/2018] [Indexed: 06/09/2023]
Abstract
PURPOSE Successful speech processing depends on our ability to detect and integrate multisensory cues, yet there is minimal research on multisensory speech detection and integration by children. To address this need, we studied the development of speech detection for auditory (A), visual (V), and audiovisual (AV) input. METHOD Participants were 115 typically developing children clustered into age groups between 4 and 14 years. Speech detection (quantified by response times [RTs]) was determined for 1 stimulus, /buh/, presented in A, V, and AV modes (articulating vs. static facial conditions). Performance was analyzed not only in terms of traditional mean RTs but also in terms of the faster versus slower RTs (defined by the 1st vs. 3rd quartiles of RT distributions). These time regions were conceptualized respectively as reflecting optimal detection with efficient focused attention versus less optimal detection with inefficient focused attention due to attentional lapses. RESULTS Mean RTs indicated better detection (a) of multisensory AV speech than A speech only in 4- to 5-year-olds and (b) of A and AV inputs than V input in all age groups. The faster RTs revealed that AV input did not improve detection in any group. The slower RTs indicated that (a) the processing of silent V input was significantly faster for the articulating than static face and (b) AV speech or facial input significantly minimized attentional lapses in all groups except 6- to 7-year-olds (a peaked U-shaped curve). Apparently, the AV benefit observed for mean performance in 4- to 5-year-olds arose from effects of attention. CONCLUSIONS The faster RTs indicated that AV input did not enhance detection in any group, but the slower RTs indicated that AV speech and dynamic V speech (mouthing) significantly minimized attentional lapses and thus did influence performance. Overall, A and AV inputs were detected consistently faster than V input; this result endorsed stimulus-bound auditory processing by these children.
Collapse
Affiliation(s)
- Susan Jerger
- School of Behavioral and Brain Sciences, GR4.1, University of Texas at Dallas, Richardson
- Callier Center for Communication Disorders, Richardson, TX
| | - Markus F. Damian
- School of Experimental Psychology, University of Bristol, United Kingdom
| | - Cassandra Karl
- School of Behavioral and Brain Sciences, GR4.1, University of Texas at Dallas, Richardson
- Callier Center for Communication Disorders, Richardson, TX
| | - Hervé Abdi
- School of Behavioral and Brain Sciences, GR4.1, University of Texas at Dallas, Richardson
| |
Collapse
|
21
|
Altvater-Mackensen N, Grossmann T. Modality-independent recruitment of inferior frontal cortex during speech processing in human infants. Dev Cogn Neurosci 2018; 34:130-138. [PMID: 30391756 PMCID: PMC6969291 DOI: 10.1016/j.dcn.2018.10.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2017] [Revised: 08/25/2018] [Accepted: 10/25/2018] [Indexed: 11/22/2022] Open
Abstract
Despite increasing interest in the development of audiovisual speech perception in infancy, the underlying mechanisms and neural processes are still only poorly understood. In addition to regions in temporal cortex associated with speech processing and multimodal integration, such as superior temporal sulcus, left inferior frontal cortex (IFC) has been suggested to be critically involved in mapping information from different modalities during speech perception. To further illuminate the role of IFC during infant language learning and speech perception, the current study examined the processing of auditory, visual and audiovisual speech in 6-month-old infants using functional near-infrared spectroscopy (fNIRS). Our results revealed that infants recruit speech-sensitive regions in frontal cortex including IFC regardless of whether they processed unimodal or multimodal speech. We argue that IFC may play an important role in associating multimodal speech information during the early steps of language learning.
Collapse
Affiliation(s)
- Nicole Altvater-Mackensen
- Department of Psychology, Johannes-Gutenberg-University Mainz, Germany; Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
| | - Tobias Grossmann
- Department of Psychology, University of Virginia, USA; Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
22
|
Borowiak K, Schelinski S, von Kriegstein K. Recognizing visual speech: Reduced responses in visual-movement regions, but not other speech regions in autism. Neuroimage Clin 2018; 20:1078-1091. [PMID: 30368195 PMCID: PMC6202694 DOI: 10.1016/j.nicl.2018.09.019] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2018] [Revised: 09/19/2018] [Accepted: 09/21/2018] [Indexed: 12/23/2022]
Abstract
Speech information inherent in face movements is important for understanding what is said in face-to-face communication. Individuals with autism spectrum disorders (ASD) have difficulties in extracting speech information from face movements, a process called visual-speech recognition. Currently, it is unknown what dysfunctional brain regions or networks underlie the visual-speech recognition deficit in ASD. We conducted a functional magnetic resonance imaging (fMRI) study with concurrent eye tracking to investigate visual-speech recognition in adults diagnosed with high-functioning autism and pairwise matched typically developed controls. Compared to the control group (n = 17), the ASD group (n = 17) showed decreased Blood Oxygenation Level Dependent (BOLD) response during visual-speech recognition in the right visual area 5 (V5/MT) and left temporal visual speech area (TVSA) - brain regions implicated in visual-movement perception. The right V5/MT showed positive correlation with visual-speech task performance in the ASD group, but not in the control group. Psychophysiological interaction analysis (PPI) revealed that functional connectivity between the left TVSA and the bilateral V5/MT and between the right V5/MT and the left IFG was lower in the ASD than in the control group. In contrast, responses in other speech-motor regions and their connectivity were on the neurotypical level. Reduced responses and network connectivity of the visual-movement regions in conjunction with intact speech-related mechanisms indicate that perceptual mechanisms might be at the core of the visual-speech recognition deficit in ASD. Communication deficits in ASD might at least partly stem from atypical sensory processing and not higher-order cognitive processing of socially relevant information.
Collapse
Affiliation(s)
- Kamila Borowiak
- Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1a, 04103 Leipzig, Germany; Berlin School of Mind and Brain, Humboldt University of Berlin, Luisenstraße 56, 10117 Berlin, Germany; Technische Universität Dresden, Bamberger Straße 7, 01187 Dresden, Germany.
| | - Stefanie Schelinski
- Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1a, 04103 Leipzig, Germany; Technische Universität Dresden, Bamberger Straße 7, 01187 Dresden, Germany
| | - Katharina von Kriegstein
- Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1a, 04103 Leipzig, Germany; Technische Universität Dresden, Bamberger Straße 7, 01187 Dresden, Germany
| |
Collapse
|
23
|
Park H, Ince RAA, Schyns PG, Thut G, Gross J. Representational interactions during audiovisual speech entrainment: Redundancy in left posterior superior temporal gyrus and synergy in left motor cortex. PLoS Biol 2018; 16:e2006558. [PMID: 30080855 PMCID: PMC6095613 DOI: 10.1371/journal.pbio.2006558] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Revised: 08/16/2018] [Accepted: 07/24/2018] [Indexed: 11/24/2022] Open
Abstract
Integration of multimodal sensory information is fundamental to many aspects of human behavior, but the neural mechanisms underlying these processes remain mysterious. For example, during face-to-face communication, we know that the brain integrates dynamic auditory and visual inputs, but we do not yet understand where and how such integration mechanisms support speech comprehension. Here, we quantify representational interactions between dynamic audio and visual speech signals and show that different brain regions exhibit different types of representational interaction. With a novel information theoretic measure, we found that theta (3-7 Hz) oscillations in the posterior superior temporal gyrus/sulcus (pSTG/S) represent auditory and visual inputs redundantly (i.e., represent common features of the two), whereas the same oscillations in left motor and inferior temporal cortex represent the inputs synergistically (i.e., the instantaneous relationship between audio and visual inputs is also represented). Importantly, redundant coding in the left pSTG/S and synergistic coding in the left motor cortex predict behavior-i.e., speech comprehension performance. Our findings therefore demonstrate that processes classically described as integration can have different statistical properties and may reflect distinct mechanisms that occur in different brain regions to support audiovisual speech comprehension.
Collapse
Affiliation(s)
- Hyojin Park
- School of Psychology, Centre for Human Brain Health (CHBH), University of Birmingham, Birmingham, United Kingdom
| | - Robin A. A. Ince
- Institute of Neuroscience and Psychology, University of Glasgow, Glasgow, United Kingdom
| | - Philippe G. Schyns
- Institute of Neuroscience and Psychology, University of Glasgow, Glasgow, United Kingdom
| | - Gregor Thut
- Institute of Neuroscience and Psychology, University of Glasgow, Glasgow, United Kingdom
| | - Joachim Gross
- Institute of Neuroscience and Psychology, University of Glasgow, Glasgow, United Kingdom
- Institute for Biomagnetism and Biosignalanalysis, University of Muenster, Muenster, Germany
| |
Collapse
|
24
|
Proverbio AM, Raso G, Zani A. Electrophysiological Indexes of Incongruent Audiovisual Phonemic Processing: Unraveling the McGurk Effect. Neuroscience 2018; 385:215-226. [PMID: 29932985 DOI: 10.1016/j.neuroscience.2018.06.021] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Revised: 06/11/2018] [Accepted: 06/12/2018] [Indexed: 11/15/2022]
Abstract
In this study the timing of electromagnetic signals recorded during incongruent and congruent audiovisual (AV) stimulation in 14 Italian healthy volunteers was examined. In a previous study (Proverbio et al., 2016) we investigated the McGurk effect in the Italian language and found out which visual and auditory inputs provided the most compelling illusory effects (e.g., bilabial phonemes presented acoustically and paired with non-labials, especially alveolar-nasal and velar-occlusive phonemes). In this study EEG was recorded from 128 scalp sites while participants observed a female and a male actor uttering 288 syllables selected on the basis of the previous investigation (lasting approximately 600 ms) and responded to rare targets (/re/, /ri/, /ro/, /ru/). In half of the cases the AV information was incongruent, except for targets that were always congruent. A pMMN (phonological Mismatch Negativity) to incongruent AV stimuli was identified 500 ms after voice onset time. This automatic response indexed the detection of an incongruity between the labial and phonetic information. SwLORETA (Low-Resolution Electromagnetic Tomography) analysis applied to the difference voltage incongruent-congruent in the same time window revealed that the strongest sources of this activity were the right superior temporal (STG) and superior frontal gyri, which supports their involvement in AV integration.
Collapse
Affiliation(s)
- Alice Mado Proverbio
- Neuro-Mi Center for Neuroscience, Dept. of Psychology, University of Milano-Bicocca, Italy.
| | - Giulia Raso
- Neuro-Mi Center for Neuroscience, Dept. of Psychology, University of Milano-Bicocca, Italy
| | | |
Collapse
|
25
|
Glanz Iljina O, Derix J, Kaur R, Schulze-Bonhage A, Auer P, Aertsen A, Ball T. Real-life speech production and perception have a shared premotor-cortical substrate. Sci Rep 2018; 8:8898. [PMID: 29891885 PMCID: PMC5995900 DOI: 10.1038/s41598-018-26801-x] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Accepted: 05/09/2018] [Indexed: 11/25/2022] Open
Abstract
Motor-cognitive accounts assume that the articulatory cortex is involved in language comprehension, but previous studies may have observed such an involvement as an artefact of experimental procedures. Here, we employed electrocorticography (ECoG) during natural, non-experimental behavior combined with electrocortical stimulation mapping to study the neural basis of real-life human verbal communication. We took advantage of ECoG’s ability to capture high-gamma activity (70–350 Hz) as a spatially and temporally precise index of cortical activation during unconstrained, naturalistic speech production and perception conditions. Our findings show that an electrostimulation-defined mouth motor region located in the superior ventral premotor cortex is consistently activated during both conditions. This region became active early relative to the onset of speech production and was recruited during speech perception regardless of acoustic background noise. Our study thus pinpoints a shared ventral premotor substrate for real-life speech production and perception with its basic properties.
Collapse
Affiliation(s)
- Olga Glanz Iljina
- GRK 1624 'Frequency Effects in Language', University of Freiburg, Freiburg, Germany. .,Department of German Linguistics, University of Freiburg, Freiburg, Germany. .,Hermann Paul School of Linguistics, University of Freiburg, Freiburg, Germany. .,Translational Neurotechnology Lab, Department of Neurosurgery, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany. .,BrainLinks-BrainTools, University of Freiburg, Freiburg, Germany. .,Neurobiology and Biophysics, Faculty of Biology, University of Freiburg, Freiburg, Germany.
| | - Johanna Derix
- Translational Neurotechnology Lab, Department of Neurosurgery, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany.,BrainLinks-BrainTools, University of Freiburg, Freiburg, Germany.,Neurobiology and Biophysics, Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Rajbir Kaur
- Translational Neurotechnology Lab, Department of Neurosurgery, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany.,Faculty of Medicine, University of Cologne, Cologne, Germany
| | - Andreas Schulze-Bonhage
- BrainLinks-BrainTools, University of Freiburg, Freiburg, Germany.,Epilepsy Center, Department of Neurosurgery, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany.,Bernstein Center Freiburg, University of Freiburg, Freiburg, Germany
| | - Peter Auer
- GRK 1624 'Frequency Effects in Language', University of Freiburg, Freiburg, Germany.,Department of German Linguistics, University of Freiburg, Freiburg, Germany.,Hermann Paul School of Linguistics, University of Freiburg, Freiburg, Germany
| | - Ad Aertsen
- Neurobiology and Biophysics, Faculty of Biology, University of Freiburg, Freiburg, Germany.,Bernstein Center Freiburg, University of Freiburg, Freiburg, Germany
| | - Tonio Ball
- Translational Neurotechnology Lab, Department of Neurosurgery, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany. .,BrainLinks-BrainTools, University of Freiburg, Freiburg, Germany. .,Bernstein Center Freiburg, University of Freiburg, Freiburg, Germany.
| |
Collapse
|
26
|
Treille A, Vilain C, Schwartz JL, Hueber T, Sato M. Electrophysiological evidence for Audio-visuo-lingual speech integration. Neuropsychologia 2018; 109:126-133. [DOI: 10.1016/j.neuropsychologia.2017.12.024] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Revised: 11/21/2017] [Accepted: 12/13/2017] [Indexed: 01/25/2023]
|
27
|
Burnham D, Dodd B. Language–General Auditory–Visual Speech Perception: Thai–English and Japanese–English McGurk Effects. Multisens Res 2018; 31:79-110. [DOI: 10.1163/22134808-00002590] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2017] [Accepted: 06/19/2017] [Indexed: 11/19/2022]
Abstract
Cross-language McGurk Effects are used to investigate the locus of auditory–visual speech integration. Experiment 1 uses the fact that [], as in ‘sing’, is phonotactically legal in word-final position in English and Thai, but in word-initial position only in Thai. English and Thai language participants were tested for ‘n’ perception from auditory [m]/visual [] (A[m]V[]) in word-initial and -final positions. Despite English speakers’ native language bias to label word-initial [] as ‘n’, the incidence of ‘n’ percepts to A[m]V[] was equivalent for English and Thai speakers in final and initial positions. Experiment 2 used the facts that (i) [ð] as in ‘that’ is not present in Japanese, and (ii) English speakers respond more often with ‘tha’ than ‘da’ to A[ba]V[ga], but more often with ‘di’ than ‘thi’ to A[bi]V[gi]. English and three groups of Japanese language participants (Beginner, Intermediate, Advanced English knowledge) were presented with A[ba]V[ga] and A[bi]V[gi] by an English (Experiment 2a) or a Japanese (Experiment 2b) speaker. Despite Japanese participants’ native language bias to perceive ‘d’ more often than ‘th’, the four groups showed a similar phonetic level effect of [a]/[i] vowel context × ‘th’ vs. ‘d’ responses to A[b]V[g] presentations. In Experiment 2b this phonetic level interaction held, but was more one-sided as very few ‘th’ responses were evident, even in Australian English participants. Results are discussed in terms of a phonetic plus postcategorical model, in which incoming auditory and visual information is integrated at a phonetic level, after which there are post-categorical phonemic influences.
Collapse
Affiliation(s)
- Denis Burnham
- MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Sydney, Australia
| | | |
Collapse
|
28
|
Ciumas C, Laurent A, Saignavongs M, Ilski F, de Bellescize J, Panagiotakaki E, Ostrowsky-Coste K, Arzimanoglou A, Herbillon V, Ibarrola D, Ryvlin P. Behavioral and fMRI responses to fearful faces are altered in benign childhood epilepsy with centrotemporal spikes (BCECTS). Epilepsia 2017; 58:1716-1727. [DOI: 10.1111/epi.13858] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/03/2017] [Indexed: 11/30/2022]
Affiliation(s)
- Carolina Ciumas
- Translational and Integrative Group in Epilepsy Research (TIGER); INSERM U1028, CNRS UMR5292; Lyon Neuroscience Research Center; University Lyon 1; Lyon France
- Institute of Epilepsies (IDEE); Lyon France
- Department of Clinical Neurosciences; CHUV; Lausanne Switzerland
| | - Agathe Laurent
- Department of Neurosurgery; Sainte-Anne Hospital; Paris France
- Department of Clinical Epileptology, Sleep Disorders and Functional Neurology in Children; University Hospitals of Lyon (HCL); Lyon France
| | - Mani Saignavongs
- Translational and Integrative Group in Epilepsy Research (TIGER); INSERM U1028, CNRS UMR5292; Lyon Neuroscience Research Center; University Lyon 1; Lyon France
| | - Faustine Ilski
- Translational and Integrative Group in Epilepsy Research (TIGER); INSERM U1028, CNRS UMR5292; Lyon Neuroscience Research Center; University Lyon 1; Lyon France
- Department of Clinical Epileptology, Sleep Disorders and Functional Neurology in Children; University Hospitals of Lyon (HCL); Lyon France
| | - Julitta de Bellescize
- Department of Clinical Epileptology, Sleep Disorders and Functional Neurology in Children; University Hospitals of Lyon (HCL); Lyon France
| | - Eleni Panagiotakaki
- Department of Clinical Epileptology, Sleep Disorders and Functional Neurology in Children; University Hospitals of Lyon (HCL); Lyon France
| | - Karine Ostrowsky-Coste
- Department of Clinical Epileptology, Sleep Disorders and Functional Neurology in Children; University Hospitals of Lyon (HCL); Lyon France
| | - Alexis Arzimanoglou
- Department of Clinical Epileptology, Sleep Disorders and Functional Neurology in Children; University Hospitals of Lyon (HCL); Lyon France
- Brain Dynamics and Cognition Team (DYCOG); INSERM U1028, CNRS UMR5292; Lyon Neuroscience Research Center; University Lyon 1; Lyon France
| | - Vania Herbillon
- Department of Clinical Epileptology, Sleep Disorders and Functional Neurology in Children; University Hospitals of Lyon (HCL); Lyon France
- Brain Dynamics and Cognition Team (DYCOG); INSERM U1028, CNRS UMR5292; Lyon Neuroscience Research Center; University Lyon 1; Lyon France
| | | | - Philippe Ryvlin
- Institute of Epilepsies (IDEE); Lyon France
- Department of Clinical Neurosciences; CHUV; Lausanne Switzerland
| |
Collapse
|
29
|
Electrophysiological evidence for a self-processing advantage during audiovisual speech integration. Exp Brain Res 2017; 235:2867-2876. [PMID: 28676921 DOI: 10.1007/s00221-017-5018-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2017] [Accepted: 06/23/2017] [Indexed: 10/19/2022]
Abstract
Previous electrophysiological studies have provided strong evidence for early multisensory integrative mechanisms during audiovisual speech perception. From these studies, one unanswered issue is whether hearing our own voice and seeing our own articulatory gestures facilitate speech perception, possibly through a better processing and integration of sensory inputs with our own sensory-motor knowledge. The present EEG study examined the impact of self-knowledge during the perception of auditory (A), visual (V) and audiovisual (AV) speech stimuli that were previously recorded from the participant or from a speaker he/she had never met. Audiovisual interactions were estimated by comparing N1 and P2 auditory evoked potentials during the bimodal condition (AV) with the sum of those observed in the unimodal conditions (A + V). In line with previous EEG studies, our results revealed an amplitude decrease of P2 auditory evoked potentials in AV compared to A + V conditions. Crucially, a temporal facilitation of N1 responses was observed during the visual perception of self speech movements compared to those of another speaker. This facilitation was negatively correlated with the saliency of visual stimuli. These results provide evidence for a temporal facilitation of the integration of auditory and visual speech signals when the visual situation involves our own speech gestures.
Collapse
|
30
|
Venezia JH, Vaden KI, Rong F, Maddox D, Saberi K, Hickok G. Auditory, Visual and Audiovisual Speech Processing Streams in Superior Temporal Sulcus. Front Hum Neurosci 2017; 11:174. [PMID: 28439236 PMCID: PMC5383672 DOI: 10.3389/fnhum.2017.00174] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2016] [Accepted: 03/24/2017] [Indexed: 11/30/2022] Open
Abstract
The human superior temporal sulcus (STS) is responsive to visual and auditory information, including sounds and facial cues during speech recognition. We investigated the functional organization of STS with respect to modality-specific and multimodal speech representations. Twenty younger adult participants were instructed to perform an oddball detection task and were presented with auditory, visual, and audiovisual speech stimuli, as well as auditory and visual nonspeech control stimuli in a block fMRI design. Consistent with a hypothesized anterior-posterior processing gradient in STS, auditory, visual and audiovisual stimuli produced the largest BOLD effects in anterior, posterior and middle STS (mSTS), respectively, based on whole-brain, linear mixed effects and principal component analyses. Notably, the mSTS exhibited preferential responses to multisensory stimulation, as well as speech compared to nonspeech. Within the mid-posterior and mSTS regions, response preferences changed gradually from visual, to multisensory, to auditory moving posterior to anterior. Post hoc analysis of visual regions in the posterior STS revealed that a single subregion bordering the mSTS was insensitive to differences in low-level motion kinematics yet distinguished between visual speech and nonspeech based on multi-voxel activation patterns. These results suggest that auditory and visual speech representations are elaborated gradually within anterior and posterior processing streams, respectively, and may be integrated within the mSTS, which is sensitive to more abstract speech information within and across presentation modalities. The spatial organization of STS is consistent with processing streams that are hypothesized to synthesize perceptual speech representations from sensory signals that provide convergent information from visual and auditory modalities.
Collapse
Affiliation(s)
| | - Kenneth I Vaden
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South CarolinaCharleston, SC, USA
| | - Feng Rong
- Department of Cognitive Sciences, Center for Cognitive Neuroscience and Engineering, University of CaliforniaIrvine, CA, USA
| | - Dale Maddox
- Department of Cognitive Sciences, Center for Cognitive Neuroscience and Engineering, University of CaliforniaIrvine, CA, USA
| | - Kourosh Saberi
- Department of Cognitive Sciences, Center for Cognitive Neuroscience and Engineering, University of CaliforniaIrvine, CA, USA
| | - Gregory Hickok
- Department of Cognitive Sciences, Center for Cognitive Neuroscience and Engineering, University of CaliforniaIrvine, CA, USA
| |
Collapse
|
31
|
Aparicio M, Peigneux P, Charlier B, Balériaux D, Kavec M, Leybaert J. The Neural Basis of Speech Perception through Lipreading and Manual Cues: Evidence from Deaf Native Users of Cued Speech. Front Psychol 2017; 8:426. [PMID: 28424636 PMCID: PMC5371603 DOI: 10.3389/fpsyg.2017.00426] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 03/07/2017] [Indexed: 11/13/2022] Open
Abstract
We present here the first neuroimaging data for perception of Cued Speech (CS) by deaf adults who are native users of CS. CS is a visual mode of communicating a spoken language through a set of manual cues which accompany lipreading and disambiguate it. With CS, sublexical units of the oral language are conveyed clearly and completely through the visual modality without requiring hearing. The comparison of neural processing of CS in deaf individuals with processing of audiovisual (AV) speech in normally hearing individuals represents a unique opportunity to explore the similarities and differences in neural processing of an oral language delivered in a visuo-manual vs. an AV modality. The study included deaf adult participants who were early CS users and native hearing users of French who process speech audiovisually. Words were presented in an event-related fMRI design. Three conditions were presented to each group of participants. The deaf participants saw CS words (manual + lipread), words presented as manual cues alone, and words presented to be lipread without manual cues. The hearing group saw AV spoken words, audio-alone and lipread-alone. Three findings are highlighted. First, the middle and superior temporal gyrus (excluding Heschl's gyrus) and left inferior frontal gyrus pars triangularis constituted a common, amodal neural basis for AV and CS perception. Second, integration was inferred in posterior parts of superior temporal sulcus for audio and lipread information in AV speech, but in the occipito-temporal junction, including MT/V5, for the manual cues and lipreading in CS. Third, the perception of manual cues showed a much greater overlap with the regions activated by CS (manual + lipreading) than lipreading alone did. This supports the notion that manual cues play a larger role than lipreading for CS processing. The present study contributes to a better understanding of the role of manual cues as support of visual speech perception in the framework of the multimodal nature of human communication.
Collapse
Affiliation(s)
- Mario Aparicio
- Laboratory of Cognition, Language and Development, Centre de Recherches Neurosciences et Cognition, Université Libre de Bruxelles,Brussels, Belgium
| | - Philippe Peigneux
- Neuropsychology and Functional Neuroimaging Research Unit (UR2NF), Centre de Recherches Cognition et Neurosciences, Université Libre de Bruxelles,Brussels, Belgium
| | - Brigitte Charlier
- Laboratory of Cognition, Language and Development, Centre de Recherches Neurosciences et Cognition, Université Libre de Bruxelles,Brussels, Belgium
| | - Danielle Balériaux
- Department of Radiology, Clinics of Magnetic Resonance, Erasme HospitalBrussels, Belgium
| | - Martin Kavec
- Department of Radiology, Clinics of Magnetic Resonance, Erasme HospitalBrussels, Belgium
| | - Jacqueline Leybaert
- Laboratory of Cognition, Language and Development, Centre de Recherches Neurosciences et Cognition, Université Libre de Bruxelles,Brussels, Belgium
| |
Collapse
|
32
|
Wu C, Zheng Y, Li J, Zhang B, Li R, Wu H, She S, Liu S, Peng H, Ning Y, Li L. Activation and Functional Connectivity of the Left Inferior Temporal Gyrus during Visual Speech Priming in Healthy Listeners and Listeners with Schizophrenia. Front Neurosci 2017; 11:107. [PMID: 28360829 PMCID: PMC5350153 DOI: 10.3389/fnins.2017.00107] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 02/20/2017] [Indexed: 11/13/2022] Open
Abstract
Under a "cocktail-party" listening condition with multiple-people talking, compared to healthy people, people with schizophrenia benefit less from the use of visual-speech (lipreading) priming (VSP) cues to improve speech recognition. The neural mechanisms underlying the unmasking effect of VSP remain unknown. This study investigated the brain substrates underlying the unmasking effect of VSP in healthy listeners and the schizophrenia-induced changes in the brain substrates. Using functional magnetic resonance imaging, brain activation and functional connectivity for the contrasts of the VSP listening condition vs. the visual non-speech priming (VNSP) condition were examined in 16 healthy listeners (27.4 ± 8.6 years old, 9 females and 7 males) and 22 listeners with schizophrenia (29.0 ± 8.1 years old, 8 females and 14 males). The results showed that in healthy listeners, but not listeners with schizophrenia, the VSP-induced activation (against the VNSP condition) of the left posterior inferior temporal gyrus (pITG) was significantly correlated with the VSP-induced improvement in target-speech recognition against speech masking. Compared to healthy listeners, listeners with schizophrenia showed significantly lower VSP-induced activation of the left pITG and reduced functional connectivity of the left pITG with the bilateral Rolandic operculum, bilateral STG, and left insular. Thus, the left pITG and its functional connectivity may be the brain substrates related to the unmasking effect of VSP, assumedly through enhancing both the processing of target visual-speech signals and the inhibition of masking-speech signals. In people with schizophrenia, the reduced unmasking effect of VSP on speech recognition may be associated with a schizophrenia-related reduction of VSP-induced activation and functional connectivity of the left pITG.
Collapse
Affiliation(s)
- Chao Wu
- Beijing Key Laboratory of Behavior and Mental Health, Key Laboratory on Machine Perception, Ministry of Education, School of Psychological and Cognitive Sciences, Peking UniversityBeijing, China; School of Life Sciences, Peking UniversityBeijing, China; School of Psychology, Beijing Normal UniversityBeijing, China
| | - Yingjun Zheng
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Juanhua Li
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Bei Zhang
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Ruikeng Li
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Haibo Wu
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Shenglin She
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Sha Liu
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Hongjun Peng
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Yuping Ning
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Liang Li
- Beijing Key Laboratory of Behavior and Mental Health, Key Laboratory on Machine Perception, Ministry of Education, School of Psychological and Cognitive Sciences, Peking UniversityBeijing, China; The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital)Guangzhou, China; Beijing Institute for Brain Disorder, Capital Medical UniversityBeijing, China
| |
Collapse
|
33
|
Treille A, Vilain C, Hueber T, Lamalle L, Sato M. Inside Speech: Multisensory and Modality-specific Processing of Tongue and Lip Speech Actions. J Cogn Neurosci 2017; 29:448-466. [DOI: 10.1162/jocn_a_01057] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Abstract
Action recognition has been found to rely not only on sensory brain areas but also partly on the observer's motor system. However, whether distinct auditory and visual experiences of an action modulate sensorimotor activity remains largely unknown. In the present sparse sampling fMRI study, we determined to which extent sensory and motor representations interact during the perception of tongue and lip speech actions. Tongue and lip speech actions were selected because tongue movements of our interlocutor are accessible via their impact on speech acoustics but not visible because of its position inside the vocal tract, whereas lip movements are both “audible” and visible. Participants were presented with auditory, visual, and audiovisual speech actions, with the visual inputs related to either a sagittal view of the tongue movements or a facial view of the lip movements of a speaker, previously recorded by an ultrasound imaging system and a video camera. Although the neural networks involved in visual visuolingual and visuofacial perception largely overlapped, stronger motor and somatosensory activations were observed during visuolingual perception. In contrast, stronger activity was found in auditory and visual cortices during visuofacial perception. Complementing these findings, activity in the left premotor cortex and in visual brain areas was found to correlate with visual recognition scores observed for visuolingual and visuofacial speech stimuli, respectively, whereas visual activity correlated with RTs for both stimuli. These results suggest that unimodal and multimodal processing of lip and tongue speech actions rely on common sensorimotor brain areas. They also suggest that visual processing of audible but not visible movements induces motor and visual mental simulation of the perceived actions to facilitate recognition and/or to learn the association between auditory and visual signals.
Collapse
Affiliation(s)
| | | | | | - Laurent Lamalle
- 2Université Grenoble-Alpes & CHU de Grenoble
- 3CNRS UMS 3552, Grenoble, France
| | - Marc Sato
- 4CNRS UMR 7309 & Aix-Marseille Université
| |
Collapse
|
34
|
Weisberg J, Hubbard AL, Emmorey K. Multimodal integration of spontaneously produced representational co-speech gestures: an fMRI study. LANGUAGE, COGNITION AND NEUROSCIENCE 2016; 32:158-174. [PMID: 29130054 PMCID: PMC5675577 DOI: 10.1080/23273798.2016.1245426] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2016] [Accepted: 09/05/2016] [Indexed: 05/31/2023]
Abstract
To examine whether more ecologically valid co-speech gesture stimuli elicit brain responses consistent with those found by studies that relied on scripted stimuli, we presented participants with spontaneously produced, meaningful co-speech gesture during fMRI scanning (n = 28). Speech presented with gesture (versus either presented alone) elicited heightened activity in bilateral posterior superior temporal, premotor, and inferior frontal regions. Within left temporal and premotor, but not inferior frontal regions, we identified small clusters with superadditive responses, suggesting that these discrete regions support both sensory and semantic integration. In contrast, surrounding areas and the inferior frontal gyrus may support either sensory or semantic integration. Reduced activation for speech with gesture in language-related regions indicates allocation of fewer neural resources when meaningful gestures accompany speech. Sign language experience did not affect co-speech gesture activation. Overall, our results indicate that scripted stimuli have minimal confounding influences; however, they may miss subtle superadditive effects.
Collapse
Affiliation(s)
- Jill Weisberg
- Laboratory for Language and Cognitive Neuroscience, San Diego State University, 6495 Alvarado Rd., Suite 200, San Diego, CA 92120, USA, 619-594-8069,
| | - Amy Lynn Hubbard
- Laboratory for Language and Cognitive Neuroscience, San Diego State University, 6495 Alvarado Rd., Suite 200, San Diego, CA 92120, USA, 619-594-8069,
| | - Karen Emmorey
- Laboratory for Language and Cognitive Neuroscience, San Diego State University, 6495 Alvarado Rd., Suite 200, San Diego, CA 92120, USA, 619-594-8069,
| |
Collapse
|
35
|
Kaufmann JM, Schweinberger SR. Speaker Variations Influence Speechreading Speed for Dynamic Faces. Perception 2016; 34:595-610. [PMID: 15991696 DOI: 10.1068/p5104] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
We investigated the influence of task-irrelevant speaker variations on speechreading performance. In three experiments with video digitised faces presented either in dynamic, static-sequential, or static mode, participants performed speeded classifications on vowel utterances (German vowels /u/ and /i/). A Garner interference paradigm was used, in which speaker identity was task-irrelevant but could be either correlated, constant, or orthogonal to the vowel uttered. Reaction times for facial speech classifications were slowed by task-irrelevant speaker variations for dynamic stimuli. The results are discussed with reference to distributed models of face perception (Haxby et al, 2000 Trends in Cognitive Sciences4 223–233) and the relevance of both dynamic information and speaker characteristics for speechreading.
Collapse
Affiliation(s)
- Jürgen M Kaufmann
- Department of Psychology, University of Glasgow, 58 Hillhead Street, Glasgow G12 8QB, Scotland, UK.
| | | |
Collapse
|
36
|
Qi Z, Wang X, Hao S, Zhu C, He W, Luo W. Correlations of Electrophysiological Measurements with Identification Levels of Ancient Chinese Characters. PLoS One 2016; 11:e0151133. [PMID: 26982215 PMCID: PMC4794118 DOI: 10.1371/journal.pone.0151133] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2015] [Accepted: 02/24/2016] [Indexed: 11/18/2022] Open
Abstract
Studies of event-related potential (ERP) in the human brain have shown that the N170 component can reliably distinguish among different object categories. However, it is unclear whether this is true for different identifiable levels within a single category. In the present study, we used ERP recording to examine the neural response to different identification levels and orientations (upright vs. inverted) of Chinese characters. The results showed that P1, N170, and P250 were modulated by different identification levels of Chinese characters. Moreover, time frequency analysis showed similar results, indicating that identification levels were associated with object recognition, particularly during processing of a single categorical stimulus.
Collapse
Affiliation(s)
- Zhengyang Qi
- Research Center of Brain and Cognitive Neuroscience, Liaoning Normal University, Dalian, China
| | - Xiaolong Wang
- Research Center of Brain and Cognitive Neuroscience, Liaoning Normal University, Dalian, China
| | - Shuang Hao
- Research Center of Brain and Cognitive Neuroscience, Liaoning Normal University, Dalian, China
| | - Chuanlin Zhu
- Research Center of Brain and Cognitive Neuroscience, Liaoning Normal University, Dalian, China
| | - Weiqi He
- Research Center of Brain and Cognitive Neuroscience, Liaoning Normal University, Dalian, China
| | - Wenbo Luo
- Research Center of Brain and Cognitive Neuroscience, Liaoning Normal University, Dalian, China.,Laboratory of Cognition and Mental Health, Chongqing University of Arts and Sciences, Chongqing, China
| |
Collapse
|
37
|
Venezia JH, Fillmore P, Matchin W, Isenberg AL, Hickok G, Fridriksson J. Perception drives production across sensory modalities: A network for sensorimotor integration of visual speech. Neuroimage 2016; 126:196-207. [PMID: 26608242 PMCID: PMC4733636 DOI: 10.1016/j.neuroimage.2015.11.038] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2015] [Revised: 11/09/2015] [Accepted: 11/15/2015] [Indexed: 11/22/2022] Open
Abstract
Sensory information is critical for movement control, both for defining the targets of actions and providing feedback during planning or ongoing movements. This holds for speech motor control as well, where both auditory and somatosensory information have been shown to play a key role. Recent clinical research demonstrates that individuals with severe speech production deficits can show a dramatic improvement in fluency during online mimicking of an audiovisual speech signal suggesting the existence of a visuomotor pathway for speech motor control. Here we used fMRI in healthy individuals to identify this new visuomotor circuit for speech production. Participants were asked to perceive and covertly rehearse nonsense syllable sequences presented auditorily, visually, or audiovisually. The motor act of rehearsal, which is prima facie the same whether or not it is cued with a visible talker, produced different patterns of sensorimotor activation when cued by visual or audiovisual speech (relative to auditory speech). In particular, a network of brain regions including the left posterior middle temporal gyrus and several frontoparietal sensorimotor areas activated more strongly during rehearsal cued by a visible talker versus rehearsal cued by auditory speech alone. Some of these brain regions responded exclusively to rehearsal cued by visual or audiovisual speech. This result has significant implications for models of speech motor control, for the treatment of speech output disorders, and for models of the role of speech gesture imitation in development.
Collapse
Affiliation(s)
- Jonathan H Venezia
- Department of Cognitive Sciences, University of California, Irvine, Irvine, CA 92697, United States.
| | - Paul Fillmore
- Department of Communication Sciences and Disorders, Baylor University, Waco, TX 76798, United States
| | - William Matchin
- Department of Linguistics, University of Maryland, College Park, MD 20742, United States
| | - A Lisette Isenberg
- Department of Cognitive Sciences, University of California, Irvine, Irvine, CA 92697, United States
| | - Gregory Hickok
- Department of Cognitive Sciences, University of California, Irvine, Irvine, CA 92697, United States
| | - Julius Fridriksson
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, SC 29208, United States
| |
Collapse
|
38
|
Rhone AE, Nourski KV, Oya H, Kawasaki H, Howard MA, McMurray B. Can you hear me yet? An intracranial investigation of speech and non-speech audiovisual interactions in human cortex. LANGUAGE, COGNITION AND NEUROSCIENCE 2015; 31:284-302. [PMID: 27182530 PMCID: PMC4865257 DOI: 10.1080/23273798.2015.1101145] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
In everyday conversation, viewing a talker's face can provide information about the timing and content of an upcoming speech signal, resulting in improved intelligibility. Using electrocorticography, we tested whether human auditory cortex in Heschl's gyrus (HG) and on superior temporal gyrus (STG) and motor cortex on precentral gyrus (PreC) were responsive to visual/gestural information prior to the onset of sound and whether early stages of auditory processing were sensitive to the visual content (speech syllable versus non-speech motion). Event-related band power (ERBP) in the high gamma band was content-specific prior to acoustic onset on STG and PreC, and ERBP in the beta band differed in all three areas. Following sound onset, we found with no evidence for content-specificity in HG, evidence for visual specificity in PreC, and specificity for both modalities in STG. These results support models of audio-visual processing in which sensory information is integrated in non-primary cortical areas.
Collapse
|
39
|
Two neural pathways of face processing: A critical evaluation of current models. Neurosci Biobehav Rev 2015; 55:536-46. [DOI: 10.1016/j.neubiorev.2015.06.010] [Citation(s) in RCA: 127] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Revised: 04/22/2015] [Accepted: 06/05/2015] [Indexed: 11/15/2022]
|
40
|
Riedel P, Ragert P, Schelinski S, Kiebel SJ, von Kriegstein K. Visual face-movement sensitive cortex is relevant for auditory-only speech recognition. Cortex 2015; 68:86-99. [DOI: 10.1016/j.cortex.2014.11.016] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2014] [Revised: 10/24/2014] [Accepted: 11/25/2014] [Indexed: 12/31/2022]
|
41
|
Amaral CP, Simões MA, Castelo-Branco MS. Neural signals evoked by stimuli of increasing social scene complexity are detectable at the single-trial level and right lateralized. PLoS One 2015; 10:e0121970. [PMID: 25807525 PMCID: PMC4373781 DOI: 10.1371/journal.pone.0121970] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2014] [Accepted: 02/06/2015] [Indexed: 11/24/2022] Open
Abstract
Classification of neural signals at the single-trial level and the study of their relevance in affective and cognitive neuroscience are still in their infancy. Here we investigated the neurophysiological correlates of conditions of increasing social scene complexity using 3D human models as targets of attention, which may also be important in autism research. Challenging single-trial statistical classification of EEG neural signals was attempted for detection of oddball stimuli with increasing social scene complexity. Stimuli had an oddball structure and were as follows: 1) flashed schematic eyes, 2) simple 3D faces flashed between averted and non-averted gaze (only eye position changing), 3) simple 3D faces flashed between averted and non-averted gaze (head and eye position changing), 4) animated avatar alternated its gaze direction to the left and to the right (head and eye position), 5) environment with 4 animated avatars all of which change gaze and one of which is the target of attention. We found a late (> 300 ms) neurophysiological oddball correlate for all conditions irrespective of their complexity as assessed by repeated measures ANOVA. We attempted single-trial detection of this signal with automatic classifiers and obtained a significant balanced accuracy classification of around 79%, which is noteworthy given the amount of scene complexity. Lateralization analysis showed a specific right lateralization only for more complex realistic social scenes. In sum, complex ecological animations with social content elicit neurophysiological events which can be characterized even at the single-trial level. These signals are right lateralized. These finding paves the way for neuroscientific studies in affective neuroscience based on complex social scenes, and given the detectability at the single trial level this suggests the feasibility of brain computer interfaces that can be applied to social cognition disorders such as autism.
Collapse
Affiliation(s)
- Carlos P Amaral
- IBILI-Institute for Biomedical Imaging in Life Sciences, Faculty of Medicine, University of Coimbra, Coimbra, Portugal
| | - Marco A Simões
- IBILI-Institute for Biomedical Imaging in Life Sciences, Faculty of Medicine, University of Coimbra, Coimbra, Portugal
| | - Miguel S Castelo-Branco
- IBILI-Institute for Biomedical Imaging in Life Sciences, Faculty of Medicine, University of Coimbra, Coimbra, Portugal; ICNAS, Brain Imaging Network of Portugal, Coimbra, Portugal
| |
Collapse
|
42
|
Lateralization for dynamic facial expressions in human superior temporal sulcus. Neuroimage 2015; 106:340-52. [DOI: 10.1016/j.neuroimage.2014.11.020] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2014] [Revised: 10/04/2014] [Accepted: 11/08/2014] [Indexed: 11/24/2022] Open
|
43
|
Bernstein LE, Liebenthal E. Neural pathways for visual speech perception. Front Neurosci 2014; 8:386. [PMID: 25520611 PMCID: PMC4248808 DOI: 10.3389/fnins.2014.00386] [Citation(s) in RCA: 88] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Accepted: 11/10/2014] [Indexed: 12/03/2022] Open
Abstract
This paper examines the questions, what levels of speech can be perceived visually, and how is visual speech represented by the brain? Review of the literature leads to the conclusions that every level of psycholinguistic speech structure (i.e., phonetic features, phonemes, syllables, words, and prosody) can be perceived visually, although individuals differ in their abilities to do so; and that there are visual modality-specific representations of speech qua speech in higher-level vision brain areas. That is, the visual system represents the modal patterns of visual speech. The suggestion that the auditory speech pathway receives and represents visual speech is examined in light of neuroimaging evidence on the auditory speech pathways. We outline the generally agreed-upon organization of the visual ventral and dorsal pathways and examine several types of visual processing that might be related to speech through those pathways, specifically, face and body, orthography, and sign language processing. In this context, we examine the visual speech processing literature, which reveals widespread diverse patterns of activity in posterior temporal cortices in response to visual speech stimuli. We outline a model of the visual and auditory speech pathways and make several suggestions: (1) The visual perception of speech relies on visual pathway representations of speech qua speech. (2) A proposed site of these representations, the temporal visual speech area (TVSA) has been demonstrated in posterior temporal cortex, ventral and posterior to multisensory posterior superior temporal sulcus (pSTS). (3) Given that visual speech has dynamic and configural features, its representations in feedforward visual pathways are expected to integrate these features, possibly in TVSA.
Collapse
Affiliation(s)
- Lynne E Bernstein
- Department of Speech and Hearing Sciences, George Washington University Washington, DC, USA
| | - Einat Liebenthal
- Department of Neurology, Medical College of Wisconsin Milwaukee, WI, USA ; Department of Psychiatry, Brigham and Women's Hospital Boston, MA, USA
| |
Collapse
|
44
|
Visual abilities are important for auditory-only speech recognition: Evidence from autism spectrum disorder. Neuropsychologia 2014; 65:1-11. [DOI: 10.1016/j.neuropsychologia.2014.09.031] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2014] [Revised: 08/25/2014] [Accepted: 09/18/2014] [Indexed: 11/22/2022]
|
45
|
Campbell R, MacSweeney M, Woll B. Cochlear implantation (CI) for prelingual deafness: the relevance of studies of brain organization and the role of first language acquisition in considering outcome success. Front Hum Neurosci 2014; 8:834. [PMID: 25368567 PMCID: PMC4201085 DOI: 10.3389/fnhum.2014.00834] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2014] [Accepted: 09/30/2014] [Indexed: 11/13/2022] Open
Abstract
Cochlear implantation (CI) for profound congenital hearing impairment, while often successful in restoring hearing to the deaf child, does not always result in effective speech processing. Exposure to non-auditory signals during the pre-implantation period is widely held to be responsible for such failures. Here, we question the inference that such exposure irreparably distorts the function of auditory cortex, negatively impacting the efficacy of CI. Animal studies suggest that in congenital early deafness there is a disconnection between (disordered) activation in primary auditory cortex (A1) and activation in secondary auditory cortex (A2). In humans, one factor contributing to this functional decoupling is assumed to be abnormal activation of A1 by visual projections-including exposure to sign language. In this paper we show that that this abnormal activation of A1 does not routinely occur, while A2 functions effectively supramodally and multimodally to deliver spoken language irrespective of hearing status. What, then, is responsible for poor outcomes for some individuals with CI and for apparent abnormalities in cortical organization in these people? Since infancy is a critical period for the acquisition of language, deaf children born to hearing parents are at risk of developing inefficient neural structures to support skilled language processing. A sign language, acquired by a deaf child as a first language in a signing environment, is cortically organized like a heard spoken language in terms of specialization of the dominant perisylvian system. However, very few deaf children are exposed to sign language in early infancy. Moreover, no studies to date have examined sign language proficiency in relation to cortical organization in individuals with CI. Given the paucity of such relevant findings, we suggest that the best guarantee of good language outcome after CI is the establishment of a secure first language pre-implant-however that may be achieved, and whatever the success of auditory restoration.
Collapse
Affiliation(s)
- Ruth Campbell
- Deafness Cognition and Language Research Centre, University College LondonLondon, UK
| | - Mairéad MacSweeney
- Deafness Cognition and Language Research Centre, University College LondonLondon, UK
- Institute of Cognitive Neuroscience, University College LondonLondon, UK
| | - Bencie Woll
- Deafness Cognition and Language Research Centre, University College LondonLondon, UK
| |
Collapse
|
46
|
Reinl M, Bartels A. Face processing regions are sensitive to distinct aspects of temporal sequence in facial dynamics. Neuroimage 2014; 102 Pt 2:407-15. [PMID: 25132020 DOI: 10.1016/j.neuroimage.2014.08.011] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2014] [Revised: 07/25/2014] [Accepted: 08/04/2014] [Indexed: 12/16/2022] Open
Abstract
Facial movement conveys important information for social interactions, yet its neural processing is poorly understood. Computational models propose that shape- and temporal sequence sensitive mechanisms interact in processing dynamic faces. While face processing regions are known to respond to facial movement, their sensitivity to particular temporal sequences has barely been studied. Here we used fMRI to examine the sensitivity of human face-processing regions to two aspects of directionality in facial movement trajectories. We presented genuine movie recordings of increasing and decreasing fear expressions, each of which were played in natural or reversed frame order. This two-by-two factorial design matched low-level visual properties, static content and motion energy within each factor, emotion-direction (increasing or decreasing emotion) and timeline (natural versus artificial). The results showed sensitivity for emotion-direction in FFA, which was timeline-dependent as it only occurred within the natural frame order, and sensitivity to timeline in the STS, which was emotion-direction-dependent as it only occurred for decreased fear. The occipital face area (OFA) was sensitive to the factor timeline. These findings reveal interacting temporal sequence sensitive mechanisms that are responsive to both ecological meaning and to prototypical unfolding of facial dynamics. These mechanisms are temporally directional, provide socially relevant information regarding emotional state or naturalness of behavior, and agree with predictions from modeling and predictive coding theory.
Collapse
Affiliation(s)
- Maren Reinl
- Vision and Cognition Lab, Centre for Integrative Neuroscience, University of Tübingen, and Max Planck Institute for Biological Cybernetics, Tübingen 72076, Germany
| | - Andreas Bartels
- Vision and Cognition Lab, Centre for Integrative Neuroscience, University of Tübingen, and Max Planck Institute for Biological Cybernetics, Tübingen 72076, Germany.
| |
Collapse
|
47
|
Callan DE, Jones JA, Callan A. Multisensory and modality specific processing of visual speech in different regions of the premotor cortex. Front Psychol 2014; 5:389. [PMID: 24860526 PMCID: PMC4017150 DOI: 10.3389/fpsyg.2014.00389] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2014] [Accepted: 04/14/2014] [Indexed: 01/17/2023] Open
Abstract
Behavioral and neuroimaging studies have demonstrated that brain regions involved with speech production also support speech perception, especially under degraded conditions. The premotor cortex (PMC) has been shown to be active during both observation and execution of action (“Mirror System” properties), and may facilitate speech perception by mapping unimodal and multimodal sensory features onto articulatory speech gestures. For this functional magnetic resonance imaging (fMRI) study, participants identified vowels produced by a speaker in audio-visual (saw the speaker's articulating face and heard her voice), visual only (only saw the speaker's articulating face), and audio only (only heard the speaker's voice) conditions with varying audio signal-to-noise ratios in order to determine the regions of the PMC involved with multisensory and modality specific processing of visual speech gestures. The task was designed so that identification could be made with a high level of accuracy from visual only stimuli to control for task difficulty and differences in intelligibility. The results of the functional magnetic resonance imaging (fMRI) analysis for visual only and audio-visual conditions showed overlapping activity in inferior frontal gyrus and PMC. The left ventral inferior premotor cortex (PMvi) showed properties of multimodal (audio-visual) enhancement with a degraded auditory signal. The left inferior parietal lobule and right cerebellum also showed these properties. The left ventral superior and dorsal premotor cortex (PMvs/PMd) did not show this multisensory enhancement effect, but there was greater activity for the visual only over audio-visual conditions in these areas. The results suggest that the inferior regions of the ventral premotor cortex are involved with integrating multisensory information, whereas, more superior and dorsal regions of the PMC are involved with mapping unimodal (in this case visual) sensory features of the speech signal with articulatory speech gestures.
Collapse
Affiliation(s)
- Daniel E Callan
- Center for Information and Neural Networks, National Institute of Information and Communications Technology, Osaka University Osaka, Japan ; Multisensory Cognition and Computation Laboratory Universal Communication Research Institute, National Institute of Information and Communications Technology Kyoto, Japan
| | - Jeffery A Jones
- Psychology Department, Laurier Centre for Cognitive Neuroscience, Wilfrid Laurier University, Waterloo ON, Canada
| | - Akiko Callan
- Center for Information and Neural Networks, National Institute of Information and Communications Technology, Osaka University Osaka, Japan ; Multisensory Cognition and Computation Laboratory Universal Communication Research Institute, National Institute of Information and Communications Technology Kyoto, Japan
| |
Collapse
|
48
|
Perlman SB, Fournier JC, Bebko G, Bertocci MA, Hinze AK, Bonar L, Almeida JRC, Versace A, Schirda C, Travis M, Gill MK, Demeter C, Diwadkar VA, Sunshine JL, Holland SK, Kowatch RA, Birmaher B, Axelson D, Horwitz SM, Arnold LE, Fristad MA, Youngstrom EA, Findling RL, Phillips ML. Emotional face processing in pediatric bipolar disorder: evidence for functional impairments in the fusiform gyrus. J Am Acad Child Adolesc Psychiatry 2013; 52:1314-1325.e3. [PMID: 24290464 PMCID: PMC3881180 DOI: 10.1016/j.jaac.2013.09.004] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/22/2013] [Revised: 08/07/2013] [Accepted: 09/20/2013] [Indexed: 01/29/2023]
Abstract
OBJECTIVE Pediatric bipolar disorder involves poor social functioning, but the neural mechanisms underlying these deficits are not well understood. Previous neuroimaging studies have found deficits in emotional face processing localized to emotional brain regions. However, few studies have examined dysfunction in other regions of the face processing circuit. This study assessed hypoactivation in key face processing regions of the brain in pediatric bipolar disorder. METHOD Youth with a bipolar spectrum diagnosis (n = 20) were matched to a nonbipolar clinical group (n = 20), with similar demographics and comorbid diagnoses, and a healthy control group (n = 20). Youth participated in a functional magnetic resonance imaging (fMRI) scanning which employed a task-irrelevant emotion processing design in which processing of facial emotions was not germane to task performance. RESULTS Hypoactivation, isolated to the fusiform gyrus, was found when viewing animated, emerging facial expressions of happiness, sadness, fearfulness, and especially anger in pediatric bipolar participants relative to matched clinical and healthy control groups. CONCLUSIONS The results of the study imply that differences exist in visual regions of the brain's face processing system and are not solely isolated to emotional brain regions such as the amygdala. Findings are discussed in relation to facial emotion recognition and fusiform gyrus deficits previously reported in the autism literature. Behavioral interventions targeting attention to facial stimuli might be explored as possible treatments for bipolar disorder in youth.
Collapse
|
49
|
Chu YH, Lin FH, Chou YJ, Tsai KWK, Kuo WJ, Jääskeläinen IP. Effective cerebral connectivity during silent speech reading revealed by functional magnetic resonance imaging. PLoS One 2013; 8:e80265. [PMID: 24278266 PMCID: PMC3837007 DOI: 10.1371/journal.pone.0080265] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2012] [Accepted: 10/10/2013] [Indexed: 11/29/2022] Open
Abstract
Seeing the articulatory gestures of the speaker ("speech reading") enhances speech perception especially in noisy conditions. Recent neuroimaging studies tentatively suggest that speech reading activates speech motor system, which then influences superior-posterior temporal lobe auditory areas via an efference copy. Here, nineteen healthy volunteers were presented with silent videoclips of a person articulating Finnish vowels /a/, /i/ (non-targets), and /o/ (targets) during event-related functional magnetic resonance imaging (fMRI). Speech reading significantly activated visual cortex, posterior fusiform gyrus (pFG), posterior superior temporal gyrus and sulcus (pSTG/S), and the speech motor areas, including premotor cortex, parts of the inferior (IFG) and middle (MFG) frontal gyri extending into frontal polar (FP) structures, somatosensory areas, and supramarginal gyrus (SMG). Structural equation modelling (SEM) of these data suggested that information flows first from extrastriate visual cortex to pFS, and from there, in parallel, to pSTG/S and MFG/FP. From pSTG/S information flow continues to IFG or SMG and eventually somatosensory areas. Feedback connectivity was estimated to run from MFG/FP to IFG, and pSTG/S. The direct functional connection from pFG to MFG/FP and feedback connection from MFG/FP to pSTG/S and IFG support the hypothesis of prefrontal speech motor areas influencing auditory speech processing in pSTG/S via an efference copy.
Collapse
Affiliation(s)
- Ying-Hua Chu
- Institute of Biomedical Engineering, National Taiwan University, Taipei, Taiwan
| | - Fa-Hsuan Lin
- Institute of Biomedical Engineering, National Taiwan University, Taipei, Taiwan
- Department of Biomedical Engineering and Computational Science, Aalto University School of Science, Espoo, Finland
| | - Yu-Jen Chou
- Institute of Biomedical Engineering, National Taiwan University, Taipei, Taiwan
| | - Kevin W.-K. Tsai
- Institute of Biomedical Engineering, National Taiwan University, Taipei, Taiwan
| | - Wen-Jui Kuo
- Institute of Neuroscience, National Yang-Ming University, Taipei, Taiwan
| | - Iiro P. Jääskeläinen
- Department of Biomedical Engineering and Computational Science, Aalto University School of Science, Espoo, Finland
| |
Collapse
|
50
|
Matchin W, Groulx K, Hickok G. Audiovisual speech integration does not rely on the motor system: evidence from articulatory suppression, the McGurk effect, and fMRI. J Cogn Neurosci 2013; 26:606-20. [PMID: 24236768 DOI: 10.1162/jocn_a_00515] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Visual speech influences the perception of heard speech. A classic example of this is the McGurk effect, whereby an auditory /pa/ overlaid onto a visual /ka/ induces the fusion percept of /ta/. Recent behavioral and neuroimaging research has highlighted the importance of both articulatory representations and motor speech regions of the brain, particularly Broca's area, in audiovisual (AV) speech integration. Alternatively, AV speech integration may be accomplished by the sensory system through multisensory integration in the posterior STS. We assessed the claims regarding the involvement of the motor system in AV integration in two experiments: (i) examining the effect of articulatory suppression on the McGurk effect and (ii) determining if motor speech regions show an AV integration profile. The hypothesis regarding experiment (i) is that if the motor system plays a role in McGurk fusion, distracting the motor system through articulatory suppression should result in a reduction of McGurk fusion. The results of experiment (i) showed that articulatory suppression results in no such reduction, suggesting that the motor system is not responsible for the McGurk effect. The hypothesis of experiment (ii) was that if the brain activation to AV speech in motor regions (such as Broca's area) reflects AV integration, the profile of activity should reflect AV integration: AV > AO (auditory only) and AV > VO (visual only). The results of experiment (ii) demonstrate that motor speech regions do not show this integration profile, whereas the posterior STS does. Instead, activity in motor regions is task dependent. The combined results suggest that AV speech integration does not rely on the motor system.
Collapse
|