1
|
Ito T, Ogane R. Repetitive Exposure to Orofacial Somatosensory Inputs in Speech Perceptual Training Modulates Vowel Categorization in Speech Perception. Front Psychol 2022; 13:839087. [PMID: 35558689 PMCID: PMC9088678 DOI: 10.3389/fpsyg.2022.839087] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Accepted: 03/24/2022] [Indexed: 11/24/2022] Open
Abstract
Orofacial somatosensory inputs may play a role in the link between speech perception and production. Given the fact that speech motor learning, which involves paired auditory and somatosensory inputs, results in changes to speech perceptual representations, somatosensory inputs may also be involved in learning or adaptive processes of speech perception. Here we show that repetitive pairing of somatosensory inputs and sounds, such as occurs during speech production and motor learning, can also induce a change of speech perception. We examined whether the category boundary between /ε/ and /a/ was changed as a result of perceptual training with orofacial somatosensory inputs. The experiment consisted of three phases: Baseline, Training, and Aftereffect. In all phases, a vowel identification test was used to identify the perceptual boundary between /ε/ and /a/. In the Baseline and the Aftereffect phase, an adaptive method based on the maximum-likelihood procedure was applied to detect the category boundary using a small number of trials. In the Training phase, we used the method of constant stimuli in order to expose participants to stimulus variants which covered the range between /ε/ and /a/ evenly. In this phase, to mimic the sensory input that accompanies speech production and learning in an experimental group, somatosensory stimulation was applied in the upward direction when the stimulus sound was presented. A control group (CTL) followed the same training procedure in the absence of somatosensory stimulation. When we compared category boundaries prior to and following paired auditory-somatosensory training, the boundary for participants in the experimental group reliably changed in the direction of /ε/, indicating that the participants perceived /a/ more than /ε/ as a consequence of training. In contrast, the CTL did not show any change. Although a limited number of participants were tested, the perceptual shift was reduced and almost eliminated 1 week later. Our data suggest that repetitive exposure of somatosensory inputs in a task that simulates the sensory pairing which occurs during speech production, changes perceptual system and supports the idea that somatosensory inputs play a role in speech perceptual adaptation, probably contributing to the formation of sound representations for speech perception.
Collapse
Affiliation(s)
- Takayuki Ito
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France
- Haskins Laboratories, New Haven, CT, United States
| | - Rintaro Ogane
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France
- Haskins Laboratories, New Haven, CT, United States
| |
Collapse
|
2
|
Endo N, Ito T, Mochida T, Ijiri T, Watanabe K, Nakazawa K. Precise force controls enhance loudness discrimination of self-generated sound. Exp Brain Res 2021; 239:1141-1149. [PMID: 33555383 DOI: 10.1007/s00221-020-05993-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2020] [Accepted: 11/19/2020] [Indexed: 10/22/2022]
Abstract
Motor executions alter sensory processes. Studies have shown that loudness perception changes when a sound is generated by active movement. However, it is still unknown where and how the motor-related changes in loudness perception depend on the task demand of motor execution. We examined whether different levels of precision demands in motor control affects loudness perception. We carried out a loudness discrimination test, in which the sound stimulus was produced in conjunction with the force generation task. We tested three target force amplitude levels. The force target was presented on a monitor as a fixed visual target. The generated force was also presented on the same monitor as a movement of the visual cursor. Participants adjusted their force amplitude in a predetermined range without overshooting using these visual targets and moving cursor. In the control condition, the sound and visual stimuli were generated externally (without a force generation task). We found that the discrimination performance was significantly improved when the sound was produced by the force generation task compared to the control condition, in which the sound was produced externally, although we did not find that this improvement in discrimination performance changed depending on the different target force amplitude levels. The results suggest that the demand for precise control to produce a fixed amount of force may be key to obtaining the facilitatory effect of motor execution in auditory processes.
Collapse
Affiliation(s)
- Nozomi Endo
- Department of Life Sciences, Graduate School of Arts and Sciences, The University of Tokyo, 3-8-1, Komaba, Meguro-ku, Tokyo, 153-8902, Japan.,Faculty of Science and Engineering, Waseda University, 3-4-1, Ohkubo, Shinjuku-ku, Tokyo, 169-8555, Japan.,Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo, 102-0083, Japan
| | - Takayuki Ito
- Univ. Grenoble Alps, Grenoble-INP, CNRS, GIPSA-Lab, 11 rue des Mathématiques, Grenoble Campus BP46, 38402, Saint Martin D'heres Cedex, France.,Haskins Laboratories, 300 George Street, New Haven, CT, 06511, USA
| | - Takemi Mochida
- NTT Communication Science Laboratories, 3-1, Morinosato Wakamiya, Atsugi-shi, Kanagawa, 243-0198, Japan
| | - Tetsuya Ijiri
- Department of Life Sciences, Graduate School of Arts and Sciences, The University of Tokyo, 3-8-1, Komaba, Meguro-ku, Tokyo, 153-8902, Japan
| | - Katsumi Watanabe
- Faculty of Science and Engineering, Waseda University, 3-4-1, Ohkubo, Shinjuku-ku, Tokyo, 169-8555, Japan.,Art & Design, University of New South Wales, Oxford St & Greens Rd, Paddington, NSW 202, Australia
| | - Kimitaka Nakazawa
- Department of Life Sciences, Graduate School of Arts and Sciences, The University of Tokyo, 3-8-1, Komaba, Meguro-ku, Tokyo, 153-8902, Japan.
| |
Collapse
|
3
|
Komeilipoor N, Cesari P, Daffertshofer A. Involvement of superior temporal areas in audiovisual and audiomotor speech integration. Neuroscience 2017; 343:276-283. [PMID: 27019129 DOI: 10.1016/j.neuroscience.2016.03.047] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2015] [Revised: 03/16/2016] [Accepted: 03/16/2016] [Indexed: 11/25/2022]
Abstract
Perception of speech sounds is affected by observing facial motion. Incongruence between speech sounds and watching somebody articulating may influence the perception of auditory syllable, referred to as the McGurk effect. We tested the degree to which silent articulation of a syllable also affects speech perception and searched for its neural correlates. Listeners were instructed to identify the auditory syllables /pa/ and /ta/ while silently articulating congruent/incongruent syllables or observing videos of a speaker's face articulating them. As a baseline, we included an auditory-only condition without competing visual or sensorimotor input. As expected, perception of sounds degraded when incongruent syllables were observed, and also when they were silently articulated, albeit to a lesser extent. This degrading was accompanied by significant amplitude modulations in the beta frequency band in right superior temporal areas. In these areas, the event-related beta activity during congruent conditions was phase-locked to responses evoked during the auditory-only condition. We conclude that proper temporal alignment of different input streams in right superior temporal areas is mandatory for both audiovisual and audiomotor speech integration.
Collapse
Affiliation(s)
- N Komeilipoor
- MOVE Research Institute Amsterdam, Faculty of Behavioural and Movement Sciences, Vrije Universiteit, Van der Boechorststraat 9, 1081BT Amsterdam, The Netherlands; Department of Neurological, Biomedical and Movement Sciences, University of Verona, 37131 Verona, Italy
| | - P Cesari
- Department of Neurological, Biomedical and Movement Sciences, University of Verona, 37131 Verona, Italy
| | - A Daffertshofer
- MOVE Research Institute Amsterdam, Faculty of Behavioural and Movement Sciences, Vrije Universiteit, Van der Boechorststraat 9, 1081BT Amsterdam, The Netherlands.
| |
Collapse
|
4
|
Skipper JI, Devlin JT, Lametti DR. The hearing ear is always found close to the speaking tongue: Review of the role of the motor system in speech perception. BRAIN AND LANGUAGE 2017; 164:77-105. [PMID: 27821280 DOI: 10.1016/j.bandl.2016.10.004] [Citation(s) in RCA: 123] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2016] [Accepted: 10/24/2016] [Indexed: 06/06/2023]
Abstract
Does "the motor system" play "a role" in speech perception? If so, where, how, and when? We conducted a systematic review that addresses these questions using both qualitative and quantitative methods. The qualitative review of behavioural, computational modelling, non-human animal, brain damage/disorder, electrical stimulation/recording, and neuroimaging research suggests that distributed brain regions involved in producing speech play specific, dynamic, and contextually determined roles in speech perception. The quantitative review employed region and network based neuroimaging meta-analyses and a novel text mining method to describe relative contributions of nodes in distributed brain networks. Supporting the qualitative review, results show a specific functional correspondence between regions involved in non-linguistic movement of the articulators, covertly and overtly producing speech, and the perception of both nonword and word sounds. This distributed set of cortical and subcortical speech production regions are ubiquitously active and form multiple networks whose topologies dynamically change with listening context. Results are inconsistent with motor and acoustic only models of speech perception and classical and contemporary dual-stream models of the organization of language and the brain. Instead, results are more consistent with complex network models in which multiple speech production related networks and subnetworks dynamically self-organize to constrain interpretation of indeterminant acoustic patterns as listening context requires.
Collapse
Affiliation(s)
- Jeremy I Skipper
- Experimental Psychology, University College London, United Kingdom.
| | - Joseph T Devlin
- Experimental Psychology, University College London, United Kingdom
| | - Daniel R Lametti
- Experimental Psychology, University College London, United Kingdom; Department of Experimental Psychology, University of Oxford, United Kingdom
| |
Collapse
|
5
|
Tiainen M, Tiippana K, Vainio M, Peromaa T, Komeilipoor N, Vainio L. Selective Influences of Precision and Power Grips on Speech Categorization. PLoS One 2016; 11:e0151688. [PMID: 26978074 PMCID: PMC4792373 DOI: 10.1371/journal.pone.0151688] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2015] [Accepted: 03/02/2016] [Indexed: 11/18/2022] Open
Abstract
Recent studies have shown that articulatory gestures are systematically associated with specific manual grip actions. Here we show that executing such actions can influence performance on a speech-categorization task. Participants watched and/or listened to speech stimuli while executing either a power or a precision grip. Grip performance influenced the syllable categorization by increasing the proportion of responses of the syllable congruent with the executed grip (power grip—[ke] and precision grip—[te]). Two follow-up experiments indicated that the effect was based on action-induced bias in selecting the syllable.
Collapse
Affiliation(s)
- Mikko Tiainen
- Division of Cognitive and Neuropsychology, Institute of Behavioural Sciences, University of Helsinki, Helsinki, Finland
- * E-mail:
| | - Kaisa Tiippana
- Division of Cognitive and Neuropsychology, Institute of Behavioural Sciences, University of Helsinki, Helsinki, Finland
| | - Martti Vainio
- Phonetics and Speech Synthesis Research Group, Institute of Behavioural Sciences, University of Helsinki, Helsinki, Finland
| | - Tarja Peromaa
- Division of Cognitive and Neuropsychology, Institute of Behavioural Sciences, University of Helsinki, Helsinki, Finland
| | - Naeem Komeilipoor
- Division of Cognitive and Neuropsychology, Institute of Behavioural Sciences, University of Helsinki, Helsinki, Finland
| | - Lari Vainio
- Division of Cognitive and Neuropsychology, Institute of Behavioural Sciences, University of Helsinki, Helsinki, Finland
| |
Collapse
|
6
|
Katz WF, Mehta S. Visual Feedback of Tongue Movement for Novel Speech Sound Learning. Front Hum Neurosci 2015; 9:612. [PMID: 26635571 PMCID: PMC4652268 DOI: 10.3389/fnhum.2015.00612] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2015] [Accepted: 10/26/2015] [Indexed: 01/24/2023] Open
Abstract
Pronunciation training studies have yielded important information concerning the processing of audiovisual (AV) information. Second language (L2) learners show increased reliance on bottom-up, multimodal input for speech perception (compared to monolingual individuals). However, little is known about the role of viewing one's own speech articulation processes during speech training. The current study investigated whether real-time, visual feedback for tongue movement can improve a speaker's learning of non-native speech sounds. An interactive 3D tongue visualization system based on electromagnetic articulography (EMA) was used in a speech training experiment. Native speakers of American English produced a novel speech sound (/ɖ/; a voiced, coronal, palatal stop) before, during, and after trials in which they viewed their own speech movements using the 3D model. Talkers' productions were evaluated using kinematic (tongue-tip spatial positioning) and acoustic (burst spectra) measures. The results indicated a rapid gain in accuracy associated with visual feedback training. The findings are discussed with respect to neural models for multimodal speech processing.
Collapse
Affiliation(s)
- William F Katz
- Speech Production Lab, Callier Center for Communication Disorders, School of Behavioral and Brain Sciences, The University of Texas at Dallas Dallas, TX, USA
| | - Sonya Mehta
- Speech Production Lab, Callier Center for Communication Disorders, School of Behavioral and Brain Sciences, The University of Texas at Dallas Dallas, TX, USA
| |
Collapse
|
7
|
Guellaï B, Streri A, Yeung HH. The development of sensorimotor influences in the audiovisual speech domain: some critical questions. Front Psychol 2014; 5:812. [PMID: 25147528 PMCID: PMC4123602 DOI: 10.3389/fpsyg.2014.00812] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2014] [Accepted: 07/09/2014] [Indexed: 11/13/2022] Open
Abstract
Speech researchers have long been interested in how auditory and visual speech signals are integrated, and the recent work has revived interest in the role of speech production with respect to this process. Here, we discuss these issues from a developmental perspective. Because speech perception abilities typically outstrip speech production abilities in infancy and childhood, it is unclear how speech-like movements could influence audiovisual speech perception in development. While work on this question is still in its preliminary stages, there is nevertheless increasing evidence that sensorimotor processes (defined here as any motor or proprioceptive process related to orofacial movements) affect developmental audiovisual speech processing. We suggest three areas on which to focus in future research: (i) the relation between audiovisual speech perception and sensorimotor processes at birth, (ii) the pathways through which sensorimotor processes interact with audiovisual speech processing in infancy, and (iii) developmental change in sensorimotor pathways as speech production emerges in childhood.
Collapse
Affiliation(s)
- Bahia Guellaï
- Laboratoire Ethologie, Cognition, Développement, Université Paris Ouest Nanterre La Défense, NanterreFrance
| | - Arlette Streri
- CNRS, Laboratoire Psychologie de la Perception, UMR 8242, ParisFrance
| | - H. Henny Yeung
- CNRS, Laboratoire Psychologie de la Perception, UMR 8242, ParisFrance
- Université Paris Descartes, Paris Sorbonne Cité, ParisFrance
| |
Collapse
|
8
|
Carbonell KM, Lotto AJ. Speech is not special… again. Front Psychol 2014; 5:427. [PMID: 24917830 PMCID: PMC4042079 DOI: 10.3389/fpsyg.2014.00427] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2014] [Accepted: 04/22/2014] [Indexed: 11/13/2022] Open
Affiliation(s)
| | - Andrew J. Lotto
- Department of Speech, Language and Hearing Sciences, University of ArizonaTucson, AZ, USA
| |
Collapse
|