1
|
Rizzi R, Bidelman GM. Functional benefits of continuous vs. categorical listening strategies on the neural encoding and perception of noise-degraded speech. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.15.594387. [PMID: 38798410 PMCID: PMC11118460 DOI: 10.1101/2024.05.15.594387] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Acoustic information in speech changes continuously, yet listeners form discrete perceptual categories to ease the demands of perception. Being a more continuous/gradient as opposed to a discrete/categorical listener may be further advantageous for understanding speech in noise by increasing perceptual flexibility and resolving ambiguity. The degree to which a listener's responses to a continuum of speech sounds are categorical versus continuous can be quantified using visual analog scaling (VAS) during speech labeling tasks. Here, we recorded event-related brain potentials (ERPs) to vowels along an acoustic-phonetic continuum (/u/ to /a/) while listeners categorized phonemes in both clean and noise conditions. Behavior was assessed using standard two alternative forced choice (2AFC) and VAS paradigms to evaluate categorization under task structures that promote discrete (2AFC) vs. continuous (VAS) hearing, respectively. Behaviorally, identification curves were steeper under 2AFC vs. VAS categorization but were relatively immune to noise, suggesting robust access to abstract, phonetic categories even under signal degradation. Behavioral slopes were positively correlated with listeners' QuickSIN scores, suggesting a behavioral advantage for speech in noise comprehension conferred by gradient listening strategy. At the neural level, electrode level data revealed P2 peak amplitudes of the ERPs were modulated by task and noise; responses were larger under VAS vs. 2AFC categorization and showed larger noise-related delay in latency in the VAS vs. 2AFC condition. More gradient responders also had smaller shifts in ERP latency with noise, suggesting their neural encoding of speech was more resilient to noise degradation. Interestingly, source-resolved ERPs showed that more gradient listening was also correlated with stronger neural responses in left superior temporal gyrus. Our results demonstrate that listening strategy (i.e., being a discrete vs. continuous listener) modulates the categorical organization of speech and behavioral success, with continuous/gradient listening being more advantageous to speech in noise perception.
Collapse
|
2
|
Ren J, Cai L, Jia G, Niu H. Cortical specialization associated with native speech category acquisition in early infancy. Cereb Cortex 2024; 34:bhae124. [PMID: 38566511 DOI: 10.1093/cercor/bhae124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 02/29/2024] [Accepted: 03/08/2024] [Indexed: 04/04/2024] Open
Abstract
This study investigates neural processes in infant speech processing, with a focus on left frontal brain regions and hemispheric lateralization in Mandarin-speaking infants' acquisition of native tonal categories. We tested 2- to 6-month-old Mandarin learners to explore age-related improvements in tone discrimination, the role of inferior frontal regions in abstract speech category representation, and left hemisphere lateralization during tone processing. Using a block design, we presented four Mandarin tones via [ta] and measured oxygenated hemoglobin concentration with functional near-infrared spectroscopy. Results showed age-related improvements in tone discrimination, greater involvement of frontal regions in older infants indicating abstract tonal representation development and increased bilateral activation mirroring native adult Mandarin speakers. These findings contribute to our broader understanding of the relationship between native speech acquisition and infant brain development during the critical period of early language learning.
Collapse
Affiliation(s)
- Jie Ren
- Longy School of Music of Bard College, 27 Garden Street, Cambridge, MA 02138, United States
| | - Lin Cai
- Center for Evolutionary Cognitive Sciences, Graduate School of Arts and Sciences, The University of Tokyo, 3-8-1, Komaba, Meguro-ku, Tokyo 153-8902, Japan
| | - Gaoding Jia
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, No. 19, Xinjiekouwai St, Haidian District, Beijing 100875, China
| | - Haijing Niu
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, No. 19, Xinjiekouwai St, Haidian District, Beijing 100875, China
| |
Collapse
|
3
|
Elmer S, Kurthen I, Meyer M, Giroud N. A multidimensional characterization of the neurocognitive architecture underlying age-related temporal speech processing. Neuroimage 2023; 278:120285. [PMID: 37481009 DOI: 10.1016/j.neuroimage.2023.120285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 07/11/2023] [Accepted: 07/19/2023] [Indexed: 07/24/2023] Open
Abstract
Healthy aging is often associated with speech comprehension difficulties in everyday life situations despite a pure-tone hearing threshold in the normative range. Drawing on this background, we used a multidimensional approach to assess the functional and structural neural correlates underlying age-related temporal speech processing while controlling for pure-tone hearing acuity. Accordingly, we combined structural magnetic resonance imaging and electroencephalography, and collected behavioral data while younger and older adults completed a phonetic categorization and discrimination task with consonant-vowel syllables varying along a voice-onset time continuum. The behavioral results confirmed age-related temporal speech processing singularities which were reflected in a shift of the boundary of the psychometric categorization function, with older adults perceiving more syllable characterized by a short voice-onset time as /ta/ compared to younger adults. Furthermore, despite the absence of any between-group differences in phonetic discrimination abilities, older adults demonstrated longer N100/P200 latencies as well as increased P200 amplitudes while processing the consonant-vowel syllables varying in voice-onset time. Finally, older adults also exhibited a divergent anatomical gray matter infrastructure in bilateral auditory-related and frontal brain regions, as manifested in reduced cortical thickness and surface area. Notably, in the younger adults but not in the older adult cohort, cortical surface area in these two gross anatomical clusters correlated with the categorization of consonant-vowel syllables characterized by a short voice-onset time, suggesting the existence of a critical gray matter threshold that is crucial for consistent mapping of phonetic categories varying along the temporal dimension. Taken together, our results highlight the multifaceted dimensions of age-related temporal speech processing characteristics, and pave the way toward a better understanding of the relationships between hearing, speech and the brain in older age.
Collapse
Affiliation(s)
- Stefan Elmer
- Department of Computational Linguistics, Computational Neuroscience of Speech & Hearing, University of Zurich, Zurich, Switzerland; Competence center Language & Medicine, University of Zurich, Switzerland.
| | - Ira Kurthen
- Department of Computational Linguistics, Computational Neuroscience of Speech & Hearing, University of Zurich, Zurich, Switzerland
| | - Martin Meyer
- Department of Comparative Language Science, University of Zurich, Zurich, Switzerland; Center for Neuroscience Zurich, University and ETH of Zurich, Zurich, Switzerland; Center for the Interdisciplinary Study of Language Evolution (ISLE), University of Zurich, Zurich, Switzerland; Cognitive Psychology Unit, Alpen-Adria University, Klagenfurt, Austria
| | - Nathalie Giroud
- Department of Computational Linguistics, Computational Neuroscience of Speech & Hearing, University of Zurich, Zurich, Switzerland; Center for Neuroscience Zurich, University and ETH of Zurich, Zurich, Switzerland; Competence center Language & Medicine, University of Zurich, Switzerland
| |
Collapse
|
4
|
Kong EJ, Kang S. Individual Differences in Categorical Judgment of L2 Stops: A Link to Proficiency and Acoustic Cue-Weighting. LANGUAGE AND SPEECH 2023; 66:354-380. [PMID: 35822267 DOI: 10.1177/00238309221108647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
This study investigated individual differences in Korean adult learners' categorical perception of L2 English stops with an aim to explore the relationship of gradient categorizations to perceptual sensitivity to acoustic cues and L2 proficiency. Korean young adult L2 learners of English (N = 49) participated in two speech perception tasks (visual analog scaling and forced-choice identification) in which they listened to English voiced and voiceless stops and Korean lax and aspirated stops with Voice Onset Time (VOT) and F0 manipulated to form a continuum. It was found that in both L1 and L2 stop perception, listeners' gradient category judgment was associated with greater reliance on language-specific redundant cues (i.e., F0 in L2 English and VOT in L1 Korean) and that in the perception of L2 stops, categorical listeners who tended to be less sensitive to F0 were the ones with a higher level of L2 English proficiency. The results suggest that the categorical manner of judging L2 stops reflects learners' better knowledge of L2-specific acoustic cue-weightings, based on which less relevant acoustic information is effectively suppressed.
Collapse
Affiliation(s)
- Eun Jong Kong
- Department of English, Korea Aerospace University, South Korea
| | - Soyoung Kang
- School of Linguistics and Language Studies, Carleton University, Canada
| |
Collapse
|
5
|
Luthra S, Mechtenberg H, Giorio C, Theodore RM, Magnuson JS, Myers EB. Using TMS to evaluate a causal role for right posterior temporal cortex in talker-specific phonetic processing. BRAIN AND LANGUAGE 2023; 240:105264. [PMID: 37087863 PMCID: PMC10286152 DOI: 10.1016/j.bandl.2023.105264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 04/06/2023] [Accepted: 04/08/2023] [Indexed: 05/03/2023]
Abstract
Theories suggest that speech perception is informed by listeners' beliefs of what phonetic variation is typical of a talker. A previous fMRI study found right middle temporal gyrus (RMTG) sensitivity to whether a phonetic variant was typical of a talker, consistent with literature suggesting that the right hemisphere may play a key role in conditioning phonetic identity on talker information. The current work used transcranial magnetic stimulation (TMS) to test whether the RMTG plays a causal role in processing talker-specific phonetic variation. Listeners were exposed to talkers who differed in how they produced voiceless stop consonants while TMS was applied to RMTG, left MTG, or scalp vertex. Listeners subsequently showed near-ceiling performance in indicating which of two variants was typical of a trained talker, regardless of previous stimulation site. Thus, even though the RMTG is recruited for talker-specific phonetic processing, modulation of its function may have only modest consequences.
Collapse
Affiliation(s)
| | | | | | | | - James S Magnuson
- University of Connecticut, United States; BCBL. Basque Center on Cognition Brain and Language, Donostia-San Sebastián, Spain; Ikerbasque, Basque Foundation for Science, Bilbao, Spain
| | | |
Collapse
|
6
|
Shekari E, Nozari N. A narrative review of the anatomy and function of the white matter tracts in language production and comprehension. Front Hum Neurosci 2023; 17:1139292. [PMID: 37051488 PMCID: PMC10083342 DOI: 10.3389/fnhum.2023.1139292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 02/24/2023] [Indexed: 03/28/2023] Open
Abstract
Much is known about the role of cortical areas in language processing. The shift towards network approaches in recent years has highlighted the importance of uncovering the role of white matter in connecting these areas. However, despite a large body of research, many of these tracts’ functions are not well-understood. We present a comprehensive review of the empirical evidence on the role of eight major tracts that are hypothesized to be involved in language processing (inferior longitudinal fasciculus, inferior fronto-occipital fasciculus, uncinate fasciculus, extreme capsule, middle longitudinal fasciculus, superior longitudinal fasciculus, arcuate fasciculus, and frontal aslant tract). For each tract, we hypothesize its role based on the function of the cortical regions it connects. We then evaluate these hypotheses with data from three sources: studies in neurotypical individuals, neuropsychological data, and intraoperative stimulation studies. Finally, we summarize the conclusions supported by the data and highlight the areas needing further investigation.
Collapse
Affiliation(s)
- Ehsan Shekari
- Department of Neuroscience, Iran University of Medical Sciences, Tehran, Iran
| | - Nazbanou Nozari
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA, United States
- Center for the Neural Basis of Cognition (CNBC), Pittsburgh, PA, United States
- *Correspondence: Nazbanou Nozari
| |
Collapse
|
7
|
Luthra S, Magnuson JS, Myers EB. Right Posterior Temporal Cortex Supports Integration of Phonetic and Talker Information. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2023; 4:145-177. [PMID: 37229142 PMCID: PMC10205075 DOI: 10.1162/nol_a_00091] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 11/08/2022] [Indexed: 05/27/2023]
Abstract
Though the right hemisphere has been implicated in talker processing, it is thought to play a minimal role in phonetic processing, at least relative to the left hemisphere. Recent evidence suggests that the right posterior temporal cortex may support learning of phonetic variation associated with a specific talker. In the current study, listeners heard a male talker and a female talker, one of whom produced an ambiguous fricative in /s/-biased lexical contexts (e.g., epi?ode) and one who produced it in /∫/-biased contexts (e.g., friend?ip). Listeners in a behavioral experiment (Experiment 1) showed evidence of lexically guided perceptual learning, categorizing ambiguous fricatives in line with their previous experience. Listeners in an fMRI experiment (Experiment 2) showed differential phonetic categorization as a function of talker, allowing for an investigation of the neural basis of talker-specific phonetic processing, though they did not exhibit perceptual learning (likely due to characteristics of our in-scanner headphones). Searchlight analyses revealed that the patterns of activation in the right superior temporal sulcus (STS) contained information about who was talking and what phoneme they produced. We take this as evidence that talker information and phonetic information are integrated in the right STS. Functional connectivity analyses suggested that the process of conditioning phonetic identity on talker information depends on the coordinated activity of a left-lateralized phonetic processing system and a right-lateralized talker processing system. Overall, these results clarify the mechanisms through which the right hemisphere supports talker-specific phonetic processing.
Collapse
Affiliation(s)
- Sahil Luthra
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, USA
| | - James S. Magnuson
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, USA
- Basque Center on Cognition Brain and Language (BCBL), Donostia-San Sebastián, Spain
- Ikerbasque, Basque Foundation for Science, Bilbao, Spain
| | - Emily B. Myers
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, USA
- Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, CT, USA
| |
Collapse
|
8
|
Moinuddin KA, Havugimana F, Al-Fahad R, Bidelman GM, Yeasin M. Unraveling Spatial-Spectral Dynamics of Speech Categorization Speed Using Convolutional Neural Networks. Brain Sci 2022; 13:brainsci13010075. [PMID: 36672055 PMCID: PMC9856675 DOI: 10.3390/brainsci13010075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 12/22/2022] [Accepted: 12/24/2022] [Indexed: 12/31/2022] Open
Abstract
The process of categorizing sounds into distinct phonetic categories is known as categorical perception (CP). Response times (RTs) provide a measure of perceptual difficulty during labeling decisions (i.e., categorization). The RT is quasi-stochastic in nature due to individuality and variations in perceptual tasks. To identify the source of RT variation in CP, we have built models to decode the brain regions and frequency bands driving fast, medium and slow response decision speeds. In particular, we implemented a parameter optimized convolutional neural network (CNN) to classify listeners' behavioral RTs from their neural EEG data. We adopted visual interpretation of model response using Guided-GradCAM to identify spatial-spectral correlates of RT. Our framework includes (but is not limited to): (i) a data augmentation technique designed to reduce noise and control the overall variance of EEG dataset; (ii) bandpower topomaps to learn the spatial-spectral representation using CNN; (iii) large-scale Bayesian hyper-parameter optimization to find best performing CNN model; (iv) ANOVA and posthoc analysis on Guided-GradCAM activation values to measure the effect of neural regions and frequency bands on behavioral responses. Using this framework, we observe that α-β (10-20 Hz) activity over left frontal, right prefrontal/frontal, and right cerebellar regions are correlated with RT variation. Our results indicate that attention, template matching, temporal prediction of acoustics, motor control, and decision uncertainty are the most probable factors in RT variation.
Collapse
Affiliation(s)
| | - Felix Havugimana
- Department of EECE, University of Memphis, Memphis, TN 38152, USA
| | - Rakib Al-Fahad
- Department of EECE, University of Memphis, Memphis, TN 38152, USA
| | - Gavin M. Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN 47408, USA
| | - Mohammed Yeasin
- Department of EECE, University of Memphis, Memphis, TN 38152, USA
| |
Collapse
|
9
|
Dole M, Vilain C, Haldin C, Baciu M, Cousin E, Lamalle L, Lœvenbruck H, Vilain A, Schwartz JL. Comparing the selectivity of vowel representations in cortical auditory vs. motor areas: A repetition-suppression study. Neuropsychologia 2022; 176:108392. [DOI: 10.1016/j.neuropsychologia.2022.108392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2022] [Revised: 09/22/2022] [Accepted: 10/03/2022] [Indexed: 10/31/2022]
|
10
|
McMurray B, Sarrett ME, Chiu S, Black AK, Wang A, Canale R, Aslin RN. Decoding the temporal dynamics of spoken word and nonword processing from EEG. Neuroimage 2022; 260:119457. [PMID: 35842096 PMCID: PMC10875705 DOI: 10.1016/j.neuroimage.2022.119457] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Revised: 07/02/2022] [Accepted: 07/06/2022] [Indexed: 11/23/2022] Open
Abstract
The efficiency of spoken word recognition is essential for real-time communication. There is consensus that this efficiency relies on an implicit process of activating multiple word candidates that compete for recognition as the acoustic signal unfolds in real-time. However, few methods capture the neural basis of this dynamic competition on a msec-by-msec basis. This is crucial for understanding the neuroscience of language, and for understanding hearing, language and cognitive disorders in people for whom current behavioral methods are not suitable. We applied machine-learning techniques to standard EEG signals to decode which word was heard on each trial and analyzed the patterns of confusion over time. Results mirrored psycholinguistic findings: Early on, the decoder was equally likely to report the target (e.g., baggage) or a similar sounding competitor (badger), but by around 500 msec, competitors were suppressed. Follow up analyses show that this is robust across EEG systems (gel and saline), with fewer channels, and with fewer trials. Results are robust within individuals and show high reliability. This suggests a powerful and simple paradigm that can assess the neural dynamics of speech decoding, with potential applications for understanding lexical development in a variety of clinical disorders.
Collapse
Affiliation(s)
- Bob McMurray
- Dept. of Psychological and Brain Sciences, Dept. of Communication Sciences and Disorders, Dept. of Linguistics and Dept. of Otolaryngology, University of Iowa.
| | - McCall E Sarrett
- Interdisciplinary Graduate Program in Neuroscience, Unviersity of Iowa
| | - Samantha Chiu
- Dept. of Psychological and Brain Sciences, University of Iowa
| | - Alexis K Black
- School of Audiology and Speech Sciences, University of British Columbia, Haskins Laboratories
| | - Alice Wang
- Dept. of Psychology, University of Oregon, Haskins Laboratories
| | - Rebecca Canale
- Dept. of Psychological Sciences, University of Connecticut, Haskins Laboratories
| | - Richard N Aslin
- Haskins Laboratories, Department of Psychology and Child Study Center, Yale University, Department of Psychology, University of Connecticut
| |
Collapse
|
11
|
Preisig BC, Riecke L, Hervais-Adelman A. Speech sound categorization: The contribution of non-auditory and auditory cortical regions. Neuroimage 2022; 258:119375. [PMID: 35700949 DOI: 10.1016/j.neuroimage.2022.119375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 05/13/2022] [Accepted: 06/10/2022] [Indexed: 11/26/2022] Open
Abstract
Which processes in the human brain lead to the categorical perception of speech sounds? Investigation of this question is hampered by the fact that categorical speech perception is normally confounded by acoustic differences in the stimulus. By using ambiguous sounds, however, it is possible to dissociate acoustic from perceptual stimulus representations. Twenty-seven normally hearing individuals took part in an fMRI study in which they were presented with an ambiguous syllable (intermediate between /da/ and /ga/) in one ear and with disambiguating acoustic feature (third formant, F3) in the other ear. Multi-voxel pattern searchlight analysis was used to identify brain areas that consistently differentiated between response patterns associated with different syllable reports. By comparing responses to different stimuli with identical syllable reports and identical stimuli with different syllable reports, we disambiguated whether these regions primarily differentiated the acoustics of the stimuli or the syllable report. We found that BOLD activity patterns in left perisylvian regions (STG, SMG), left inferior frontal regions (vMC, IFG, AI), left supplementary motor cortex (SMA/pre-SMA), and right motor and somatosensory regions (M1/S1) represent listeners' syllable report irrespective of stimulus acoustics. Most of these regions are outside of what is traditionally regarded as auditory or phonological processing areas. Our results indicate that the process of speech sound categorization implicates decision-making mechanisms and auditory-motor transformations.
Collapse
Affiliation(s)
- Basil C Preisig
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, 6500 HB Nijmegen, The Netherlands; Max Planck Institute for Psycholinguistics, 6525 XD Nijmegen, The Netherlands; Department of Psychology, Neurolinguistics, University of Zurich, 8050 Zurich, Switzerland; Department of Comparative Language Science, Evolutionary Neuroscience of Language, University of Zurich, 8050 Zurich, Switzerland; Neuroscience Center Zurich, University of Zurich and Eidgenössische Technische Hochschule Zurich, 8057 Zurich, Switzerland.
| | - Lars Riecke
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6229 ER Maastricht, The Netherlands
| | - Alexis Hervais-Adelman
- Department of Psychology, Neurolinguistics, University of Zurich, 8050 Zurich, Switzerland; Neuroscience Center Zurich, University of Zurich and Eidgenössische Technische Hochschule Zurich, 8057 Zurich, Switzerland
| |
Collapse
|
12
|
Tamura S, Hirose N, Mitsudo T, Hoaki N, Nakamura I, Onitsuka T, Hirano Y. Multi-modal imaging of the auditory-larynx motor network for voicing perception. Neuroimage 2022; 251:118981. [PMID: 35150835 DOI: 10.1016/j.neuroimage.2022.118981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 12/20/2021] [Accepted: 02/07/2022] [Indexed: 10/19/2022] Open
Abstract
Voicing is one of the most important characteristics of phonetic speech sounds. Despite its importance, voicing perception mechanisms remain largely unknown. To explore auditory-motor networks associated with voicing perception, we firstly examined the brain regions that showed common activities for voicing production and perception using functional magnetic resonance imaging. Results indicated that the auditory and speech motor areas were activated with the operculum parietale 4 (OP4) during both voicing production and perception. Secondly, we used a magnetoencephalography and examined the dynamical functional connectivity of the auditory-motor networks during a perceptual categorization task of /da/-/ta/ continuum stimuli varying in voice onset time (VOT) from 0 to 40 ms in 10 ms steps. Significant functional connectivities from the auditory cortical regions to the larynx motor area via OP4 were observed only when perceiving the stimulus with VOT 30 ms. In addition, regional activity analysis showed that the neural representation of VOT in the auditory cortical regions was mostly correlated with categorical perception of voicing but did not reflect the perception of stimulus with VOT 30 ms. We suggest that the larynx motor area, which is considered to play a crucial role in voicing production, contributes to categorical perception of voicing by complementing the temporal processing in the auditory cortical regions.
Collapse
Affiliation(s)
- Shunsuke Tamura
- Department of Neuropsychiatry, Graduate School of Medical Sciences, Kyushu University, 3-1-1 Maidashi, Higashiku, Fukuoka 812-8582, Japan.
| | - Nobuyuki Hirose
- Faculty of Information Science and Electrical Engineering, Kyushu University, Fukuoka, Japan
| | - Takako Mitsudo
- Department of Neuropsychiatry, Graduate School of Medical Sciences, Kyushu University, 3-1-1 Maidashi, Higashiku, Fukuoka 812-8582, Japan
| | | | - Itta Nakamura
- Department of Neuropsychiatry, Graduate School of Medical Sciences, Kyushu University, 3-1-1 Maidashi, Higashiku, Fukuoka 812-8582, Japan
| | - Toshiaki Onitsuka
- Department of Neuropsychiatry, Graduate School of Medical Sciences, Kyushu University, 3-1-1 Maidashi, Higashiku, Fukuoka 812-8582, Japan
| | - Yoji Hirano
- Department of Neuropsychiatry, Graduate School of Medical Sciences, Kyushu University, 3-1-1 Maidashi, Higashiku, Fukuoka 812-8582, Japan; Neural Dynamics Laboratory, Research Service, VA Boston Healthcare System, and Department of Psychiatry, Harvard Medical School, Boston, United States
| |
Collapse
|
13
|
Abstract
Human speech perception results from neural computations that transform external acoustic speech signals into internal representations of words. The superior temporal gyrus (STG) contains the nonprimary auditory cortex and is a critical locus for phonological processing. Here, we describe how speech sound representation in the STG relies on fundamentally nonlinear and dynamical processes, such as categorization, normalization, contextual restoration, and the extraction of temporal structure. A spatial mosaic of local cortical sites on the STG exhibits complex auditory encoding for distinct acoustic-phonetic and prosodic features. We propose that as a population ensemble, these distributed patterns of neural activity give rise to abstract, higher-order phonemic and syllabic representations that support speech perception. This review presents a multi-scale, recurrent model of phonological processing in the STG, highlighting the critical interface between auditory and language systems. Expected final online publication date for the Annual Review of Psychology, Volume 73 is January 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Ilina Bhaya-Grossman
- Department of Neurological Surgery, University of California, San Francisco, California 94143, USA; .,Joint Graduate Program in Bioengineering, University of California, Berkeley and San Francisco, California 94720, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, California 94143, USA;
| |
Collapse
|
14
|
Scaglione M, Battaglia A, Lamanna A, Cerrato N, Di Donna P, Bertagnin E, Muro M, Alberto Caruzzo C, Gagliardi M, Caponi D. Adjunctive hypnotic communication for analgosedation in subcutaneous implantable cardioverter defibrillator implantation. A prospective single center pilot study. IJC HEART & VASCULATURE 2021; 35:100839. [PMID: 34307829 PMCID: PMC8287220 DOI: 10.1016/j.ijcha.2021.100839] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 06/18/2021] [Accepted: 06/30/2021] [Indexed: 11/04/2022]
Abstract
Background Subcutaneous implantable cardioverter defibrillator (S-ICD) is a well-established therapy for sudden death prevention. Considering the painful nature of the procedure anaesthesia may be required for analgo-sedation. Hypnosis is emerging as a promising therapeutic strategy for pain control. Few data are available regarding the use of hypnosis as adjunctive technique for pain control during S-ICD implantation. Methods Thirty consecutive patients referred to our centre for S-ICD implantation were prospectively and alternatively allocated with 1:1 ratio in two groups: A) Standard analgo-sedation approach (Hypnosis non responder patients) B) Standard analgo-sedation approach with the addition of hypnotic communication (Hypnosis responder patients). Peri-procedural pain perception and anxiety, perceived procedural length, type and dosage of administered analgesic drugs have been measured using validate scores and compared. Results Hypnotic communication was offered to 15 patients of which was successful in 11 patients (73%). There were no statistical differences between the two study groups according to baseline characteristics. Hypnosis communication resulted in significant pain perception reduction (Group A 6,9 ± 1,6 Vs Group B 1,1 ± 0,9, p value < 0,01), peri-procedural anxiety (Group A 3,5 ± 1,6 Vs Group B 1,9 ± 0,5, p value < 0,01) and reduced perceived procedural length (Group A 58,7 ± 13,4 min Vs Group B 44,7 ± 5,5 min, p value < 0,01). Fentanyl dosage was significantly lower in Group B patients. Conclusions Our results demonstrated a significant reduction of perceived pain, anxiety, procedural time and use of analgesic drugs in hypnosis responder patients. These results reinforce the beneficial effects of the hypnotic technique in patients undergoing S-ICD implantation.
Collapse
Affiliation(s)
- Marco Scaglione
- Division of Cardiology, Cardinal G. Massaia Hospital, Asti, Italy
| | | | - Andrea Lamanna
- Division of Cardiology, Cardinal G. Massaia Hospital, Asti, Italy
| | - Natascia Cerrato
- Division of Cardiology, Cardinal G. Massaia Hospital, Asti, Italy
| | - Paolo Di Donna
- Division of Cardiology, Cardinal G. Massaia Hospital, Asti, Italy
| | - Enrico Bertagnin
- Division of Cardiology, Cardinal G. Massaia Hospital, Asti, Italy
| | - Milena Muro
- Pain Therapy and Palliative Care, Azienda Ospedaliero-Universitaria Città della Salute e della Scienza di Torino, Italy
| | | | - Marco Gagliardi
- Division of Cardiology, Cardinal G. Massaia Hospital, Asti, Italy
| | - Domenico Caponi
- Division of Cardiology, Cardinal G. Massaia Hospital, Asti, Italy
| |
Collapse
|
15
|
Fuhrmeister P, Myers EB. Structural neural correlates of individual differences in categorical perception. BRAIN AND LANGUAGE 2021; 215:104919. [PMID: 33524740 DOI: 10.1016/j.bandl.2021.104919] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Revised: 11/18/2020] [Accepted: 01/12/2021] [Indexed: 06/12/2023]
Abstract
Listeners perceive speech sounds categorically. While group-level differences in categorical perception have been observed in children or individuals with reading disorders, recent findings suggest that typical adults vary in how categorically they perceive sounds. The current study investigated neural sources of individual variability in categorical perception of speech. Fifty-seven participants rated phonetic tokens on a visual analogue scale; categoricity and response consistency were measured and related to measures of brain structure from MRI. Increased surface area of the right middle frontal gyrus predicted more categorical perception of a fricative continuum. This finding supports the idea that frontal regions are sensitive to phonetic category-level information and extends it to make behavioral predictions at the individual level. Additionally, more gyrification in bilateral transverse temporal gyri predicted less consistent responses on the task, perhaps reflecting subtle variation in language ability across the population.
Collapse
Affiliation(s)
- Pamela Fuhrmeister
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, 2 Alethia Drive, Storrs, CT 06269, United States.
| | - Emily B Myers
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, 2 Alethia Drive, Storrs, CT 06269, United States
| |
Collapse
|
16
|
Luthra S. The Role of the Right Hemisphere in Processing Phonetic Variability Between Talkers. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2021; 2:138-151. [PMID: 37213418 PMCID: PMC10174361 DOI: 10.1162/nol_a_00028] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 11/13/2020] [Indexed: 05/23/2023]
Abstract
Neurobiological models of speech perception posit that both left and right posterior temporal brain regions are involved in the early auditory analysis of speech sounds. However, frank deficits in speech perception are not readily observed in individuals with right hemisphere damage. Instead, damage to the right hemisphere is often associated with impairments in vocal identity processing. Herein lies an apparent paradox: The mapping between acoustics and speech sound categories can vary substantially across talkers, so why might right hemisphere damage selectively impair vocal identity processing without obvious effects on speech perception? In this review, I attempt to clarify the role of the right hemisphere in speech perception through a careful consideration of its role in processing vocal identity. I review evidence showing that right posterior superior temporal, right anterior superior temporal, and right inferior / middle frontal regions all play distinct roles in vocal identity processing. In considering the implications of these findings for neurobiological accounts of speech perception, I argue that the recruitment of right posterior superior temporal cortex during speech perception may specifically reflect the process of conditioning phonetic identity on talker information. I suggest that the relative lack of involvement of other right hemisphere regions in speech perception may be because speech perception does not necessarily place a high burden on talker processing systems, and I argue that the extant literature hints at potential subclinical impairments in the speech perception abilities of individuals with right hemisphere damage.
Collapse
|
17
|
Quillen IA, Yen M, Wilson SM. Distinct neural correlates of linguistic demand and non-linguistic demand. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2021; 2:202-225. [PMID: 34585141 PMCID: PMC8475781 DOI: 10.1162/nol_a_00031] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/27/2023]
Abstract
In this study, we investigated how the brain responds to task difficulty in linguistic and non-linguistic contexts. This is important for the interpretation of functional imaging studies of neuroplasticity in post-stroke aphasia, because of the inherent difficulty of matching or controlling task difficulty in studies with neurological populations. Twenty neurologically normal individuals were scanned with fMRI as they performed a linguistic task and a non-linguistic task, each of which had two levels of difficulty. Critically, the tasks were matched across domains (linguistic, non-linguistic) for accuracy and reaction time, such that the differences between the easy and difficult conditions were equivalent across domains. We found that non-linguistic demand modulated the same set of multiple demand (MD) regions that have been identified in many prior studies. In contrast, linguistic demand modulated MD regions to a much lesser extent, especially nodes belonging to the dorsal attention network. Linguistic demand modulated a subset of language regions, with the left inferior frontal gyrus most strongly modulated. The right hemisphere region homotopic to Broca's area was also modulated by linguistic but not non-linguistic demand. When linguistic demand was mapped relative to non-linguistic demand, we also observed domain by difficulty interactions in temporal language regions as well as a widespread bilateral semantic network. In sum, linguistic and non-linguistic demand have strikingly different neural correlates. These findings can be used to better interpret studies of patients recovering from aphasia. Some reported activations in these studies may reflect task performance differences, while others can be more confidently attributed to neuroplasticity.
Collapse
Affiliation(s)
- Ian A Quillen
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Melodie Yen
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Stephen M Wilson
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
18
|
Chien PJ, Friederici AD, Hartwigsen G, Sammler D. Intonation processing increases task-specific fronto-temporal connectivity in tonal language speakers. Hum Brain Mapp 2020; 42:161-174. [PMID: 32996647 PMCID: PMC7721241 DOI: 10.1002/hbm.25214] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Revised: 09/08/2020] [Accepted: 09/13/2020] [Indexed: 01/08/2023] Open
Abstract
Language comprehension depends on tight functional interactions between distributed brain regions. While these interactions are established for semantic and syntactic processes, the functional network of speech intonation – the linguistic variation of pitch – has been scarcely defined. Particularly little is known about intonation in tonal languages, in which pitch not only serves intonation but also expresses meaning via lexical tones. The present study used psychophysiological interaction analyses of functional magnetic resonance imaging data to characterise the neural networks underlying intonation and tone processing in native Mandarin Chinese speakers. Participants categorised either intonation or tone of monosyllabic Mandarin words that gradually varied between statement and question and between Tone 2 and Tone 4. Intonation processing induced bilateral fronto‐temporal activity and increased functional connectivity between left inferior frontal gyrus and bilateral temporal regions, likely linking auditory perception and labelling of intonation categories in a phonological network. Tone processing induced bilateral temporal activity, associated with the auditory representation of tonal (phonemic) categories. Together, the present data demonstrate the breadth of the functional intonation network in a tonal language including higher‐level phonological processes in addition to auditory representations common to both intonation and tone.
Collapse
Affiliation(s)
- Pei-Ju Chien
- International Max Planck Research School NeuroCom, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.,Otto Hahn Group 'Neural Bases of Intonation in Speech and Music', Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.,Lise Meitner Research Group 'Cognition and Plasticity', Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.,Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Angela D Friederici
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Gesa Hartwigsen
- Lise Meitner Research Group 'Cognition and Plasticity', Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Daniela Sammler
- Otto Hahn Group 'Neural Bases of Intonation in Speech and Music', Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
19
|
Fox NP, Leonard M, Sjerps MJ, Chang EF. Transformation of a temporal speech cue to a spatial neural code in human auditory cortex. eLife 2020; 9:e53051. [PMID: 32840483 PMCID: PMC7556862 DOI: 10.7554/elife.53051] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Accepted: 08/21/2020] [Indexed: 11/28/2022] Open
Abstract
In speech, listeners extract continuously-varying spectrotemporal cues from the acoustic signal to perceive discrete phonetic categories. Spectral cues are spatially encoded in the amplitude of responses in phonetically-tuned neural populations in auditory cortex. It remains unknown whether similar neurophysiological mechanisms encode temporal cues like voice-onset time (VOT), which distinguishes sounds like /b/ and/p/. We used direct brain recordings in humans to investigate the neural encoding of temporal speech cues with a VOT continuum from /ba/ to /pa/. We found that distinct neural populations respond preferentially to VOTs from one phonetic category, and are also sensitive to sub-phonetic VOT differences within a population's preferred category. In a simple neural network model, simulated populations tuned to detect either temporal gaps or coincidences between spectral cues captured encoding patterns observed in real neural data. These results demonstrate that a spatial/amplitude neural code underlies the cortical representation of both spectral and temporal speech cues.
Collapse
Affiliation(s)
- Neal P Fox
- Department of Neurological Surgery, University of California, San FranciscoSan FranciscoUnited States
| | - Matthew Leonard
- Department of Neurological Surgery, University of California, San FranciscoSan FranciscoUnited States
| | - Matthias J Sjerps
- Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive Neuroimaging, Radboud UniversityNijmegenNetherlands
- Max Planck Institute for PsycholinguisticsNijmegenNetherlands
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San FranciscoSan FranciscoUnited States
- Weill Institute for Neurosciences, University of California, San FranciscoSan FranciscoUnited States
| |
Collapse
|
20
|
Getz LM, Toscano JC. The time-course of speech perception revealed by temporally-sensitive neural measures. WILEY INTERDISCIPLINARY REVIEWS. COGNITIVE SCIENCE 2020; 12:e1541. [PMID: 32767836 DOI: 10.1002/wcs.1541] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2019] [Revised: 05/28/2020] [Accepted: 06/26/2020] [Indexed: 11/07/2022]
Abstract
Recent advances in cognitive neuroscience have provided a detailed picture of the early time-course of speech perception. In this review, we highlight this work, placing it within the broader context of research on the neurobiology of speech processing, and discuss how these data point us toward new models of speech perception and spoken language comprehension. We focus, in particular, on temporally-sensitive measures that allow us to directly measure early perceptual processes. Overall, the data provide support for two key principles: (a) speech perception is based on gradient representations of speech sounds and (b) speech perception is interactive and receives input from higher-level linguistic context at the earliest stages of cortical processing. Implications for models of speech processing and the neurobiology of language more broadly are discussed. This article is categorized under: Psychology > Language Psychology > Perception and Psychophysics Neuroscience > Cognition.
Collapse
Affiliation(s)
- Laura M Getz
- Department of Psychological Sciences, University of San Diego, San Diego, California, USA
| | - Joseph C Toscano
- Department of Psychological and Brain Sciences, Villanova University, Villanova, Pennsylvania, USA
| |
Collapse
|
21
|
Rodrigues de Almeida L, Pope PA, Hansen PC. Task load modulates tDCS effects on brain network for phonological processing. Cogn Process 2020; 21:341-363. [PMID: 32152767 PMCID: PMC7381442 DOI: 10.1007/s10339-020-00964-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2019] [Accepted: 02/20/2020] [Indexed: 02/07/2023]
Abstract
Motor participation in phonological processing can be modulated by task nature across the speech perception to speech production range. The pars opercularis of the left inferior frontal gyrus (LIFG) would be increasingly active across this range, because of changing motor demands. Here, we investigated with simultaneous tDCS and fMRI whether the task load modulation of tDCS effects translates into predictable patterns of functional connectivity. Findings were analysed under the "multi-node framework", according to which task load and the network structure underlying cognitive functions are modulators of tDCS effects. In a within-subject study, participants (N = 20) performed categorical perception, lexical decision and word naming tasks [which differentially recruit the target of stimulation (LIFG)], which were repeatedly administered in three tDCS sessions (anodal, cathodal and sham). The LIFG, left superior temporal gyrus and their right homologues formed the target network subserving phonological processing. C-tDCS inhibition and A-tDCS excitation should increase with task load. Correspondingly, the larger the task load, the larger the relevance of the target for the task and smaller the room for compensation of C-tDCS inhibition by less relevant nodes. Functional connectivity analyses were performed with partial correlations, and network compensation globally inferred by comparing the relative number of significant connections each condition induced relative to sham. Overall, simultaneous tDCS and fMRI was adequate to show that motor participation in phonological processing is modulated by task nature. Network responses induced by C-tDCS across phonological processing tasks matched predictions. A-tDCS effects were attributed to optimisation of network efficiency.
Collapse
Affiliation(s)
| | - Paul A Pope
- School of Psychology, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| | - Peter C Hansen
- School of Psychology, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| |
Collapse
|
22
|
Saltzman DI, Myers EB. Neural Representation of Articulable and Inarticulable Novel Sound Contrasts: The Role of the Dorsal Stream. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2020; 1:339-364. [PMID: 35784619 PMCID: PMC9248853 DOI: 10.1162/nol_a_00016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2019] [Accepted: 05/23/2020] [Indexed: 06/15/2023]
Abstract
The extent that articulatory information embedded in incoming speech contributes to the formation of new perceptual categories for speech sounds has been a matter of discourse for decades. It has been theorized that the acquisition of new speech sound categories requires a network of sensory and speech motor cortical areas (the "dorsal stream") to successfully integrate auditory and articulatory information. However, it is possible that these brain regions are not sensitive specifically to articulatory information, but instead are sensitive to the abstract phonological categories being learned. We tested this hypothesis by training participants over the course of several days on an articulable non-native speech contrast and acoustically matched inarticulable nonspeech analogues. After reaching comparable levels of proficiency with the two sets of stimuli, activation was measured in fMRI as participants passively listened to both sound types. Decoding of category membership for the articulable speech contrast alone revealed a series of left and right hemisphere regions outside of the dorsal stream that have previously been implicated in the emergence of non-native speech sound categories, while no regions could successfully decode the inarticulable nonspeech contrast. Although activation patterns in the left inferior frontal gyrus, the middle temporal gyrus, and the supplementary motor area provided better information for decoding articulable (speech) sounds compared to the inarticulable (sine wave) sounds, the finding that dorsal stream regions do not emerge as good decoders of the articulable contrast alone suggests that other factors, including the strength and structure of the emerging speech categories are more likely drivers of dorsal stream activation for novel sound learning.
Collapse
|
23
|
Musicians use speech-specific areas when processing tones: The key to their superior linguistic competence? Behav Brain Res 2020; 390:112662. [PMID: 32442547 DOI: 10.1016/j.bbr.2020.112662] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 04/21/2020] [Accepted: 04/22/2020] [Indexed: 11/23/2022]
Abstract
It is known that musicians compared to non-musicians have some superior speech and language competence, yet the mechanisms how musical training leads to this advantage are not well specified. This event-related fMRI study confirmed that musicians outperformed non-musicians in processing not only of musical tones but also syllables and identified a network differentiating musicians from non-musicians during processing of linguistic sounds. Within this network, the activation of bilateral superior temporal gyrus was shared with all subjects during processing of the acoustically well-matched musical and linguistic sounds, and with the activation distinguishing tones with a complex harmonic spectrum (bowed tone) from a simpler one (plucked tone). These results confirm that better speech processing in musicians relies on improved cross-domain spectral analysis. Activation of left posterior superior temporal sulcus (pSTS), premotor cortex, inferior frontal and fusiform gyrus (FG) also distinguishing musicians from non-musicians during syllable processing overlapped with the activation segregating linguistic from musical sounds in all subjects. Since these brain-regions were not involved during tone processing in non-musicians, they could code for functions which are specialized for speech. Musicians recruited pSTS and FG during tone processing, thus these speech-specialized brain-areas processed musical sounds in the presence of musical training. This study shows that the linguistic advantage of musicians is linked not only to improved cross-domain spectral analysis, but also to the functional adaptation of brain resources that are specialized for speech, but accessible to the domain of music in the presence of musical training.
Collapse
|
24
|
Guediche S, Zhu Y, Minicucci D, Blumstein SE. Written sentence context effects on acoustic-phonetic perception: fMRI reveals cross-modal semantic-perceptual interactions. BRAIN AND LANGUAGE 2019; 199:104698. [PMID: 31586792 DOI: 10.1016/j.bandl.2019.104698] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Revised: 09/15/2019] [Accepted: 09/18/2019] [Indexed: 06/10/2023]
Abstract
This study examines cross-modality effects of a semantically-biased written sentence context on the perception of an acoustically-ambiguous word target identifying neural areas sensitive to interactions between sentential bias and phonetic ambiguity. Of interest is whether the locus or nature of the interactions resembles those previously demonstrated for auditory-only effects. FMRI results show significant interaction effects in right mid-middle temporal gyrus (RmMTG) and bilateral anterior superior temporal gyri (aSTG), regions along the ventral language comprehension stream that map sound onto meaning. These regions are more anterior than those previously identified for auditory-only effects; however, the same cross-over interaction pattern emerged implying similar underlying computations at play. The findings suggest that the mechanisms that integrate information across modality and across sentence and phonetic levels of processing recruit amodal areas where reading and spoken lexical and semantic access converge. Taken together, results support interactive accounts of speech and language processing.
Collapse
Affiliation(s)
- Sara Guediche
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, United States; BCBL - Basque Center on Cognition, Brain and Language, Donostia-San Sebastian, Spain.
| | - Yuli Zhu
- Neuroscience Department, Brown University, United States
| | - Domenic Minicucci
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, United States
| | - Sheila E Blumstein
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, United States; Brown Institute for Brain Science, Brown University, United States
| |
Collapse
|
25
|
Feng G, Gan Z, Wang S, Wong PCM, Chandrasekaran B. Task-General and Acoustic-Invariant Neural Representation of Speech Categories in the Human Brain. Cereb Cortex 2019; 28:3241-3254. [PMID: 28968658 DOI: 10.1093/cercor/bhx195] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2016] [Accepted: 07/13/2017] [Indexed: 11/14/2022] Open
Abstract
A significant neural challenge in speech perception includes extracting discrete phonetic categories from continuous and multidimensional signals despite varying task demands and surface-acoustic variability. While neural representations of speech categories have been previously identified in frontal and posterior temporal-parietal regions, the task dependency and dimensional specificity of these neural representations are still unclear. Here, we asked native Mandarin participants to listen to speech syllables carrying 4 distinct lexical tone categories across passive listening, repetition, and categorization tasks while they underwent functional magnetic resonance imaging (fMRI). We used searchlight classification and representational similarity analysis (RSA) to identify the dimensional structure underlying neural representation across tasks and surface-acoustic properties. Searchlight classification analyses revealed significant "cross-task" lexical tone decoding within the bilateral superior temporal gyrus (STG) and left inferior parietal lobule (LIPL). RSA revealed that the LIPL and LSTG, in contrast to the RSTG, relate to 2 critical dimensions (pitch height, pitch direction) underlying tone perception. Outside this core representational network, we found greater activation in the inferior frontal and parietal regions for stimuli that are more perceptually similar during tone categorization. Our findings reveal the specific characteristics of fronto-tempo-parietal regions that support speech representation and categorization processing.
Collapse
Affiliation(s)
- Gangyi Feng
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China.,Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China.,Department of Communication Sciences & Disorders, Moody College of Communication, The University of Texas at Austin, 2504A Whitis Avenue (A1100), Austin, TX, USA
| | - Zhenzhong Gan
- Center for the Study of Applied Psychology and School of Psychology, South China Normal University, Guangzhou, China
| | - Suiping Wang
- Center for the Study of Applied Psychology and School of Psychology, South China Normal University, Guangzhou, China.,Guangdong Provincial Key Laboratory of Mental Health and Cognitive Science, South China Normal University, Guangzhou, China
| | - Patrick C M Wong
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China.,Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
| | - Bharath Chandrasekaran
- Department of Communication Sciences & Disorders, Moody College of Communication, The University of Texas at Austin, 2504A Whitis Avenue (A1100), Austin, TX, USA.,Department of Psychology, The University of Texas at Austin, 108 E. Dean Keeton Stop, Austin, TX, USA.,Department of Linguistics, The University of Texas at Austin, 305 E. 23rd Street STOP, Austin, TX, USA.,Institute for Mental Health Research, College of Liberal Arts, The University of Texas at Austin, 305 E. 23rd St. Stop, Austin, TX, USA.,The Institute for Neuroscience, The University of Texas at Austin, 1 University Station Stop, Austin, TX, USA
| |
Collapse
|
26
|
Qi Z, Han M, Wang Y, de los Angeles C, Liu Q, Garel K, Chen ES, Whitfield-Gabrieli S, Gabrieli JD, Perrachione TK. Speech processing and plasticity in the right hemisphere predict variation in adult foreign language learning. Neuroimage 2019; 192:76-87. [DOI: 10.1016/j.neuroimage.2019.03.008] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 02/20/2019] [Accepted: 03/04/2019] [Indexed: 02/04/2023] Open
|
27
|
Luthra S, Guediche S, Blumstein SE, Myers EB. Neural substrates of subphonemic variation and lexical competition in spoken word recognition. LANGUAGE, COGNITION AND NEUROSCIENCE 2019; 34:151-169. [PMID: 31106225 PMCID: PMC6516505 DOI: 10.1080/23273798.2018.1531140] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
In spoken word recognition, subphonemic variation influences lexical activation, with sounds near a category boundary increasing phonetic competition as well as lexical competition. The current study investigated the interplay of these factors using a visual world task in which participants were instructed to look at a picture of an auditory target (e.g., peacock). Eyetracking data indicated that participants were slowed when a voiced onset competitor (e.g., beaker) was also displayed, and this effect was amplified when acoustic-phonetic competition was increased. Simultaneously-collected fMRI data showed that several brain regions were sensitive to the presence of the onset competitor, including the supramarginal, middle temporal, and inferior frontal gyri, and functional connectivity analyses revealed that the coordinated activity of left frontal regions depends on both acoustic-phonetic and lexical factors. Taken together, results suggest a role for frontal brain structures in resolving lexical competition, particularly as atypical acoustic-phonetic information maps on to the lexicon.
Collapse
Affiliation(s)
- Sahil Luthra
- Department of Psychological Sciences, University of Connecticut 406 Babbidge Road, Unit 1020, Storrs, CT, USA 06269
| | - Sara Guediche
- BCBL. Basque Center on Cognition, Brain and Language Mikeletegi Pasealekua, 69, 20009 Donostia, Gipuzkoa, Spain
| | - Sheila E Blumstein
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University 190 Thayer Street, Providence, RI, USA 02912
- Brown Institute for Brain Science, Brown University 2 Stimson Ave, Providence, RI, USA 02912
| | - Emily B Myers
- Department of Psychological Sciences, University of Connecticut 406 Babbidge Road, Unit 1020, Storrs, CT, USA 06269
- Department of Speech, Language & Hearing Sciences, University of Connecticut 850 Bolton Road, Unit 1085, Storrs, CT, USA 06269
- Haskins Laboratories 300 George Street, Suite 900, New Haven, CT, USA 06511
| |
Collapse
|
28
|
Toscano JC, Anderson ND, Fabiani M, Gratton G, Garnsey SM. The time-course of cortical responses to speech revealed by fast optical imaging. BRAIN AND LANGUAGE 2018; 184:32-42. [PMID: 29960165 PMCID: PMC6102048 DOI: 10.1016/j.bandl.2018.06.006] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2017] [Revised: 04/03/2018] [Accepted: 06/12/2018] [Indexed: 05/31/2023]
Abstract
Recent work has sought to describe the time-course of spoken word recognition, from initial acoustic cue encoding through lexical activation, and identify cortical areas involved in each stage of analysis. However, existing methods are limited in either temporal or spatial resolution, and as a result, have only provided partial answers to the question of how listeners encode acoustic information in speech. We present data from an experiment using a novel neuroimaging method, fast optical imaging, to directly assess the time-course of speech perception, providing non-invasive measurement of speech sound representations, localized to specific cortical areas. We find that listeners encode speech in terms of continuous acoustic cues at early stages of processing (ca. 96 ms post-stimulus onset), and begin activating phonological category representations rapidly (ca. 144 ms post-stimulus). Moreover, cue-based representations are widespread in the brain and overlap in time with graded category-based representations, suggesting that spoken word recognition involves simultaneous activation of both continuous acoustic cues and phonological categories.
Collapse
Affiliation(s)
- Joseph C Toscano
- Department of Psychological & Brain Sciences, Villanova University, United States; Beckman Institute for Advanced Science & Technology, University of Illinois at Urbana-Champaign, United States.
| | - Nathaniel D Anderson
- Beckman Institute for Advanced Science & Technology, University of Illinois at Urbana-Champaign, United States; Department of Psychology, University of Illinois at Urbana-Champaign, United States
| | - Monica Fabiani
- Beckman Institute for Advanced Science & Technology, University of Illinois at Urbana-Champaign, United States; Department of Psychology, University of Illinois at Urbana-Champaign, United States
| | - Gabriele Gratton
- Beckman Institute for Advanced Science & Technology, University of Illinois at Urbana-Champaign, United States; Department of Psychology, University of Illinois at Urbana-Champaign, United States
| | - Susan M Garnsey
- Beckman Institute for Advanced Science & Technology, University of Illinois at Urbana-Champaign, United States; Department of Psychology, University of Illinois at Urbana-Champaign, United States
| |
Collapse
|
29
|
In Spoken Word Recognition, the Future Predicts the Past. J Neurosci 2018; 38:7585-7599. [PMID: 30012695 DOI: 10.1523/jneurosci.0065-18.2018] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Revised: 06/06/2018] [Accepted: 07/09/2018] [Indexed: 11/21/2022] Open
Abstract
Speech is an inherently noisy and ambiguous signal. To fluently derive meaning, a listener must integrate contextual information to guide interpretations of the sensory input. Although many studies have demonstrated the influence of prior context on speech perception, the neural mechanisms supporting the integration of subsequent context remain unknown. Using MEG to record from human auditory cortex, we analyzed responses to spoken words with a varyingly ambiguous onset phoneme, the identity of which is later disambiguated at the lexical uniqueness point. Fifty participants (both male and female) were recruited across two MEG experiments. Our findings suggest that primary auditory cortex is sensitive to phonological ambiguity very early during processing at just 50 ms after onset. Subphonemic detail is preserved in auditory cortex over long timescales and re-evoked at subsequent phoneme positions. Commitments to phonological categories occur in parallel, resolving on the shorter timescale of ∼450 ms. These findings provide evidence that future input determines the perception of earlier speech sounds by maintaining sensory features until they can be integrated with top-down lexical information.SIGNIFICANCE STATEMENT The perception of a speech sound is determined by its surrounding context in the form of words, sentences, and other speech sounds. Often, such contextual information becomes available later than the sensory input. The present study is the first to unveil how the brain uses this subsequent information to aid speech comprehension. Concretely, we found that the auditory system actively maintains the acoustic signal in auditory cortex while concurrently making guesses about the identity of the words being said. Such a processing strategy allows the content of the message to be accessed quickly while also permitting reanalysis of the acoustic signal to minimize parsing mistakes.
Collapse
|
30
|
Magnuson JS, Mirman D, Luthra S, Strauss T, Harris HD. Interaction in Spoken Word Recognition Models: Feedback Helps. Front Psychol 2018; 9:369. [PMID: 29666593 PMCID: PMC5891609 DOI: 10.3389/fpsyg.2018.00369] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Accepted: 03/06/2018] [Indexed: 11/13/2022] Open
Abstract
Human perception, cognition, and action requires fast integration of bottom-up signals with top-down knowledge and context. A key theoretical perspective in cognitive science is the interactive activation hypothesis: forward and backward flow in bidirectionally connected neural networks allows humans and other biological systems to approximate optimal integration of bottom-up and top-down information under real-world constraints. An alternative view is that online feedback is neither necessary nor helpful; purely feed forward alternatives can be constructed for any feedback system, and online feedback could not improve processing and would preclude veridical perception. In the domain of spoken word recognition, the latter view was apparently supported by simulations using the interactive activation model, TRACE, with and without feedback: as many words were recognized more quickly without feedback as were recognized faster with feedback, However, these simulations used only a small set of words and did not address a primary motivation for interaction: making a model robust in noise. We conducted simulations using hundreds of words, and found that the majority were recognized more quickly with feedback than without. More importantly, as we added noise to inputs, accuracy and recognition times were better with feedback than without. We follow these simulations with a critical review of recent arguments that online feedback in interactive activation models like TRACE is distinct from other potentially helpful forms of feedback. We conclude that in addition to providing the benefits demonstrated in our simulations, online feedback provides a plausible means of implementing putatively distinct forms of feedback, supporting the interactive activation hypothesis.
Collapse
Affiliation(s)
- James S. Magnuson
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, United States
- Connecticut Institute for the Brain and Cognitive Sciences, University of Connecticut, Storrs, CT, United States
| | - Daniel Mirman
- Department of Psychology, University of Alabama at Birmingham, Birmingham, AL, United States
| | - Sahil Luthra
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, United States
- Connecticut Institute for the Brain and Cognitive Sciences, University of Connecticut, Storrs, CT, United States
| | - Ted Strauss
- McConnell Brain Imaging Centre, McGill University, Montreal, QC, Canada
| | - Harlan D. Harris
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, United States
- Connecticut Institute for the Brain and Cognitive Sciences, University of Connecticut, Storrs, CT, United States
| |
Collapse
|
31
|
Ghaleh M, Skipper-Kallal LM, Xing S, Lacey E, DeWitt I, DeMarco A, Turkeltaub P. Phonotactic processing deficit following left-hemisphere stroke. Cortex 2018; 99:346-357. [PMID: 29351881 PMCID: PMC5801128 DOI: 10.1016/j.cortex.2017.12.010] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Revised: 10/19/2017] [Accepted: 12/11/2017] [Indexed: 11/25/2022]
Abstract
The neural basis of speech processing is still a matter of great debate. Phonotactic knowledge-knowledge of the allowable sound combinations in a language-remains particularly understudied. The purpose of this study was to investigate the brain regions crucial to phonotactic knowledge in left-hemisphere stroke survivors. Results were compared to areas in which gray matter anatomy related to phonotactic knowledge in healthy controls. 44 patients with chronic left-hemisphere stroke, and 32 controls performed an English-likeness rating task on 60 auditory non-words of varying phonotactic regularities. They were asked to rate on a 1-5 scale, how close each non-word sounded to English. Patients' performance was compared to that of healthy controls, using mixed effects modeling. Multivariate lesion-symptom mapping and voxel-based morphometry were used to find the brain regions important for phonotactic processing in patients and controls respectively. The results showed that compared to controls, stroke survivors were less sensitive to phonotactic regularity differences. Lesion-symptom mapping demonstrated that a loss of sensitivity to phonotactic regularities was associated with lesions in left angular gyrus and posterior middle temporal gyrus. Voxel-based morphometry also revealed a positive correlation between gray matter density in left angular gyrus and sensitivity to phonotactic regularities in controls. We suggest that the angular gyrus is used to compare the incoming speech stream to internal predictions based on the frequency of sound sequences in the language derived from stored lexical representations in the posterior middle temporal gyrus.
Collapse
Affiliation(s)
- Maryam Ghaleh
- Georgetown University Medical Center, Neurology Department, Washington, DC, USA.
| | | | - Shihui Xing
- Georgetown University Medical Center, Neurology Department, Washington, DC, USA; Department of Neurology, First Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
| | - Elizabeth Lacey
- Georgetown University Medical Center, Neurology Department, Washington, DC, USA; MedStar National Rehabilitation Hospital, Washington, DC, USA
| | - Iain DeWitt
- Brain Imaging and Modeling Section, NIH/NIDCD, Bethesda, MD, USA
| | - Andrew DeMarco
- Georgetown University Medical Center, Neurology Department, Washington, DC, USA
| | - Peter Turkeltaub
- Georgetown University Medical Center, Neurology Department, Washington, DC, USA; MedStar National Rehabilitation Hospital, Washington, DC, USA.
| |
Collapse
|
32
|
Focal versus distributed temporal cortex activity for speech sound category assignment. Proc Natl Acad Sci U S A 2018; 115:E1299-E1308. [PMID: 29363598 PMCID: PMC5819402 DOI: 10.1073/pnas.1714279115] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
When listening to speech, phonemes are represented in a distributed fashion in our temporal and prefrontal cortices. How these representations are selected in a phonemic decision context, and in particular whether distributed or focal neural information is required for explicit phoneme recognition, is unclear. We hypothesized that focal and early neural encoding of acoustic signals is sufficiently informative to access speech sound representations and permit phoneme recognition. We tested this hypothesis by combining a simple speech-phoneme categorization task with univariate and multivariate analyses of fMRI, magnetoencephalography, intracortical, and clinical data. We show that neural information available focally in the temporal cortex prior to decision-related neural activity is specific enough to account for human phonemic identification. Percepts and words can be decoded from distributed neural activity measures. However, the existence of widespread representations might conflict with the more classical notions of hierarchical processing and efficient coding, which are especially relevant in speech processing. Using fMRI and magnetoencephalography during syllable identification, we show that sensory and decisional activity colocalize to a restricted part of the posterior superior temporal gyrus (pSTG). Next, using intracortical recordings, we demonstrate that early and focal neural activity in this region distinguishes correct from incorrect decisions and can be machine-decoded to classify syllables. Crucially, significant machine decoding was possible from neuronal activity sampled across different regions of the temporal and frontal lobes, despite weak or absent sensory or decision-related responses. These findings show that speech-sound categorization relies on an efficient readout of focal pSTG neural activity, while more distributed activity patterns, although classifiable by machine learning, instead reflect collateral processes of sensory perception and decision.
Collapse
|
33
|
Abstract
Categorical effects are found across speech sound categories, with the degree of these effects ranging from extremely strong categorical perception in consonants to nearly continuous perception in vowels. We show that both strong and weak categorical effects can be captured by a unified model. We treat speech perception as a statistical inference problem, assuming that listeners use their knowledge of categories as well as the acoustics of the signal to infer the intended productions of the speaker. Simulations show that the model provides close fits to empirical data, unifying past findings of categorical effects in consonants and vowels and capturing differences in the degree of categorical effects through a single parameter.
Collapse
|
34
|
Xie X, Myers E. Left Inferior Frontal Gyrus Sensitivity to Phonetic Competition in Receptive Language Processing: A Comparison of Clear and Conversational Speech. J Cogn Neurosci 2017; 30:267-280. [PMID: 29160743 DOI: 10.1162/jocn_a_01208] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The speech signal is rife with variations in phonetic ambiguity. For instance, when talkers speak in a conversational register, they demonstrate less articulatory precision, leading to greater potential for confusability at the phonetic level compared with a clear speech register. Current psycholinguistic models assume that ambiguous speech sounds activate more than one phonological category and that competition at prelexical levels cascades to lexical levels of processing. Imaging studies have shown that the left inferior frontal gyrus (LIFG) is modulated by phonetic competition between simultaneously activated categories, with increases in activation for more ambiguous tokens. Yet, these studies have often used artificially manipulated speech and/or metalinguistic tasks, which arguably may recruit neural regions that are not critical for natural speech recognition. Indeed, a prominent model of speech processing, the dual-stream model, posits that the LIFG is not involved in prelexical processing in receptive language processing. In the current study, we exploited natural variation in phonetic competition in the speech signal to investigate the neural systems sensitive to phonetic competition as listeners engage in a receptive language task. Participants heard nonsense sentences spoken in either a clear or conversational register as neural activity was monitored using fMRI. Conversational sentences contained greater phonetic competition, as estimated by measures of vowel confusability, and these sentences also elicited greater activation in a region in the LIFG. Sentence-level phonetic competition metrics uniquely correlated with LIFG activity as well. This finding is consistent with the hypothesis that the LIFG responds to competition at multiple levels of language processing and that recruitment of this region does not require an explicit phonological judgment.
Collapse
|
35
|
Kapnoula EC, Winn MB, Kong EJ, Edwards J, McMurray B. Evaluating the sources and functions of gradiency in phoneme categorization: An individual differences approach. J Exp Psychol Hum Percept Perform 2017; 43:1594-1611. [PMID: 28406683 PMCID: PMC5561468 DOI: 10.1037/xhp0000410] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
During spoken language comprehension listeners transform continuous acoustic cues into categories (e.g., /b/ and /p/). While long-standing research suggests that phonetic categories are activated in a gradient way, there are also clear individual differences in that more gradient categorization has been linked to various communication impairments such as dyslexia and specific language impairments (Joanisse, Manis, Keating, & Seidenberg, 2000; López-Zamora, Luque, Álvarez, & Cobos, 2012; Serniclaes, Van Heghe, Mousty, Carré, & Sprenger-Charolles, 2004; Werker & Tees, 1987). Crucially, most studies have used 2-alternative forced choice (2AFC) tasks to measure the sharpness of between-category boundaries. Here we propose an alternative paradigm that allows us to measure categorization gradiency in a more direct way. Furthermore, we follow an individual differences approach to (a) link this measure of gradiency to multiple cue integration, (b) explore its relationship to a set of other cognitive processes, and (c) evaluate its role in individuals' ability to perceive speech in noise. Our results provide validation for this new method of assessing phoneme categorization gradiency and offer preliminary insights into how different aspects of speech perception may be linked to each other and to more general cognitive processes. (PsycINFO Database Record
Collapse
Affiliation(s)
- Efthymia C Kapnoula
- Department of Psychological and Brain Sciences, DeLTA Center, University of Iowa
| | - Matthew B Winn
- Department of Speech and Hearing Sciences, University of Washington
| | | | - Jan Edwards
- Department of Communication Sciences and Disorders, Waisman Center, University of Wisconsin-Madison
| | - Bob McMurray
- Department of Psychological and Brain Sciences, DeLTA Center, University of Iowa
| |
Collapse
|
36
|
Rogers JC, Davis MH. Inferior Frontal Cortex Contributions to the Recognition of Spoken Words and Their Constituent Speech Sounds. J Cogn Neurosci 2017; 29:919-936. [PMID: 28129061 DOI: 10.1162/jocn_a_01096] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Speech perception and comprehension are often challenged by the need to recognize speech sounds that are degraded or ambiguous. Here, we explore the cognitive and neural mechanisms involved in resolving ambiguity in the identity of speech sounds using syllables that contain ambiguous phonetic segments (e.g., intermediate sounds between /b/ and /g/ as in "blade" and "glade"). We used an audio-morphing procedure to create a large set of natural sounding minimal pairs that contain phonetically ambiguous onset or offset consonants (differing in place, manner, or voicing). These ambiguous segments occurred in different lexical contexts (i.e., in words or pseudowords, such as blade-glade or blem-glem) and in different phonological environments (i.e., with neighboring syllables that differed in lexical status, such as blouse-glouse). These stimuli allowed us to explore the impact of phonetic ambiguity on the speed and accuracy of lexical decision responses (Experiment 1), semantic categorization responses (Experiment 2), and the magnitude of BOLD fMRI responses during attentive comprehension (Experiment 3). For both behavioral and neural measures, observed effects of phonetic ambiguity were influenced by lexical context leading to slower responses and increased activity in the left inferior frontal gyrus for high-ambiguity syllables that distinguish pairs of words, but not for equivalent pseudowords. These findings suggest lexical involvement in the resolution of phonetic ambiguity. Implications for speech perception and the role of inferior frontal regions are discussed.
Collapse
Affiliation(s)
- Jack C Rogers
- MRC Cognition & Brain Sciences Unit, Cambridge, UK.,University of Birmingham
| | | |
Collapse
|
37
|
Oscillatory Dynamics Underlying Perceptual Narrowing of Native Phoneme Mapping from 6 to 12 Months of Age. J Neurosci 2016; 36:12095-12105. [PMID: 27903720 DOI: 10.1523/jneurosci.1162-16.2016] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2016] [Revised: 09/08/2016] [Accepted: 10/07/2016] [Indexed: 11/21/2022] Open
Abstract
During the first months of life, human infants process phonemic elements from all languages similarly. However, by 12 months of age, as language-specific phonemic maps are established, infants respond preferentially to their native language. This process, known as perceptual narrowing, supports neural representation and thus efficient processing of the distinctive phonemes within the sound environment. Although oscillatory mechanisms underlying processing of native and non-native phonemic contrasts were recently delineated in 6-month-old infants, the maturational trajectory of these mechanisms remained unclear. A group of typically developing infants born into monolingual English families, were followed from 6 to 12 months and presented with English and Spanish syllable contrasts varying in voice-onset time. Brain responses were recorded with high-density electroencephalogram, and sources of event-related potential generators identified at right and left auditory cortices at 6 and 12 months and also at frontal cortex at 6 months. Time-frequency analyses conducted at source level found variations in both θ and γ ranges across age. Compared with 6-month-olds, 12-month-olds' responses to native phonemes showed smaller and faster phase synchronization and less spectral power in the θ range, and increases in left phase synchrony as well as induced high-γ activity in both frontal and left auditory sources. These results demonstrate that infants become more automatized and efficient in processing their native language as they approach 12 months of age via the interplay between θ and γ oscillations. We suggest that, while θ oscillations support syllable processing, γ oscillations underlie phonemic perceptual narrowing, progressively favoring mapping of native over non-native language across the first year of life. SIGNIFICANCE STATEMENT During early language acquisition, typically developing infants gradually construct phonemic maps of their native language in auditory cortex. It is well known that, by 12 months of age, human infants move from universal discrimination of most linguistic phonemic contrasts to phonemic expertise in their native language. This perceptual narrowing occurs at the expense of the ability to process non-native phonemes. However, the neural mechanisms underlying this process are still poorly understood. Here we demonstrate that perceptual narrowing is, at least in part, accomplished by decreasing power and phase coherence in the θ range while increasing activity in high-γ in left auditory cortex. Understanding the normative neural mechanisms that support early language acquisition is crucial to understanding and perhaps ameliorating developmental language disorders.
Collapse
|
38
|
Gow DW, Olson BB. Using effective connectivity analyses to understand processing architecture: Response to commentaries by Samuel, Spivey and McQueen, Eisner and Norris. LANGUAGE, COGNITION AND NEUROSCIENCE 2016; 31:869-875. [PMID: 28090547 PMCID: PMC5232413 DOI: 10.1080/23273798.2016.1192656] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2016] [Accepted: 05/10/2016] [Indexed: 06/06/2023]
Affiliation(s)
- David W. Gow
- Neuropsychology Laboratory, Massachusetts General Hospital, 100 Cambridge St., rm 2030, Boston, MA 02114
- Department of Psychology, Salem State University, 352 Lafayette St., Salem, MA 01970
- Athinoula A. Martinos Center for Biomedical Imaging. Massachusetts General Hospital, 149 Thirteenth St., S2301, Charlestown, MA.02129
- Harvard-MIT Division of Health Sciences and Technology, 77 Massachusetts Ave., E25-519, Cambridge, MA 02139
| | - Bruna B. Olson
- University of Massachusetts Medical School, 55 Lake Avenue North, Worcester, MA 01655
| |
Collapse
|
39
|
Evans S, McGettigan C, Agnew ZK, Rosen S, Scott SK. Getting the Cocktail Party Started: Masking Effects in Speech Perception. J Cogn Neurosci 2015; 28:483-500. [PMID: 26696297 DOI: 10.1162/jocn_a_00913] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Spoken conversations typically take place in noisy environments, and different kinds of masking sounds place differing demands on cognitive resources. Previous studies, examining the modulation of neural activity associated with the properties of competing sounds, have shown that additional speech streams engage the superior temporal gyrus. However, the absence of a condition in which target speech was heard without additional masking made it difficult to identify brain networks specific to masking and to ascertain the extent to which competing speech was processed equivalently to target speech. In this study, we scanned young healthy adults with continuous fMRI, while they listened to stories masked by sounds that differed in their similarity to speech. We show that auditory attention and control networks are activated during attentive listening to masked speech in the absence of an overt behavioral task. We demonstrate that competing speech is processed predominantly in the left hemisphere within the same pathway as target speech but is not treated equivalently within that stream and that individuals who perform better in speech in noise tasks activate the left mid-posterior superior temporal gyrus more. Finally, we identify neural responses associated with the onset of sounds in the auditory environment; activity was found within right lateralized frontal regions consistent with a phasic alerting response. Taken together, these results provide a comprehensive account of the neural processes involved in listening in noise.
Collapse
Affiliation(s)
| | | | - Zarinah K Agnew
- University College London.,University of California, San Francisco
| | | | | |
Collapse
|
40
|
Krieger-Redwood K, Teige C, Davey J, Hymers M, Jefferies E. Conceptual control across modalities: graded specialisation for pictures and words in inferior frontal and posterior temporal cortex. Neuropsychologia 2015; 76:92-107. [PMID: 25726898 PMCID: PMC4582805 DOI: 10.1016/j.neuropsychologia.2015.02.030] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2014] [Revised: 02/20/2015] [Accepted: 02/21/2015] [Indexed: 11/16/2022]
Abstract
Controlled semantic retrieval to words elicits co-activation of inferior frontal (IFG) and left posterior temporal cortex (pMTG), but research has not yet established (i) the distinct contributions of these regions or (ii) whether the same processes are recruited for non-verbal stimuli. Words have relatively flexible meanings - as a consequence, identifying the context that links two specific words is relatively demanding. In contrast, pictures are richer stimuli and their precise meaning is better specified by their visible features - however, not all of these features will be relevant to uncovering a given association, tapping selection/inhibition processes. To explore potential differences across modalities, we took a commonly-used manipulation of controlled retrieval demands, namely the identification of weak vs. strong associations, and compared word and picture versions. There were 4 key findings: (1) Regions of interest (ROIs) in posterior IFG (BA44) showed graded effects of modality (e.g., words>pictures in left BA44; pictures>words in right BA44). (2) An equivalent response was observed in left mid-IFG (BA45) across modalities, consistent with the multimodal semantic control deficits that typically follow LIFG lesions. (3) The anterior IFG (BA47) ROI showed a stronger response to verbal than pictorial associations, potentially reflecting a role for this region in establishing a meaningful context that can be used to direct semantic retrieval. (4) The left pMTG ROI also responded to difficulty across modalities yet showed a stronger response overall to verbal stimuli, helping to reconcile two distinct literatures that have implicated this site in semantic control and lexical-semantic access respectively. We propose that left anterior IFG and pMTG work together to maintain a meaningful context that shapes ongoing semantic processing, and that this process is more strongly taxed by word than picture associations.
Collapse
Affiliation(s)
| | - Catarina Teige
- Department of Psychology and York Neuroimaging Centre, University of York, UK
| | - James Davey
- Department of Psychology and York Neuroimaging Centre, University of York, UK
| | - Mark Hymers
- Department of Psychology and York Neuroimaging Centre, University of York, UK
| | - Elizabeth Jefferies
- Department of Psychology and York Neuroimaging Centre, University of York, UK.
| |
Collapse
|
41
|
Myers EB, Mesite LM. Neural Systems Underlying Perceptual Adjustment to Non-Standard Speech Tokens. JOURNAL OF MEMORY AND LANGUAGE 2014; 76:80-93. [PMID: 25092949 PMCID: PMC4118215 DOI: 10.1016/j.jml.2014.06.007] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
It has long been noted that listeners use top-down information from context to guide perception of speech sounds. A recent line of work employing a phenomenon termed 'perceptual learning for speech' shows that listeners use top-down information to not only resolve the identity of perceptually ambiguous speech sounds, but also to adjust perceptual boundaries in subsequent processing of speech from the same talker. Even so, the neural mechanisms that underlie this process are not well understood. Of particular interest is whether this type of adjustment comes about because of a retuning of sensitivities to phonetic category structure early in the neural processing stream or whether the boundary shift results from decision-related or attentional mechanisms further downstream. In the current study, neural activation was measured using fMRI as participants categorized speech sounds that were perceptually shifted as a result of exposure to these sounds in lexically-unambiguous contexts. Sensitivity to lexically-mediated shifts in phonetic categorization emerged in right hemisphere frontal and middle temporal regions, suggesting that the perceptual learning for speech phenomenon relies on the adjustment of perceptual criteria downstream from primary auditory cortex. By the end of the session, this same sensitivity was seen in left superior temporal areas, which suggests that a rapidly-adapting system may be accompanied by more slowly evolving shifts in regions of the brain related to phonetic processing.
Collapse
Affiliation(s)
- Emily B. Myers
- University of Connecticut, Department of Speech, Language, and Hearing Sciences, 850 Bolton Road, Storrs, CT 06269
- University of Connecticut, Department of Psychology, 406 Babbidge Road, Storrs, CT 06269
- Brown University, Department of Cognitive, Linguistic, and Psychological Sciences, 190 Thayer Street, Providence, RI 02912
- Haskins Laboratories, 300 George Street #900, New Haven, CT 06511
- Corresponding Author: Emily Myers, Department of Speech, Language, and Hearing Sciences, University of Connecticut, 850 Bolton Road, Storrs, CT 06269, , 860-486-2630
| | - Laura M. Mesite
- Brown University, Department of Cognitive, Linguistic, and Psychological Sciences, 190 Thayer Street, Providence, RI 02912
- Haskins Laboratories, 300 George Street #900, New Haven, CT 06511
| |
Collapse
|
42
|
Specht K, Baumgartner F, Stadler J, Hugdahl K, Pollmann S. Functional asymmetry and effective connectivity of the auditory system during speech perception is modulated by the place of articulation of the consonant- A 7T fMRI study. Front Psychol 2014; 5:549. [PMID: 24966841 PMCID: PMC4052338 DOI: 10.3389/fpsyg.2014.00549] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2013] [Accepted: 05/18/2014] [Indexed: 11/16/2022] Open
Abstract
To differentiate between stop-consonants, the auditory system has to detect subtle place of articulation (PoA) and voice-onset time (VOT) differences between stop-consonants. How this differential processing is represented on the cortical level remains unclear. The present functional magnetic resonance (fMRI) study takes advantage of the superior spatial resolution and high sensitivity of ultra-high-field 7 T MRI. Subjects were attentively listening to consonant–vowel (CV) syllables with an alveolar or bilabial stop-consonant and either a short or long VOT. The results showed an overall bilateral activation pattern in the posterior temporal lobe during the processing of the CV syllables. This was however modulated strongest by PoA such that syllables with an alveolar stop-consonant showed stronger left lateralized activation. In addition, analysis of underlying functional and effective connectivity revealed an inhibitory effect of the left planum temporale (PT) onto the right auditory cortex (AC) during the processing of alveolar CV syllables. Furthermore, the connectivity result indicated also a directed information flow from the right to the left AC, and further to the left PT for all syllables. These results indicate that auditory speech perception relies on an interplay between the left and right ACs, with the left PT as modulator. Furthermore, the degree of functional asymmetry is determined by the acoustic properties of the CV syllables.
Collapse
Affiliation(s)
- Karsten Specht
- Department of Biological and Medical Psychology University of Bergen, Bergen, Norway ; Department of Medical Engineering, Haukeland University Hospital Bergen, Norway
| | - Florian Baumgartner
- Department of Experimental Psychology, Otto-von-Guericke University Magdeburg, Germany
| | - Jörg Stadler
- Leibniz Institute for Neurobiology, Magdeburg Germany
| | - Kenneth Hugdahl
- Department of Biological and Medical Psychology University of Bergen, Bergen, Norway ; Division of Psychiatry, Haukeland University Hospital Bergen, Norway ; Department of Radiology, Haukeland University Hospital Bergen, Norway ; NORMENT Senter for Fremragende Forskning Oslo Norway
| | - Stefan Pollmann
- Department of Experimental Psychology, Otto-von-Guericke University Magdeburg, Germany ; Center for Behavioral Brain Sciences Magdeburg Germany
| |
Collapse
|
43
|
Guediche S, Blumstein SE, Fiez JA, Holt LL. Speech perception under adverse conditions: insights from behavioral, computational, and neuroscience research. Front Syst Neurosci 2014; 7:126. [PMID: 24427119 PMCID: PMC3879477 DOI: 10.3389/fnsys.2013.00126] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2013] [Accepted: 12/16/2013] [Indexed: 01/06/2023] Open
Abstract
Adult speech perception reflects the long-term regularities of the native language, but it is also flexible such that it accommodates and adapts to adverse listening conditions and short-term deviations from native-language norms. The purpose of this article is to examine how the broader neuroscience literature can inform and advance research efforts in understanding the neural basis of flexibility and adaptive plasticity in speech perception. Specifically, we highlight the potential role of learning algorithms that rely on prediction error signals and discuss specific neural structures that are likely to contribute to such learning. To this end, we review behavioral studies, computational accounts, and neuroimaging findings related to adaptive plasticity in speech perception. Already, a few studies have alluded to a potential role of these mechanisms in adaptive plasticity in speech perception. Furthermore, we consider research topics in neuroscience that offer insight into how perception can be adaptively tuned to short-term deviations while balancing the need to maintain stability in the perception of learned long-term regularities. Consideration of the application and limitations of these algorithms in characterizing flexible speech perception under adverse conditions promises to inform theoretical models of speech.
Collapse
Affiliation(s)
- Sara Guediche
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown UniversityProvidence, RI, USA
| | - Sheila E. Blumstein
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown UniversityProvidence, RI, USA
- Department of Cognitive, Linguistic, and Psychological Sciences, Brain Institute, Brown UniversityProvidence, RI, USA
| | - Julie A. Fiez
- Department of Neuroscience, Center for Neuroscience at the University of Pittsburgh, University of PittsburghPittsburgh, PA, USA
- Department of Psychology, University of PittsburghPittsburgh, PA, USA
- Department of Psychology at Carnegie Mellon University and Department of Neuroscience at the University of Pittsburgh, Center for the Neural Basis of CognitionPittsburgh, PA, USA
| | - Lori L. Holt
- Department of Neuroscience, Center for Neuroscience at the University of Pittsburgh, University of PittsburghPittsburgh, PA, USA
- Department of Psychology at Carnegie Mellon University and Department of Neuroscience at the University of Pittsburgh, Center for the Neural Basis of CognitionPittsburgh, PA, USA
- Department of Psychology, Carnegie Mellon UniversityPittsburgh, PA, USA
| |
Collapse
|
44
|
Lawyer L, Corina D. An Investigation of Place and Voice Features Using fMRI-Adaptation. JOURNAL OF NEUROLINGUISTICS 2014; 27:10.1016/j.jneuroling.2013.07.001. [PMID: 24187438 PMCID: PMC3810966 DOI: 10.1016/j.jneuroling.2013.07.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
A widely accepted view of speech perception holds that in order to comprehend language, the variable acoustic signal must be parsed into a set of abstract linguistic representations. However, the neural basis of early phonological processing, including the nature of featural encoding of speech, is still poorly understood. In part, progress in this domain has been constrained by the difficulty inherent in extricating the influence of acoustic modulations from those which can be ascribed to the abstract, featural content of the stimuli. A further concern is that group averaging techniques may obscure subtle individual differences in cortical regions involved in early language processing. In this paper we present the results of an fMRI-adaptation experiment which finds evidence of areas in the superior and medial temporal lobes which respond selectively to changes in the major feature categories of voicing and place of articulation. We present both single-subject and group-averaged analyses.
Collapse
Affiliation(s)
- Laurel Lawyer
- Shields Ave, Department of Linguistics, University of California, Davis, CA, 95616
| | - David Corina
- Shields Ave, Department of Linguistics, University of California, Davis, CA, 95616
| |
Collapse
|
45
|
Scharinger M, Henry MJ, Erb J, Meyer L, Obleser J. Thalamic and parietal brain morphology predicts auditory category learning. Neuropsychologia 2013; 53:75-83. [PMID: 24035788 DOI: 10.1016/j.neuropsychologia.2013.09.012] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2013] [Revised: 09/02/2013] [Accepted: 09/04/2013] [Indexed: 01/13/2023]
Abstract
Auditory categorization is a vital skill involving the attribution of meaning to acoustic events, engaging domain-specific (i.e., auditory) as well as domain-general (e.g., executive) brain networks. A listener's ability to categorize novel acoustic stimuli should therefore depend on both, with the domain-general network being particularly relevant for adaptively changing listening strategies and directing attention to relevant acoustic cues. Here we assessed adaptive listening behavior, using complex acoustic stimuli with an initially salient (but later degraded) spectral cue and a secondary, duration cue that remained nondegraded. We employed voxel-based morphometry (VBM) to identify cortical and subcortical brain structures whose individual neuroanatomy predicted task performance and the ability to optimally switch to making use of temporal cues after spectral degradation. Behavioral listening strategies were assessed by logistic regression and revealed mainly strategy switches in the expected direction, with considerable individual differences. Gray-matter probability in the left inferior parietal lobule (BA 40) and left precentral gyrus was predictive of "optimal" strategy switch, while gray-matter probability in thalamic areas, comprising the medial geniculate body, co-varied with overall performance. Taken together, our findings suggest that successful auditory categorization relies on domain-specific neural circuits in the ascending auditory pathway, while adaptive listening behavior depends more on brain structure in parietal cortex, enabling the (re)direction of attention to salient stimulus properties.
Collapse
Affiliation(s)
- Mathias Scharinger
- Max Planck Research Group "Auditory Cognition", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
| | - Molly J Henry
- Max Planck Research Group "Auditory Cognition", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Julia Erb
- Max Planck Research Group "Auditory Cognition", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Lars Meyer
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Jonas Obleser
- Max Planck Research Group "Auditory Cognition", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
46
|
Thompson HE, Jefferies E. Semantic control and modality: An input processing deficit in aphasia leading to deregulated semantic cognition in a single modality. Neuropsychologia 2013; 51:1998-2015. [DOI: 10.1016/j.neuropsychologia.2013.06.030] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2013] [Revised: 06/27/2013] [Accepted: 06/29/2013] [Indexed: 10/26/2022]
|
47
|
An fMRI examination of the effects of acoustic-phonetic and lexical competition on access to the lexical-semantic network. Neuropsychologia 2013; 51:1980-8. [PMID: 23816958 DOI: 10.1016/j.neuropsychologia.2013.06.016] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2013] [Revised: 06/11/2013] [Accepted: 06/13/2013] [Indexed: 11/23/2022]
Abstract
The current study explored how factors of acoustic-phonetic and lexical competition affect access to the lexical-semantic network during spoken word recognition. An auditory semantic priming lexical decision task was presented to subjects while in the MR scanner. Prime-target pairs consisted of prime words with the initial voiceless stop consonants /p/, /t/, and /k/ followed by word and nonword targets. To examine the neural consequences of lexical and sound structure competition, primes either had voiced minimal pair competitors or they did not, and they were either acoustically modified to be poorer exemplars of the voiceless phonetic category or not. Neural activation associated with semantic priming (Unrelated-Related conditions) revealed a bilateral fronto-temporo-parietal network. Within this network, clusters in the left insula/inferior frontal gyrus (IFG), left superior temporal gyrus (STG), and left posterior middle temporal gyrus (pMTG) showed sensitivity to lexical competition. The pMTG also demonstrated sensitivity to acoustic modification, and the insula/IFG showed an interaction between lexical competition and acoustic modification. These findings suggest the posterior lexical-semantic network is modulated by both acoustic-phonetic and lexical structure, and that the resolution of these two sources of competition recruits frontal structures.
Collapse
|
48
|
Junger J, Pauly K, Bröhr S, Birkholz P, Neuschaefer-Rube C, Kohler C, Schneider F, Derntl B, Habel U. Sex matters: Neural correlates of voice gender perception. Neuroimage 2013; 79:275-87. [PMID: 23660030 DOI: 10.1016/j.neuroimage.2013.04.105] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2012] [Revised: 04/12/2013] [Accepted: 04/24/2013] [Indexed: 10/26/2022] Open
Abstract
The basis for different neural activations in response to male and female voices as well as the question, whether men and women perceive male and female voices differently, has not been thoroughly investigated. Therefore, the aim of the present study was to examine the behavioral and neural correlates of gender-related voice perception in healthy male and female volunteers. fMRI data were collected while 39 participants (19 female) were asked to indicate the gender of 240 voice stimuli. These stimuli included recordings of 3-syllable nouns as well as the same recordings pitch-shifted in 2, 4 and 6 semitone steps in the direction of the other gender. Data analysis revealed a) equal voice discrimination sensitivity in men and women but better performance in the categorization of opposite-sex stimuli at least in men, b) increased responses to increasing gender ambiguity in the mid cingulate cortex and bilateral inferior frontal gyri, and c) stronger activation in a fronto-temporal neural network in response to voices of the opposite sex. Our results indicate a gender specific processing for male and female voices on a behavioral and neuronal level. We suggest that our results reflect higher sensitivity probably due to the evolutionary relevance of voice perception in mate selection.
Collapse
Affiliation(s)
- Jessica Junger
- Department of Psychiatry, Medical School, RWTH Aachen University, Aachen, Germany
| | | | | | | | | | | | | | | | | |
Collapse
|
49
|
Guediche S, Salvata C, Blumstein SE. Temporal cortex reflects effects of sentence context on phonetic processing. J Cogn Neurosci 2013; 25:706-18. [PMID: 23281778 DOI: 10.1162/jocn_a_00351] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Listeners' perception of acoustically presented speech is constrained by many different sources of information that arise from other sensory modalities and from more abstract higher-level language context. An open question is how perceptual processes are influenced by and interact with these other sources of information. In this study, we use fMRI to examine the effect of a prior sentence fragment meaning on the categorization of two possible target words that differ in an acoustic phonetic feature of the initial consonant, VOT. Specifically, we manipulate the bias of the sentence context (biased, neutral) and the target type (ambiguous, unambiguous). Our results show that an interaction between these two factors emerged in a cluster in temporal cortex encompassing the left middle temporal gyrus and the superior temporal gyrus. The locus and pattern of these interactions support an interactive view of speech processing and suggest that both the quality of the input and the potential bias of the context together interact and modulate neural activation patterns.
Collapse
Affiliation(s)
- Sara Guediche
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI, USA.
| | | | | |
Collapse
|
50
|
Gow DW. The cortical organization of lexical knowledge: a dual lexicon model of spoken language processing. BRAIN AND LANGUAGE 2012; 121:273-88. [PMID: 22498237 PMCID: PMC3348354 DOI: 10.1016/j.bandl.2012.03.005] [Citation(s) in RCA: 105] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2011] [Revised: 02/08/2012] [Accepted: 03/13/2012] [Indexed: 05/14/2023]
Abstract
Current accounts of spoken language assume the existence of a lexicon where wordforms are stored and interact during spoken language perception, understanding and production. Despite the theoretical importance of the wordform lexicon, the exact localization and function of the lexicon in the broader context of language use is not well understood. This review draws on evidence from aphasia, functional imaging, neuroanatomy, laboratory phonology and behavioral results to argue for the existence of parallel lexica that facilitate different processes in the dorsal and ventral speech pathways. The dorsal lexicon, localized in the inferior parietal region including the supramarginal gyrus, serves as an interface between phonetic and articulatory representations. The ventral lexicon, localized in the posterior superior temporal sulcus and middle temporal gyrus, serves as an interface between phonetic and semantic representations. In addition to their interface roles, the two lexica contribute to the robustness of speech processing.
Collapse
Affiliation(s)
- David W Gow
- Neuropsychology Laboratory, Massachusetts General Hospital, Boston, MA 02114, USA.
| |
Collapse
|