1
|
Themistocleous C. Linguistic and Emotional Prosody: A Systematic Review and ALE Meta-Analysis. Neurosci Biobehav Rev 2025:106210. [PMID: 40379231 DOI: 10.1016/j.neubiorev.2025.106210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2024] [Revised: 04/28/2025] [Accepted: 05/10/2025] [Indexed: 05/19/2025]
Abstract
Prosody is a cover term referring to the melodic aspects of speech, with linguistic and affective (a.k.a. emotional) meanings. This review provides an overview of linguistic and affective prosody, evaluating two hypotheses on healthy individuals' linguistic and affective prosody. The first hypothesizes that the biological nature of affective prosody triggers activations unrelated to language (biological hypothesis), and the second that the aspects of affective prosody have been grammaticalized, i.e., incorporated into the language (linguistic hypothesis). We employed a systematic ALE metanalytic approach to identify neural correlates of prosody from the literature. Specifically, we assessed papers that report brain coordinates from healthy individuals selected using systematic research from academic databases, such as PubMed (NLM), Scopus, and Web of Science. We found that affective and linguistic prosody activate bilateral frontotemporal regions, like the Superior Temporal Gyrus (STG). A key difference is that affective prosody involves subcortical structures like the amygdala, and linguistic prosody activates linguistic areas and brain areas of social cognition and engagement. The shared activations, therefore, suggest that linguistic and affective meanings are combined, involving shared underlying brain connectivity mechanisms and acoustic manifestations. We suggest that the traditional distinction between linguistic and affective prosody may be overly rigid. Much like speech, lexicon, and grammar-domains that convey affective, social, and linguistic meanings without explicitly reflecting separate categories-prosody, too, functions as a system interfacing with affective, social, and linguistic domains. We conclude by proposing a novel blending hypothesis: prosody should be viewed as an integrated system that serves both affective and linguistic functions.
Collapse
|
2
|
Gnanateja GN, Rupp K, Llanos F, Hect J, German JS, Teichert T, Abel TJ, Chandrasekaran B. Cortical processing of discrete prosodic patterns in continuous speech. Nat Commun 2025; 16:1947. [PMID: 40032850 PMCID: PMC11876672 DOI: 10.1038/s41467-025-56779-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 01/29/2025] [Indexed: 03/05/2025] Open
Abstract
Prosody has a vital function in speech, structuring a speaker's intended message for the listener. The superior temporal gyrus (STG) is considered a critical hub for prosody, but the role of earlier auditory regions like Heschl's gyrus (HG), associated with pitch processing, remains unclear. Using intracerebral recordings in humans and non-human primate models, we investigated prosody processing in narrative speech, focusing on pitch accents-abstract phonological units that signal word prominence and communicative intent. In humans, HG encoded pitch accents as abstract representations beyond spectrotemporal features, distinct from segmental speech processing, and outperforms STG in disambiguating pitch accents. Multivariate models confirm HG's unique representation of pitch accent categories. In the non-human primate, pitch accents were not abstractly encoded, despite robust spectrotemporal processing, highlighting the role of experience in shaping abstract representations. These findings emphasize a key role for the HG in early prosodic abstraction and advance our understanding of human speech processing.
Collapse
Affiliation(s)
- G Nike Gnanateja
- Speech Processing and Auditory Neuroscience Lab, Department of Communication Sciences and Disorder, University of Wisconsin-Madison, Madison, WI, USA
| | - Kyle Rupp
- Pediatric Brain Electrophysiology Laboratory, Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA
| | - Fernando Llanos
- UT Austin Neurolinguistics Lab, Department of Linguistics, The University of Texas at Austin, Austin, TX, USA
| | - Jasmine Hect
- Pediatric Brain Electrophysiology Laboratory, Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA
| | - James S German
- Aix-Marseille University, CNRS, LPL, Aix-en-Provence, France
| | - Tobias Teichert
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Taylor J Abel
- Pediatric Brain Electrophysiology Laboratory, Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA.
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, PA, USA.
| | - Bharath Chandrasekaran
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, PA, USA.
- Roxelyn and Richard Pepper Department of Communication Sciences & Disorders, Northwestern University, Evanston, IL, USA.
- Knowles Hearing Center, Evanston, IL, 60208, USA.
| |
Collapse
|
3
|
Keshishian M, Mischler G, Thomas S, Kingsbury B, Bickel S, Mehta AD, Mesgarani N. Parallel hierarchical encoding of linguistic representations in the human auditory cortex and recurrent automatic speech recognition systems. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.30.635775. [PMID: 39975377 PMCID: PMC11838305 DOI: 10.1101/2025.01.30.635775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
The human brain's ability to transform acoustic speech signals into rich linguistic representations has inspired advancements in automatic speech recognition (ASR) systems. While ASR systems now achieve human-level performance under controlled conditions, prior research on their parallels with the brain has been limited by the use of biologically implausible models, narrow feature sets, and comparisons that primarily emphasize predictability of brain activity without fully exploring shared underlying representations. Additionally, studies comparing the brain to text-based language models overlook the acoustic stages of speech processing, an essential part in transforming sound to meaning. Leveraging high-resolution intracranial recordings and a recurrent ASR model, this study bridges these gaps by uncovering a striking correspondence in the hierarchical encoding of linguistic features, from low-level acoustic signals to high-level semantic processing. Specifically, we demonstrate that neural activity in distinct regions of the auditory cortex aligns with representations in corresponding layers of the ASR model and, crucially, that both systems encode similar features at each stage of processing-from acoustic to phonetic, lexical, and semantic information. These findings suggest that both systems, despite their distinct architectures, converge on similar strategies for language processing, providing insight in the optimal computational principles underlying linguistic representation and the shared constraints shaping human and artificial speech processing.
Collapse
Affiliation(s)
- Menoua Keshishian
- Department of Electrical Engineering, Columbia University, New York, NY, USA
- Zuckerman Institute, Columbia University, New York, NY, USA
| | - Gavin Mischler
- Department of Electrical Engineering, Columbia University, New York, NY, USA
- Zuckerman Institute, Columbia University, New York, NY, USA
| | | | | | - Stephan Bickel
- The Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
- Department of Neurosurgery, Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, USA
| | - Ashesh D. Mehta
- The Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
- Department of Neurosurgery, Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, USA
| | - Nima Mesgarani
- Department of Electrical Engineering, Columbia University, New York, NY, USA
- Zuckerman Institute, Columbia University, New York, NY, USA
| |
Collapse
|
4
|
Persson A, Barreda S, Jaeger TF. Comparing accounts of formant normalization against US English listeners' vowel perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2025; 157:1458-1482. [PMID: 39998127 DOI: 10.1121/10.0035476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/25/2024] [Accepted: 01/07/2025] [Indexed: 02/26/2025]
Abstract
Human speech recognition tends to be robust, despite substantial cross-talker variability. Believed to be critical to this ability are auditory normalization mechanisms whereby listeners adapt to individual differences in vocal tract physiology. This study investigates the computations involved in such normalization. Two 8-way alternative forced-choice experiments assessed L1 listeners' categorizations across the entire US English vowel space-both for unaltered and synthesized stimuli. Listeners' responses in these experiments were compared against the predictions of 20 influential normalization accounts that differ starkly in the inference and memory capacities they imply for speech perception. This includes variants of estimation-free transformations into psycho-acoustic spaces, intrinsic normalizations relative to concurrent acoustic properties, and extrinsic normalizations relative to talker-specific statistics. Listeners' responses were best explained by extrinsic normalization, suggesting that listeners learn and store distributional properties of talkers' speech. Specifically, computationally simple (single-parameter) extrinsic normalization best fit listeners' responses. This simple extrinsic normalization also clearly outperformed Lobanov normalization-a computationally more complex account that remains popular in research on phonetics and phonology, sociolinguistics, typology, and language acquisition.
Collapse
Affiliation(s)
- Anna Persson
- Swedish Language and Multilingualism, Stockholm University, Stockholm, SE-106 91, Sweden
| | - Santiago Barreda
- Linguistics, University of California, Davis, California 95616, USA
| | - T Florian Jaeger
- Brain and Cognitive Sciences, Goergen Institute for Data Science and Artificial Intelligence, University of Rochester, Rochester, New York 14627, USA
| |
Collapse
|
5
|
Cervantes Constantino F, Caputi Á. Cortical tracking of speakers' spectral changes predicts selective listening. Cereb Cortex 2024; 34:bhae472. [PMID: 39656649 DOI: 10.1093/cercor/bhae472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 10/20/2024] [Accepted: 11/15/2024] [Indexed: 12/17/2024] Open
Abstract
A social scene is particularly informative when people are distinguishable. To understand somebody amid a "cocktail party" chatter, we automatically index their voice. This ability is underpinned by parallel processing of vocal spectral contours from speech sounds, but it has not yet been established how this occurs in the brain's cortex. We investigate single-trial neural tracking of slow frequency modulations in speech using electroencephalography. Participants briefly listened to unfamiliar single speakers, and in addition, they performed a cocktail party comprehension task. Quantified through stimulus reconstruction methods, robust tracking was found in neural responses to slow (delta-theta range) modulations of frequency contours in the fourth and fifth formant band, equivalent to the 3.5-5 KHz audible range. The spectral spacing between neighboring instantaneous frequency contours (ΔF), which also yields indexical information from the vocal tract, was similarly decodable. Moreover, EEG evidence of listeners' spectral tracking abilities predicted their chances of succeeding at selective listening when faced with two-speaker speech mixtures. In summary, the results indicate that the communicating brain can rely on locking of cortical rhythms to major changes led by upper resonances of the vocal tract. Their corresponding articulatory mechanics hence continuously issue a fundamental credential for listeners to target in real time.
Collapse
Affiliation(s)
- Francisco Cervantes Constantino
- Instituto de Investigaciones Biológicas Clemente Estable, Department of Integrative and Computational Neurosciences, Av. Italia 3318, Montevideo, 11.600, Uruguay
- Facultad de Psicología, Universidad de la República
| | - Ángel Caputi
- Instituto de Investigaciones Biológicas Clemente Estable, Department of Integrative and Computational Neurosciences, Av. Italia 3318, Montevideo, 11.600, Uruguay
| |
Collapse
|
6
|
Vaziri PA, McDougle SD, Clark DA. Humans can use positive and negative spectrotemporal correlations to detect rising and falling pitch. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.03.606481. [PMID: 39131316 PMCID: PMC11312537 DOI: 10.1101/2024.08.03.606481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
To discern speech or appreciate music, the human auditory system detects how pitch increases or decreases over time. However, the algorithms used to detect changes in pitch, or pitch motion, are incompletely understood. Here, using psychophysics, computational modeling, functional neuroimaging, and analysis of recorded speech, we ask if humans can detect pitch motion using computations analogous to those used by the visual system. We adapted stimuli from studies of vision to create novel auditory correlated noise stimuli that elicited robust pitch motion percepts. Crucially, these stimuli are inharmonic and possess no persistent features across frequency or time, but do possess positive or negative local spectrotemporal correlations in intensity. In psychophysical experiments, we found clear evidence that humans can judge pitch direction based only on positive or negative spectrotemporal intensity correlations. The key behavioral result-robust sensitivity to the negative spectrotemporal correlations-is a direct analogue of illusory "reverse-phi" motion in vision, and thus constitutes a new auditory illusion. Our behavioral results and computational modeling led us to hypothesize that human auditory processing may employ pitch direction opponency. fMRI measurements in auditory cortex supported this hypothesis. To link our psychophysical findings to real-world pitch perception, we analyzed recordings of English and Mandarin speech and found that pitch direction was robustly signaled by both positive and negative spectrotemporal correlations, suggesting that sensitivity to both types of correlations confers ecological benefits. Overall, this work reveals how motion detection algorithms sensitive to local correlations are deployed by the central nervous system across disparate modalities (vision and audition) and dimensions (space and frequency).
Collapse
|
7
|
Kurteff GL, Field AM, Asghar S, Tyler-Kabara EC, Clarke D, Weiner HL, Anderson AE, Watrous AJ, Buchanan RJ, Modur PN, Hamilton LS. Spatiotemporal Mapping of Auditory Onsets during Speech Production. J Neurosci 2024; 44:e1109242024. [PMID: 39455254 PMCID: PMC11580786 DOI: 10.1523/jneurosci.1109-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Revised: 07/31/2024] [Accepted: 10/08/2024] [Indexed: 10/28/2024] Open
Abstract
The human auditory cortex is organized according to the timing and spectral characteristics of speech sounds during speech perception. During listening, the posterior superior temporal gyrus is organized according to onset responses, which segment acoustic boundaries in speech, and sustained responses, which further process phonological content. When we speak, the auditory system is actively processing the sound of our own voice to detect and correct speech errors in real time. This manifests in neural recordings as suppression of auditory responses during speech production compared with perception, but whether this differentially affects the onset and sustained temporal profiles is not known. Here, we investigated this question using intracranial EEG recorded from seventeen pediatric, adolescent, and adult patients with medication-resistant epilepsy while they performed a reading/listening task. We identified onset and sustained responses to speech in the bilateral auditory cortex and observed a selective suppression of onset responses during speech production. We conclude that onset responses provide a temporal landmark during speech perception that is redundant with forward prediction during speech production and are therefore suppressed. Phonological feature tuning in these "onset suppression" electrodes remained stable between perception and production. Notably, auditory onset responses and phonological feature tuning were present in the posterior insula during both speech perception and production, suggesting an anatomically and functionally separate auditory processing zone that we believe to be involved in multisensory integration during speech perception and feedback control.
Collapse
Affiliation(s)
- Garret Lynn Kurteff
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, Texas 78712
| | - Alyssa M Field
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, Texas 78712
| | - Saman Asghar
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, Texas 78712
- Departments of Neurosurgery, Baylor College of Medicine, Houston, Texas 77030
| | - Elizabeth C Tyler-Kabara
- Departments of Neurosurgery, Dell Medical School, The University of Texas at Austin, Austin, Texas 78712
- Pediatrics, Dell Medical School, The University of Texas at Austin, Austin, Texas 78712
| | - Dave Clarke
- Departments of Neurosurgery, Dell Medical School, The University of Texas at Austin, Austin, Texas 78712
- Pediatrics, Dell Medical School, The University of Texas at Austin, Austin, Texas 78712
- Neurology, Dell Medical School, The University of Texas at Austin, Austin, Texas 78712
| | - Howard L Weiner
- Departments of Neurosurgery, Baylor College of Medicine, Houston, Texas 77030
| | - Anne E Anderson
- Pediatrics, Baylor College of Medicine, Houston, Texas 77030
| | - Andrew J Watrous
- Departments of Neurosurgery, Baylor College of Medicine, Houston, Texas 77030
| | - Robert J Buchanan
- Departments of Neurosurgery, Dell Medical School, The University of Texas at Austin, Austin, Texas 78712
| | - Pradeep N Modur
- Neurology, Dell Medical School, The University of Texas at Austin, Austin, Texas 78712
| | - Liberty S Hamilton
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, Texas 78712
- Neurology, Dell Medical School, The University of Texas at Austin, Austin, Texas 78712
| |
Collapse
|
8
|
Hoarau C, Pralus A, Moulin A, Bedoin N, Ginzburg J, Fornoni L, Aguera PE, Tillmann B, Caclin A. Deficits in congenital amusia: Pitch, music, speech, and beyond. Neuropsychologia 2024; 202:108960. [PMID: 39032629 DOI: 10.1016/j.neuropsychologia.2024.108960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 07/17/2024] [Accepted: 07/17/2024] [Indexed: 07/23/2024]
Abstract
Congenital amusia is a neurodevelopmental disorder characterized by deficits of music perception and production, which are related to altered pitch processing. The present study used a wide variety of tasks to test potential patterns of processing impairment in individuals with congenital amusia (N = 18) in comparison to matched controls (N = 19), notably classical pitch processing tests (i.e., pitch change detection, pitch direction of change identification, and pitch short-term memory tasks) together with tasks assessing other aspects of pitch-related auditory cognition, such as emotion recognition in speech, sound segregation in tone sequences, and speech-in-noise perception. Additional behavioral measures were also collected, including text reading/copying tests, visual control tasks, and a subjective assessment of hearing abilities. As expected, amusics' performance was impaired for the three pitch-specific tasks compared to controls. This deficit of pitch perception had a self-perceived impact on amusics' quality of hearing. Moreover, participants with amusia were impaired in emotion recognition in vowels compared to controls, but no group difference was observed for emotion recognition in sentences, replicating previous data. Despite pitch processing deficits, participants with amusia did not differ from controls in sound segregation and speech-in-noise perception. Text reading and visual control tests did not reveal any impairments in participants with amusia compared to controls. However, the copying test revealed more numerous eye-movements and a smaller memory span. These results allow us to refine the pattern of pitch processing and memory deficits in congenital amusia, thus contributing further to understand pitch-related auditory cognition. Together with previous reports suggesting a comorbidity between congenital amusia and dyslexia, the findings call for further investigation of language-related abilities in this disorder even in the absence of neurodevelopmental language disorder diagnosis.
Collapse
Affiliation(s)
- Caliani Hoarau
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France; Humans Matter, Lyon, France.
| | - Agathe Pralus
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France; Humans Matter, Lyon, France
| | - Annie Moulin
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France
| | - Nathalie Bedoin
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France; Université Lumière Lyon 2, Lyon, France
| | - Jérémie Ginzburg
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France
| | - Lesly Fornoni
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France
| | - Pierre-Emmanuel Aguera
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France
| | - Barbara Tillmann
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France; Laboratory for Research on Learning and Development, Université de Bourgogne, LEAD-CNRS UMR5022, Dijon, France
| | - Anne Caclin
- Université Claude Bernard Lyon 1, INSERM, CNRS, Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, F-69500, Bron, France
| |
Collapse
|
9
|
Desai M, Field AM, Hamilton LS. A comparison of EEG encoding models using audiovisual stimuli and their unimodal counterparts. PLoS Comput Biol 2024; 20:e1012433. [PMID: 39250485 PMCID: PMC11412666 DOI: 10.1371/journal.pcbi.1012433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 09/19/2024] [Accepted: 08/21/2024] [Indexed: 09/11/2024] Open
Abstract
Communication in the real world is inherently multimodal. When having a conversation, typically sighted and hearing people use both auditory and visual cues to understand one another. For example, objects may make sounds as they move in space, or we may use the movement of a person's mouth to better understand what they are saying in a noisy environment. Still, many neuroscience experiments rely on unimodal stimuli to understand encoding of sensory features in the brain. The extent to which visual information may influence encoding of auditory information and vice versa in natural environments is thus unclear. Here, we addressed this question by recording scalp electroencephalography (EEG) in 11 subjects as they listened to and watched movie trailers in audiovisual (AV), visual (V) only, and audio (A) only conditions. We then fit linear encoding models that described the relationship between the brain responses and the acoustic, phonetic, and visual information in the stimuli. We also compared whether auditory and visual feature tuning was the same when stimuli were presented in the original AV format versus when visual or auditory information was removed. In these stimuli, visual and auditory information was relatively uncorrelated, and included spoken narration over a scene as well as animated or live-action characters talking with and without their face visible. For this stimulus, we found that auditory feature tuning was similar in the AV and A-only conditions, and similarly, tuning for visual information was similar when stimuli were presented with the audio present (AV) and when the audio was removed (V only). In a cross prediction analysis, we investigated whether models trained on AV data predicted responses to A or V only test data similarly to models trained on unimodal data. Overall, prediction performance using AV training and V test sets was similar to using V training and V test sets, suggesting that the auditory information has a relatively smaller effect on EEG. In contrast, prediction performance using AV training and A only test set was slightly worse than using matching A only training and A only test sets. This suggests the visual information has a stronger influence on EEG, though this makes no qualitative difference in the derived feature tuning. In effect, our results show that researchers may benefit from the richness of multimodal datasets, which can then be used to answer more than one research question.
Collapse
Affiliation(s)
- Maansi Desai
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, Texas, United States of America
| | - Alyssa M Field
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, Texas, United States of America
| | - Liberty S Hamilton
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, Texas, United States of America
- Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, Texas, United States of America
| |
Collapse
|
10
|
Kurumada C, Rivera R, Allen P, Bennetto L. Perception and adaptation of receptive prosody in autistic adolescents. Sci Rep 2024; 14:16409. [PMID: 39013983 PMCID: PMC11252140 DOI: 10.1038/s41598-024-66569-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 07/01/2024] [Indexed: 07/18/2024] Open
Abstract
A fundamental aspect of language processing is inferring others' minds from subtle variations in speech. The same word or sentence can often convey different meanings depending on its tempo, timing, and intonation-features often referred to as prosody. Although autistic children and adults are known to experience difficulty in making such inferences, the science remains unclear as to why. We hypothesize that detail-oriented perception in autism may interfere with the inference process if it lacks the adaptivity required to cope with the variability ubiquitous in human speech. Using a novel prosodic continuum that shifts the sentence meaning gradiently from a statement (e.g., "It's raining") to a question (e.g., "It's raining?"), we have investigated the perception and adaptation of receptive prosody in autistic adolescents and two groups of non-autistic controls. Autistic adolescents showed attenuated adaptivity in categorizing prosody, whereas they were equivalent to controls in terms of discrimination accuracy. Combined with recent findings in segmental (e.g., phoneme) recognition, the current results provide the basis for an emerging research framework for attenuated flexibility and reduced influence of contextual feedback as a possible source of deficits that hinder linguistic and social communication in autism.
Collapse
Affiliation(s)
- Chigusa Kurumada
- Brain and Cognitive Sciences, University of Rochester, Rochester, 14627, USA.
| | - Rachel Rivera
- Psychology, University of Rochester, Rochester, 14627, USA
| | - Paul Allen
- Psychology, University of Rochester, Rochester, 14627, USA
- Otolaryngology, University of Rochester Medical Center, Rochester, 14642, USA
| | - Loisa Bennetto
- Psychology, University of Rochester, Rochester, 14627, USA
| |
Collapse
|
11
|
Adl Zarrabi A, Jeulin M, Bardet P, Commère P, Naccache L, Aucouturier JJ, Ponsot E, Villain M. A simple psychophysical procedure separates representational and noise components in impairments of speech prosody perception after right-hemisphere stroke. Sci Rep 2024; 14:15194. [PMID: 38956187 PMCID: PMC11219855 DOI: 10.1038/s41598-024-64295-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 06/06/2024] [Indexed: 07/04/2024] Open
Abstract
After a right hemisphere stroke, more than half of the patients are impaired in their capacity to produce or comprehend speech prosody. Yet, and despite its social-cognitive consequences for patients, aprosodia following stroke has received scant attention. In this report, we introduce a novel, simple psychophysical procedure which, by combining systematic digital manipulations of speech stimuli and reverse-correlation analysis, allows estimating the internal sensory representations that subtend how individual patients perceive speech prosody, and the level of internal noise that govern behavioral variability in how patients apply these representations. Tested on a sample of N = 22 right-hemisphere stroke survivors and N = 21 age-matched controls, the representation + noise model provides a promising alternative to the clinical gold standard for evaluating aprosodia (MEC): both parameters strongly associate with receptive, and not expressive, aprosodia measured by MEC within the patient group; they have better sensitivity than MEC for separating high-functioning patients from controls; and have good specificity with respect to non-prosody-related impairments of auditory attention and processing. Taken together, individual differences in either internal representation, internal noise, or both, paint a potent portrait of the variety of sensory/cognitive mechanisms that can explain impairments of prosody processing after stroke.
Collapse
Affiliation(s)
- Aynaz Adl Zarrabi
- Université de Franche-Comté, SUPMICROTECH, CNRS, Institut FEMTO-ST, 25000, Besançon, France
| | - Mélissa Jeulin
- Department of Physical Medicine & Rehabilitation, APHP/Hôpital Pitié-Salpêtrière, 75013, Paris, France
| | - Pauline Bardet
- Department of Physical Medicine & Rehabilitation, APHP/Hôpital Pitié-Salpêtrière, 75013, Paris, France
| | - Pauline Commère
- Department of Physical Medicine & Rehabilitation, APHP/Hôpital Pitié-Salpêtrière, 75013, Paris, France
| | - Lionel Naccache
- Department of Physical Medicine & Rehabilitation, APHP/Hôpital Pitié-Salpêtrière, 75013, Paris, France
- Paris Brain Institute (ICM), Inserm, CNRS, PICNIC-Lab, 75013, Paris, France
| | | | - Emmanuel Ponsot
- Science & Technology of Music and Sound, IRCAM/CNRS/Sorbonne Université, 75004, Paris, France
| | - Marie Villain
- Department of Physical Medicine & Rehabilitation, APHP/Hôpital Pitié-Salpêtrière, 75013, Paris, France.
- Paris Brain Institute (ICM), Inserm, CNRS, PICNIC-Lab, 75013, Paris, France.
| |
Collapse
|
12
|
Degano G, Donhauser PW, Gwilliams L, Merlo P, Golestani N. Speech prosody enhances the neural processing of syntax. Commun Biol 2024; 7:748. [PMID: 38902370 PMCID: PMC11190187 DOI: 10.1038/s42003-024-06444-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 06/12/2024] [Indexed: 06/22/2024] Open
Abstract
Human language relies on the correct processing of syntactic information, as it is essential for successful communication between speakers. As an abstract level of language, syntax has often been studied separately from the physical form of the speech signal, thus often masking the interactions that can promote better syntactic processing in the human brain. However, behavioral and neural evidence from adults suggests the idea that prosody and syntax interact, and studies in infants support the notion that prosody assists language learning. Here we analyze a MEG dataset to investigate how acoustic cues, specifically prosody, interact with syntactic representations in the brains of native English speakers. More specifically, to examine whether prosody enhances the cortical encoding of syntactic representations, we decode syntactic phrase boundaries directly from brain activity, and evaluate possible modulations of this decoding by the prosodic boundaries. Our findings demonstrate that the presence of prosodic boundaries improves the neural representation of phrase boundaries, indicating the facilitative role of prosodic cues in processing abstract linguistic features. This work has implications for interactive models of how the brain processes different linguistic features. Future research is needed to establish the neural underpinnings of prosody-syntax interactions in languages with different typological characteristics.
Collapse
Affiliation(s)
- Giulio Degano
- Department of Psychology, Faculty of Psychology and Educational Sciences, University of Geneva, Geneva, Switzerland.
| | - Peter W Donhauser
- Ernst Strüngmann Institute for Neuroscience in Cooperation with Max Planck Society, Frankfurt am Main, Germany
| | - Laura Gwilliams
- Department of Psychology, Stanford University, Stanford, CA, USA
| | - Paola Merlo
- Department of Linguistics, University of Geneva, Geneva, Switzerland
- University Centre for Informatics, University of Geneva, Geneva, Switzerland
| | - Narly Golestani
- Department of Psychology, Faculty of Psychology and Educational Sciences, University of Geneva, Geneva, Switzerland
- Brain and Language Lab, Cognitive Science Hub, University of Vienna, Vienna, Austria
- Department of Behavioral and Cognitive Biology, Faculty of Life Sciences, University of Vienna, Vienna, Austria
| |
Collapse
|
13
|
Kurteff GL, Field AM, Asghar S, Tyler-Kabara EC, Clarke D, Weiner HL, Anderson AE, Watrous AJ, Buchanan RJ, Modur PN, Hamilton LS. Processing of auditory feedback in perisylvian and insular cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.14.593257. [PMID: 38798574 PMCID: PMC11118286 DOI: 10.1101/2024.05.14.593257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
When we speak, we not only make movements with our mouth, lips, and tongue, but we also hear the sound of our own voice. Thus, speech production in the brain involves not only controlling the movements we make, but also auditory and sensory feedback. Auditory responses are typically suppressed during speech production compared to perception, but how this manifests across space and time is unclear. Here we recorded intracranial EEG in seventeen pediatric, adolescent, and adult patients with medication-resistant epilepsy who performed a reading/listening task to investigate how other auditory responses are modulated during speech production. We identified onset and sustained responses to speech in bilateral auditory cortex, with a selective suppression of onset responses during speech production. Onset responses provide a temporal landmark during speech perception that is redundant with forward prediction during speech production. Phonological feature tuning in these "onset suppression" electrodes remained stable between perception and production. Notably, the posterior insula responded at sentence onset for both perception and production, suggesting a role in multisensory integration during feedback control.
Collapse
Affiliation(s)
- Garret Lynn Kurteff
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, TX, USA
| | - Alyssa M. Field
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, TX, USA
| | - Saman Asghar
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, TX, USA
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
| | - Elizabeth C. Tyler-Kabara
- Department of Neurosurgery, Dell Medical School, The University of Texas at Austin, Austin, TX, USA
- Department of Pediatrics, Dell Medical School, The University of Texas at Austin, Austin, TX, USA
| | - Dave Clarke
- Department of Neurosurgery, Dell Medical School, The University of Texas at Austin, Austin, TX, USA
- Department of Pediatrics, Dell Medical School, The University of Texas at Austin, Austin, TX, USA
- Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, TX, USA
| | - Howard L. Weiner
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
| | - Anne E. Anderson
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
| | - Andrew J. Watrous
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
| | - Robert J. Buchanan
- Department of Neurosurgery, Dell Medical School, The University of Texas at Austin, Austin, TX, USA
| | - Pradeep N. Modur
- Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, TX, USA
| | - Liberty S. Hamilton
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, TX, USA
- Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, TX, USA
- Lead contact
| |
Collapse
|
14
|
Clarke A, Tyler LK, Marslen-Wilson W. Hearing what is being said: the distributed neural substrate for early speech interpretation. LANGUAGE, COGNITION AND NEUROSCIENCE 2024; 39:1097-1116. [PMID: 39439863 PMCID: PMC11493057 DOI: 10.1080/23273798.2024.2345308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 03/26/2024] [Indexed: 10/25/2024]
Abstract
Speech comprehension is remarkable for the immediacy with which the listener hears what is being said. Here, we focus on the neural underpinnings of this process in isolated spoken words. We analysed source-localised MEG data for nouns using Representational Similarity Analysis to probe the spatiotemporal coordinates of phonology, lexical form, and the semantics of emerging word candidates. Phonological model fit was detectable within 40-50 ms, engaging a bilateral network including superior and middle temporal cortex and extending into anterior temporal and inferior parietal regions. Lexical form emerged within 60-70 ms, and model fit to semantics from 100-110 ms. Strikingly, the majority of vertices in a central core showed model fit to all three dimensions, consistent with a distributed neural substrate for early speech analysis. The early interpretation of speech seems to be conducted in a unified integrative representational space, in conflict with conventional views of a linguistically stratified representational hierarchy.
Collapse
Affiliation(s)
- Alex Clarke
- Department of Psychology, University of Cambridge, Cambridge, UK
| | | | | |
Collapse
|
15
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. SCIENCE ADVANCES 2024; 10:eadk0010. [PMID: 38363839 PMCID: PMC10871532 DOI: 10.1126/sciadv.adk0010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 01/17/2024] [Indexed: 02/18/2024]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception varies along several pitch-based dimensions: (i) the absolute pitch of notes, (ii) the difference in pitch between successive notes, and (iii) the statistical expectation of each note given prior context. How the brain represents these dimensions and whether their encoding is specialized for music remains unknown. We recorded high-density neurophysiological activity directly from the human auditory cortex while participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial map for representing distinct melodic dimensions. The same participants listened to spoken English, and we compared responses to music and speech. Cortical sites selective for music encoded expectation, while sites that encoded pitch and pitch-change in music used the same neural code to represent equivalent properties of speech. Findings reveal how the perception of melody recruits both music-specific and general-purpose sound representations.
Collapse
Affiliation(s)
- Narayan Sankaran
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Matthew K. Leonard
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Frederic Theunissen
- Department of Psychology, University of California, Berkeley, 2121 Berkeley Way, Berkeley, CA 94720, USA
| | - Edward F. Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| |
Collapse
|
16
|
Dahl KL, Cádiz MD, Zuk J, Guenther FH, Stepp CE. Controlling Pitch for Prosody: Sensorimotor Adaptation in Linguistically Meaningful Contexts. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:440-454. [PMID: 38241671 PMCID: PMC11000799 DOI: 10.1044/2023_jslhr-23-00460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 10/09/2023] [Accepted: 11/02/2023] [Indexed: 01/21/2024]
Abstract
PURPOSE This study examined how speakers adapt to fundamental frequency (fo) errors that affect the use of prosody to convey linguistic meaning, whether fo adaptation in that context relates to adaptation in linguistically neutral sustained vowels, and whether cue trading is reflected in responses in the prosodic cues of fo and amplitude. METHOD Twenty-four speakers said vowels and sentences while fo was digitally altered to induce predictable errors. Shifts in fo (±200 cents) were applied to the entire sustained vowel and one word (emphasized or unemphasized) in sentences. Two prosodic cues-fo and amplitude-were extracted. The effects of fo shifts, shift direction, and emphasis on fo response magnitude were evaluated with repeated-measures analyses of variance. Relationships between adaptive fo responses in sentences and vowels and between adaptive fo and amplitude responses were evaluated with Spearman correlations. RESULTS Speakers adapted to fo errors in both linguistically meaningful sentences and linguistically neutral vowels. Adaptive fo responses of unemphasized words were smaller than those of emphasized words when fo was shifted upward. There was no relationship between adaptive fo responses in vowels and emphasized words, but adaptive fo and amplitude responses were strongly, positively correlated. CONCLUSIONS Sensorimotor adaptation occurs in response to fo errors regardless of how disruptive the error is to linguistic meaning. Adaptation to fo errors during sustained vowels may not involve the exact same mechanisms as sensorimotor adaptation as it occurs in meaningful speech. The relationship between adaptive responses in fo and amplitude supports an integrated model of prosody. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.25008908.
Collapse
Affiliation(s)
- Kimberly L. Dahl
- Department of Speech, Language and Hearing Sciences, Boston University, MA
| | - Manuel Díaz Cádiz
- Department of Speech, Language and Hearing Sciences, Boston University, MA
| | - Jennifer Zuk
- Department of Speech, Language and Hearing Sciences, Boston University, MA
| | - Frank H. Guenther
- Department of Speech, Language and Hearing Sciences, Boston University, MA
- Department of Biomedical Engineering, Boston University, MA
| | - Cara E. Stepp
- Department of Speech, Language and Hearing Sciences, Boston University, MA
- Department of Biomedical Engineering, Boston University, MA
- Department of Otolaryngology–Head and Neck Surgery, Boston University School of Medicine, MA
| |
Collapse
|
17
|
Leonard MK, Gwilliams L, Sellers KK, Chung JE, Xu D, Mischler G, Mesgarani N, Welkenhuysen M, Dutta B, Chang EF. Large-scale single-neuron speech sound encoding across the depth of human cortex. Nature 2024; 626:593-602. [PMID: 38093008 PMCID: PMC10866713 DOI: 10.1038/s41586-023-06839-2] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 11/06/2023] [Indexed: 01/31/2024]
Abstract
Understanding the neural basis of speech perception requires that we study the human brain both at the scale of the fundamental computational unit of neurons and in their organization across the depth of cortex. Here we used high-density Neuropixels arrays1-3 to record from 685 neurons across cortical layers at nine sites in a high-level auditory region that is critical for speech, the superior temporal gyrus4,5, while participants listened to spoken sentences. Single neurons encoded a wide range of speech sound cues, including features of consonants and vowels, relative vocal pitch, onsets, amplitude envelope and sequence statistics. Neurons at each cross-laminar recording exhibited dominant tuning to a primary speech feature while also containing a substantial proportion of neurons that encoded other features contributing to heterogeneous selectivity. Spatially, neurons at similar cortical depths tended to encode similar speech features. Activity across all cortical layers was predictive of high-frequency field potentials (electrocorticography), providing a neuronal origin for macroelectrode recordings from the cortical surface. Together, these results establish single-neuron tuning across the cortical laminae as an important dimension of speech encoding in human superior temporal gyrus.
Collapse
Affiliation(s)
- Matthew K Leonard
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Laura Gwilliams
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Kristin K Sellers
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Jason E Chung
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Duo Xu
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Gavin Mischler
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
- Department of Electrical Engineering, Columbia University, New York, NY, USA
| | - Nima Mesgarani
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
- Department of Electrical Engineering, Columbia University, New York, NY, USA
| | | | | | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA.
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
18
|
Bowling DL. Biological principles for music and mental health. Transl Psychiatry 2023; 13:374. [PMID: 38049408 PMCID: PMC10695969 DOI: 10.1038/s41398-023-02671-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 10/30/2023] [Accepted: 11/17/2023] [Indexed: 12/06/2023] Open
Abstract
Efforts to integrate music into healthcare systems and wellness practices are accelerating but the biological foundations supporting these initiatives remain underappreciated. As a result, music-based interventions are often sidelined in medicine. Here, I bring together advances in music research from neuroscience, psychology, and psychiatry to bridge music's specific foundations in human biology with its specific therapeutic applications. The framework I propose organizes the neurophysiological effects of music around four core elements of human musicality: tonality, rhythm, reward, and sociality. For each, I review key concepts, biological bases, and evidence of clinical benefits. Within this framework, I outline a strategy to increase music's impact on health based on standardizing treatments and their alignment with individual differences in responsivity to these musical elements. I propose that an integrated biological understanding of human musicality-describing each element's functional origins, development, phylogeny, and neural bases-is critical to advancing rational applications of music in mental health and wellness.
Collapse
Affiliation(s)
- Daniel L Bowling
- Department of Psychiatry and Behavioral Sciences, Stanford University, School of Medicine, Stanford, CA, USA.
- Center for Computer Research in Music and Acoustics (CCRMA), Stanford University, School of Humanities and Sciences, Stanford, CA, USA.
| |
Collapse
|
19
|
Li Y, Anumanchipalli GK, Mohamed A, Chen P, Carney LH, Lu J, Wu J, Chang EF. Dissecting neural computations in the human auditory pathway using deep neural networks for speech. Nat Neurosci 2023; 26:2213-2225. [PMID: 37904043 PMCID: PMC10689246 DOI: 10.1038/s41593-023-01468-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 09/13/2023] [Indexed: 11/01/2023]
Abstract
The human auditory system extracts rich linguistic abstractions from speech signals. Traditional approaches to understanding this complex process have used linear feature-encoding models, with limited success. Artificial neural networks excel in speech recognition tasks and offer promising computational models of speech processing. We used speech representations in state-of-the-art deep neural network (DNN) models to investigate neural coding from the auditory nerve to the speech cortex. Representations in hierarchical layers of the DNN correlated well with the neural activity throughout the ascending auditory system. Unsupervised speech models performed at least as well as other purely supervised or fine-tuned models. Deeper DNN layers were better correlated with the neural activity in the higher-order auditory cortex, with computations aligned with phonemic and syllabic structures in speech. Accordingly, DNN models trained on either English or Mandarin predicted cortical responses in native speakers of each language. These results reveal convergence between DNN model representations and the biological auditory pathway, offering new approaches for modeling neural coding in the auditory cortex.
Collapse
Affiliation(s)
- Yuanning Li
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- School of Biomedical Engineering & State Key Laboratory of Advanced Medical Materials and Devices, ShanghaiTech University, Shanghai, China
| | - Gopala K Anumanchipalli
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
- Department of Electrical Engineering and Computer Science, University of California, Berkeley, Berkeley, CA, USA
| | | | - Peili Chen
- School of Biomedical Engineering & State Key Laboratory of Advanced Medical Materialsand Devices, ShanghaiTech University, Shanghai, China
| | - Laurel H Carney
- Department of Biomedical Engineering, University of Rochester, Rochester, NY, USA
| | - Junfeng Lu
- Neurologic Surgery Department, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China
- Brain Function Laboratory, Neurosurgical Institute, Fudan University, Shanghai, China
| | - Jinsong Wu
- Neurologic Surgery Department, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China
- Brain Function Laboratory, Neurosurgical Institute, Fudan University, Shanghai, China
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA.
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
20
|
Lu J, Li Y, Zhao Z, Liu Y, Zhu Y, Mao Y, Wu J, Chang EF. Neural control of lexical tone production in human laryngeal motor cortex. Nat Commun 2023; 14:6917. [PMID: 37903780 PMCID: PMC10616086 DOI: 10.1038/s41467-023-42175-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 09/28/2023] [Indexed: 11/01/2023] Open
Abstract
In tonal languages, which are spoken by nearly one-third of the world's population, speakers precisely control the tension of vocal folds in the larynx to modulate pitch in order to distinguish words with completely different meanings. The specific pitch trajectories for a given tonal language are called lexical tones. Here, we used high-density direct cortical recordings to determine the neural basis of lexical tone production in native Mandarin-speaking participants. We found that instead of a tone category-selective coding, local populations in the bilateral laryngeal motor cortex (LMC) encode articulatory kinematic information to generate the pitch dynamics of lexical tones. Using a computational model of tone production, we discovered two distinct patterns of population activity in LMC commanding pitch rising and lowering. Finally, we showed that direct electrocortical stimulation of different local populations in LMC evoked pitch rising and lowering during tone production, respectively. Together, these results reveal the neural basis of vocal pitch control of lexical tones in tonal languages.
Collapse
Affiliation(s)
- Junfeng Lu
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai, 200040, China
- National Center for Neurological Disorders, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China
| | - Yuanning Li
- School of Biomedical Engineering, ShanghaiTech University, Shanghai, 201210, China
- Department of Neurological Surgery, University of California, San Francisco, CA, 94143, USA
- Weill Institute for Neurosciences, University of California, San Francisco, CA, 94158, USA
- State Key Laboratory of Advanced Medical Materials and Devices, ShanghaiTech University, Shanghai, 201210, China
| | - Zehao Zhao
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai, 200040, China
- National Center for Neurological Disorders, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China
| | - Yan Liu
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai, 200040, China
- National Center for Neurological Disorders, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China
| | - Yanming Zhu
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China
- Speech and Hearing Bioscience & Technology Program, Division of Medical Sciences, Harvard University, Boston, MA, 02215, USA
| | - Ying Mao
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China.
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai, 200040, China.
- National Center for Neurological Disorders, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China.
| | - Jinsong Wu
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China.
- Shanghai Key Laboratory of Brain Function Restoration and Neural Regeneration, Shanghai, 200040, China.
- National Center for Neurological Disorders, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, 200040, China.
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, CA, 94143, USA.
- Weill Institute for Neurosciences, University of California, San Francisco, CA, 94158, USA.
| |
Collapse
|
21
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.17.562771. [PMID: 37905047 PMCID: PMC10614915 DOI: 10.1101/2023.10.17.562771] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception of melody varies along several pitch-based dimensions: (1) the absolute pitch of notes, (2) the difference in pitch between successive notes, and (3) the higher-order statistical expectation of each note conditioned on its prior context. While humans readily perceive melody, how these dimensions are collectively represented in the brain and whether their encoding is specialized for music remains unknown. Here, we recorded high-density neurophysiological activity directly from the surface of human auditory cortex while Western participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial code for representing distinct dimensions of melody. The same participants listened to spoken English, and we compared evoked responses to music and speech. Cortical sites selective for music were systematically driven by the encoding of expectation. In contrast, sites that encoded pitch and pitch-change used the same neural code to represent equivalent properties of speech. These findings reveal the multidimensional nature of melody encoding, consisting of both music-specific and domain-general sound representations in auditory cortex. Teaser The human brain contains both general-purpose and music-specific neural populations for processing distinct attributes of melody.
Collapse
|
22
|
Tillmann B, Graves JE, Talamini F, Lévêque Y, Fornoni L, Hoarau C, Pralus A, Ginzburg J, Albouy P, Caclin A. Auditory cortex and beyond: Deficits in congenital amusia. Hear Res 2023; 437:108855. [PMID: 37572645 DOI: 10.1016/j.heares.2023.108855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/04/2023] [Revised: 06/14/2023] [Accepted: 07/21/2023] [Indexed: 08/14/2023]
Abstract
Congenital amusia is a neuro-developmental disorder of music perception and production, with the observed deficits contrasting with the sophisticated music processing reported for the general population. Musical deficits within amusia have been hypothesized to arise from altered pitch processing, with impairments in pitch discrimination and, notably, short-term memory. We here review research investigating its behavioral and neural correlates, in particular the impairments at encoding, retention, and recollection of pitch information, as well as how these impairments extend to the processing of pitch cues in speech and emotion. The impairments have been related to altered brain responses in a distributed fronto-temporal network, which can be observed also at rest. Neuroimaging studies revealed changes in connectivity patterns within this network and beyond, shedding light on the brain dynamics underlying auditory cognition. Interestingly, some studies revealed spared implicit pitch processing in congenital amusia, showing the power of implicit cognition in the music domain. Building on these findings, together with audiovisual integration and other beneficial mechanisms, we outline perspectives for training and rehabilitation and the future directions of this research domain.
Collapse
Affiliation(s)
- Barbara Tillmann
- CNRS, INSERM, Centre de Recherche en Neurosciences de Lyon CRNL, Université Claude Bernard Lyon 1, UMR5292, U1028, F-69500, Bron, France; Laboratory for Research on Learning and Development, Université de Bourgogne, LEAD - CNRS UMR5022, Dijon, France; LEAD-CNRS UMR5022; Université Bourgogne Franche-Comté; Pôle AAFE; 11 Esplanade Erasme; 21000 Dijon, France.
| | - Jackson E Graves
- Laboratoire des systèmes perceptifs, Département d'études cognitives, École normale supérieure, PSL University, Paris 75005, France
| | | | - Yohana Lévêque
- CNRS, INSERM, Centre de Recherche en Neurosciences de Lyon CRNL, Université Claude Bernard Lyon 1, UMR5292, U1028, F-69500, Bron, France
| | - Lesly Fornoni
- CNRS, INSERM, Centre de Recherche en Neurosciences de Lyon CRNL, Université Claude Bernard Lyon 1, UMR5292, U1028, F-69500, Bron, France
| | - Caliani Hoarau
- CNRS, INSERM, Centre de Recherche en Neurosciences de Lyon CRNL, Université Claude Bernard Lyon 1, UMR5292, U1028, F-69500, Bron, France
| | - Agathe Pralus
- CNRS, INSERM, Centre de Recherche en Neurosciences de Lyon CRNL, Université Claude Bernard Lyon 1, UMR5292, U1028, F-69500, Bron, France
| | - Jérémie Ginzburg
- CNRS, INSERM, Centre de Recherche en Neurosciences de Lyon CRNL, Université Claude Bernard Lyon 1, UMR5292, U1028, F-69500, Bron, France
| | - Philippe Albouy
- CERVO Brain Research Center, School of Psychology, Laval University, Québec, G1J 2G3; International Laboratory for Brain, Music and Sound Research (BRAMS), CRBLM, Montreal QC, H2V 2J2, Canada
| | - Anne Caclin
- CNRS, INSERM, Centre de Recherche en Neurosciences de Lyon CRNL, Université Claude Bernard Lyon 1, UMR5292, U1028, F-69500, Bron, France.
| |
Collapse
|
23
|
Xie X, Jaeger TF, Kurumada C. What we do (not) know about the mechanisms underlying adaptive speech perception: A computational framework and review. Cortex 2023; 166:377-424. [PMID: 37506665 DOI: 10.1016/j.cortex.2023.05.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 12/23/2022] [Accepted: 05/05/2023] [Indexed: 07/30/2023]
Abstract
Speech from unfamiliar talkers can be difficult to comprehend initially. These difficulties tend to dissipate with exposure, sometimes within minutes or less. Adaptivity in response to unfamiliar input is now considered a fundamental property of speech perception, and research over the past two decades has made substantial progress in identifying its characteristics. The mechanisms underlying adaptive speech perception, however, remain unknown. Past work has attributed facilitatory effects of exposure to any one of three qualitatively different hypothesized mechanisms: (1) low-level, pre-linguistic, signal normalization, (2) changes in/selection of linguistic representations, or (3) changes in post-perceptual decision-making. Direct comparisons of these hypotheses, or combinations thereof, have been lacking. We describe a general computational framework for adaptive speech perception (ASP) that-for the first time-implements all three mechanisms. We demonstrate how the framework can be used to derive predictions for experiments on perception from the acoustic properties of the stimuli. Using this approach, we find that-at the level of data analysis presently employed by most studies in the field-the signature results of influential experimental paradigms do not distinguish between the three mechanisms. This highlights the need for a change in research practices, so that future experiments provide more informative results. We recommend specific changes to experimental paradigms and data analysis. All data and code for this study are shared via OSF, including the R markdown document that this article is generated from, and an R library that implements the models we present.
Collapse
Affiliation(s)
- Xin Xie
- Language Science, University of California, Irvine, USA.
| | - T Florian Jaeger
- Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA; Computer Science, University of Rochester, Rochester, NY, USA
| | - Chigusa Kurumada
- Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA
| |
Collapse
|
24
|
McPherson MJ, McDermott JH. Relative pitch representations and invariance to timbre. Cognition 2023; 232:105327. [PMID: 36495710 PMCID: PMC10016107 DOI: 10.1016/j.cognition.2022.105327] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 09/13/2022] [Accepted: 11/10/2022] [Indexed: 12/12/2022]
Abstract
Information in speech and music is often conveyed through changes in fundamental frequency (f0), perceived by humans as "relative pitch". Relative pitch judgments are complicated by two facts. First, sounds can simultaneously vary in timbre due to filtering imposed by a vocal tract or instrument body. Second, relative pitch can be extracted in two ways: by measuring changes in constituent frequency components from one sound to another, or by estimating the f0 of each sound and comparing the estimates. We examined the effects of timbral differences on relative pitch judgments, and whether any invariance to timbre depends on whether judgments are based on constituent frequencies or their f0. Listeners performed up/down and interval discrimination tasks with pairs of spoken vowels, instrument notes, or synthetic tones, synthesized to be either harmonic or inharmonic. Inharmonic sounds lack a well-defined f0, such that relative pitch must be extracted from changes in individual frequencies. Pitch judgments were less accurate when vowels/instruments were different compared to when they were the same, and were biased by the associated timbre differences. However, this bias was similar for harmonic and inharmonic sounds, and was observed even in conditions where judgments of harmonic sounds were based on f0 representations. Relative pitch judgments are thus not invariant to timbre, even when timbral variation is naturalistic, and when such judgments are based on representations of f0.
Collapse
Affiliation(s)
- Malinda J McPherson
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States of America; Program in Speech and Hearing Biosciences and Technology, Harvard University, Boston, MA 02115, United States of America; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States of America.
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States of America; Program in Speech and Hearing Biosciences and Technology, Harvard University, Boston, MA 02115, United States of America; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States of America; Center for Brains Minds and Machines, MIT, Cambridge, MA 02139, United States of America
| |
Collapse
|
25
|
Desai M, Field AM, Hamilton LS. Dataset size considerations for robust acoustic and phonetic speech encoding models in EEG. Front Hum Neurosci 2023; 16:1001171. [PMID: 36741776 PMCID: PMC9895838 DOI: 10.3389/fnhum.2022.1001171] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 12/22/2022] [Indexed: 01/21/2023] Open
Abstract
In many experiments that investigate auditory and speech processing in the brain using electroencephalography (EEG), the experimental paradigm is often lengthy and tedious. Typically, the experimenter errs on the side of including more data, more trials, and therefore conducting a longer task to ensure that the data are robust and effects are measurable. Recent studies used naturalistic stimuli to investigate the brain's response to individual or a combination of multiple speech features using system identification techniques, such as multivariate temporal receptive field (mTRF) analyses. The neural data collected from such experiments must be divided into a training set and a test set to fit and validate the mTRF weights. While a good strategy is clearly to collect as much data as is feasible, it is unclear how much data are needed to achieve stable results. Furthermore, it is unclear whether the specific stimulus used for mTRF fitting and the choice of feature representation affects how much data would be required for robust and generalizable results. Here, we used previously collected EEG data from our lab using sentence stimuli and movie stimuli as well as EEG data from an open-source dataset using audiobook stimuli to better understand how much data needs to be collected for naturalistic speech experiments measuring acoustic and phonetic tuning. We found that the EEG receptive field structure tested here stabilizes after collecting a training dataset of approximately 200 s of TIMIT sentences, around 600 s of movie trailers training set data, and approximately 460 s of audiobook training set data. Thus, we provide suggestions on the minimum amount of data that would be necessary for fitting mTRFs from naturalistic listening data. Our findings are motivated by highly practical concerns when working with children, patient populations, or others who may not tolerate long study sessions. These findings will aid future researchers who wish to study naturalistic speech processing in healthy and clinical populations while minimizing participant fatigue and retaining signal quality.
Collapse
Affiliation(s)
- Maansi Desai
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, TX, United States
| | - Alyssa M. Field
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, TX, United States
| | - Liberty S. Hamilton
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, TX, United States,Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, TX, United States,*Correspondence: Liberty S. Hamilton ✉
| |
Collapse
|
26
|
Hullett PW, Kandahari N, Shih TT, Kleen JK, Knowlton RC, Rao VR, Chang EF. Intact speech perception after resection of dominant hemisphere primary auditory cortex for the treatment of medically refractory epilepsy: illustrative case. JOURNAL OF NEUROSURGERY. CASE LESSONS 2022; 4:CASE22417. [PMID: 36443954 PMCID: PMC9705521 DOI: 10.3171/case22417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 10/27/2022] [Indexed: 11/29/2022]
Abstract
BACKGROUND In classic speech network models, the primary auditory cortex is the source of auditory input to Wernicke's area in the posterior superior temporal gyrus (pSTG). Because resection of the primary auditory cortex in the dominant hemisphere removes inputs to the pSTG, there is a risk of speech impairment. However, recent research has shown the existence of other, nonprimary auditory cortex inputs to the pSTG, potentially reducing the risk of primary auditory cortex resection in the dominant hemisphere. OBSERVATIONS Here, the authors present a clinical case of a woman with severe medically refractory epilepsy with a lesional epileptic focus in the left (dominant) Heschl's gyrus. Analysis of neural responses to speech stimuli was consistent with primary auditory cortex localization to Heschl's gyrus. Although the primary auditory cortex was within the proposed resection margins, she underwent lesionectomy with total resection of Heschl's gyrus. Postoperatively, she had no speech deficits and her seizures were fully controlled. LESSONS While resection of the dominant hemisphere Heschl's gyrus/primary auditory cortex warrants caution, this case illustrates the ability to resect the primary auditory cortex without speech impairment and supports recent models of multiple parallel inputs to the pSTG.
Collapse
Affiliation(s)
- Patrick W. Hullett
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Nazineen Kandahari
- Department of Neurosurgery, University of California San Francisco, San Francisco, California; and ,Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Tina T. Shih
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Jonathan K. Kleen
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Robert C. Knowlton
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Vikram R. Rao
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Edward F. Chang
- Department of Neurosurgery, University of California San Francisco, San Francisco, California; and
| |
Collapse
|
27
|
Bairnsfather JE, Osborne MS, Martin C, Mosing MA, Wilson SJ. Use of explicit priming to phenotype absolute pitch ability. PLoS One 2022; 17:e0273828. [PMID: 36103463 PMCID: PMC9473427 DOI: 10.1371/journal.pone.0273828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Accepted: 08/16/2022] [Indexed: 11/24/2022] Open
Abstract
Musicians with absolute pitch (AP) can name the pitch of a musical note in isolation. Expression of this unusual ability is thought to be influenced by heritability, early music training and current practice. However, our understanding of factors shaping its expression is hampered by testing and scoring methods that treat AP as dichotomous. These fail to capture the observed variability in pitch-naming accuracy among reported AP possessors. The aim of this study was to trial a novel explicit priming paradigm to explore phenotypic variability of AP. Thirty-five musically experienced individuals (Mage = 29 years, range 18–68; 14 males) with varying AP ability completed a standard AP task and the explicit priming AP task. Results showed: 1) phenotypic variability of AP ability, including high-accuracy AP, heterogeneous intermediate performers, and chance-level performers; 2) intermediate performance profiles that were either reliant on or independent of relative pitch strategies, as identified by the priming task; and 3) the emergence of a bimodal distribution of AP performance when adopting scoring criteria that assign credit to semitone errors. These findings show the importance of methods in studying behavioural traits, and are a key step towards identifying AP phenotypes. Replication of our results in larger samples will further establish the usefulness of this priming paradigm in AP research.
Collapse
Affiliation(s)
- Jane E. Bairnsfather
- Melbourne School of Psychological Sciences, The University of Melbourne, Melbourne, Victoria, Australia
- * E-mail:
| | - Margaret S. Osborne
- Melbourne School of Psychological Sciences, The University of Melbourne, Melbourne, Victoria, Australia
- Melbourne Conservatorium of Music, The University of Melbourne, Melbourne, Victoria, Australia
| | - Catherine Martin
- Melbourne School of Psychological Sciences, The University of Melbourne, Melbourne, Victoria, Australia
| | - Miriam A. Mosing
- Melbourne School of Psychological Sciences, The University of Melbourne, Melbourne, Victoria, Australia
- Department of Neuroscience, Karolinska Institutet, Stockholm, Sweden
- Behaviour Genetics Unit, Max Planck Institute for Empirical Aesthetics, Frankfurt am Main, Germany
| | - Sarah J. Wilson
- Melbourne School of Psychological Sciences, The University of Melbourne, Melbourne, Victoria, Australia
| |
Collapse
|
28
|
Brodbeck C, Simon JZ. Cortical tracking of voice pitch in the presence of multiple speakers depends on selective attention. Front Neurosci 2022; 16:828546. [PMID: 36003957 PMCID: PMC9393379 DOI: 10.3389/fnins.2022.828546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 07/08/2022] [Indexed: 11/13/2022] Open
Abstract
Voice pitch carries linguistic and non-linguistic information. Previous studies have described cortical tracking of voice pitch in clean speech, with responses reflecting both pitch strength and pitch value. However, pitch is also a powerful cue for auditory stream segregation, especially when competing streams have pitch differing in fundamental frequency, as is the case when multiple speakers talk simultaneously. We therefore investigated how cortical speech pitch tracking is affected in the presence of a second, task-irrelevant speaker. We analyzed human magnetoencephalography (MEG) responses to continuous narrative speech, presented either as a single talker in a quiet background or as a two-talker mixture of a male and a female speaker. In clean speech, voice pitch was associated with a right-dominant response, peaking at a latency of around 100 ms, consistent with previous electroencephalography and electrocorticography results. The response tracked both the presence of pitch and the relative value of the speaker's fundamental frequency. In the two-talker mixture, the pitch of the attended speaker was tracked bilaterally, regardless of whether or not there was simultaneously present pitch in the speech of the irrelevant speaker. Pitch tracking for the irrelevant speaker was reduced: only the right hemisphere still significantly tracked pitch of the unattended speaker, and only during intervals in which no pitch was present in the attended talker's speech. Taken together, these results suggest that pitch-based segregation of multiple speakers, at least as measured by macroscopic cortical tracking, is not entirely automatic but strongly dependent on selective attention.
Collapse
Affiliation(s)
- Christian Brodbeck
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, United States
- Institute for Systems Research, University of Maryland, College Park, College Park, MD, United States
| | - Jonathan Z. Simon
- Institute for Systems Research, University of Maryland, College Park, College Park, MD, United States
- Department of Electrical and Computer Engineering, University of Maryland, College Park, College Park, MD, United States
- Department of Biology, University of Maryland, College Park, College Park, MD, United States
| |
Collapse
|
29
|
Zhou D, Zhang G, Dang J, Unoki M, Liu X. Detection of Brain Network Communities During Natural Speech Comprehension From Functionally Aligned EEG Sources. Front Comput Neurosci 2022; 16:919215. [PMID: 35874316 PMCID: PMC9301328 DOI: 10.3389/fncom.2022.919215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 06/14/2022] [Indexed: 11/30/2022] Open
Abstract
In recent years, electroencephalograph (EEG) studies on speech comprehension have been extended from a controlled paradigm to a natural paradigm. Under the hypothesis that the brain can be approximated as a linear time-invariant system, the neural response to natural speech has been investigated extensively using temporal response functions (TRFs). However, most studies have modeled TRFs in the electrode space, which is a mixture of brain sources and thus cannot fully reveal the functional mechanism underlying speech comprehension. In this paper, we propose methods for investigating the brain networks of natural speech comprehension using TRFs on the basis of EEG source reconstruction. We first propose a functional hyper-alignment method with an additive average method to reduce EEG noise. Then, we reconstruct neural sources within the brain based on the EEG signals to estimate TRFs from speech stimuli to source areas, and then investigate the brain networks in the neural source space on the basis of the community detection method. To evaluate TRF-based brain networks, EEG data were recorded in story listening tasks with normal speech and time-reversed speech. To obtain reliable structures of brain networks, we detected TRF-based communities from multiple scales. As a result, the proposed functional hyper-alignment method could effectively reduce the noise caused by individual settings in an EEG experiment and thus improve the accuracy of source reconstruction. The detected brain networks for normal speech comprehension were clearly distinctive from those for non-semantically driven (time-reversed speech) audio processing. Our result indicates that the proposed source TRFs can reflect the cognitive processing of spoken language and that the multi-scale community detection method is powerful for investigating brain networks.
Collapse
Affiliation(s)
- Di Zhou
- School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa, Japan
| | - Gaoyan Zhang
- College of Intelligence and Computing, Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin University, Tianjin, China
| | - Jianwu Dang
- School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa, Japan
- College of Intelligence and Computing, Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin University, Tianjin, China
| | - Masashi Unoki
- School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa, Japan
| | - Xin Liu
- School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa, Japan
| |
Collapse
|
30
|
Aberrant Beta-band Brain Connectivity Predicts Speech Motor Planning Deficits in Post-Stroke Aphasia. Cortex 2022; 155:75-89. [DOI: 10.1016/j.cortex.2022.07.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 05/24/2022] [Accepted: 07/06/2022] [Indexed: 11/22/2022]
|
31
|
Neural correlates of impaired vocal feedback control in post-stroke aphasia. Neuroimage 2022; 250:118938. [PMID: 35092839 PMCID: PMC8920755 DOI: 10.1016/j.neuroimage.2022.118938] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 12/31/2021] [Accepted: 01/25/2022] [Indexed: 01/16/2023] Open
Abstract
We used left-hemisphere stroke as a model to examine how damage to sensorimotor brain networks impairs vocal auditory feedback processing and control. Individuals with post-stroke aphasia and matched neurotypical control subjects vocalized speech vowel sounds and listened to the playback of their self-produced vocalizations under normal (NAF) and pitch-shifted altered auditory feedback (AAF) while their brain activity was recorded using electroencephalography (EEG) signals. Event-related potentials (ERPs) were utilized as a neural index to probe the effect of vocal production on auditory feedback processing with high temporal resolution, while lesion data in the stroke group was used to determine how brain abnormality accounted for the impairment of such mechanisms. Results revealed that ERP activity was aberrantly modulated during vocalization vs. listening in aphasia, and this effect was accompanied by the reduced magnitude of compensatory vocal responses to pitch-shift alterations in the auditory feedback compared with control subjects. Lesion-mapping revealed that the aberrant pattern of ERP modulation in response to NAF was accounted for by damage to sensorimotor networks within the left-hemisphere inferior frontal, precentral, inferior parietal, and superior temporal cortices. For responses to AAF, neural deficits were predicted by damage to a distinguishable network within the inferior frontal and parietal cortices. These findings define the left-hemisphere sensorimotor networks implicated in auditory feedback processing, error detection, and vocal motor control. Our results provide translational synergy to inform the theoretical models of sensorimotor integration while having clinical applications for diagnosis and treatment of communication disabilities in individuals with stroke and other neurological conditions.
Collapse
|
32
|
Abstract
Does the brain perceive song as speech with melody? A new study using intracranial recordings and functional brain imaging in humans suggests that it does not. Instead, singing, instrumental music, and speech are represented by different neural populations.
Collapse
Affiliation(s)
- Liberty S Hamilton
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, TX 78712, USA; Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA.
| |
Collapse
|
33
|
Ye H, Fan Z, Li G, Wu Z, Hu J, Sheng X, Chen L, Zhu X. Spontaneous State Detection Using Time-Frequency and Time-Domain Features Extracted From Stereo-Electroencephalography Traces. Front Neurosci 2022; 16:818214. [PMID: 35368269 PMCID: PMC8968069 DOI: 10.3389/fnins.2022.818214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 02/15/2022] [Indexed: 11/23/2022] Open
Abstract
As a minimally invasive recording technique, stereo-electroencephalography (SEEG) measures intracranial signals directly by inserting depth electrodes shafts into the human brain, and thus can capture neural activities in both cortical layers and subcortical structures. Despite gradually increasing SEEG-based brain-computer interface (BCI) studies, the features utilized were usually confined to the amplitude of the event-related potential (ERP) or band power, and the decoding capabilities of other time-frequency and time-domain features have not been demonstrated for SEEG recordings yet. In this study, we aimed to verify the validity of time-domain and time-frequency features of SEEG, where classification performances served as evaluating indicators. To do this, using SEEG signals under intermittent auditory stimuli, we extracted features including the average amplitude, root mean square, slope of linear regression, and line-length from the ERP trace and three traces of band power activities (high-gamma, beta, and alpha). These features were used to detect the active state (including activations to two types of names) against the idle state. Results suggested that valid time-domain and time-frequency features distributed across multiple regions, including the temporal lobe, parietal lobe, and deeper structures such as the insula. Among all feature types, the average amplitude, root mean square, and line-length extracted from high-gamma (60–140 Hz) power and the line-length extracted from ERP were the most informative. Using a hidden Markov model (HMM), we could precisely detect the onset and the end of the active state with a sensitivity of 95.7 ± 1.3% and a precision of 91.7 ± 1.6%. The valid features derived from high-gamma power and ERP in this work provided new insights into the feature selection procedure for further SEEG-based BCI applications.
Collapse
Affiliation(s)
- Huanpeng Ye
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Zhen Fan
- Department of Neurosurgery of Huashan Hospital, Fudan University, Shanghai, China
| | - Guangye Li
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Zehan Wu
- Department of Neurosurgery of Huashan Hospital, Fudan University, Shanghai, China
| | - Jie Hu
- Department of Neurosurgery of Huashan Hospital, Fudan University, Shanghai, China
| | - Xinjun Sheng
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Liang Chen
- Department of Neurosurgery of Huashan Hospital, Fudan University, Shanghai, China
- *Correspondence: Liang Chen
| | - Xiangyang Zhu
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, China
- Xiangyang Zhu
| |
Collapse
|
34
|
Giampiccolo D, Duffau H. Controversy over the temporal cortical terminations of the left arcuate fasciculus: a reappraisal. Brain 2022; 145:1242-1256. [PMID: 35142842 DOI: 10.1093/brain/awac057] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 12/19/2021] [Accepted: 01/20/2022] [Indexed: 11/12/2022] Open
Abstract
The arcuate fasciculus has been considered a major dorsal fronto-temporal white matter pathway linking frontal language production regions with auditory perception in the superior temporal gyrus, the so-called Wernicke's area. In line with this tradition, both historical and contemporary models of language function have assigned primacy to superior temporal projections of the arcuate fasciculus. However, classical anatomical descriptions and emerging behavioural data are at odds with this assumption. On one hand, fronto-temporal projections to Wernicke's area may not be unique to the arcuate fasciculus. On the other hand, dorsal stream language deficits have been reported also for damage to middle, inferior and basal temporal gyri which may be linked to arcuate disconnection. These findings point to a reappraisal of arcuate projections in the temporal lobe. Here, we review anatomical and functional evidence regarding the temporal cortical terminations of the left arcuate fasciculus by incorporating dissection and tractography findings with stimulation data using cortico-cortical evoked potentials and direct electrical stimulation mapping in awake patients. Firstly, we discuss the fibers of the arcuate fasciculus projecting to the superior temporal gyrus and the functional rostro-caudal gradient in this region where both phonological encoding and auditory-motor transformation may be performed. Caudal regions within the temporoparietal junction may be involved in articulation and associated with temporoparietal projections of the third branch of the superior longitudinal fasciculus, while more rostral regions may support encoding of acoustic phonetic features, supported by arcuate fibres. We then move to examine clinical data showing that multimodal phonological encoding is facilitated by projections of the arcuate fasciculus to superior, but also middle, inferior and basal temporal regions. Hence, we discuss how projections of the arcuate fasciculus may contribute to acoustic (middle-posterior superior and middle temporal gyri), visual (posterior inferior temporal/fusiform gyri comprising the visual word form area) and lexical (anterior-middle inferior temporal/fusiform gyri in the basal temporal language area) information in the temporal lobe to be processed, encoded and translated into a dorsal phonological route to the frontal lobe. Finally, we point out surgical implications for this model in terms of the prediction and avoidance of neurological deficit.
Collapse
Affiliation(s)
- Davide Giampiccolo
- Section of Neurosurgery, Department of Neurosciences, Biomedicine and Movement Sciences, University Hospital, Verona, Italy.,Institute of Neuroscience, Cleveland Clinic London, Grosvenor Place, London, UK.,Department of Clinical and Experimental Epilepsy, UCL Queen Square Institute of Neurology, University College London, London, UK.,Victor Horsley Department of Neurosurgery, National Hospital for Neurology and Neurosurgery, Queen Square, London, UK
| | - Hugues Duffau
- Department of Neurosurgery, Gui de Chauliac Hospital, Montpellier University Medical Center, Montpellier, France.,Team "Neuroplasticity, Stem Cells and Low-grade Gliomas," INSERM U1191, Institute of Genomics of Montpellier, University of Montpellier, Montpellier, France
| |
Collapse
|
35
|
Teoh ES, Ahmed F, Lalor EC. Attention Differentially Affects Acoustic and Phonetic Feature Encoding in a Multispeaker Environment. J Neurosci 2022; 42:682-691. [PMID: 34893546 PMCID: PMC8805628 DOI: 10.1523/jneurosci.1455-20.2021] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Revised: 09/28/2021] [Accepted: 09/29/2021] [Indexed: 11/21/2022] Open
Abstract
Humans have the remarkable ability to selectively focus on a single talker in the midst of other competing talkers. The neural mechanisms that underlie this phenomenon remain incompletely understood. In particular, there has been longstanding debate over whether attention operates at an early or late stage in the speech processing hierarchy. One way to better understand this is to examine how attention might differentially affect neurophysiological indices of hierarchical acoustic and linguistic speech representations. In this study, we do this by using encoding models to identify neural correlates of speech processing at various levels of representation. Specifically, we recorded EEG from fourteen human subjects (nine female and five male) during a "cocktail party" attention experiment. Model comparisons based on these data revealed phonetic feature processing for attended, but not unattended speech. Furthermore, we show that attention specifically enhances isolated indices of phonetic feature processing, but that such attention effects are not apparent for isolated measures of acoustic processing. These results provide new insights into the effects of attention on different prelexical representations of speech, insights that complement recent anatomic accounts of the hierarchical encoding of attended speech. Furthermore, our findings support the notion that, for attended speech, phonetic features are processed as a distinct stage, separate from the processing of the speech acoustics.SIGNIFICANCE STATEMENT Humans are very good at paying attention to one speaker in an environment with multiple speakers. However, the details of how attended and unattended speech are processed differently by the brain is not completely clear. Here, we explore how attention affects the processing of the acoustic sounds of speech as well as the mapping of those sounds onto categorical phonetic features. We find evidence of categorical phonetic feature processing for attended, but not unattended speech. Furthermore, we find evidence that categorical phonetic feature processing is enhanced by attention, but acoustic processing is not. These findings add an important new layer in our understanding of how the human brain solves the cocktail party problem.
Collapse
Affiliation(s)
- Emily S Teoh
- School of Engineering, Trinity Centre for Biomedical Engineering, and Trinity College Institute of Neuroscience, Trinity College, University of Dublin, Dublin 2, Ireland
| | - Farhin Ahmed
- Department of Neuroscience, Department of Biomedical Engineering, and Del Monte Neuroscience Institute, University of Rochester, Rochester, New York 14627
| | - Edmund C Lalor
- School of Engineering, Trinity Centre for Biomedical Engineering, and Trinity College Institute of Neuroscience, Trinity College, University of Dublin, Dublin 2, Ireland
- Department of Neuroscience, Department of Biomedical Engineering, and Del Monte Neuroscience Institute, University of Rochester, Rochester, New York 14627
| |
Collapse
|
36
|
Bachmann FL, MacDonald EN, Hjortkjær J. Neural Measures of Pitch Processing in EEG Responses to Running Speech. Front Neurosci 2022; 15:738408. [PMID: 35002597 PMCID: PMC8729880 DOI: 10.3389/fnins.2021.738408] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 11/01/2021] [Indexed: 11/13/2022] Open
Abstract
Linearized encoding models are increasingly employed to model cortical responses to running speech. Recent extensions to subcortical responses suggest clinical perspectives, potentially complementing auditory brainstem responses (ABRs) or frequency-following responses (FFRs) that are current clinical standards. However, while it is well-known that the auditory brainstem responds both to transient amplitude variations and the stimulus periodicity that gives rise to pitch, these features co-vary in running speech. Here, we discuss challenges in disentangling the features that drive the subcortical response to running speech. Cortical and subcortical electroencephalographic (EEG) responses to running speech from 19 normal-hearing listeners (12 female) were analyzed. Using forward regression models, we confirm that responses to the rectified broadband speech signal yield temporal response functions consistent with wave V of the ABR, as shown in previous work. Peak latency and amplitude of the speech-evoked brainstem response were correlated with standard click-evoked ABRs recorded at the vertex electrode (Cz). Similar responses could be obtained using the fundamental frequency (F0) of the speech signal as model predictor. However, simulations indicated that dissociating responses to temporal fine structure at the F0 from broadband amplitude variations is not possible given the high co-variance of the features and the poor signal-to-noise ratio (SNR) of subcortical EEG responses. In cortex, both simulations and data replicated previous findings indicating that envelope tracking on frontal electrodes can be dissociated from responses to slow variations in F0 (relative pitch). Yet, no association between subcortical F0-tracking and cortical responses to relative pitch could be detected. These results indicate that while subcortical speech responses are comparable to click-evoked ABRs, dissociating pitch-related processing in the auditory brainstem may be challenging with natural speech stimuli.
Collapse
Affiliation(s)
- Florine L Bachmann
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
| | - Ewen N MacDonald
- Department of Systems Design Engineering, University of Waterloo, Waterloo, ON, Canada
| | - Jens Hjortkjær
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark.,Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital - Amager and Hvidovre, Copenhagen, Denmark
| |
Collapse
|
37
|
Abstract
Human speech perception results from neural computations that transform external acoustic speech signals into internal representations of words. The superior temporal gyrus (STG) contains the nonprimary auditory cortex and is a critical locus for phonological processing. Here, we describe how speech sound representation in the STG relies on fundamentally nonlinear and dynamical processes, such as categorization, normalization, contextual restoration, and the extraction of temporal structure. A spatial mosaic of local cortical sites on the STG exhibits complex auditory encoding for distinct acoustic-phonetic and prosodic features. We propose that as a population ensemble, these distributed patterns of neural activity give rise to abstract, higher-order phonemic and syllabic representations that support speech perception. This review presents a multi-scale, recurrent model of phonological processing in the STG, highlighting the critical interface between auditory and language systems.
Collapse
Affiliation(s)
- Ilina Bhaya-Grossman
- Department of Neurological Surgery, University of California, San Francisco, California 94143, USA;
- Joint Graduate Program in Bioengineering, University of California, Berkeley and San Francisco, California 94720, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, California 94143, USA;
| |
Collapse
|
38
|
Pruvost-Robieux E, André-Obadia N, Marchi A, Sharshar T, Liuni M, Gavaret M, Aucouturier JJ. It’s not what you say, it’s how you say it: a retrospective study of the impact of prosody on own-name P300 in comatose patients. Clin Neurophysiol 2022; 135:154-161. [DOI: 10.1016/j.clinph.2021.12.015] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 12/16/2021] [Accepted: 12/18/2021] [Indexed: 02/05/2023]
|
39
|
Tomasello R, Grisoni L, Boux I, Sammler D, Pulvermüller F. OUP accepted manuscript. Cereb Cortex 2022; 32:4885-4901. [PMID: 35136980 PMCID: PMC9626830 DOI: 10.1093/cercor/bhab522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 11/16/2021] [Accepted: 12/17/2021] [Indexed: 11/20/2022] Open
Abstract
During conversations, speech prosody provides important clues about the speaker’s communicative intentions. In many languages, a rising vocal pitch at the end of a sentence typically expresses a question function, whereas a falling pitch suggests a statement. Here, the neurophysiological basis of intonation and speech act understanding were investigated with high-density electroencephalography (EEG) to determine whether prosodic features are reflected at the neurophysiological level. Already approximately 100 ms after the sentence-final word differing in prosody, questions, and statements expressed with the same sentences led to different neurophysiological activity recorded in the event-related potential. Interestingly, low-pass filtered sentences and acoustically matched nonvocal musical signals failed to show any neurophysiological dissociations, thus suggesting that the physical intonation alone cannot explain this modulation. Our results show rapid neurophysiological indexes of prosodic communicative information processing that emerge only when pragmatic and lexico-semantic information are fully expressed. The early enhancement of question-related activity compared with statements was due to sources in the articulatory-motor region, which may reflect the richer action knowledge immanent to questions, namely the expectation of the partner action of answering the question. The present findings demonstrate a neurophysiological correlate of prosodic communicative information processing, which enables humans to rapidly detect and understand speaker intentions in linguistic interactions.
Collapse
Affiliation(s)
- Rosario Tomasello
- Address correspondence to Rosario Tomasello, Brain Language Laboratory, Department of Philosophy and Humanities, WE4, Freie Universität Berlin, Habelschwerdter Allee 45, 14195 Berlin, Germany.
| | - Luigi Grisoni
- Brain Language Laboratory, Department of Philosophy and Humanities, Freie Universität Berlin, 14195 Berlin, Germany
- Cluster of Excellence ‘Matters of Activity. Image Space Material’, Humboldt Universität zu Berlin, 10099 Berlin, Germany
| | - Isabella Boux
- Brain Language Laboratory, Department of Philosophy and Humanities, Freie Universität Berlin, 14195 Berlin, Germany
- Berlin School of Mind and Brain, Humboldt Universität zu Berlin, 10117 Berlin, Germany
- Einstein Center for Neurosciences, 10117 Berlin, Germany
| | - Daniela Sammler
- Research Group ‘Neurocognition of Music and Language’, Max Planck Institute for Empirical Aesthetics, 60322 Frankfurt am Main, Germany
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, 04103 Leipzig, Germany
| | - Friedemann Pulvermüller
- Brain Language Laboratory, Department of Philosophy and Humanities, Freie Universität Berlin, 14195 Berlin, Germany
- Cluster of Excellence ‘Matters of Activity. Image Space Material’, Humboldt Universität zu Berlin, 10099 Berlin, Germany
- Berlin School of Mind and Brain, Humboldt Universität zu Berlin, 10117 Berlin, Germany
- Einstein Center for Neurosciences, 10117 Berlin, Germany
| |
Collapse
|
40
|
Saddler MR, Gonzalez R, McDermott JH. Deep neural network models reveal interplay of peripheral coding and stimulus statistics in pitch perception. Nat Commun 2021; 12:7278. [PMID: 34907158 PMCID: PMC8671597 DOI: 10.1038/s41467-021-27366-6] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 11/12/2021] [Indexed: 11/15/2022] Open
Abstract
Perception is thought to be shaped by the environments for which organisms are optimized. These influences are difficult to test in biological organisms but may be revealed by machine perceptual systems optimized under different conditions. We investigated environmental and physiological influences on pitch perception, whose properties are commonly linked to peripheral neural coding limits. We first trained artificial neural networks to estimate fundamental frequency from biologically faithful cochlear representations of natural sounds. The best-performing networks replicated many characteristics of human pitch judgments. To probe the origins of these characteristics, we then optimized networks given altered cochleae or sound statistics. Human-like behavior emerged only when cochleae had high temporal fidelity and when models were optimized for naturalistic sounds. The results suggest pitch perception is critically shaped by the constraints of natural environments in addition to those of the cochlea, illustrating the use of artificial neural networks to reveal underpinnings of behavior.
Collapse
Affiliation(s)
- Mark R Saddler
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA.
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA.
- Center for Brains, Minds and Machines, MIT, Cambridge, MA, USA.
| | - Ray Gonzalez
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA
- Center for Brains, Minds and Machines, MIT, Cambridge, MA, USA
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA.
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA.
- Center for Brains, Minds and Machines, MIT, Cambridge, MA, USA.
- Program in Speech and Hearing Biosciences and Technology, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
41
|
Generalizable EEG Encoding Models with Naturalistic Audiovisual Stimuli. J Neurosci 2021; 41:8946-8962. [PMID: 34503996 DOI: 10.1523/jneurosci.2891-20.2021] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 08/24/2021] [Accepted: 08/29/2021] [Indexed: 11/21/2022] Open
Abstract
In natural conversations, listeners must attend to what others are saying while ignoring extraneous background sounds. Recent studies have used encoding models to predict electroencephalography (EEG) responses to speech in noise-free listening situations, sometimes referred to as "speech tracking." Researchers have analyzed how speech tracking changes with different types of background noise. It is unclear, however, whether neural responses from acoustically rich, naturalistic environments with and without background noise can be generalized to more controlled stimuli. If encoding models for acoustically rich, naturalistic stimuli are generalizable to other tasks, this could aid in data collection from populations of individuals who may not tolerate listening to more controlled and less engaging stimuli for long periods of time. We recorded noninvasive scalp EEG while 17 human participants (8 male/9 female) listened to speech without noise and audiovisual speech stimuli containing overlapping speakers and background sounds. We fit multivariate temporal receptive field encoding models to predict EEG responses to pitch, the acoustic envelope, phonological features, and visual cues in both stimulus conditions. Our results suggested that neural responses to naturalistic stimuli were generalizable to more controlled datasets. EEG responses to speech in isolation were predicted accurately using phonological features alone, while responses to speech in a rich acoustic background were more accurate when including both phonological and acoustic features. Our findings suggest that naturalistic audiovisual stimuli can be used to measure receptive fields that are comparable and generalizable to more controlled audio-only stimuli.SIGNIFICANCE STATEMENT Understanding spoken language in natural environments requires listeners to parse acoustic and linguistic information in the presence of other distracting stimuli. However, most studies of auditory processing rely on highly controlled stimuli with no background noise, or with background noise inserted at specific times. Here, we compare models where EEG data are predicted based on a combination of acoustic, phonetic, and visual features in highly disparate stimuli-sentences from a speech corpus and speech embedded within movie trailers. We show that modeling neural responses to highly noisy, audiovisual movies can uncover tuning for acoustic and phonetic information that generalizes to simpler stimuli typically used in sensory neuroscience experiments.
Collapse
|
42
|
Kurumada C, Roettger TB. Thinking probabilistically in the study of intonational speech prosody. WILEY INTERDISCIPLINARY REVIEWS. COGNITIVE SCIENCE 2021; 13:e1579. [PMID: 34599647 DOI: 10.1002/wcs.1579] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 08/09/2021] [Accepted: 08/26/2021] [Indexed: 11/07/2022]
Abstract
Speech prosody, the melodic and rhythmic properties of a language, plays a critical role in our everyday communication. Researchers have identified unique patterns of prosody that segment words and phrases, highlight focal elements in a sentence, and convey holistic meanings and speech acts that interact with the information shared in context. The mapping between the sound and meaning represented in prosody is suggested to be probabilistic-the same physical instance of sounds can support multiple meanings across talkers and contexts while the same meaning can be encoded in physically distinct sound patterns (e.g., pitch movements). The current overview presents an analysis framework for probing the nature of this probabilistic relationship. Illustrated by examples from the literature and a dataset of German focus marking, we discuss the production variability within and across talkers and consider challenges that this variability imposes on the comprehension system. A better understanding of these challenges, we argue, will illuminate how the human perceptual, cognitive, and computational mechanisms may navigate the variability to arrive at a coherent understanding of speech prosody. The current paper is intended to be an introduction for those who are interested in thinking probabilistically about the sound-meaning mapping in prosody. Open questions for future research are discussed with proposals for examining prosodic production and comprehension within a comprehensive, mathematically-motivated framework of probabilistic inference under uncertainty. This article is categorized under: Linguistics > Language in Mind and Brain Psychology > Language.
Collapse
Affiliation(s)
- Chigusa Kurumada
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, New York, USA
| | - Timo B Roettger
- Department of Linguistics & Scandinavian Studies, Universitetet i Oslo, Oslo, Norway
| |
Collapse
|
43
|
Learning nonnative speech sounds changes local encoding in the adult human cortex. Proc Natl Acad Sci U S A 2021; 118:2101777118. [PMID: 34475209 DOI: 10.1073/pnas.2101777118] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 07/12/2021] [Indexed: 11/18/2022] Open
Abstract
Adults can learn to identify nonnative speech sounds with training, albeit with substantial variability in learning behavior. Increases in behavioral accuracy are associated with increased separability for sound representations in cortical speech areas. However, it remains unclear whether individual auditory neural populations all show the same types of changes with learning, or whether there are heterogeneous encoding patterns. Here, we used high-resolution direct neural recordings to examine local population response patterns, while native English listeners learned to recognize unfamiliar vocal pitch patterns in Mandarin Chinese tones. We found a distributed set of neural populations in bilateral superior temporal gyrus and ventrolateral frontal cortex, where the encoding of Mandarin tones changed throughout training as a function of trial-by-trial accuracy ("learning effect"), including both increases and decreases in the separability of tones. These populations were distinct from populations that showed changes as a function of exposure to the stimuli regardless of trial-by-trial accuracy. These learning effects were driven in part by more variable neural responses to repeated presentations of acoustically identical stimuli. Finally, learning effects could be predicted from speech-evoked activity even before training, suggesting that intrinsic properties of these populations make them amenable to behavior-related changes. Together, these results demonstrate that nonnative speech sound learning involves a wide array of changes in neural representations across a distributed set of brain regions.
Collapse
|
44
|
Hamilton LS, Oganian Y, Hall J, Chang EF. Parallel and distributed encoding of speech across human auditory cortex. Cell 2021; 184:4626-4639.e13. [PMID: 34411517 PMCID: PMC8456481 DOI: 10.1016/j.cell.2021.07.019] [Citation(s) in RCA: 111] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2020] [Revised: 02/11/2021] [Accepted: 07/19/2021] [Indexed: 12/27/2022]
Abstract
Speech perception is thought to rely on a cortical feedforward serial transformation of acoustic into linguistic representations. Using intracranial recordings across the entire human auditory cortex, electrocortical stimulation, and surgical ablation, we show that cortical processing across areas is not consistent with a serial hierarchical organization. Instead, response latency and receptive field analyses demonstrate parallel and distinct information processing in the primary and nonprimary auditory cortices. This functional dissociation was also observed where stimulation of the primary auditory cortex evokes auditory hallucination but does not distort or interfere with speech perception. Opposite effects were observed during stimulation of nonprimary cortex in superior temporal gyrus. Ablation of the primary auditory cortex does not affect speech perception. These results establish a distributed functional organization of parallel information processing throughout the human auditory cortex and demonstrate an essential independent role for nonprimary auditory cortex in speech processing.
Collapse
Affiliation(s)
- Liberty S Hamilton
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Yulia Oganian
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Jeffery Hall
- Department of Neurology and Neurosurgery, McGill University Montreal Neurological Institute, Montreal, QC, H3A 2B4, Canada
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA.
| |
Collapse
|
45
|
Encoding and decoding of meaning through structured variability in intonational speech prosody. Cognition 2021; 211:104619. [DOI: 10.1016/j.cognition.2021.104619] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Revised: 11/25/2020] [Accepted: 01/27/2021] [Indexed: 11/17/2022]
|
46
|
Clinical applications of neurolinguistics in neurosurgery. Front Med 2021; 15:562-574. [PMID: 33983605 DOI: 10.1007/s11684-020-0771-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Accepted: 03/05/2020] [Indexed: 11/27/2022]
Abstract
The protection of language function is one of the major challenges of brain surgery. Over the past century, neurosurgeons have attempted to seek the optimal strategy for the preoperative and intraoperative identification of language-related brain regions. Neurosurgeons have investigated the neural mechanism of language, developed neurolinguistics theory, and provided unique evidence to further understand the neural basis of language functions by using intraoperative cortical and subcortical electrical stimulation. With the emergence of modern neuroscience techniques and dramatic advances in language models over the last 25 years, novel language mapping methods have been applied in the neurosurgical practice to help neurosurgeons protect the brain and reduce morbidity. The rapid advancements in brain-computer interface have provided the perfect platform for the combination of neurosurgery and neurolinguistics. In this review, the history of neurolinguistics models, advancements in modern technology, role of neurosurgery in language mapping, and modern language mapping methods (including noninvasive neuroimaging techniques and invasive cortical electroencephalogram) are presented.
Collapse
|
47
|
Llanos F, German JS, Gnanateja GN, Chandrasekaran B. The neural processing of pitch accents in continuous speech. Neuropsychologia 2021; 158:107883. [PMID: 33989647 DOI: 10.1016/j.neuropsychologia.2021.107883] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 04/29/2021] [Accepted: 05/03/2021] [Indexed: 12/21/2022]
Abstract
Pitch accents are local pitch patterns that convey differences in word prominence and modulate the information structure of the discourse. Despite the importance to discourse in languages like English, neural processing of pitch accents remains understudied. The current study investigates the neural processing of pitch accents by native and non-native English speakers while they are listening to or ignoring 45 min of continuous, natural speech. Leveraging an approach used to study phonemes in natural speech, we analyzed thousands of electroencephalography (EEG) segments time-locked to pitch accents in a prosodic transcription. The optimal neural discrimination between pitch accent categories emerged at latencies between 100 and 200 ms. During these latencies, we found a strong structural alignment between neural and phonetic representations of pitch accent categories. In the same latencies, native listeners exhibited more robust processing of pitch accent contrasts than non-native listeners. However, these group differences attenuated when the speech signal was ignored. We can reliably capture the neural processing of discrete and contrastive pitch accent categories in continuous speech. Our analytic approach also captures how language-specific knowledge and selective attention influences the neural processing of pitch accent categories.
Collapse
Affiliation(s)
- Fernando Llanos
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, USA; Department of Linguistics, The University of Texas at Austin, Austin, TX, USA
| | - James S German
- Aix-Marseille University, CNRS, LPL, Aix-en-Provence, France
| | - G Nike Gnanateja
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, USA
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
48
|
Convergence of heteromodal lexical retrieval in the lateral prefrontal cortex. Sci Rep 2021; 11:6305. [PMID: 33737672 PMCID: PMC7973515 DOI: 10.1038/s41598-021-85802-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Accepted: 03/03/2021] [Indexed: 01/31/2023] Open
Abstract
Lexical retrieval requires selecting and retrieving the most appropriate word from the lexicon to express a desired concept. Few studies have probed lexical retrieval with tasks other than picture naming, and when non-picture naming lexical retrieval tasks have been applied, both convergent and divergent results emerged. The presence of a single construct for auditory and visual processes of lexical retrieval would influence cognitive rehabilitation strategies for patients with aphasia. In this study, we perform support vector regression lesion-symptom mapping using a brain tumor model to test the hypothesis that brain regions specifically involved in lexical retrieval from visual and auditory stimuli represent overlapping neural systems. We find that principal components analysis of language tasks revealed multicollinearity between picture naming, auditory naming, and a validated measure of word finding, implying the existence of redundant cognitive constructs. Nonparametric, multivariate lesion-symptom mapping across participants was used to model accuracies on each of the four language tasks. Lesions within overlapping clusters of 8,333 voxels and 21,512 voxels in the left lateral prefrontal cortex (PFC) were predictive of impaired picture naming and auditory naming, respectively. These data indicate a convergence of heteromodal lexical retrieval within the PFC.
Collapse
|
49
|
Trumpis M, Chiang CH, Orsborn AL, Bent B, Li J, Rogers JA, Pesaran B, Cogan G, Viventi J. Sufficient sampling for kriging prediction of cortical potential in rat, monkey, and human µECoG. J Neural Eng 2021; 18. [PMID: 33326943 DOI: 10.1088/1741-2552/abd460] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Accepted: 12/16/2020] [Indexed: 12/22/2022]
Abstract
Objective. Large channel count surface-based electrophysiology arrays (e.g. µECoG) are high-throughput neural interfaces with good chronic stability. Electrode spacing remains ad hoc due to redundancy and nonstationarity of field dynamics. Here, we establish a criterion for electrode spacing based on the expected accuracy of predicting unsampled field potential from sampled sites.Approach. We applied spatial covariance modeling and field prediction techniques based on geospatial kriging to quantify sufficient sampling for thousands of 500 ms µECoG snapshots in human, monkey, and rat. We calculated a probably approximately correct (PAC) spacing based on kriging that would be required to predict µECoG fields at≤10% error for most cases (95% of observations).Main results. Kriging theory accurately explained the competing effects of electrode density and noise on predicting field potential. Across five frequency bands from 4-7 to 75-300 Hz, PAC spacing was sub-millimeter for auditory cortex in anesthetized and awake rats, and posterior superior temporal gyrus in anesthetized human. At 75-300 Hz, sub-millimeter PAC spacing was required in all species and cortical areas.Significance. PAC spacing accounted for the effect of signal-to-noise on prediction quality and was sensitive to the full distribution of non-stationary covariance states. Our results show that µECoG arrays should sample at sub-millimeter resolution for applications in diverse cortical areas and for noise resilience.
Collapse
Affiliation(s)
- Michael Trumpis
- Department of Biomedical Engineering, Duke University, Durham, NC 27708, United States of America
| | - Chia-Han Chiang
- Department of Biomedical Engineering, Duke University, Durham, NC 27708, United States of America
| | - Amy L Orsborn
- Center for Neural Science, New York University, New York, NY 10003, United States of America.,Department of Electrical & Computer Engineering, University of Washington, Seattle, WA 98195, United States of America.,Department of Bioengineering, University of Washington, Seattle, Washington 98105, United States of America.,Washington National Primate Research Center, Seattle, Washington 98195, United States of America
| | - Brinnae Bent
- Department of Biomedical Engineering, Duke University, Durham, NC 27708, United States of America
| | - Jinghua Li
- Department of Materials Science and Engineering, Northwestern University, Evanston, IL 60208, United States of America.,Department of Materials Science and Engineering, The Ohio State University, Columbus, OH 43210, United States of America.,Chronic Brain Injury Program, The Ohio State University, Columbus, OH 43210, United States of America
| | - John A Rogers
- Department of Materials Science and Engineering, Northwestern University, Evanston, IL 60208, United States of America.,Simpson Querrey Institute, Northwestern University, Chicago, IL 60611, United States of America.,Department of Biomedical Engineering, Northwestern University, Evanston, IL 60208, United States of America.,Department of Neurological Surgery, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, United States of America
| | - Bijan Pesaran
- Center for Neural Science, New York University, New York, NY 10003, United States of America
| | - Gregory Cogan
- Department of Neurosurgery, Duke School of Medicine, Durham, NC 27710, United States of America.,Department of Psychology and Neuroscience, Duke University, Durham, NC 27708, United States of America.,Center for Cognitive Neuroscience, Duke University, Durham, NC 27708, United States of America.,Duke Comprehensive Epilepsy Center, Duke School of Medicine, Durham, NC 27710, United States of America
| | - Jonathan Viventi
- Department of Biomedical Engineering, Duke University, Durham, NC 27708, United States of America.,Department of Neurosurgery, Duke School of Medicine, Durham, NC 27710, United States of America.,Duke Comprehensive Epilepsy Center, Duke School of Medicine, Durham, NC 27710, United States of America.,Department of Neurobiology, Duke School of Medicine, Durham, NC 27710, United States of America
| |
Collapse
|
50
|
Li Y, Tang C, Lu J, Wu J, Chang EF. Human cortical encoding of pitch in tonal and non-tonal languages. Nat Commun 2021; 12:1161. [PMID: 33608548 PMCID: PMC7896081 DOI: 10.1038/s41467-021-21430-x] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Accepted: 01/26/2021] [Indexed: 11/09/2022] Open
Abstract
Languages can use a common repertoire of vocal sounds to signify distinct meanings. In tonal languages, such as Mandarin Chinese, pitch contours of syllables distinguish one word from another, whereas in non-tonal languages, such as English, pitch is used to convey intonation. The neural computations underlying language specialization in speech perception are unknown. Here, we use a cross-linguistic approach to address this. Native Mandarin- and English- speaking participants each listened to both Mandarin and English speech, while neural activity was directly recorded from the non-primary auditory cortex. Both groups show language-general coding of speaker-invariant pitch at the single electrode level. At the electrode population level, we find language-specific distribution of cortical tuning parameters in Mandarin speakers only, with enhanced sensitivity to Mandarin tone categories. Our results show that speech perception relies upon a shared cortical auditory feature processing mechanism, which may be tuned to the statistics of a given language. Different languages rely on different vocal sounds to convey meaning. Here the authors show that language-general coding of pitch occurs in the non-primary auditory cortex for both tonal (Mandarin Chinese) and non-tonal (English) languages, with some language specificity on the population level.
Collapse
Affiliation(s)
- Yuanning Li
- Department of Neurological Surgery, University of California, San Francisco, CA, USA.,Center for Integrative Neuroscience, University of California, San Francisco, CA, USA
| | - Claire Tang
- Department of Neurological Surgery, University of California, San Francisco, CA, USA.,Center for Integrative Neuroscience, University of California, San Francisco, CA, USA
| | - Junfeng Lu
- Brain Function Laboratory, Neurosurgical Institute of Fudan University, Shanghai, China.,Shanghai Key laboratory of Brain Function Restoration and Neural Regeneration, Shanghai, China
| | - Jinsong Wu
- Brain Function Laboratory, Neurosurgical Institute of Fudan University, Shanghai, China. .,Shanghai Key laboratory of Brain Function Restoration and Neural Regeneration, Shanghai, China. .,Neurologic Surgery Department, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China. .,Institute of Brain-Intelligence Technology, Zhangjiang Lab, Shanghai, China.
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, CA, USA. .,Center for Integrative Neuroscience, University of California, San Francisco, CA, USA.
| |
Collapse
|