51
|
Lin FH, Lee HJ, Ahveninen J, Jääskeläinen IP, Yu HY, Lee CC, Chou CC, Kuo WJ. Distributed source modeling of intracranial stereoelectro-encephalographic measurements. Neuroimage 2021; 230:117746. [PMID: 33454414 DOI: 10.1016/j.neuroimage.2021.117746] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Revised: 12/11/2020] [Accepted: 01/06/2021] [Indexed: 11/17/2022] Open
Abstract
Intracranial stereoelectroencephalography (sEEG) provides unsurpassed sensitivity and specificity for human neurophysiology. However, functional mapping of brain functions has been limited because the implantations have sparse coverage and differ greatly across individuals. Here, we developed a distributed, anatomically realistic sEEG source-modeling approach for within- and between-subject analyses. In addition to intracranial event-related potentials (iERP), we estimated the sources of high broadband gamma activity (HBBG), a putative correlate of local neural firing. Our novel approach accounted for a significant portion of the variance of the sEEG measurements in leave-one-out cross-validation. After logarithmic transformations, the sensitivity and signal-to-noise ratio were linearly inversely related to the minimal distance between the brain location and electrode contacts (slope≈-3.6). The signa-to-noise ratio and sensitivity in the thalamus and brain stem were comparable to those locations at the vicinity of electrode contact implantation. The HGGB source estimates were remarkably consistent with analyses of intracranial-contact data. In conclusion, distributed sEEG source modeling provides a powerful neuroimaging tool, which facilitates anatomically-normalized functional mapping of human brain using both iERP and HBBG data.
Collapse
Affiliation(s)
- Fa-Hsuan Lin
- Physical Sciences Platform, Sunnybrook Research Institute, Toronto, Canada; Department of Medical Biophysics, University of Toronto, Toronto, Canada; Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland
| | - Hsin-Ju Lee
- Physical Sciences Platform, Sunnybrook Research Institute, Toronto, Canada; Department of Medical Biophysics, University of Toronto, Toronto, Canada
| | - Jyrki Ahveninen
- Athinoula A. Martinos Center for Biomedical Imaging, Charlestown, MA, USA
| | - Iiro P Jääskeläinen
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland; International Laboratory of Social Neurobiology, Institute of Cognitive Neuroscience, National Research University Higher School of Economics, Moscow, Russian Federation
| | - Hsiang-Yu Yu
- Department of Neurology, Neurological Institute, Taipei Veterans General Hospital, Taipei, Taiwan; Institute of Brain Science, Brain Research Center, National Yang-Ming University, Taipei, Taiwan
| | - Cheng-Chia Lee
- Institute of Brain Science, Brain Research Center, National Yang-Ming University, Taipei, Taiwan; Department of Neurosurgery, Neurological Institute, Taipei Veterans General Hospital, Taipei, Taiwan
| | - Chien-Chen Chou
- Department of Neurology, Neurological Institute, Taipei Veterans General Hospital, Taipei, Taiwan; Institute of Brain Science, Brain Research Center, National Yang-Ming University, Taipei, Taiwan
| | - Wen-Jui Kuo
- Institute of Neuroscience, National Yang Ming University, Taipei, Taiwan; Brain Research Center, National Yang Ming University, Taipei, Taiwan.
| |
Collapse
|
52
|
McPherson MJ, McDermott JH. Time-dependent discrimination advantages for harmonic sounds suggest efficient coding for memory. Proc Natl Acad Sci U S A 2020; 117:32169-32180. [PMID: 33262275 PMCID: PMC7749397 DOI: 10.1073/pnas.2008956117] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Perceptual systems have finite memory resources and must store incoming signals in compressed formats. To explore whether representations of a sound's pitch might derive from this need for compression, we compared discrimination of harmonic and inharmonic sounds across delays. In contrast to inharmonic spectra, harmonic spectra can be summarized, and thus compressed, using their fundamental frequency (f0). Participants heard two sounds and judged which was higher. Despite being comparable for sounds presented back-to-back, discrimination was better for harmonic than inharmonic stimuli when sounds were separated in time, implicating memory representations unique to harmonic sounds. Patterns of individual differences (correlations between thresholds in different conditions) indicated that listeners use different representations depending on the time delay between sounds, directly comparing the spectra of temporally adjacent sounds, but transitioning to comparing f0s across delays. The need to store sound in memory appears to determine reliance on f0-based pitch and may explain its importance in music, in which listeners must extract relationships between notes separated in time.
Collapse
Affiliation(s)
- Malinda J McPherson
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139;
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Boston, MA 02115
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139;
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Boston, MA 02115
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139
- Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139
| |
Collapse
|
53
|
Abstract
Speech processing in the human brain is grounded in non-specific auditory processing in the general mammalian brain, but relies on human-specific adaptations for processing speech and language. For this reason, many recent neurophysiological investigations of speech processing have turned to the human brain, with an emphasis on continuous speech. Substantial progress has been made using the phenomenon of "neural speech tracking", in which neurophysiological responses time-lock to the rhythm of auditory (and other) features in continuous speech. One broad category of investigations concerns the extent to which speech tracking measures are related to speech intelligibility, which has clinical applications in addition to its scientific importance. Recent investigations have also focused on disentangling different neural processes that contribute to speech tracking. The two lines of research are closely related, since processing stages throughout auditory cortex contribute to speech comprehension, in addition to subcortical processing and higher order and attentional processes.
Collapse
Affiliation(s)
- Christian Brodbeck
- Institute for Systems Research, University of Maryland, College Park, Maryland 20742, U.S.A
| | - Jonathan Z. Simon
- Institute for Systems Research, University of Maryland, College Park, Maryland 20742, U.S.A
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland 20742, U.S.A
- Department of Biology, University of Maryland, College Park, Maryland 20742, U.S.A
| |
Collapse
|
54
|
Speech frequency-following response in human auditory cortex is more than a simple tracking. Neuroimage 2020; 226:117545. [PMID: 33186711 DOI: 10.1016/j.neuroimage.2020.117545] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 10/29/2020] [Accepted: 11/02/2020] [Indexed: 11/20/2022] Open
Abstract
The human auditory cortex is recently found to contribute to the frequency following response (FFR) and the cortical component has been shown to be more relevant to speech perception. However, it is not clear how cortical FFR may contribute to the processing of speech fundamental frequency (F0) and the dynamic pitch. Using intracranial EEG recordings, we observed a significant FFR at the fundamental frequency (F0) for both speech and speech-like harmonic complex stimuli in the human auditory cortex, even in the missing fundamental condition. Both the spectral amplitude and phase coherence of the cortical FFR showed a significant harmonic preference, and attenuated from the primary auditory cortex to the surrounding associative auditory cortex. The phase coherence of the speech FFR was found significantly higher than that of the harmonic complex stimuli, especially in the left hemisphere, showing a high timing fidelity of the cortical FFR in tracking dynamic F0 in speech. Spectrally, the frequency band of the cortical FFR was largely overlapped with the range of the human vocal pitch. Taken together, our study parsed the intrinsic properties of the cortical FFR and reveals a preference for speech-like sounds, supporting its potential role in processing speech intonation and lexical tones.
Collapse
|
55
|
Fox NP, Leonard M, Sjerps MJ, Chang EF. Transformation of a temporal speech cue to a spatial neural code in human auditory cortex. eLife 2020; 9:e53051. [PMID: 32840483 PMCID: PMC7556862 DOI: 10.7554/elife.53051] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Accepted: 08/21/2020] [Indexed: 11/28/2022] Open
Abstract
In speech, listeners extract continuously-varying spectrotemporal cues from the acoustic signal to perceive discrete phonetic categories. Spectral cues are spatially encoded in the amplitude of responses in phonetically-tuned neural populations in auditory cortex. It remains unknown whether similar neurophysiological mechanisms encode temporal cues like voice-onset time (VOT), which distinguishes sounds like /b/ and/p/. We used direct brain recordings in humans to investigate the neural encoding of temporal speech cues with a VOT continuum from /ba/ to /pa/. We found that distinct neural populations respond preferentially to VOTs from one phonetic category, and are also sensitive to sub-phonetic VOT differences within a population's preferred category. In a simple neural network model, simulated populations tuned to detect either temporal gaps or coincidences between spectral cues captured encoding patterns observed in real neural data. These results demonstrate that a spatial/amplitude neural code underlies the cortical representation of both spectral and temporal speech cues.
Collapse
Affiliation(s)
- Neal P Fox
- Department of Neurological Surgery, University of California, San FranciscoSan FranciscoUnited States
| | - Matthew Leonard
- Department of Neurological Surgery, University of California, San FranciscoSan FranciscoUnited States
| | - Matthias J Sjerps
- Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive Neuroimaging, Radboud UniversityNijmegenNetherlands
- Max Planck Institute for PsycholinguisticsNijmegenNetherlands
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San FranciscoSan FranciscoUnited States
- Weill Institute for Neurosciences, University of California, San FranciscoSan FranciscoUnited States
| |
Collapse
|
56
|
Leszczyński M, Barczak A, Kajikawa Y, Ulbert I, Falchier AY, Tal I, Haegens S, Melloni L, Knight RT, Schroeder CE. Dissociation of broadband high-frequency activity and neuronal firing in the neocortex. SCIENCE ADVANCES 2020; 6:eabb0977. [PMID: 32851172 PMCID: PMC7423365 DOI: 10.1126/sciadv.abb0977] [Citation(s) in RCA: 97] [Impact Index Per Article: 19.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Accepted: 06/30/2020] [Indexed: 05/30/2023]
Abstract
Broadband high-frequency activity (BHA; 70 to 150 Hz), also known as "high gamma," a key analytic signal in human intracranial (electrocorticographic) recordings, is often assumed to reflect local neural firing [multiunit activity (MUA)]. As the precise physiological substrates of BHA are unknown, this assumption remains controversial. Our analysis of laminar multielectrode data from V1 and A1 in monkeys outlines two components of stimulus-evoked BHA distributed across the cortical layers: an "early-deep" and "late-superficial" response. Early-deep BHA has a clear spatial and temporal overlap with MUA. Late-superficial BHA was more prominent and accounted for more of the BHA signal measured near the cortical pial surface. However, its association with local MUA is weak and often undetectable, consistent with the view that it reflects dendritic processes separable from local neuronal firing.
Collapse
Affiliation(s)
- Marcin Leszczyński
- Cognitive Science and Neuromodulation Program, Departments of Psychiatry, Neurology and Neurosurgery, Columbia University College of Physicians and Surgeons, New York, NY, USA
- Translational Neuroscience Division of the Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute, Orangeburg, NY, USA
| | - Annamaria Barczak
- Translational Neuroscience Division of the Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute, Orangeburg, NY, USA
| | - Yoshinao Kajikawa
- Translational Neuroscience Division of the Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute, Orangeburg, NY, USA
| | - Istvan Ulbert
- Institute for Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary
| | - Arnaud Y. Falchier
- Translational Neuroscience Division of the Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute, Orangeburg, NY, USA
- Department of Psychiatry, NYU Grossman School of Medicine, New York, NY, USA
| | - Idan Tal
- Cognitive Science and Neuromodulation Program, Departments of Psychiatry, Neurology and Neurosurgery, Columbia University College of Physicians and Surgeons, New York, NY, USA
- Translational Neuroscience Division of the Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute, Orangeburg, NY, USA
| | - Saskia Haegens
- Cognitive Science and Neuromodulation Program, Departments of Psychiatry, Neurology and Neurosurgery, Columbia University College of Physicians and Surgeons, New York, NY, USA
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands
| | - Lucia Melloni
- Department of Neurology, New York University Langone Health, New York, NY, USA
| | - Robert T. Knight
- Department of Psychology and Helen Wills Neuroscience Institute, University of California at Berkeley, Berkeley, CA, USA
| | - Charles E. Schroeder
- Cognitive Science and Neuromodulation Program, Departments of Psychiatry, Neurology and Neurosurgery, Columbia University College of Physicians and Surgeons, New York, NY, USA
- Translational Neuroscience Division of the Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute, Orangeburg, NY, USA
| |
Collapse
|
57
|
Abstract
This study presents a computational model to reproduce the biological dynamics of "listening to music." A biologically plausible model of periodicity pitch detection is proposed and simulated. Periodicity pitch is computed across a range of the auditory spectrum. Periodicity pitch is detected from subsets of activated auditory nerve fibers (ANFs). These activate connected model octopus cells, which trigger model neurons detecting onsets and offsets; thence model interval-tuned neurons are innervated at the right interval times; and finally, a set of common interval-detecting neurons indicate pitch. Octopus cells rhythmically spike with the pitch periodicity of the sound. Batteries of interval-tuned neurons stopwatch-like measure the inter-spike intervals of the octopus cells by coding interval durations as first spike latencies (FSLs). The FSL-triggered spikes synchronously coincide through a monolayer spiking neural network at the corresponding receiver pitch neurons.
Collapse
Affiliation(s)
- Frank Klefenz
- Fraunhofer Institute for Digital Media Technology IDMT, Ilmenau, Germany
| | - Tamas Harczos
- Fraunhofer Institute for Digital Media Technology IDMT, Ilmenau, Germany
- Auditory Neuroscience and Optogenetics Laboratory, German Primate Center, Göttingen, Germany
- audifon GmbH & Co. KG, Kölleda, Germany
| |
Collapse
|
58
|
Abstract
Perceptual disturbances in psychosis, such as auditory verbal hallucinations, are associated with increased baseline activity in the associative auditory cortex and increased dopamine transmission in the associative striatum. Perceptual disturbances are also associated with perceptual biases that suggest increased reliance on prior expectations. We review theoretical models of perceptual inference and key supporting physiological evidence, as well as the anatomy of associative cortico-striatal loops that may be relevant to auditory perceptual inference. Integrating recent findings, we outline a working framework that bridges neurobiology and the phenomenology of perceptual disturbances via theoretical models of perceptual inference.
Collapse
|
59
|
Gehrig J, Michalareas G, Forster MT, Lei J, Hok P, Laufs H, Senft C, Seifert V, Schoffelen JM, Hanslmayr S, Kell CA. Low-Frequency Oscillations Code Speech during Verbal Working Memory. J Neurosci 2019; 39:6498-6512. [PMID: 31196933 PMCID: PMC6697399 DOI: 10.1523/jneurosci.0018-19.2019] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Revised: 05/09/2019] [Accepted: 05/10/2019] [Indexed: 11/21/2022] Open
Abstract
The way the human brain represents speech in memory is still unknown. An obvious characteristic of speech is its evolvement over time. During speech processing, neural oscillations are modulated by the temporal properties of the acoustic speech signal, but also acquired knowledge on the temporal structure of language influences speech perception-related brain activity. This suggests that speech could be represented in the temporal domain, a form of representation that the brain also uses to encode autobiographic memories. Empirical evidence for such a memory code is lacking. We investigated the nature of speech memory representations using direct cortical recordings in the left perisylvian cortex during delayed sentence reproduction in female and male patients undergoing awake tumor surgery. Our results reveal that the brain endogenously represents speech in the temporal domain. Temporal pattern similarity analyses revealed that the phase of frontotemporal low-frequency oscillations, primarily in the beta range, represents sentence identity in working memory. The positive relationship between beta power during working memory and task performance suggests that working memory representations benefit from increased phase separation.SIGNIFICANCE STATEMENT Memory is an endogenous source of information based on experience. While neural oscillations encode autobiographic memories in the temporal domain, little is known on their contribution to memory representations of human speech. Our electrocortical recordings in participants who maintain sentences in memory identify the phase of left frontotemporal beta oscillations as the most prominent information carrier of sentence identity. These observations provide evidence for a theoretical model on speech memory representations and explain why interfering with beta oscillations in the left inferior frontal cortex diminishes verbal working memory capacity. The lack of sentence identity coding at the syllabic rate suggests that sentences are represented in memory in a more abstract form compared with speech coding during speech perception and production.
Collapse
Affiliation(s)
- Johannes Gehrig
- Department of Neurology, Goethe University, 60528 Frankfurt, Germany
| | | | | | - Juan Lei
- Department of Neurology, Goethe University, 60528 Frankfurt, Germany
- Institute for Cell Biology and Neuroscience, Goethe University, 60438 Frankfurt, Germany
| | - Pavel Hok
- Department of Neurology, Goethe University, 60528 Frankfurt, Germany
- Department of Neurology, Palacky University and University Hospital Olomouc, 77147 Olomouc, Czech Republic
| | - Helmut Laufs
- Department of Neurology, Goethe University, 60528 Frankfurt, Germany
- Department of Neurology, Christian-Albrechts-University, 24105 Kiel, Germany
| | - Christian Senft
- Department of Neurosurgery, Goethe University, 60528 Frankfurt, Germany
| | - Volker Seifert
- Department of Neurosurgery, Goethe University, 60528 Frankfurt, Germany
| | - Jan-Mathijs Schoffelen
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, 6525 HR Nijmegen, The Netherlands, and
| | - Simon Hanslmayr
- School of Psychology at University of Birmingham, B15 2TT Birmingham, United Kingdom
| | - Christian A Kell
- Department of Neurology, Goethe University, 60528 Frankfurt, Germany,
| |
Collapse
|
60
|
Intonation guides sentence processing in the left inferior frontal gyrus. Cortex 2019; 117:122-134. [DOI: 10.1016/j.cortex.2019.02.011] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2018] [Revised: 12/22/2018] [Accepted: 02/11/2019] [Indexed: 11/18/2022]
|
61
|
Teoh ES, Cappelloni MS, Lalor EC. Prosodic pitch processing is represented in delta-band EEG and is dissociable from the cortical tracking of other acoustic and phonetic features. Eur J Neurosci 2019; 50:3831-3842. [PMID: 31287601 DOI: 10.1111/ejn.14510] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 06/10/2019] [Accepted: 07/02/2019] [Indexed: 01/09/2023]
Abstract
Speech is central to communication among humans. Meaning is largely conveyed by the selection of linguistic units such as words, phrases and sentences. However, prosody, that is the variation of acoustic cues that tie linguistic segments together, adds another layer of meaning. There are various features underlying prosody, one of the most important being pitch and how it is modulated. Recent fMRI and ECoG studies have suggested that there are cortical regions for pitch which respond primarily to resolved harmonics and that high-gamma cortical activity encodes intonation as represented by relative pitch. Importantly, this latter result was shown to be independent of the cortical tracking of the acoustic energy of speech, a commonly used measure. Here, we investigate whether we can isolate low-frequency EEG indices of pitch processing of continuous narrative speech from those reflecting the tracking of other acoustic and phonetic features. Harmonic resolvability was found to contain unique predictive power in delta and theta phase, but it was highly correlated with the envelope and tracked even when stimuli were pitch-impoverished. As such, we are circumspect about whether its contribution is truly pitch-specific. Crucially however, we found a unique contribution of relative pitch to EEG delta-phase prediction, and this tracking was absent when subjects listened to pitch-impoverished stimuli. This finding suggests the possibility of a separate processing stream for prosody that might operate in parallel to acoustic-linguistic processing. Furthermore, it provides a novel neural index that could be useful for testing prosodic encoding in populations with speech processing deficits and for improving cognitively controlled hearing aids.
Collapse
Affiliation(s)
- Emily S Teoh
- School of Engineering, Trinity Centre for Biomedical Engineering and Trinity College Institute of Neuroscience, Trinity College Dublin, University of Dublin, Dublin, Ireland
| | | | - Edmund C Lalor
- School of Engineering, Trinity Centre for Biomedical Engineering and Trinity College Institute of Neuroscience, Trinity College Dublin, University of Dublin, Dublin, Ireland.,Department of Biomedical Engineering, University of Rochester, Rochester, NY, USA.,Department of Neuroscience and Del Monte Institute for Neuroscience, University of Rochester, Rochester, NY, USA
| |
Collapse
|
62
|
Yi HG, Leonard MK, Chang EF. The Encoding of Speech Sounds in the Superior Temporal Gyrus. Neuron 2019; 102:1096-1110. [PMID: 31220442 PMCID: PMC6602075 DOI: 10.1016/j.neuron.2019.04.023] [Citation(s) in RCA: 223] [Impact Index Per Article: 37.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Revised: 04/08/2019] [Accepted: 04/16/2019] [Indexed: 01/02/2023]
Abstract
The human superior temporal gyrus (STG) is critical for extracting meaningful linguistic features from speech input. Local neural populations are tuned to acoustic-phonetic features of all consonants and vowels and to dynamic cues for intonational pitch. These populations are embedded throughout broader functional zones that are sensitive to amplitude-based temporal cues. Beyond speech features, STG representations are strongly modulated by learned knowledge and perceptual goals. Currently, a major challenge is to understand how these features are integrated across space and time in the brain during natural speech comprehension. We present a theory that temporally recurrent connections within STG generate context-dependent phonological representations, spanning longer temporal sequences relevant for coherent percepts of syllables, words, and phrases.
Collapse
Affiliation(s)
- Han Gyol Yi
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Matthew K Leonard
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA.
| |
Collapse
|
63
|
Sjerps MJ, Fox NP, Johnson K, Chang EF. Speaker-normalized sound representations in the human auditory cortex. Nat Commun 2019; 10:2465. [PMID: 31165733 PMCID: PMC6549175 DOI: 10.1038/s41467-019-10365-z] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2018] [Accepted: 05/03/2019] [Indexed: 11/08/2022] Open
Abstract
The acoustic dimensions that distinguish speech sounds (like the vowel differences in "boot" and "boat") also differentiate speakers' voices. Therefore, listeners must normalize across speakers without losing linguistic information. Past behavioral work suggests an important role for auditory contrast enhancement in normalization: preceding context affects listeners' perception of subsequent speech sounds. Here, using intracranial electrocorticography in humans, we investigate whether and how such context effects arise in auditory cortex. Participants identified speech sounds that were preceded by phrases from two different speakers whose voices differed along the same acoustic dimension as target words (the lowest resonance of the vocal tract). In every participant, target vowels evoke a speaker-dependent neural response that is consistent with the listener's perception, and which follows from a contrast enhancement model. Auditory cortex processing thus displays a critical feature of normalization, allowing listeners to extract meaningful content from the voices of diverse speakers.
Collapse
Affiliation(s)
- Matthias J Sjerps
- Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive Neuroimaging, Radboud University, Kapittelweg 29, Nijmegen, 6525 EN, The Netherlands
- Max Planck Institute for Psycholinguistics, Wundtlaan 1, Nijmegen, 6525 XD, Netherlands
| | - Neal P Fox
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, California, 94158, USA
| | - Keith Johnson
- Department of Linguistics, University of California, Berkeley, 1203 Dwinelle Hall #2650, Berkeley, California, 94720, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, California, 94158, USA.
- Weill Institute for Neurosciences, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, California, 94158, USA.
| |
Collapse
|
64
|
Angrick M, Herff C, Mugler E, Tate MC, Slutzky MW, Krusienski DJ, Schultz T. Speech synthesis from ECoG using densely connected 3D convolutional neural networks. J Neural Eng 2019; 16:036019. [PMID: 30831567 PMCID: PMC6822609 DOI: 10.1088/1741-2552/ab0c59] [Citation(s) in RCA: 94] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
OBJECTIVE Direct synthesis of speech from neural signals could provide a fast and natural way of communication to people with neurological diseases. Invasively-measured brain activity (electrocorticography; ECoG) supplies the necessary temporal and spatial resolution to decode fast and complex processes such as speech production. A number of impressive advances in speech decoding using neural signals have been achieved in recent years, but the complex dynamics are still not fully understood. However, it is unlikely that simple linear models can capture the relation between neural activity and continuous spoken speech. APPROACH Here we show that deep neural networks can be used to map ECoG from speech production areas onto an intermediate representation of speech (logMel spectrogram). The proposed method uses a densely connected convolutional neural network topology which is well-suited to work with the small amount of data available from each participant. MAIN RESULTS In a study with six participants, we achieved correlations up to r = 0.69 between the reconstructed and original logMel spectrograms. We transfered our prediction back into an audible waveform by applying a Wavenet vocoder. The vocoder was conditioned on logMel features that harnessed a much larger, pre-existing data corpus to provide the most natural acoustic output. SIGNIFICANCE To the best of our knowledge, this is the first time that high-quality speech has been reconstructed from neural recordings during speech production using deep neural networks.
Collapse
Affiliation(s)
- Miguel Angrick
- Cognitive Systems Lab, University of Bremen, Bremen, Germany
| | | | | | | | | | | | | |
Collapse
|
65
|
Stolk A, Griffin S, van der Meij R, Dewar C, Saez I, Lin JJ, Piantoni G, Schoffelen JM, Knight RT, Oostenveld R. Integrated analysis of anatomical and electrophysiological human intracranial data. Nat Protoc 2019; 13:1699-1723. [PMID: 29988107 PMCID: PMC6548463 DOI: 10.1038/s41596-018-0009-6] [Citation(s) in RCA: 108] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Human intracranial electroencephalography (iEEG) recordings provide data with much greater spatiotemporal precision than is possible from data obtained using scalp EEG, magnetoencephalography (MEG), or functional MRI. Until recently, the fusion of anatomical data (MRI and computed tomography (CT) images) with electrophysiological data and their subsequent analysis have required the use of technologically and conceptually challenging combinations of software. Here, we describe a comprehensive protocol that enables complex raw human iEEG data to be converted into more readily comprehensible illustrative representations. The protocol uses an open-source toolbox for electrophysiological data analysis (FieldTrip). This allows iEEG researchers to build on a continuously growing body of scriptable and reproducible analysis methods that, over the past decade, have been developed and used by a large research community. In this protocol, we describe how to analyze complex iEEG datasets by providing an intuitive and rapid approach that can handle both neuroanatomical information and large electrophysiological datasets. We provide a worked example using an example dataset. We also explain how to automate the protocol and adjust the settings to enable analysis of iEEG datasets with other characteristics. The protocol can be implemented by a graduate student or postdoctoral fellow with minimal MATLAB experience and takes approximately an hour to execute, excluding the automated cortical surface extraction.
Collapse
Affiliation(s)
- Arjen Stolk
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA. .,Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, The Netherlands.
| | - Sandon Griffin
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA
| | - Roemer van der Meij
- Department of Cognitive Science, University of California, San Diego, La Jolla, CA, USA
| | - Callum Dewar
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA.,College of Medicine, University of Illinois, Chicago, IL, USA
| | - Ignacio Saez
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA
| | - Jack J Lin
- Department of Neurology, University of California, Irvine, Irvine, CA, USA
| | - Giovanni Piantoni
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Jan-Mathijs Schoffelen
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Robert T Knight
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA.,Department of Psychology, University of California, Berkeley, Berkeley, CA, USA
| | - Robert Oostenveld
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, The Netherlands.,NatMEG, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
66
|
Burred JJ, Ponsot E, Goupil L, Liuni M, Aucouturier JJ. CLEESE: An open-source audio-transformation toolbox for data-driven experiments in speech and music cognition. PLoS One 2019; 14:e0205943. [PMID: 30947281 PMCID: PMC6448843 DOI: 10.1371/journal.pone.0205943] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2018] [Accepted: 02/15/2019] [Indexed: 11/29/2022] Open
Abstract
Over the past few years, the field of visual social cognition and face processing has been dramatically impacted by a series of data-driven studies employing computer-graphics tools to synthesize arbitrary meaningful facial expressions. In the auditory modality, reverse correlation is traditionally used to characterize sensory processing at the level of spectral or spectro-temporal stimulus properties, but not higher-level cognitive processing of e.g. words, sentences or music, by lack of tools able to manipulate the stimulus dimensions that are relevant for these processes. Here, we present an open-source audio-transformation toolbox, called CLEESE, able to systematically randomize the prosody/melody of existing speech and music recordings. CLEESE works by cutting recordings in small successive time segments (e.g. every successive 100 milliseconds in a spoken utterance), and applying a random parametric transformation of each segment’s pitch, duration or amplitude, using a new Python-language implementation of the phase-vocoder digital audio technique. We present here two applications of the tool to generate stimuli for studying intonation processing of interrogative vs declarative speech, and rhythm processing of sung melodies.
Collapse
Affiliation(s)
| | - Emmanuel Ponsot
- Science and Technology of Music and Sound (UMR9912, IRCAM/CNRS/Sorbonne Université), Paris, France
- Laboratoire des Systèmes Perceptifs (CNRS UMR 8248) and Département d’études cognitives, École Normale Supérieure, PSL Research University, Paris, France
| | - Louise Goupil
- Science and Technology of Music and Sound (UMR9912, IRCAM/CNRS/Sorbonne Université), Paris, France
| | - Marco Liuni
- Science and Technology of Music and Sound (UMR9912, IRCAM/CNRS/Sorbonne Université), Paris, France
| | - Jean-Julien Aucouturier
- Science and Technology of Music and Sound (UMR9912, IRCAM/CNRS/Sorbonne Université), Paris, France
- * E-mail:
| |
Collapse
|
67
|
Zhang Y, Qiu T, Yuan X, Zhang J, Wang Y, Zhang N, Zhou C, Luo C, Zhang J. Abnormal topological organization of structural covariance networks in amyotrophic lateral sclerosis. NEUROIMAGE-CLINICAL 2018; 21:101619. [PMID: 30528369 PMCID: PMC6411656 DOI: 10.1016/j.nicl.2018.101619] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/23/2018] [Revised: 11/03/2018] [Accepted: 11/29/2018] [Indexed: 01/12/2023]
Abstract
Neuroimaging studies of patients with amyotrophic lateral sclerosis (ALS) have shown widespread alterations in structure, function, and connectivity in both motor and non-motor brain regions, suggesting multi-systemic neurobiological abnormalities that might impact large-scale brain networks. Here, we examined the alterations in the topological organization of structural covariance networks of ALS patients (N = 60) compared with normal controls (N = 60). We found that structural covariance networks of ALS patients showed a consistent rearrangement towards a regularized architecture evidenced by increased path length, clustering coefficient, small-world index, and modularity, as well as decreased global efficiency, suggesting inefficient global integration and increased local segregation. Locally, ALS patients showed decreased nodal degree and betweenness in the gyrus rectus and/or Heschl's gyrus, and increased betweenness in the supplementary motor area, triangular part of the inferior frontal gyrus, supramarginal gyrus and posterior cingulate cortex. In addition, we identified a different number and distribution of hubs in ALS patients, showing more frontal and subcortical hubs than in normal controls. In conclusion, we reveal abnormal topological organization of structural covariance networks in ALS patients, and provide network-level evidence for the concept that ALS is a multisystem disorder with a cerebral involvement extending beyond the motor areas.
Collapse
Affiliation(s)
- Yuanchao Zhang
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, PR China
| | - Ting Qiu
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, PR China
| | - Xinru Yuan
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, PR China
| | - Jinlei Zhang
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, PR China
| | - Yue Wang
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, PR China
| | - Na Zhang
- School of Mathematical Sciences, University of Jinan, Jinan 250022, Shandong Province, PR China
| | - Chaoyang Zhou
- Department of Radiology, Southwest Hospital, Third Military Medical University, Chongqing 400038, PR China
| | - Chunxia Luo
- Department of Neurology, Southwest Hospital, Third Military Medical University, Chongqing 400038, PR China
| | - Jiuquan Zhang
- Department of Radiology, Chongqing University Cancer Hospital, Chongqing Cancer Institute, Chongqing Cancer Hospital, Chongqing 400030, PR China; Key Laboratory for Biorheological Science and Technology of Ministry of Education (Chongqing University), Chongqing University Cancer Hospital, Chongqing Cancer Institute, Chongqing Cancer Hospital, Chongqing 400044, PR China.
| |
Collapse
|
68
|
Venezia JH, Thurman SM, Richards VM, Hickok G. Hierarchy of speech-driven spectrotemporal receptive fields in human auditory cortex. Neuroimage 2018; 186:647-666. [PMID: 30500424 DOI: 10.1016/j.neuroimage.2018.11.049] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2018] [Revised: 10/11/2018] [Accepted: 11/26/2018] [Indexed: 12/22/2022] Open
Abstract
Existing data indicate that cortical speech processing is hierarchically organized. Numerous studies have shown that early auditory areas encode fine acoustic details while later areas encode abstracted speech patterns. However, it remains unclear precisely what speech information is encoded across these hierarchical levels. Estimation of speech-driven spectrotemporal receptive fields (STRFs) provides a means to explore cortical speech processing in terms of acoustic or linguistic information associated with characteristic spectrotemporal patterns. Here, we estimate STRFs from cortical responses to continuous speech in fMRI. Using a novel approach based on filtering randomly-selected spectrotemporal modulations (STMs) from aurally-presented sentences, STRFs were estimated for a group of listeners and categorized using a data-driven clustering algorithm. 'Behavioral STRFs' highlighting STMs crucial for speech recognition were derived from intelligibility judgments. Clustering revealed that STRFs in the supratemporal plane represented a broad range of STMs, while STRFs in the lateral temporal lobe represented circumscribed STM patterns important to intelligibility. Detailed analysis recovered a bilateral organization with posterior-lateral regions preferentially processing STMs associated with phonological information and anterior-lateral regions preferentially processing STMs associated with word- and phrase-level information. Regions in lateral Heschl's gyrus preferentially processed STMs associated with vocalic information (pitch).
Collapse
Affiliation(s)
- Jonathan H Venezia
- VA Loma Linda Healthcare System, Loma Linda, CA, USA; Dept. of Otolaryngology, School of Medicine, Loma Linda University, Loma Linda, CA, USA.
| | | | - Virginia M Richards
- Depts. of Cognitive Sciences and Language Science, University of California, Irvine, Irvine, CA, USA
| | - Gregory Hickok
- Depts. of Cognitive Sciences and Language Science, University of California, Irvine, Irvine, CA, USA
| |
Collapse
|
69
|
Harczos T, Klefenz FM. Modeling Pitch Perception With an Active Auditory Model Extended by Octopus Cells. Front Neurosci 2018; 12:660. [PMID: 30319340 PMCID: PMC6167605 DOI: 10.3389/fnins.2018.00660] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Accepted: 09/04/2018] [Indexed: 11/13/2022] Open
Abstract
Pitch is an essential category for musical sensations. Models of pitch perception are vividly discussed up to date. Most of them rely on definitions of mathematical methods in the spectral or temporal domain. Our proposed pitch perception model is composed of an active auditory model extended by octopus cells. The active auditory model is the same as used in the Stimulation based on Auditory Modeling (SAM), a successful cochlear implant sound processing strategy extended here by modeling the functional behavior of the octopus cells in the ventral cochlear nucleus and by modeling their connections to the auditory nerve fibers (ANFs). The neurophysiological parameterization of the extended model is fully described in the time domain. The model is based on latency-phase en- and decoding as octopus cells are latency-phase rectifiers in their local receptive fields. Pitch is ubiquitously represented by cascaded firing sweeps of octopus cells. Based on the firing patterns of octopus cells, inter-spike interval histograms can be aggregated, in which the place of the global maximum is assumed to encode the pitch.
Collapse
Affiliation(s)
- Tamas Harczos
- Fraunhofer Institute for Digital Media Technology, Ilmenau, Germany
- Auditory Neuroscience and Optogenetics Laboratory, German Primate Center, Goettingen, Germany
- Institut für Mikroelektronik- und Mechatronik-Systeme gGmbH, Ilmenau, Germany
| | | |
Collapse
|
70
|
Breshears JD, Hamilton LS, Chang EF. Spontaneous Neural Activity in the Superior Temporal Gyrus Recapitulates Tuning for Speech Features. Front Hum Neurosci 2018; 12:360. [PMID: 30279650 PMCID: PMC6153351 DOI: 10.3389/fnhum.2018.00360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2018] [Accepted: 08/21/2018] [Indexed: 11/26/2022] Open
Abstract
Background: Numerous studies have demonstrated that individuals exhibit structured neural activity in many brain regions during rest that is also observed during different tasks, however it is still not clear whether and how resting state activity patterns may relate to underlying tuning for specific stimuli. In the posterior superior temporal gyrus (STG), distinct neural activity patterns are observed during the perception of specific linguistic speech features. We hypothesized that spontaneous resting-state neural dynamics of the STG would be structured to reflect its role in speech perception, exhibiting an organization along speech features as seen during speech perception. Methods: Human cortical local field potentials were recorded from the superior temporal gyrus (STG) in 8 patients undergoing surgical treatment of epilepsy. Signals were recorded during speech perception and rest. Patterns of neural activity (high gamma power: 70–150 Hz) during rest, extracted with spatiotemporal principal component analysis, were compared to spatiotemporal neural responses to speech features during perception. Hierarchical clustering was applied to look for patterns in rest that corresponded to speech feature tuning. Results: Significant correlations were found between neural responses to speech features (sentence onsets, consonants, and vowels) and the spontaneous neural activity in the STG. Across subjects, these correlations clustered into five groups, demonstrating tuning for speech features—most robustly for acoustic onsets. These correlations were not seen in other brain areas, or during motor and spectrally-rotated speech control tasks. Conclusions: In this study, we present evidence that the RS structure of STG activity robustly recapitulates its stimulus-evoked response to acoustic onsets. Further, secondary patterns in RS activity appear to correlate with stimulus-evoked responses to speech features. The role of these spontaneous spatiotemporal activity patterns remains to be elucidated.
Collapse
Affiliation(s)
- Jonathan D. Breshears
- Department of Neurosurgery, University of California, San Francisco, San Francisco, CA, United States
| | - Liberty S. Hamilton
- Department of Neurosurgery, University of California, San Francisco, San Francisco, CA, United States
- Department of Communication Sciences and Disorders, University of Texas at Austin, Austin, TX, United States
- Department of Neurology, Dell Medical School, University of Texas at Austin, Austin, TX, United States
| | - Edward F. Chang
- Department of Neurosurgery, University of California, San Francisco, San Francisco, CA, United States
- *Correspondence: Edward F. Chang
| |
Collapse
|
71
|
Sammler D, Cunitz K, Gierhan SME, Anwander A, Adermann J, Meixensberger J, Friederici AD. White matter pathways for prosodic structure building: A case study. BRAIN AND LANGUAGE 2018; 183:1-10. [PMID: 29758365 DOI: 10.1016/j.bandl.2018.05.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Revised: 03/14/2018] [Accepted: 05/03/2018] [Indexed: 06/08/2023]
Abstract
The relevance of left dorsal and ventral fiber pathways for syntactic and semantic comprehension is well established, while pathways for prosody are little explored. The present study examined linguistic prosodic structure building in a patient whose right arcuate/superior longitudinal fascicles and posterior corpus callosum were transiently compromised by a vasogenic peritumoral edema. Compared to ten matched healthy controls, the patient's ability to detect irregular prosodic structure significantly improved between pre- and post-surgical assessment. This recovery was accompanied by an increase in average fractional anisotropy (FA) in right dorsal and posterior transcallosal fiber tracts. Neither general cognitive abilities nor (non-prosodic) syntactic comprehension nor FA in right ventral and left dorsal fiber tracts showed a similar pre-post increase. Together, these findings suggest a contribution of right dorsal and inter-hemispheric pathways to prosody perception, including the right-dorsal tracking and structuring of prosodic pitch contours that is transcallosally informed by concurrent syntactic information.
Collapse
Affiliation(s)
- Daniela Sammler
- Otto Hahn Group "Neural Bases of Intonation in Speech and Music", Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1a, 04103 Leipzig, Germany; Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1a, 04103 Leipzig, Germany.
| | - Katrin Cunitz
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1a, 04103 Leipzig, Germany; Department of Child and Adolescent Psychiatry and Psychotherapy, University Hospital Ulm, Steinhövelstraße 5, 89075 Ulm, Germany
| | - Sarah M E Gierhan
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1a, 04103 Leipzig, Germany; Berlin School of Mind and Brain, Humboldt University Berlin, Unter den Linden 6, 10099 Berlin, Germany
| | - Alfred Anwander
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1a, 04103 Leipzig, Germany
| | - Jens Adermann
- University Hospital Leipzig, Clinic and Policlinic for Neurosurgery, Liebigstraße 20, 04103 Leipzig, Germany
| | - Jürgen Meixensberger
- University Hospital Leipzig, Clinic and Policlinic for Neurosurgery, Liebigstraße 20, 04103 Leipzig, Germany
| | - Angela D Friederici
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1a, 04103 Leipzig, Germany; Berlin School of Mind and Brain, Humboldt University Berlin, Unter den Linden 6, 10099 Berlin, Germany
| |
Collapse
|
72
|
Hamilton LS, Huth AG. The revolution will not be controlled: natural stimuli in speech neuroscience. LANGUAGE, COGNITION AND NEUROSCIENCE 2018; 35:573-582. [PMID: 32656294 PMCID: PMC7324135 DOI: 10.1080/23273798.2018.1499946] [Citation(s) in RCA: 130] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 02/21/2018] [Accepted: 07/03/2018] [Indexed: 05/22/2023]
Abstract
Humans have a unique ability to produce and consume rich, complex, and varied language in order to communicate ideas to one another. Still, outside of natural reading, the most common methods for studying how our brains process speech or understand language use only isolated words or simple sentences. Recent studies have upset this status quo by employing complex natural stimuli and measuring how the brain responds to language as it is used. In this article we argue that natural stimuli offer many advantages over simplified, controlled stimuli for studying how language is processed by the brain. Furthermore, the downsides of using natural language stimuli can be mitigated using modern statistical and computational techniques.
Collapse
Affiliation(s)
- Liberty S. Hamilton
- Communication Sciences & Disorders, Moody College of Communication, The University of Texas at Austin, Austin, USA
- Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, USA
| | - Alexander G. Huth
- Department of Neuroscience, The University of Texas at Austin, Austin, USA
- Department of Computer Science, The University of Texas at Austin, Austin, USA
| |
Collapse
|
73
|
Dichter BK, Breshears JD, Leonard MK, Chang EF. The Control of Vocal Pitch in Human Laryngeal Motor Cortex. Cell 2018; 174:21-31.e9. [PMID: 29958109 PMCID: PMC6084806 DOI: 10.1016/j.cell.2018.05.016] [Citation(s) in RCA: 101] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2018] [Revised: 03/23/2018] [Accepted: 05/08/2018] [Indexed: 11/24/2022]
Abstract
In speech, the highly flexible modulation of vocal pitch creates intonation patterns that speakers use to convey linguistic meaning. This human ability is unique among primates. Here, we used high-density cortical recordings directly from the human brain to determine the encoding of vocal pitch during natural speech. We found neural populations in bilateral dorsal laryngeal motor cortex (dLMC) that selectively encoded produced pitch but not non-laryngeal articulatory movements. This neural population controlled short pitch accents to express prosodic emphasis on a word in a sentence. Other larynx cortical representations controlling voicing and longer pitch phrase contours were found at separate sites. dLMC sites also encoded vocal pitch during a non-speech singing task. Finally, direct focal stimulation of dLMC evoked laryngeal movements and involuntary vocalization, confirming its causal role in feedforward control. Together, these results reveal the neural basis for the voluntary control of vocal pitch in human speech. VIDEO ABSTRACT.
Collapse
Affiliation(s)
- Benjamin K Dichter
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA; Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA 94143, USA; UC Berkeley and UCSF Joint Program in Bioengineering, Berkeley, CA 94720, USA
| | - Jonathan D Breshears
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA; Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Matthew K Leonard
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA; Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Edward F Chang
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA; Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA 94143, USA; UC Berkeley and UCSF Joint Program in Bioengineering, Berkeley, CA 94720, USA.
| |
Collapse
|
74
|
Castro N, Mendoza JM, Tampke EC, Vitevitch MS. An account of the Speech-to-Song Illusion using Node Structure Theory. PLoS One 2018; 13:e0198656. [PMID: 29883451 PMCID: PMC5993277 DOI: 10.1371/journal.pone.0198656] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2017] [Accepted: 05/23/2018] [Indexed: 11/25/2022] Open
Abstract
In the Speech-to-Song Illusion, repetition of a spoken phrase results in it being perceived as if it were sung. Although a number of previous studies have examined which characteristics of the stimulus will produce the illusion, there is, until now, no description of the cognitive mechanism that underlies the illusion. We suggest that the processes found in Node Structure Theory that are used to explain normal language processing as well as other auditory illusions might also account for the Speech-to-Song Illusion. In six experiments we tested whether the satiation of lexical nodes, but continued priming of syllable nodes may lead to the Speech-to-Song Illusion. The results of these experiments provide evidence for the role of priming, activation, and satiation as described in Node Structure Theory as an explanation of the Speech-to-Song Illusion.
Collapse
Affiliation(s)
- Nichol Castro
- Spoken Language Laboratory, Department of Psychology, University of Kansas, Lawrence, Kansas, United States of America
| | - Joshua M. Mendoza
- Spoken Language Laboratory, Department of Psychology, University of Kansas, Lawrence, Kansas, United States of America
| | - Elizabeth C. Tampke
- Spoken Language Laboratory, Department of Psychology, University of Kansas, Lawrence, Kansas, United States of America
| | - Michael S. Vitevitch
- Spoken Language Laboratory, Department of Psychology, University of Kansas, Lawrence, Kansas, United States of America
| |
Collapse
|
75
|
Ozker M, Yoshor D, Beauchamp MS. Converging Evidence From Electrocorticography and BOLD fMRI for a Sharp Functional Boundary in Superior Temporal Gyrus Related to Multisensory Speech Processing. Front Hum Neurosci 2018; 12:141. [PMID: 29740294 PMCID: PMC5928751 DOI: 10.3389/fnhum.2018.00141] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2018] [Accepted: 03/28/2018] [Indexed: 01/15/2023] Open
Abstract
Although humans can understand speech using the auditory modality alone, in noisy environments visual speech information from the talker’s mouth can rescue otherwise unintelligible auditory speech. To investigate the neural substrates of multisensory speech perception, we compared neural activity from the human superior temporal gyrus (STG) in two datasets. One dataset consisted of direct neural recordings (electrocorticography, ECoG) from surface electrodes implanted in epilepsy patients (this dataset has been previously published). The second dataset consisted of indirect measures of neural activity using blood oxygen level dependent functional magnetic resonance imaging (BOLD fMRI). Both ECoG and fMRI participants viewed the same clear and noisy audiovisual speech stimuli and performed the same speech recognition task. Both techniques demonstrated a sharp functional boundary in the STG, spatially coincident with an anatomical boundary defined by the posterior edge of Heschl’s gyrus. Cortex on the anterior side of the boundary responded more strongly to clear audiovisual speech than to noisy audiovisual speech while cortex on the posterior side of the boundary did not. For both ECoG and fMRI measurements, the transition between the functionally distinct regions happened within 10 mm of anterior-to-posterior distance along the STG. We relate this boundary to the multisensory neural code underlying speech perception and propose that it represents an important functional division within the human speech perception network.
Collapse
Affiliation(s)
- Muge Ozker
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, United States
| | - Daniel Yoshor
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, United States.,Michael E. DeBakey Veterans Affairs Medical Center, Houston, TX, United States
| | - Michael S Beauchamp
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, United States
| |
Collapse
|
76
|
Wang N, Wu H, Xu M, Yang Y, Chang C, Zeng W, Yan H. Occupational functional plasticity revealed by brain entropy: A resting-state fMRI study of seafarers. Hum Brain Mapp 2018; 39:2997-3004. [PMID: 29676512 DOI: 10.1002/hbm.24055] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Revised: 02/12/2018] [Accepted: 03/12/2018] [Indexed: 11/09/2022] Open
Abstract
Recently, functional magnetic resonance imaging (fMRI) has been increasingly used to assess brain function. Brain entropy is an effective model for evaluating the alteration of brain complexity. Specifically, the sample entropy (SampEn) provides a feasible solution for revealing the brain's complexity. Occupation is one key factor affecting the brain's activity, but the neuropsychological mechanisms are still unclear. Thus, in this article, based on fMRI and a brain entropy model, we explored the functional complexity changes engendered by occupation factors, taking the seafarer as an example. The whole-brain entropy values of two groups (i.e., the seafarers and the nonseafarers) were first calculated by SampEn and followed by a two-sample t test with AlphaSim correction (p < .05). We found that the entropy of the orbital-frontal gyrus (OFG) and superior temporal gyrus (STG) in the seafarers was significantly higher than that of the nonseafarers. In addition, the entropy of the cerebellum in the seafarers was lower than that of the nonseafarers. We conclude that (1) the lower entropy in the cerebellum implies that the seafarers' cerebellum activity had strong regularity and consistency, suggesting that the seafarer's cerebellum was possibly more specialized by the long-term career training; (2) the higher entropy in the OFG and STG possibly demonstrated that the seafarers had a relatively decreased capability for emotion control and auditory information processing. The above results imply that the seafarer occupation indeed impacted the brain's complexity, and also provided new neuropsychological evidence of functional plasticity related to one's career.
Collapse
Affiliation(s)
- Nizhuan Wang
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, 518060, China.,Guangdong Provincial Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Shenzhen University, Shenzhen, 518060, China
| | - Huijun Wu
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, 518060, China.,Guangdong Provincial Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Shenzhen University, Shenzhen, 518060, China
| | - Min Xu
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, 518060, China.,Guangdong Provincial Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Shenzhen University, Shenzhen, 518060, China.,Center for Neuroimaging, Shenzhen Institute of Neuroscience, Shenzhen, 518057, China
| | - Yang Yang
- Center for Neuroimaging, Shenzhen Institute of Neuroscience, Shenzhen, 518057, China
| | - Chunqi Chang
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, 518060, China.,Guangdong Provincial Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Shenzhen University, Shenzhen, 518060, China.,Center for Neuroimaging, Shenzhen Institute of Neuroscience, Shenzhen, 518057, China
| | - Weiming Zeng
- Digital Image and Intelligent computation Laboratory, College of Information Engineering, Shanghai Maritime University, Shanghai, 201306, China
| | - Hongjie Yan
- Department of Neurology, Affiliated Lianyungang Hospital of Xuzhou Medical University, Lianyungang, 222002, China
| |
Collapse
|
77
|
Abstract
In speech, social evaluations of a speaker’s dominance or trustworthiness are conveyed by distinguishing, but little-understood, pitch variations. This work describes how to combine state-of-the-art vocal pitch transformations with the psychophysical technique of reverse correlation and uses this methodology to uncover the prosodic prototypes that govern such social judgments in speech. This finding is of great significance, because the exact shape of these prototypes, and how they vary with sex, age, and culture, is virtually unknown, and because prototypes derived with the method can then be reapplied to arbitrary spoken utterances, thus providing a principled way to modulate personality impressions in speech. Human listeners excel at forming high-level social representations about each other, even from the briefest of utterances. In particular, pitch is widely recognized as the auditory dimension that conveys most of the information about a speaker’s traits, emotional states, and attitudes. While past research has primarily looked at the influence of mean pitch, almost nothing is known about how intonation patterns, i.e., finely tuned pitch trajectories around the mean, may determine social judgments in speech. Here, we introduce an experimental paradigm that combines state-of-the-art voice transformation algorithms with psychophysical reverse correlation and show that two of the most important dimensions of social judgments, a speaker’s perceived dominance and trustworthiness, are driven by robust and distinguishing pitch trajectories in short utterances like the word “Hello,” which remained remarkably stable whether male or female listeners judged male or female speakers. These findings reveal a unique communicative adaptation that enables listeners to infer social traits regardless of speakers’ physical characteristics, such as sex and mean pitch. By characterizing how any given individual’s mental representations may differ from this generic code, the method introduced here opens avenues to explore dysprosody and social-cognitive deficits in disorders like autism spectrum and schizophrenia. In addition, once derived experimentally, these prototypes can be applied to novel utterances, thus providing a principled way to modulate personality impressions in arbitrary speech signals.
Collapse
|
78
|
McPherson MJ, McDermott JH. Diversity in pitch perception revealed by task dependence. Nat Hum Behav 2018; 2:52-66. [PMID: 30221202 PMCID: PMC6136452 DOI: 10.1038/s41562-017-0261-8] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2017] [Accepted: 11/08/2017] [Indexed: 01/12/2023]
Abstract
Pitch conveys critical information in speech, music, and other natural sounds, and is conventionally defined as the perceptual correlate of a sound's fundamental frequency (F0). Although pitch is widely assumed to be subserved by a single F0 estimation process, real-world pitch tasks vary enormously, raising the possibility of underlying mechanistic diversity. To probe pitch mechanisms we conducted a battery of pitch-related music and speech tasks using conventional harmonic sounds and inharmonic sounds whose frequencies lack a common F0. Some pitch-related abilities - those relying on musical interval or voice recognition - were strongly impaired by inharmonicity, suggesting a reliance on F0. However, other tasks, including those dependent on pitch contours in speech and music, were unaffected by inharmonicity, suggesting a mechanism that tracks the frequency spectrum rather than the F0. The results suggest that pitch perception is mediated by several different mechanisms, only some of which conform to traditional notions of pitch.
Collapse
Affiliation(s)
- Malinda J McPherson
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, USA.
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, USA
| |
Collapse
|
79
|
Mikulan E, Hesse E, Sedeño L, Bekinschtein T, Sigman M, García MDC, Silva W, Ciraolo C, García AM, Ibáñez A. Intracranial high-γ connectivity distinguishes wakefulness from sleep. Neuroimage 2017; 169:265-277. [PMID: 29225064 DOI: 10.1016/j.neuroimage.2017.12.015] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2017] [Revised: 11/21/2017] [Accepted: 12/06/2017] [Indexed: 12/27/2022] Open
Abstract
Neural synchrony in the γ-band is considered a fundamental process in cortical computation and communication and it has also been proposed as a crucial correlate of consciousness. However, the latter claim remains inconclusive, mainly due to methodological limitations, such as the spectral constraints of scalp-level electroencephalographic recordings or volume-conduction confounds. Here, we circumvented these caveats by comparing γ-band connectivity between two global states of consciousness via intracranial electroencephalography (iEEG), which provides the most reliable measurements of high-frequency activity in the human brain. Non-REM Sleep recordings were compared to passive-wakefulness recordings of the same duration in three subjects with surgically implanted electrodes. Signals were analyzed through the weighted Phase Lag Index connectivity measure and relevant graph theory metrics. We found that connectivity in the high-γ range (90-120 Hz), as well as relevant graph theory properties, were higher during wakefulness than during sleep and discriminated between conditions better than any other canonical frequency band. Our results constitute the first report of iEEG differences between wakefulness and sleep in the high-γ range at both local and distant sites, highlighting the utility of this technique in the search for the neural correlates of global states of consciousness.
Collapse
Affiliation(s)
- Ezequiel Mikulan
- Laboratory of Experimental Psychology and Neuroscience (LPEN), Institute of Cognitive and Translational Neuroscience (INCYT), INECO Foundation, Favaloro University, Buenos Aires, Argentina; National Scientific and Technical Research Council (CONICET), Buenos Aires, Argentina; Consciousness and Cognition Lab, Department of Psychology, University of Cambridge, UK.
| | - Eugenia Hesse
- Laboratory of Experimental Psychology and Neuroscience (LPEN), Institute of Cognitive and Translational Neuroscience (INCYT), INECO Foundation, Favaloro University, Buenos Aires, Argentina; National Scientific and Technical Research Council (CONICET), Buenos Aires, Argentina; Instituto de Ingeniería Biomédica, Facultad de Ingeniería, Universidad de Buenos Aires, Argentina
| | - Lucas Sedeño
- Laboratory of Experimental Psychology and Neuroscience (LPEN), Institute of Cognitive and Translational Neuroscience (INCYT), INECO Foundation, Favaloro University, Buenos Aires, Argentina; National Scientific and Technical Research Council (CONICET), Buenos Aires, Argentina
| | - Tristán Bekinschtein
- Consciousness and Cognition Lab, Department of Psychology, University of Cambridge, UK
| | | | - María Del Carmen García
- Programa de Cirugía de Epilepsia, Hospital Italiano de Buenos Asires, Buenos Aires, Argentina
| | - Walter Silva
- Programa de Cirugía de Epilepsia, Hospital Italiano de Buenos Asires, Buenos Aires, Argentina
| | - Carlos Ciraolo
- Programa de Cirugía de Epilepsia, Hospital Italiano de Buenos Asires, Buenos Aires, Argentina
| | - Adolfo M García
- Laboratory of Experimental Psychology and Neuroscience (LPEN), Institute of Cognitive and Translational Neuroscience (INCYT), INECO Foundation, Favaloro University, Buenos Aires, Argentina; National Scientific and Technical Research Council (CONICET), Buenos Aires, Argentina; Faculty of Education, National University of Cuyo (UNCuyo), Mendoza, Argentina
| | - Agustín Ibáñez
- Laboratory of Experimental Psychology and Neuroscience (LPEN), Institute of Cognitive and Translational Neuroscience (INCYT), INECO Foundation, Favaloro University, Buenos Aires, Argentina; National Scientific and Technical Research Council (CONICET), Buenos Aires, Argentina; Universidad Autónoma del Caribe, Barranquilla, Colombia; Center for Social and Cognitive Neuroscience (CSCN), School of Psychology, Universidad Adolfo Ibañez, Santiago de Chile, Chile; Australian Research Council Centre of Excellence in Cognition and its Disorders, Sydney, Australia.
| |
Collapse
|
80
|
Hamilton LS, Chang DL, Lee MB, Chang EF. Semi-automated Anatomical Labeling and Inter-subject Warping of High-Density Intracranial Recording Electrodes in Electrocorticography. Front Neuroinform 2017; 11:62. [PMID: 29163118 PMCID: PMC5671481 DOI: 10.3389/fninf.2017.00062] [Citation(s) in RCA: 74] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Accepted: 10/05/2017] [Indexed: 11/13/2022] Open
Abstract
In this article, we introduce img_pipe, our open source python package for preprocessing of imaging data for use in intracranial electrocorticography (ECoG) and intracranial stereo-EEG analyses. The process of electrode localization, labeling, and warping for use in ECoG currently varies widely across laboratories, and it is usually performed with custom, lab-specific code. This python package aims to provide a standardized interface for these procedures, as well as code to plot and display results on 3D cortical surface meshes. It gives the user an easy interface to create anatomically labeled electrodes that can also be warped to an atlas brain, starting with only a preoperative T1 MRI scan and a postoperative CT scan. We describe the full capabilities of our imaging pipeline and present a step-by-step protocol for users.
Collapse
Affiliation(s)
- Liberty S Hamilton
- Department of Neurosurgery, University of California, San Francisco, San Francisco, CA, United States.,Center for Integrative Neuroscience, University of California, San Francisco, San Francisco, CA, United States
| | - David L Chang
- Department of Neurosurgery, University of California, San Francisco, San Francisco, CA, United States.,Center for Integrative Neuroscience, University of California, San Francisco, San Francisco, CA, United States
| | - Morgan B Lee
- Department of Neurosurgery, University of California, San Francisco, San Francisco, CA, United States.,Center for Integrative Neuroscience, University of California, San Francisco, San Francisco, CA, United States
| | - Edward F Chang
- Department of Neurosurgery, University of California, San Francisco, San Francisco, CA, United States.,Center for Integrative Neuroscience, University of California, San Francisco, San Francisco, CA, United States
| |
Collapse
|