1
|
Zhao 赵隽元 J, Gao 高睿敏 R, Brennan JR. Decoding the Neural Dynamics of Headed Syntactic Structure Building. J Neurosci 2025; 45:e2126242025. [PMID: 40050114 PMCID: PMC12019108 DOI: 10.1523/jneurosci.2126-24.2025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2024] [Revised: 12/29/2024] [Accepted: 01/22/2025] [Indexed: 04/25/2025] Open
Abstract
The brain builds hierarchical phrases during language comprehension; however, the details and dynamics of the phrase-building process remain underspecified. This study directly probes whether the neural code of verb phrases involves reactivating the syntactic property of a key subcomponent (the "head" verb). To this end, we train a part-of-speech sliding-window verb/adverb decoder on EEG signals recorded while 30 participants read sentences in a controlled experiment. The decoder reaches above-chance performance that is spatiotemporally consistent and generalizes to unseen data across sentence positions. Applying the decoder to held-out data yields predicted activation levels for the verbal "head" of a verb phrase at a distant nonhead word (adverb); the critical adverb appeared either at the end of a verb phrase or at a sequentially and lexically matched position with no verb phrase boundary. There is stronger verb activation beginning at ∼600 milliseconds at the critical adverb when it appears at a verb phrase boundary; this effect is not modulated by the strength of conceptual association nor does it reflect word predictability. Time-locked analyses additionally reveal a negativity waveform component and increased beta-delta inter-trial phase coherence, both previously linked to linguistic composition, in a similar time window. With a novel application of neural decoding, our findings delineate the dynamics by which the brain encodes phrasal representations by, in part, reactivating the representation of key subcomponents. We thus establish a link between cognitive accounts of phrasal representations and electrophysiological dynamics.
Collapse
Affiliation(s)
- Junyuan Zhao 赵隽元
- Department of Linguistics, University of Michigan, Ann Arbor, Michigan 48109
| | - Ruimin Gao 高睿敏
- School of Psychology, Georgia Institute of Technology, Atlanta, Georgia 30332
| | - Jonathan R Brennan
- Department of Linguistics, University of Michigan, Ann Arbor, Michigan 48109
| |
Collapse
|
2
|
Zada Z, Nastase SA, Aubrey B, Jalon I, Michelmann S, Wang H, Hasenfratz L, Doyle W, Friedman D, Dugan P, Melloni L, Devore S, Flinker A, Devinsky O, Goldstein A, Hasson U. The "Podcast" ECoG dataset for modeling neural activity during natural language comprehension. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.14.638352. [PMID: 39990398 PMCID: PMC11844552 DOI: 10.1101/2025.02.14.638352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/25/2025]
Abstract
Naturalistic electrocorticography (ECoG) data are a rare but essential resource for studying the brain's linguistic capabilities. ECoG offers a high temporal resolution suitable for investigating processes at multiple temporal timescales and frequency bands. It also provides broad spatial coverage, often along critical language areas. Here, we share a dataset of nine ECoG participants with 1,330 electrodes listening to a 30-minute audio podcast. The richness of this naturalistic stimulus can be used for various research endeavors, from auditory perception to semantic integration. In addition to the neural data, we extract linguistic features of the stimulus ranging from phonetic information to large language model word embeddings. We use these linguistic features in encoding models that relate stimulus properties to neural activity. Finally, we provide detailed tutorials for preprocessing raw data, extracting stimulus features, and running encoding analyses that can serve as a pedagogical resource or a springboard for new research.
Collapse
Affiliation(s)
- Zaid Zada
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
| | - Samuel A. Nastase
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
| | - Bobbi Aubrey
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
| | - Itamar Jalon
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
| | - Sebastian Michelmann
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
| | - Haocheng Wang
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
| | - Liat Hasenfratz
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
| | - Werner Doyle
- Grossman School of Medicine, New York University; New York, 10016, USA
| | - Daniel Friedman
- Grossman School of Medicine, New York University; New York, 10016, USA
| | - Patricia Dugan
- Grossman School of Medicine, New York University; New York, 10016, USA
| | - Lucia Melloni
- Grossman School of Medicine, New York University; New York, 10016, USA
| | - Sasha Devore
- Grossman School of Medicine, New York University; New York, 10016, USA
| | - Adeen Flinker
- Grossman School of Medicine, New York University; New York, 10016, USA
- Tandon School of Engineering, New York University; New York, 10016, USA
| | - Orrin Devinsky
- Grossman School of Medicine, New York University; New York, 10016, USA
| | - Ariel Goldstein
- Department of Cognitive and Brain Sciences and Business School, Hebrew University; Jerusalem, 9190501, Israel
| | - Uri Hasson
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
| |
Collapse
|
3
|
Coopmans CW, de Hoop H, Tezcan F, Hagoort P, Martin AE. Language-specific neural dynamics extend syntax into the time domain. PLoS Biol 2025; 23:e3002968. [PMID: 39836653 PMCID: PMC11750093 DOI: 10.1371/journal.pbio.3002968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 12/05/2024] [Indexed: 01/23/2025] Open
Abstract
Studies of perception have long shown that the brain adds information to its sensory analysis of the physical environment. A touchstone example for humans is language use: to comprehend a physical signal like speech, the brain must add linguistic knowledge, including syntax. Yet, syntactic rules and representations are widely assumed to be atemporal (i.e., abstract and not bound by time), so they must be translated into time-varying signals for speech comprehension and production. Here, we test 3 different models of the temporal spell-out of syntactic structure against brain activity of people listening to Dutch stories: an integratory bottom-up parser, a predictive top-down parser, and a mildly predictive left-corner parser. These models build exactly the same structure but differ in when syntactic information is added by the brain-this difference is captured in the (temporal distribution of the) complexity metric "incremental node count." Using temporal response function models with both acoustic and information-theoretic control predictors, node counts were regressed against source-reconstructed delta-band activity acquired with magnetoencephalography. Neural dynamics in left frontal and temporal regions most strongly reflect node counts derived by the top-down method, which postulates syntax early in time, suggesting that predictive structure building is an important component of Dutch sentence comprehension. The absence of strong effects of the left-corner model further suggests that its mildly predictive strategy does not represent Dutch language comprehension well, in contrast to what has been found for English. Understanding when the brain projects its knowledge of syntax onto speech, and whether this is done in language-specific ways, will inform and constrain the development of mechanistic models of syntactic structure building in the brain.
Collapse
Affiliation(s)
- Cas W. Coopmans
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, the Netherlands
- Centre for Language Studies, Radboud University, Nijmegen, the Netherlands
| | - Helen de Hoop
- Centre for Language Studies, Radboud University, Nijmegen, the Netherlands
| | - Filiz Tezcan
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, the Netherlands
| | - Peter Hagoort
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, the Netherlands
| | - Andrea E. Martin
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, the Netherlands
| |
Collapse
|
4
|
Wang S, Kim S, Binder JR, Pylkkänen L. Unlocking the complexity of phrasal composition: An interplay between semantic features and linguistic relations. Cognition 2024; 254:105986. [PMID: 39426327 DOI: 10.1016/j.cognition.2024.105986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 07/05/2024] [Accepted: 10/11/2024] [Indexed: 10/21/2024]
Abstract
Understanding the computational operations involved in conceptual composition is fundamental for theories of language. However, the existing literature on this topic remains fragmented, comprising disconnected theories from various fields. For instance, while formal semantic theories in Linguistics rely on type-driven interpretation without explicitly representing the conceptual content of lexical items, neurolinguistic research suggests that the brain is sensitive to conceptual factors during word composition. What is the relationship between these two types of theories? Do they describe two distinct aspects of composition, operating independently, or do they connect in some way during interpretation by our brain? To probe this, we explored how the mathematical operations explaining the combination of two words into a phrase are affected by the semantic content of items and the formal linguistic relations between the combining items. For six phrase types that varied properties relevant to type-driven interpretation such as modification vs. argument-saturation and modifier context sensitivity, we collected human ratings of experiential semantic features both for the phrases and for all the individual words within the phrases. We then compared the ability of different computational combination rules to explain the phrase ratings based on the individual word ratings. Our results indicate that composition operations are not one-size-fits-all but rather depend on both feature type and linguistic relation. For example, in intersective Adjective-Noun phrases, addition is used to merge attention-related features, while color features are predominantly determined by the first word's ratings. In the case of social features, the verb chiefly guides interpretation in Verb-Noun phrases, whereas in Noun-Noun phrases, the model employs multiplication to combine the social features of the nouns.
Collapse
Affiliation(s)
- Shaonan Wang
- State Key Laboratory of Multimodal Artificial intelligence systems, Institute of Automation Chinese Academy of Sciences, Beijing, China; Department of Psychology, New York University, New York, NY, USA.
| | - Songhee Kim
- Department of Neurology, Medical College of Wisconsin, USA.
| | | | - Liina Pylkkänen
- Department of Linguistics, New York University, New York, NY, USA; NYUAD Research Institute, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates.
| |
Collapse
|
5
|
Weissbart H, Martin AE. The structure and statistics of language jointly shape cross-frequency neural dynamics during spoken language comprehension. Nat Commun 2024; 15:8850. [PMID: 39397036 PMCID: PMC11471778 DOI: 10.1038/s41467-024-53128-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 09/30/2024] [Indexed: 10/15/2024] Open
Abstract
Humans excel at extracting structurally-determined meaning from speech despite inherent physical variability. This study explores the brain's ability to predict and understand spoken language robustly. It investigates the relationship between structural and statistical language knowledge in brain dynamics, focusing on phase and amplitude modulation. Using syntactic features from constituent hierarchies and surface statistics from a transformer model as predictors of forward encoding models, we reconstructed cross-frequency neural dynamics from MEG data during audiobook listening. Our findings challenge a strict separation of linguistic structure and statistics in the brain, with both aiding neural signal reconstruction. Syntactic features have a more temporally spread impact, and both word entropy and the number of closing syntactic constituents are linked to the phase-amplitude coupling of neural dynamics, implying a role in temporal prediction and cortical oscillation alignment during speech processing. Our results indicate that structured and statistical information jointly shape neural dynamics during spoken language comprehension and suggest an integration process via a cross-frequency coupling mechanism.
Collapse
Affiliation(s)
- Hugo Weissbart
- Donders Centre for Cognitive Neuroimaging, Radboud University, Nijmegen, The Netherlands.
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands.
| | - Andrea E Martin
- Donders Centre for Cognitive Neuroimaging, Radboud University, Nijmegen, The Netherlands
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| |
Collapse
|
6
|
Slaats S, Meyer AS, Martin AE. Lexical Surprisal Shapes the Time Course of Syntactic Structure Building. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024; 5:942-980. [PMID: 39534445 PMCID: PMC11556436 DOI: 10.1162/nol_a_00155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Accepted: 07/24/2024] [Indexed: 11/16/2024]
Abstract
When we understand language, we recognize words and combine them into sentences. In this article, we explore the hypothesis that listeners use probabilistic information about words to build syntactic structure. Recent work has shown that lexical probability and syntactic structure both modulate the delta-band (<4 Hz) neural signal. Here, we investigated whether the neural encoding of syntactic structure changes as a function of the distributional properties of a word. To this end, we analyzed MEG data of 24 native speakers of Dutch who listened to three fairytales with a total duration of 49 min. Using temporal response functions and a cumulative model-comparison approach, we evaluated the contributions of syntactic and distributional features to the variance in the delta-band neural signal. This revealed that lexical surprisal values (a distributional feature), as well as bottom-up node counts (a syntactic feature) positively contributed to the model of the delta-band neural signal. Subsequently, we compared responses to the syntactic feature between words with high- and low-surprisal values. This revealed a delay in the response to the syntactic feature as a consequence of the surprisal value of the word: high-surprisal values were associated with a delayed response to the syntactic feature by 150-190 ms. The delay was not affected by word duration, and did not have a lexical origin. These findings suggest that the brain uses probabilistic information to infer syntactic structure, and highlight an importance for the role of time in this process.
Collapse
Affiliation(s)
- Sophie Slaats
- Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands
- Department of Basic Neurosciences, University of Geneva, Geneva, Switzerland
| | - Antje S. Meyer
- Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands
| | - Andrea E. Martin
- Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands
| |
Collapse
|
7
|
Schuler W, Yue S. Evaluation of an Algorithmic-Level Left-Corner Parsing Account of Surprisal Effects. Cogn Sci 2024; 48:e13500. [PMID: 39400979 DOI: 10.1111/cogs.13500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 09/02/2024] [Accepted: 09/11/2024] [Indexed: 10/15/2024]
Abstract
This article evaluates the predictions of an algorithmic-level distributed associative memory model as it introduces, propagates, and resolves ambiguity, and compares it to the predictions of computational-level parallel parsing models in which ambiguous analyses are accounted separately in discrete distributions. By superposing activation patterns that serve as cues to other activation patterns, the model is able to maintain multiple syntactically complex analyses superposed in a finite working memory, propagate this ambiguity through multiple intervening words, then resolve this ambiguity in a way that produces a measurable predictor that is proportional to the log conditional probability of the disambiguating word given its context, marginalizing over all remaining analyses. The results are indeed consistent in cases of complex structural ambiguity with computational-level parallel parsing models producing this same probability as a predictor, which have been shown reliably to predict human reading times.
Collapse
Affiliation(s)
| | - Shizen Yue
- School of Foreign Languages, Shanghai Jiao Tong University
| |
Collapse
|
8
|
Townsend PH, Jones A, Patel AD, Race E. Rhythmic Temporal Cues Coordinate Cross-frequency Phase-amplitude Coupling during Memory Encoding. J Cogn Neurosci 2024; 36:2100-2116. [PMID: 38991125 DOI: 10.1162/jocn_a_02217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2024]
Abstract
Accumulating evidence suggests that rhythmic temporal cues in the environment influence the encoding of information into long-term memory. Here, we test the hypothesis that these mnemonic effects of rhythm reflect the coupling of high-frequency (gamma) oscillations to entrained lower-frequency oscillations synchronized to the beat of the rhythm. In Study 1, we first test this hypothesis in the context of global effects of rhythm on memory, when memory is superior for visual stimuli presented in rhythmic compared with arrhythmic patterns at encoding [Jones, A., & Ward, E. V. Rhythmic temporal structure at encoding enhances recognition memory, Journal of Cognitive Neuroscience, 31, 1549-1562, 2019]. We found that rhythmic presentation of visual stimuli during encoding was associated with greater phase-amplitude coupling (PAC) between entrained low-frequency (delta) oscillations and higher-frequency (gamma) oscillations. In Study 2, we next investigated cross-frequency PAC in the context of local effects of rhythm on memory encoding, when memory is superior for visual stimuli presented in-synchrony compared with out-of-synchrony with a background auditory beat [Hickey, P., Merseal, H., Patel, A. D., & Race, E. Memory in time: Neural tracking of low-frequency rhythm dynamically modulates memory formation. Neuroimage, 213, 116693, 2020]. We found that the mnemonic effect of rhythm in this context was again associated with increased cross-frequency PAC between entrained low-frequency (delta) oscillations and higher-frequency (gamma) oscillations. Furthermore, the magnitude of gamma power modulations positively scaled with the subsequent memory benefit for in- versus out-of-synchrony stimuli. Together, these results suggest that the influence of rhythm on memory encoding may reflect the temporal coordination of higher-frequency gamma activity by entrained low-frequency oscillations.
Collapse
Affiliation(s)
- Paige Hickey Townsend
- Massachusetts General Hospital, Charlestown, MA
- Athinoula A. Martinos Center for Biomedical Imaging, Charlestown, MA
| | | | - Aniruddh D Patel
- Tufts University, Medford, MA
- Canadian Institute for Advanced Research
| | | |
Collapse
|
9
|
Teng X, Larrouy-Maestri P, Poeppel D. Segmenting and Predicting Musical Phrase Structure Exploits Neural Gain Modulation and Phase Precession. J Neurosci 2024; 44:e1331232024. [PMID: 38926087 PMCID: PMC11270514 DOI: 10.1523/jneurosci.1331-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 05/29/2024] [Accepted: 06/11/2024] [Indexed: 06/28/2024] Open
Abstract
Music, like spoken language, is often characterized by hierarchically organized structure. Previous experiments have shown neural tracking of notes and beats, but little work touches on the more abstract question: how does the brain establish high-level musical structures in real time? We presented Bach chorales to participants (20 females and 9 males) undergoing electroencephalogram (EEG) recording to investigate how the brain tracks musical phrases. We removed the main temporal cues to phrasal structures, so that listeners could only rely on harmonic information to parse a continuous musical stream. Phrasal structures were disrupted by locally or globally reversing the harmonic progression, so that our observations on the original music could be controlled and compared. We first replicated the findings on neural tracking of musical notes and beats, substantiating the positive correlation between musical training and neural tracking. Critically, we discovered a neural signature in the frequency range ∼0.1 Hz (modulations of EEG power) that reliably tracks musical phrasal structure. Next, we developed an approach to quantify the phrasal phase precession of the EEG power, revealing that phrase tracking is indeed an operation of active segmentation involving predictive processes. We demonstrate that the brain establishes complex musical structures online over long timescales (>5 s) and actively segments continuous music streams in a manner comparable to language processing. These two neural signatures, phrase tracking and phrasal phase precession, provide new conceptual and technical tools to study the processes underpinning high-level structure building using noninvasive recording techniques.
Collapse
Affiliation(s)
- Xiangbin Teng
- Department of Psychology, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Pauline Larrouy-Maestri
- Music Department, Max-Planck-Institute for Empirical Aesthetics, Frankfurt 60322, Germany
- Center for Language, Music, and Emotion (CLaME), New York, New York 10003
| | - David Poeppel
- Center for Language, Music, and Emotion (CLaME), New York, New York 10003
- Department of Psychology, New York University, New York, New York 10003
- Ernst Struengmann Institute for Neuroscience, Frankfurt 60528, Germany
- Music and Audio Research Laboratory (MARL), New York, New York 11201
| |
Collapse
|
10
|
Pérez-Navarro J, Klimovich-Gray A, Lizarazu M, Piazza G, Molinaro N, Lallier M. Early language experience modulates the tradeoff between acoustic-temporal and lexico-semantic cortical tracking of speech. iScience 2024; 27:110247. [PMID: 39006483 PMCID: PMC11246002 DOI: 10.1016/j.isci.2024.110247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 03/14/2024] [Accepted: 06/07/2024] [Indexed: 07/16/2024] Open
Abstract
Cortical tracking of speech is relevant for the development of speech perception skills. However, no study to date has explored whether and how cortical tracking of speech is shaped by accumulated language experience, the central question of this study. In 35 bilingual children (6-year-old) with considerably bigger experience in one language, we collected electroencephalography data while they listened to continuous speech in their two languages. Cortical tracking of speech was assessed at acoustic-temporal and lexico-semantic levels. Children showed more robust acoustic-temporal tracking in the least experienced language, and more sensitive cortical tracking of semantic information in the most experienced language. Additionally, and only for the most experienced language, acoustic-temporal tracking was specifically linked to phonological abilities, and lexico-semantic tracking to vocabulary knowledge. Our results indicate that accumulated linguistic experience is a relevant maturational factor for the cortical tracking of speech at different levels during early language acquisition.
Collapse
Affiliation(s)
- Jose Pérez-Navarro
- Basque Center on Cognition, Brain and Language (BCBL), 20009 Donostia-San Sebastian, Spain
| | | | - Mikel Lizarazu
- Basque Center on Cognition, Brain and Language (BCBL), 20009 Donostia-San Sebastian, Spain
| | - Giorgio Piazza
- Basque Center on Cognition, Brain and Language (BCBL), 20009 Donostia-San Sebastian, Spain
| | - Nicola Molinaro
- Basque Center on Cognition, Brain and Language (BCBL), 20009 Donostia-San Sebastian, Spain
- Ikerbasque, Basque Foundation for Science, 48009 Bilbao, Spain
| | - Marie Lallier
- Basque Center on Cognition, Brain and Language (BCBL), 20009 Donostia-San Sebastian, Spain
| |
Collapse
|
11
|
Zhao J, Martin AE, Coopmans CW. Structural and sequential regularities modulate phrase-rate neural tracking. Sci Rep 2024; 14:16603. [PMID: 39025957 PMCID: PMC11258220 DOI: 10.1038/s41598-024-67153-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 07/08/2024] [Indexed: 07/20/2024] Open
Abstract
Electrophysiological brain activity has been shown to synchronize with the quasi-regular repetition of grammatical phrases in connected speech-so-called phrase-rate neural tracking. Current debate centers around whether this phenomenon is best explained in terms of the syntactic properties of phrases or in terms of syntax-external information, such as the sequential repetition of parts of speech. As these two factors were confounded in previous studies, much of the literature is compatible with both accounts. Here, we used electroencephalography (EEG) to determine if and when the brain is sensitive to both types of information. Twenty native speakers of Mandarin Chinese listened to isochronously presented streams of monosyllabic words, which contained either grammatical two-word phrases (e.g., catch fish, sell house) or non-grammatical word combinations (e.g., full lend, bread far). Within the grammatical conditions, we varied two structural factors: the position of the head of each phrase and the type of attachment. Within the non-grammatical conditions, we varied the consistency with which parts of speech were repeated. Tracking was quantified through evoked power and inter-trial phase coherence, both derived from the frequency-domain representation of EEG responses. As expected, neural tracking at the phrase rate was stronger in grammatical sequences than in non-grammatical sequences without syntactic structure. Moreover, it was modulated by both attachment type and head position, revealing the structure-sensitivity of phrase-rate tracking. We additionally found that the brain tracks the repetition of parts of speech in non-grammatical sequences. These data provide an integrative perspective on the current debate about neural tracking effects, revealing that the brain utilizes regularities computed over multiple levels of linguistic representation in guiding rhythmic computation.
Collapse
Affiliation(s)
- Junyuan Zhao
- Department of Linguistics, University of Michigan, Ann Arbor, MI, USA
| | - Andrea E Martin
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Cas W Coopmans
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands.
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands.
| |
Collapse
|
12
|
Kumar S, Sumers TR, Yamakoshi T, Goldstein A, Hasson U, Norman KA, Griffiths TL, Hawkins RD, Nastase SA. Shared functional specialization in transformer-based language models and the human brain. Nat Commun 2024; 15:5523. [PMID: 38951520 PMCID: PMC11217339 DOI: 10.1038/s41467-024-49173-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 05/24/2024] [Indexed: 07/03/2024] Open
Abstract
When processing language, the brain is thought to deploy specialized computations to construct meaning from complex linguistic structures. Recently, artificial neural networks based on the Transformer architecture have revolutionized the field of natural language processing. Transformers integrate contextual information across words via structured circuit computations. Prior work has focused on the internal representations ("embeddings") generated by these circuits. In this paper, we instead analyze the circuit computations directly: we deconstruct these computations into the functionally-specialized "transformations" that integrate contextual information across words. Using functional MRI data acquired while participants listened to naturalistic stories, we first verify that the transformations account for considerable variance in brain activity across the cortical language network. We then demonstrate that the emergent computations performed by individual, functionally-specialized "attention heads" differentially predict brain activity in specific cortical regions. These heads fall along gradients corresponding to different layers and context lengths in a low-dimensional cortical space.
Collapse
Affiliation(s)
- Sreejan Kumar
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08540, USA.
| | - Theodore R Sumers
- Department of Computer Science, Princeton University, Princeton, NJ, 08540, USA.
| | - Takateru Yamakoshi
- Faculty of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo, 113-0033, Japan
| | - Ariel Goldstein
- Department of Cognitive and Brain Sciences and Business School, Hebrew University, Jerusalem, 9190401, Israel
| | - Uri Hasson
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08540, USA
- Department of Psychology, Princeton University, Princeton, NJ, 08540, USA
| | - Kenneth A Norman
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08540, USA
- Department of Psychology, Princeton University, Princeton, NJ, 08540, USA
| | - Thomas L Griffiths
- Department of Computer Science, Princeton University, Princeton, NJ, 08540, USA
- Department of Psychology, Princeton University, Princeton, NJ, 08540, USA
| | - Robert D Hawkins
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08540, USA
- Department of Psychology, Princeton University, Princeton, NJ, 08540, USA
| | - Samuel A Nastase
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08540, USA.
| |
Collapse
|
13
|
Ten Oever S, Titone L, te Rietmolen N, Martin AE. Phase-dependent word perception emerges from region-specific sensitivity to the statistics of language. Proc Natl Acad Sci U S A 2024; 121:e2320489121. [PMID: 38805278 PMCID: PMC11161766 DOI: 10.1073/pnas.2320489121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 04/22/2024] [Indexed: 05/30/2024] Open
Abstract
Neural oscillations reflect fluctuations in excitability, which biases the percept of ambiguous sensory input. Why this bias occurs is still not fully understood. We hypothesized that neural populations representing likely events are more sensitive, and thereby become active on earlier oscillatory phases, when the ensemble itself is less excitable. Perception of ambiguous input presented during less-excitable phases should therefore be biased toward frequent or predictable stimuli that have lower activation thresholds. Here, we show such a frequency bias in spoken word recognition using psychophysics, magnetoencephalography (MEG), and computational modelling. With MEG, we found a double dissociation, where the phase of oscillations in the superior temporal gyrus and medial temporal gyrus biased word-identification behavior based on phoneme and lexical frequencies, respectively. This finding was reproduced in a computational model. These results demonstrate that oscillations provide a temporal ordering of neural activity based on the sensitivity of separable neural populations.
Collapse
Affiliation(s)
- Sanne Ten Oever
- Language and Computation in Neural Systems group, Max Planck Institute for Psycholinguistics, NijmegenXD 6525, The Netherlands
- Language and Computation in Neural Systems group, Donders Centre for Cognitive Neuroimaging, Donders Institute for Brain, Cognition and Behaviour, Radboud University, NijmegenEN 6525, The Netherlands
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, EV 6229, The Netherlands
| | - Lorenzo Titone
- Research Group Language Cycles, Max Planck Institute for Human Cognitive and Brain Sciences, LeipzigD-04303, Germany
| | - Noémie te Rietmolen
- Language and Computation in Neural Systems group, Max Planck Institute for Psycholinguistics, NijmegenXD 6525, The Netherlands
- Language and Computation in Neural Systems group, Donders Centre for Cognitive Neuroimaging, Donders Institute for Brain, Cognition and Behaviour, Radboud University, NijmegenEN 6525, The Netherlands
| | - Andrea E. Martin
- Language and Computation in Neural Systems group, Max Planck Institute for Psycholinguistics, NijmegenXD 6525, The Netherlands
- Language and Computation in Neural Systems group, Donders Centre for Cognitive Neuroimaging, Donders Institute for Brain, Cognition and Behaviour, Radboud University, NijmegenEN 6525, The Netherlands
| |
Collapse
|
14
|
Ding R, Ten Oever S, Martin AE. Delta-band Activity Underlies Referential Meaning Representation during Pronoun Resolution. J Cogn Neurosci 2024; 36:1472-1492. [PMID: 38652108 DOI: 10.1162/jocn_a_02163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/25/2024]
Abstract
Human language offers a variety of ways to create meaning, one of which is referring to entities, objects, or events in the world. One such meaning maker is understanding to whom or to what a pronoun in a discourse refers to. To understand a pronoun, the brain must access matching entities or concepts that have been encoded in memory from previous linguistic context. Models of language processing propose that internally stored linguistic concepts, accessed via exogenous cues such as phonological input of a word, are represented as (a)synchronous activities across a population of neurons active at specific frequency bands. Converging evidence suggests that delta band activity (1-3 Hz) is involved in temporal and representational integration during sentence processing. Moreover, recent advances in the neurobiology of memory suggest that recollection engages neural dynamics similar to those which occurred during memory encoding. Integrating from these two research lines, we here tested the hypothesis that neural dynamic patterns, especially in delta frequency range, underlying referential meaning representation, would be reinstated during pronoun resolution. By leveraging neural decoding techniques (i.e., representational similarity analysis) on a magnetoencephalogram data set acquired during a naturalistic story-listening task, we provide evidence that delta-band activity underlies referential meaning representation. Our findings suggest that, during spoken language comprehension, endogenous linguistic representations such as referential concepts may be proactively retrieved and represented via activation of their underlying dynamic neural patterns.
Collapse
Affiliation(s)
- Rong Ding
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Sanne Ten Oever
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Radboud University Donders Centre for Cognitive Neuroimaging, Nijmegen, The Netherlands
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands
| | - Andrea E Martin
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Radboud University Donders Centre for Cognitive Neuroimaging, Nijmegen, The Netherlands
| |
Collapse
|
15
|
Orepic P, Truccolo W, Halgren E, Cash SS, Giraud AL, Proix T. Neural manifolds carry reactivation of phonetic representations during semantic processing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.30.564638. [PMID: 37961305 PMCID: PMC10634964 DOI: 10.1101/2023.10.30.564638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Traditional models of speech perception posit that neural activity encodes speech through a hierarchy of cognitive processes, from low-level representations of acoustic and phonetic features to high-level semantic encoding. Yet it remains unknown how neural representations are transformed across levels of the speech hierarchy. Here, we analyzed unique microelectrode array recordings of neuronal spiking activity from the human left anterior superior temporal gyrus, a brain region at the interface between phonetic and semantic speech processing, during a semantic categorization task and natural speech perception. We identified distinct neural manifolds for semantic and phonetic features, with a functional separation of the corresponding low-dimensional trajectories. Moreover, phonetic and semantic representations were encoded concurrently and reflected in power increases in the beta and low-gamma local field potentials, suggesting top-down predictive and bottom-up cumulative processes. Our results are the first to demonstrate mechanisms for hierarchical speech transformations that are specific to neuronal population dynamics.
Collapse
Affiliation(s)
- Pavo Orepic
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Wilson Truccolo
- Department of Neuroscience, Brown University, Providence, Rhode Island, United States of America
- Carney Institute for Brain Science, Brown University, Providence, Rhode Island, United States of America
| | - Eric Halgren
- Department of Neuroscience & Radiology, University of California San Diego, La Jolla, California, United States of America
| | - Sydney S. Cash
- Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Anne-Lise Giraud
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland
- Institut Pasteur, Université Paris Cité, Hearing Institute, Paris, France
| | - Timothée Proix
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| |
Collapse
|
16
|
Ten Oever S, Martin AE. Interdependence of "What" and "When" in the Brain. J Cogn Neurosci 2024; 36:167-186. [PMID: 37847823 DOI: 10.1162/jocn_a_02067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2023]
Abstract
From a brain's-eye-view, when a stimulus occurs and what it is are interrelated aspects of interpreting the perceptual world. Yet in practice, the putative perceptual inferences about sensory content and timing are often dichotomized and not investigated as an integrated process. We here argue that neural temporal dynamics can influence what is perceived, and in turn, stimulus content can influence the time at which perception is achieved. This computational principle results from the highly interdependent relationship of what and when in the environment. Both brain processes and perceptual events display strong temporal variability that is not always modeled; we argue that understanding-and, minimally, modeling-this temporal variability is key for theories of how the brain generates unified and consistent neural representations and that we ignore temporal variability in our analysis practice at the peril of both data interpretation and theory-building. Here, we review what and when interactions in the brain, demonstrate via simulations how temporal variability can result in misguided interpretations and conclusions, and outline how to integrate and synthesize what and when in theories and models of brain computation.
Collapse
Affiliation(s)
- Sanne Ten Oever
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Centre for Cognitive Neuroimaging, Nijmegen, The Netherlands
- Maastricht University, The Netherlands
| | - Andrea E Martin
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Centre for Cognitive Neuroimaging, Nijmegen, The Netherlands
| |
Collapse
|
17
|
van der Burght CL, Friederici AD, Maran M, Papitto G, Pyatigorskaya E, Schroën JAM, Trettenbrein PC, Zaccarella E. Cleaning up the Brickyard: How Theory and Methodology Shape Experiments in Cognitive Neuroscience of Language. J Cogn Neurosci 2023; 35:2067-2088. [PMID: 37713672 DOI: 10.1162/jocn_a_02058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/17/2023]
Abstract
The capacity for language is a defining property of our species, yet despite decades of research, evidence on its neural basis is still mixed and a generalized consensus is difficult to achieve. We suggest that this is partly caused by researchers defining "language" in different ways, with focus on a wide range of phenomena, properties, and levels of investigation. Accordingly, there is very little agreement among cognitive neuroscientists of language on the operationalization of fundamental concepts to be investigated in neuroscientific experiments. Here, we review chains of derivation in the cognitive neuroscience of language, focusing on how the hypothesis under consideration is defined by a combination of theoretical and methodological assumptions. We first attempt to disentangle the complex relationship between linguistics, psychology, and neuroscience in the field. Next, we focus on how conclusions that can be drawn from any experiment are inherently constrained by auxiliary assumptions, both theoretical and methodological, on which the validity of conclusions drawn rests. These issues are discussed in the context of classical experimental manipulations as well as study designs that employ novel approaches such as naturalistic stimuli and computational modeling. We conclude by proposing that a highly interdisciplinary field such as the cognitive neuroscience of language requires researchers to form explicit statements concerning the theoretical definitions, methodological choices, and other constraining factors involved in their work.
Collapse
Affiliation(s)
| | - Angela D Friederici
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Matteo Maran
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- International Max Planck Research School on Neuroscience of Communication, Leipzig, Germany
| | - Giorgio Papitto
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- International Max Planck Research School on Neuroscience of Communication, Leipzig, Germany
| | - Elena Pyatigorskaya
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- International Max Planck Research School on Neuroscience of Communication, Leipzig, Germany
| | - Joëlle A M Schroën
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- International Max Planck Research School on Neuroscience of Communication, Leipzig, Germany
| | - Patrick C Trettenbrein
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- International Max Planck Research School on Neuroscience of Communication, Leipzig, Germany
- University of Göttingen, Göttingen, Germany
| | - Emiliano Zaccarella
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
18
|
Mai G, Wang WSY. Distinct roles of delta- and theta-band neural tracking for sharpening and predictive coding of multi-level speech features during spoken language processing. Hum Brain Mapp 2023; 44:6149-6172. [PMID: 37818940 PMCID: PMC10619373 DOI: 10.1002/hbm.26503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 08/17/2023] [Accepted: 09/13/2023] [Indexed: 10/13/2023] Open
Abstract
The brain tracks and encodes multi-level speech features during spoken language processing. It is evident that this speech tracking is dominant at low frequencies (<8 Hz) including delta and theta bands. Recent research has demonstrated distinctions between delta- and theta-band tracking but has not elucidated how they differentially encode speech across linguistic levels. Here, we hypothesised that delta-band tracking encodes prediction errors (enhanced processing of unexpected features) while theta-band tracking encodes neural sharpening (enhanced processing of expected features) when people perceive speech with different linguistic contents. EEG responses were recorded when normal-hearing participants attended to continuous auditory stimuli that contained different phonological/morphological and semantic contents: (1) real-words, (2) pseudo-words and (3) time-reversed speech. We employed multivariate temporal response functions to measure EEG reconstruction accuracies in response to acoustic (spectrogram), phonetic and phonemic features with the partialling procedure that singles out unique contributions of individual features. We found higher delta-band accuracies for pseudo-words than real-words and time-reversed speech, especially during encoding of phonetic features. Notably, individual time-lag analyses showed that significantly higher accuracies for pseudo-words than real-words started at early processing stages for phonetic encoding (<100 ms post-feature) and later stages for acoustic and phonemic encoding (>200 and 400 ms post-feature, respectively). Theta-band accuracies, on the other hand, were higher when stimuli had richer linguistic content (real-words > pseudo-words > time-reversed speech). Such effects also started at early stages (<100 ms post-feature) during encoding of all individual features or when all features were combined. We argue these results indicate that delta-band tracking may play a role in predictive coding leading to greater tracking of pseudo-words due to the presence of unexpected/unpredicted semantic information, while theta-band tracking encodes sharpened signals caused by more expected phonological/morphological and semantic contents. Early presence of these effects reflects rapid computations of sharpening and prediction errors. Moreover, by measuring changes in EEG alpha power, we did not find evidence that the observed effects can be solitarily explained by attentional demands or listening efforts. Finally, we used directed information analyses to illustrate feedforward and feedback information transfers between prediction errors and sharpening across linguistic levels, showcasing how our results fit with the hierarchical Predictive Coding framework. Together, we suggest the distinct roles of delta and theta neural tracking for sharpening and predictive coding of multi-level speech features during spoken language processing.
Collapse
Affiliation(s)
- Guangting Mai
- Hearing Theme, National Institute for Health Research Nottingham Biomedical Research Centre, Nottingham, UK
- Academic Unit of Mental Health and Clinical Neurosciences, School of Medicine, The University of Nottingham, Nottingham, UK
- Division of Psychology and Language Sciences, Faculty of Brain Sciences, University College London, London, UK
| | - William S-Y Wang
- Department of Chinese and Bilingual Studies, Hong Kong Polytechnic University, Hung Hom, Hong Kong
- Language Engineering Laboratory, The Chinese University of Hong Kong, Hong Kong, China
| |
Collapse
|
19
|
Stephen EP, Li Y, Metzger S, Oganian Y, Chang EF. Latent neural dynamics encode temporal context in speech. Hear Res 2023; 437:108838. [PMID: 37441880 PMCID: PMC11182421 DOI: 10.1016/j.heares.2023.108838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 06/15/2023] [Accepted: 07/03/2023] [Indexed: 07/15/2023]
Abstract
Direct neural recordings from human auditory cortex have demonstrated encoding for acoustic-phonetic features of consonants and vowels. Neural responses also encode distinct acoustic amplitude cues related to timing, such as those that occur at the onset of a sentence after a silent period or the onset of the vowel in each syllable. Here, we used a group reduced rank regression model to show that distributed cortical responses support a low-dimensional latent state representation of temporal context in speech. The timing cues each capture more unique variance than all other phonetic features and exhibit rotational or cyclical dynamics in latent space from activity that is widespread over the superior temporal gyrus. We propose that these spatially distributed timing signals could serve to provide temporal context for, and possibly bind across time, the concurrent processing of individual phonetic features, to compose higher-order phonological (e.g. word-level) representations.
Collapse
Affiliation(s)
- Emily P Stephen
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA 94143, United States; Department of Mathematics and Statistics, Boston University, Boston, MA 02215, United States
| | - Yuanning Li
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA 94143, United States; School of Biomedical Engineering, ShanghaiTech University, Shanghai, China
| | - Sean Metzger
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA 94143, United States
| | - Yulia Oganian
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA 94143, United States; Center for Integrative Neuroscience, University of Tübingen, Tübingen, Germany
| | - Edward F Chang
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA 94143, United States.
| |
Collapse
|
20
|
Tezcan F, Weissbart H, Martin AE. A tradeoff between acoustic and linguistic feature encoding in spoken language comprehension. eLife 2023; 12:e82386. [PMID: 37417736 PMCID: PMC10328533 DOI: 10.7554/elife.82386] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 06/18/2023] [Indexed: 07/08/2023] Open
Abstract
When we comprehend language from speech, the phase of the neural response aligns with particular features of the speech input, resulting in a phenomenon referred to as neural tracking. In recent years, a large body of work has demonstrated the tracking of the acoustic envelope and abstract linguistic units at the phoneme and word levels, and beyond. However, the degree to which speech tracking is driven by acoustic edges of the signal, or by internally-generated linguistic units, or by the interplay of both, remains contentious. In this study, we used naturalistic story-listening to investigate (1) whether phoneme-level features are tracked over and above acoustic edges, (2) whether word entropy, which can reflect sentence- and discourse-level constraints, impacted the encoding of acoustic and phoneme-level features, and (3) whether the tracking of acoustic edges was enhanced or suppressed during comprehension of a first language (Dutch) compared to a statistically familiar but uncomprehended language (French). We first show that encoding models with phoneme-level linguistic features, in addition to acoustic features, uncovered an increased neural tracking response; this signal was further amplified in a comprehended language, putatively reflecting the transformation of acoustic features into internally generated phoneme-level representations. Phonemes were tracked more strongly in a comprehended language, suggesting that language comprehension functions as a neural filter over acoustic edges of the speech signal as it transforms sensory signals into abstract linguistic units. We then show that word entropy enhances neural tracking of both acoustic and phonemic features when sentence- and discourse-context are less constraining. When language was not comprehended, acoustic features, but not phonemic ones, were more strongly modulated, but in contrast, when a native language is comprehended, phoneme features are more strongly modulated. Taken together, our findings highlight the flexible modulation of acoustic, and phonemic features by sentence and discourse-level constraint in language comprehension, and document the neural transformation from speech perception to language comprehension, consistent with an account of language processing as a neural filter from sensory to abstract representations.
Collapse
Affiliation(s)
- Filiz Tezcan
- Language and Computation in Neural Systems Group, Max Planck Institute for PsycholinguisticsNijmegenNetherlands
| | - Hugo Weissbart
- Donders Centre for Cognitive Neuroimaging, Radboud UniversityNijmegenNetherlands
| | - Andrea E Martin
- Language and Computation in Neural Systems Group, Max Planck Institute for PsycholinguisticsNijmegenNetherlands
- Donders Centre for Cognitive Neuroimaging, Radboud UniversityNijmegenNetherlands
| |
Collapse
|
21
|
Slaats S, Weissbart H, Schoffelen JM, Meyer AS, Martin AE. Delta-Band Neural Responses to Individual Words Are Modulated by Sentence Processing. J Neurosci 2023; 43:4867-4883. [PMID: 37221093 PMCID: PMC10312058 DOI: 10.1523/jneurosci.0964-22.2023] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 04/17/2023] [Accepted: 04/27/2023] [Indexed: 05/25/2023] Open
Abstract
To understand language, we need to recognize words and combine them into phrases and sentences. During this process, responses to the words themselves are changed. In a step toward understanding how the brain builds sentence structure, the present study concerns the neural readout of this adaptation. We ask whether low-frequency neural readouts associated with words change as a function of being in a sentence. To this end, we analyzed an MEG dataset by Schoffelen et al. (2019) of 102 human participants (51 women) listening to sentences and word lists, the latter lacking any syntactic structure and combinatorial meaning. Using temporal response functions and a cumulative model-fitting approach, we disentangled delta- and theta-band responses to lexical information (word frequency), from responses to sensory and distributional variables. The results suggest that delta-band responses to words are affected by sentence context in time and space, over and above entropy and surprisal. In both conditions, the word frequency response spanned left temporal and posterior frontal areas; however, the response appeared later in word lists than in sentences. In addition, sentence context determined whether inferior frontal areas were responsive to lexical information. In the theta band, the amplitude was larger in the word list condition ∼100 milliseconds in right frontal areas. We conclude that low-frequency responses to words are changed by sentential context. The results of this study show how the neural representation of words is affected by structural context and as such provide insight into how the brain instantiates compositionality in language.SIGNIFICANCE STATEMENT Human language is unprecedented in its combinatorial capacity: we are capable of producing and understanding sentences we have never heard before. Although the mechanisms underlying this capacity have been described in formal linguistics and cognitive science, how they are implemented in the brain remains to a large extent unknown. A large body of earlier work from the cognitive neuroscientific literature implies a role for delta-band neural activity in the representation of linguistic structure and meaning. In this work, we combine these insights and techniques with findings from psycholinguistics to show that meaning is more than the sum of its parts; the delta-band MEG signal differentially reflects lexical information inside and outside sentence structures.
Collapse
Affiliation(s)
- Sophie Slaats
- Max Planck Institute for Psycholinguistics, 6525 XD Nijmegen, The Netherlands
- The International Max Planck Research School for Language Sciences, 6525 XD Nijmegen, The Netherlands
| | - Hugo Weissbart
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, 6525 EN Nijmegen, The Netherlands
| | - Jan-Mathijs Schoffelen
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, 6525 EN Nijmegen, The Netherlands
| | - Antje S Meyer
- Max Planck Institute for Psycholinguistics, 6525 XD Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, 6525 EN Nijmegen, The Netherlands
| | - Andrea E Martin
- Max Planck Institute for Psycholinguistics, 6525 XD Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, 6525 EN Nijmegen, The Netherlands
| |
Collapse
|
22
|
Giroud J, Lerousseau JP, Pellegrino F, Morillon B. The channel capacity of multilevel linguistic features constrains speech comprehension. Cognition 2023; 232:105345. [PMID: 36462227 DOI: 10.1016/j.cognition.2022.105345] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 09/28/2022] [Accepted: 11/22/2022] [Indexed: 12/05/2022]
Abstract
Humans are expert at processing speech but how this feat is accomplished remains a major question in cognitive neuroscience. Capitalizing on the concept of channel capacity, we developed a unified measurement framework to investigate the respective influence of seven acoustic and linguistic features on speech comprehension, encompassing acoustic, sub-lexical, lexical and supra-lexical levels of description. We show that comprehension is independently impacted by all these features, but at varying degrees and with a clear dominance of the syllabic rate. Comparing comprehension of French words and sentences further reveals that when supra-lexical contextual information is present, the impact of all other features is dramatically reduced. Finally, we estimated the channel capacity associated with each linguistic feature and compared them with their generic distribution in natural speech. Our data reveal that while acoustic modulation, syllabic and phonemic rates unfold respectively at 5, 5, and 12 Hz in natural speech, they are associated with independent processing bottlenecks whose channel capacity are of 15, 15 and 35 Hz, respectively, as suggested by neurophysiological theories. They moreover point towards supra-lexical contextual information as the feature limiting the flow of natural speech. Overall, this study reveals how multilevel linguistic features constrain speech comprehension.
Collapse
Affiliation(s)
- Jérémy Giroud
- Aix Marseille Univ, Inserm, INS, Inst Neurosci Syst, Marseille, France.
| | | | - François Pellegrino
- Laboratoire Dynamique du Langage UMR 5596, CNRS, University of Lyon, 14 Avenue Berthelot, 69007 Lyon, France
| | - Benjamin Morillon
- Aix Marseille Univ, Inserm, INS, Inst Neurosci Syst, Marseille, France
| |
Collapse
|
23
|
Su Y, MacGregor LJ, Olasagasti I, Giraud AL. A deep hierarchy of predictions enables online meaning extraction in a computational model of human speech comprehension. PLoS Biol 2023; 21:e3002046. [PMID: 36947552 PMCID: PMC10079236 DOI: 10.1371/journal.pbio.3002046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 04/06/2023] [Accepted: 02/22/2023] [Indexed: 03/23/2023] Open
Abstract
Understanding speech requires mapping fleeting and often ambiguous soundwaves to meaning. While humans are known to exploit their capacity to contextualize to facilitate this process, how internal knowledge is deployed online remains an open question. Here, we present a model that extracts multiple levels of information from continuous speech online. The model applies linguistic and nonlinguistic knowledge to speech processing, by periodically generating top-down predictions and incorporating bottom-up incoming evidence in a nested temporal hierarchy. We show that a nonlinguistic context level provides semantic predictions informed by sensory inputs, which are crucial for disambiguating among multiple meanings of the same word. The explicit knowledge hierarchy of the model enables a more holistic account of the neurophysiological responses to speech compared to using lexical predictions generated by a neural network language model (GPT-2). We also show that hierarchical predictions reduce peripheral processing via minimizing uncertainty and prediction error. With this proof-of-concept model, we demonstrate that the deployment of hierarchical predictions is a possible strategy for the brain to dynamically utilize structured knowledge and make sense of the speech input.
Collapse
Affiliation(s)
- Yaqing Su
- Department of Fundamental Neuroscience, Faculty of Medicine, University of Geneva, Geneva, Switzerland
- Swiss National Centre of Competence in Research “Evolving Language” (NCCR EvolvingLanguage), Geneva, Switzerland
| | - Lucy J. MacGregor
- Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, United Kingdom
| | - Itsaso Olasagasti
- Department of Fundamental Neuroscience, Faculty of Medicine, University of Geneva, Geneva, Switzerland
- Swiss National Centre of Competence in Research “Evolving Language” (NCCR EvolvingLanguage), Geneva, Switzerland
| | - Anne-Lise Giraud
- Department of Fundamental Neuroscience, Faculty of Medicine, University of Geneva, Geneva, Switzerland
- Swiss National Centre of Competence in Research “Evolving Language” (NCCR EvolvingLanguage), Geneva, Switzerland
- Institut Pasteur, Université Paris Cité, Inserm, Institut de l’Audition, Paris, France
| |
Collapse
|
24
|
Kazanina N, Tavano A. What neural oscillations can and cannot do for syntactic structure building. Nat Rev Neurosci 2023; 24:113-128. [PMID: 36460920 DOI: 10.1038/s41583-022-00659-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/02/2022] [Indexed: 12/04/2022]
Abstract
Understanding what someone says requires relating words in a sentence to one another as instructed by the grammatical rules of a language. In recent years, the neurophysiological basis for this process has become a prominent topic of discussion in cognitive neuroscience. Current proposals about the neural mechanisms of syntactic structure building converge on a key role for neural oscillations in this process, but they differ in terms of the exact function that is assigned to them. In this Perspective, we discuss two proposed functions for neural oscillations - chunking and multiscale information integration - and evaluate their merits and limitations taking into account a fundamentally hierarchical nature of syntactic representations in natural languages. We highlight insights that provide a tangible starting point for a neurocognitive model of syntactic structure building.
Collapse
Affiliation(s)
- Nina Kazanina
- University of Bristol, Bristol, UK.
- Higher School of Economics, Moscow, Russia.
| | | |
Collapse
|
25
|
Lo CW, Tung TY, Ke AH, Brennan JR. Hierarchy, Not Lexical Regularity, Modulates Low-Frequency Neural Synchrony During Language Comprehension. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2022; 3:538-555. [PMID: 37215342 PMCID: PMC10158645 DOI: 10.1162/nol_a_00077] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 06/20/2022] [Indexed: 05/24/2023]
Abstract
Neural responses appear to synchronize with sentence structure. However, researchers have debated whether this response in the delta band (0.5-3 Hz) really reflects hierarchical information or simply lexical regularities. Computational simulations in which sentences are represented simply as sequences of high-dimensional numeric vectors that encode lexical information seem to give rise to power spectra similar to those observed for sentence synchronization, suggesting that sentence-level cortical tracking findings may reflect sequential lexical or part-of-speech information, and not necessarily hierarchical syntactic information. Using electroencephalography (EEG) data and the frequency-tagging paradigm, we develop a novel experimental condition to tease apart the predictions of the lexical and the hierarchical accounts of the attested low-frequency synchronization. Under a lexical model, synchronization should be observed even when words are reversed within their phrases (e.g., "sheep white grass eat" instead of "white sheep eat grass"), because the same lexical items are preserved at the same regular intervals. Critically, such stimuli are not syntactically well-formed; thus a hierarchical model does not predict synchronization of phrase- and sentence-level structure in the reversed phrase condition. Computational simulations confirm these diverging predictions. EEG data from N = 31 native speakers of Mandarin show robust delta synchronization to syntactically well-formed isochronous speech. Importantly, no such pattern is observed for reversed phrases, consistent with the hierarchical, but not the lexical, accounts.
Collapse
Affiliation(s)
- Chia-Wen Lo
- Research Group Language Cycles, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Department of Linguistics, University of Michigan, Ann Arbor, MI, USA
| | - Tzu-Yun Tung
- Department of Linguistics, University of Michigan, Ann Arbor, MI, USA
| | - Alan Hezao Ke
- Department of Linguistics, University of Michigan, Ann Arbor, MI, USA
- Department of Linguistics, Languages and Cultures, Michigan State University, East Lansing, MI, USA
| | | |
Collapse
|
26
|
ten Oever S, Carta S, Kaufeld G, Martin AE. Neural tracking of phrases in spoken language comprehension is automatic and task-dependent. eLife 2022; 11:e77468. [PMID: 35833919 PMCID: PMC9282854 DOI: 10.7554/elife.77468] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 06/25/2022] [Indexed: 12/02/2022] Open
Abstract
Linguistic phrases are tracked in sentences even though there is no one-to-one acoustic phrase marker in the physical signal. This phenomenon suggests an automatic tracking of abstract linguistic structure that is endogenously generated by the brain. However, all studies investigating linguistic tracking compare conditions where either relevant information at linguistic timescales is available, or where this information is absent altogether (e.g., sentences versus word lists during passive listening). It is therefore unclear whether tracking at phrasal timescales is related to the content of language, or rather, results as a consequence of attending to the timescales that happen to match behaviourally relevant information. To investigate this question, we presented participants with sentences and word lists while recording their brain activity with magnetoencephalography (MEG). Participants performed passive, syllable, word, and word-combination tasks corresponding to attending to four different rates: one they would naturally attend to, syllable-rates, word-rates, and phrasal-rates, respectively. We replicated overall findings of stronger phrasal-rate tracking measured with mutual information for sentences compared to word lists across the classical language network. However, in the inferior frontal gyrus (IFG) we found a task effect suggesting stronger phrasal-rate tracking during the word-combination task independent of the presence of linguistic structure, as well as stronger delta-band connectivity during this task. These results suggest that extracting linguistic information at phrasal rates occurs automatically with or without the presence of an additional task, but also that IFG might be important for temporal integration across various perceptual domains.
Collapse
Affiliation(s)
- Sanne ten Oever
- Language and Computation in Neural Systems group, Max Planck Institute for PsycholinguisticsNijmegenNetherlands
- Language and Computation in Neural Systems group, Donders Centre for Cognitive NeuroimagingNijmegenNetherlands
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht UniversityMaastrichtNetherlands
| | - Sara Carta
- Language and Computation in Neural Systems group, Max Planck Institute for PsycholinguisticsNijmegenNetherlands
- ADAPT Centre, School of Computer Science and Statistics, University of Dublin, Trinity CollegeDublinIreland
- CIMeC - Center for Mind/Brain Sciences, University of TrentoTrentoItaly
| | - Greta Kaufeld
- Language and Computation in Neural Systems group, Max Planck Institute for PsycholinguisticsNijmegenNetherlands
| | - Andrea E Martin
- Language and Computation in Neural Systems group, Max Planck Institute for PsycholinguisticsNijmegenNetherlands
- Language and Computation in Neural Systems group, Donders Centre for Cognitive NeuroimagingNijmegenNetherlands
| |
Collapse
|
27
|
Bai F, Meyer AS, Martin AE. Neural dynamics differentially encode phrases and sentences during spoken language comprehension. PLoS Biol 2022; 20:e3001713. [PMID: 35834569 PMCID: PMC9282610 DOI: 10.1371/journal.pbio.3001713] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 06/14/2022] [Indexed: 11/19/2022] Open
Abstract
Human language stands out in the natural world as a biological signal that uses a structured system to combine the meanings of small linguistic units (e.g., words) into larger constituents (e.g., phrases and sentences). However, the physical dynamics of speech (or sign) do not stand in a one-to-one relationship with the meanings listeners perceive. Instead, listeners infer meaning based on their knowledge of the language. The neural readouts of the perceptual and cognitive processes underlying these inferences are still poorly understood. In the present study, we used scalp electroencephalography (EEG) to compare the neural response to phrases (e.g., the red vase) and sentences (e.g., the vase is red), which were close in semantic meaning and had been synthesized to be physically indistinguishable. Differences in structure were well captured in the reorganization of neural phase responses in delta (approximately <2 Hz) and theta bands (approximately 2 to 7 Hz),and in power and power connectivity changes in the alpha band (approximately 7.5 to 13.5 Hz). Consistent with predictions from a computational model, sentences showed more power, more power connectivity, and more phase synchronization than phrases did. Theta-gamma phase-amplitude coupling occurred, but did not differ between the syntactic structures. Spectral-temporal response function (STRF) modeling revealed different encoding states for phrases and sentences, over and above the acoustically driven neural response. Our findings provide a comprehensive description of how the brain encodes and separates linguistic structures in the dynamics of neural responses. They imply that phase synchronization and strength of connectivity are readouts for the constituent structure of language. The results provide a novel basis for future neurophysiological research on linguistic structure representation in the brain, and, together with our simulations, support time-based binding as a mechanism of structure encoding in neural dynamics.
Collapse
Affiliation(s)
- Fan Bai
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, the Netherlands
| | - Antje S. Meyer
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, the Netherlands
| | - Andrea E. Martin
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, the Netherlands
| |
Collapse
|
28
|
Coopmans CW, de Hoop H, Hagoort P, Martin AE. Effects of Structure and Meaning on Cortical Tracking of Linguistic Units in Naturalistic Speech. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2022; 3:386-412. [PMID: 37216060 PMCID: PMC10158633 DOI: 10.1162/nol_a_00070] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 03/02/2022] [Indexed: 05/24/2023]
Abstract
Recent research has established that cortical activity "tracks" the presentation rate of syntactic phrases in continuous speech, even though phrases are abstract units that do not have direct correlates in the acoustic signal. We investigated whether cortical tracking of phrase structures is modulated by the extent to which these structures compositionally determine meaning. To this end, we recorded electroencephalography (EEG) of 38 native speakers who listened to naturally spoken Dutch stimuli in different conditions, which parametrically modulated the degree to which syntactic structure and lexical semantics determine sentence meaning. Tracking was quantified through mutual information between the EEG data and either the speech envelopes or abstract annotations of syntax, all of which were filtered in the frequency band corresponding to the presentation rate of phrases (1.1-2.1 Hz). Overall, these mutual information analyses showed stronger tracking of phrases in regular sentences than in stimuli whose lexical-syntactic content is reduced, but no consistent differences in tracking between sentences and stimuli that contain a combination of syntactic structure and lexical content. While there were no effects of compositional meaning on the degree of phrase-structure tracking, analyses of event-related potentials elicited by sentence-final words did reveal meaning-induced differences between conditions. Our findings suggest that cortical tracking of structure in sentences indexes the internal generation of this structure, a process that is modulated by the properties of its input, but not by the compositional interpretation of its output.
Collapse
Affiliation(s)
- Cas W. Coopmans
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Centre for Language Studies, Radboud University, Nijmegen, The Netherlands
| | - Helen de Hoop
- Centre for Language Studies, Radboud University, Nijmegen, The Netherlands
| | - Peter Hagoort
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Andrea E. Martin
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| |
Collapse
|
29
|
Missing links: The functional unification of language and memory (L∪M). Neurosci Biobehav Rev 2021; 133:104489. [PMID: 34929226 DOI: 10.1016/j.neubiorev.2021.12.012] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 11/14/2021] [Accepted: 12/07/2021] [Indexed: 10/19/2022]
Abstract
The field of neurocognition is currently undergoing a significant change of perspective. Traditional neurocognitive models evolved into an integrative and dynamic vision of cognitive functioning. Dynamic integration assumes an interaction between cognitive domains traditionally considered to be distinct. Language and declarative memory are regarded as separate functions supported by different neural systems. However, they also share anatomical structures (notably, the inferior frontal gyrus, the supplementary motor area, the superior and middle temporal gyrus, and the hippocampal complex) and cognitive processes (such as semantic and working memory) that merge to endorse our quintessential daily lives. We propose a new model, "L∪M" (i.e., Language/union/Memory), that considers these two functions interactively. We fractionated language and declarative memory into three fundamental dimensions or systems ("Receiver-Transmitter", "Controller-Manager" and "Transformer-Associative" Systems), that communicate reciprocally. We formalized their interactions at the brain level with a connectivity-based approach. This new taxonomy overcomes the modular view of cognitive functioning and reconciles functional specialization with plasticity in neurological disorders.
Collapse
|
30
|
Palaniyappan L. Dissecting the neurobiology of linguistic disorganisation and impoverishment in schizophrenia. Semin Cell Dev Biol 2021; 129:47-60. [PMID: 34507903 DOI: 10.1016/j.semcdb.2021.08.015] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 08/13/2021] [Accepted: 05/06/2021] [Indexed: 12/16/2022]
Abstract
Schizophrenia provides a quintessential disease model of how disturbances in the molecular mechanisms of neurodevelopment lead to disruptions in the emergence of cognition. The central and often persistent feature of this illness is the disorganisation and impoverishment of language and related expressive behaviours. Though clinically more prominent, the periodic perceptual distortions characterised as psychosis are non-specific and often episodic. While several insights into psychosis have been gained based on study of the dopaminergic system, the mechanistic basis of linguistic disorganisation and impoverishment is still elusive. Key findings from cellular to systems-level studies highlight the role of ubiquitous, inhibitory processes in language production. Dysregulation of these processes at critical time periods, in key brain areas, provides a surprisingly parsimonious account of linguistic disorganisation and impoverishment in schizophrenia. This review links the notion of excitatory/inhibitory (E/I) imbalance at cortical microcircuits to the expression of language behaviour characteristic of schizophrenia, through the building blocks of neurochemistry, neurophysiology, and neurocognition.
Collapse
Affiliation(s)
- Lena Palaniyappan
- Department of Psychiatry,University of Western Ontario, London, Ontario, Canada; Robarts Research Institute,University of Western Ontario, London, Ontario, Canada; Lawson Health Research Institute, London, Ontario, Canada.
| |
Collapse
|
31
|
Ten Oever S, Martin AE. An oscillating computational model can track pseudo-rhythmic speech by using linguistic predictions. eLife 2021; 10:68066. [PMID: 34338196 PMCID: PMC8328513 DOI: 10.7554/elife.68066] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 07/16/2021] [Indexed: 11/19/2022] Open
Abstract
Neuronal oscillations putatively track speech in order to optimize sensory processing. However, it is unclear how isochronous brain oscillations can track pseudo-rhythmic speech input. Here we propose that oscillations can track pseudo-rhythmic speech when considering that speech time is dependent on content-based predictions flowing from internal language models. We show that temporal dynamics of speech are dependent on the predictability of words in a sentence. A computational model including oscillations, feedback, and inhibition is able to track pseudo-rhythmic speech input. As the model processes, it generates temporal phase codes, which are a candidate mechanism for carrying information forward in time. The model is optimally sensitive to the natural temporal speech dynamics and can explain empirical data on temporal speech illusions. Our results suggest that speech tracking does not have to rely only on the acoustics but could also exploit ongoing interactions between oscillations and constraints flowing from internal language models.
Collapse
Affiliation(s)
- Sanne Ten Oever
- Language and Computation in Neural Systems group, Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands.,Donders Centre for Cognitive Neuroimaging, Radboud University, Nijmegen, Netherlands.,Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| | - Andrea E Martin
- Language and Computation in Neural Systems group, Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands.,Donders Centre for Cognitive Neuroimaging, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
32
|
Li J, Pylkkänen L. Disentangling Semantic Composition and Semantic Association in the Left Temporal Lobe. J Neurosci 2021; 41:6526-6538. [PMID: 34131034 PMCID: PMC8318083 DOI: 10.1523/jneurosci.2317-20.2021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 05/04/2021] [Accepted: 05/05/2021] [Indexed: 11/21/2022] Open
Abstract
Although composing two words into a complex representation (e.g., "coffee cake") is conceptually different from forming associations between a pair of words (e.g., "coffee, cake"), the brain regions supporting semantic composition have also been implicated for associative encoding. Here, we adopted a two-word magnetoencephalography (MEG) paradigm which varies compositionality ("French/Korean cheese" vs "France/Korea cheese") and strength of association ("France/French cheese" vs "Korea/Korean cheese") between the two words. We collected MEG data while 42 English speakers (24 females) viewed the two words successively in the scanner, and we applied both univariate regression analyses and multivariate pattern classification to the source estimates of the two words. We show that the left anterior temporal lobe (LATL) and left middle temporal lobe (LMTL) are distinctively modulated by semantic composition and semantic association. Specifically, the LATL is mostly sensitive to high-association compositional phrases, while the LMTL responds more to low-association compositional phrases. Pattern-based directed connectivity analyses further revealed a continuous information flow from the anterior to the middle temporal region, suggesting that the integration of adjective and noun properties originated earlier in the LATL is consistently delivered to the LMTL when the complex meaning is newly encountered. Taken together, our findings shed light into a functional dissociation within the left temporal lobe for compositional and distributional semantic processing.SIGNIFICANCE STATEMENT Prior studies on semantic composition and associative encoding have been conducted independently within the subfields of language and memory, and they typically adopt similar two-word experimental paradigms. However, no direct comparison has been made on the neural substrates of the two processes. The current study relates the two streams of literature, and appeals to audiences in both subfields within cognitive neuroscience. Disentangling the neural computations for semantic composition and association also offers insight into modeling compositional and distributional semantics, which has been the subject of much discussion in natural language processing and cognitive science.
Collapse
Affiliation(s)
- Jixing Li
- NYUAD Institute, New York University Abu Dhabi, Saadiyat Island, Abu Dhabi, United Arab Emirates
| | - Liina Pylkkänen
- NYUAD Institute, New York University Abu Dhabi, Saadiyat Island, Abu Dhabi, United Arab Emirates
- Department of Linguistics, New York University, New York, New York 10003
- Department of Psychology, New York University Abu Dhabi, Saadiyat Island, Abu Dhabi, United Arab Emirates
| |
Collapse
|
33
|
van Rooij I, Baggio G. Theory Before the Test: How to Build High-Verisimilitude Explanatory Theories in Psychological Science. PERSPECTIVES ON PSYCHOLOGICAL SCIENCE 2021; 16:682-697. [PMID: 33404356 PMCID: PMC8273840 DOI: 10.1177/1745691620970604] [Citation(s) in RCA: 77] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Drawing on the philosophy of psychological explanation, we suggest that psychological science, by focusing on effects, may lose sight of its primary explananda: psychological capacities. We revisit Marr's levels-of-analysis framework, which has been remarkably productive and useful for cognitive psychological explanation. We discuss ways in which Marr's framework may be extended to other areas of psychology, such as social, developmental, and evolutionary psychology, bringing new benefits to these fields. We then show how theoretical analyses can endow a theory with minimal plausibility even before contact with empirical data: We call this the theoretical cycle. Finally, we explain how our proposal may contribute to addressing critical issues in psychological science, including how to leverage effects to understand capacities better.
Collapse
Affiliation(s)
- Iris van Rooij
- Donders Institute for Brain, Cognition and Behaviour, Radboud University
| | - Giosuè Baggio
- Department of Language and Literature, Norwegian University of Science and Technology
| |
Collapse
|
34
|
Chen C, Lu Q, Beukers A, Baldassano C, Norman KA. Learning to perform role-filler binding with schematic knowledge. PeerJ 2021; 9:e11046. [PMID: 33850650 PMCID: PMC8019313 DOI: 10.7717/peerj.11046] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Accepted: 02/10/2021] [Indexed: 11/20/2022] Open
Abstract
Through specific experiences, humans learn the relationships that underlie the structure of events in the world. Schema theory suggests that we organize this information in mental frameworks called "schemata," which represent our knowledge of the structure of the world. Generalizing knowledge of structural relationships to new situations requires role-filler binding, the ability to associate specific "fillers" with abstract "roles." For instance, when we hear the sentence Alice ordered a tea from Bob, the role-filler bindings customer:Alice, drink:tea and barista:Bob allow us to understand and make inferences about the sentence. We can perform these bindings for arbitrary fillers-we understand this sentence even if we have never heard the names Alice, tea, or Bob before. In this work, we define a model as capable of performing role-filler binding if it can recall arbitrary fillers corresponding to a specified role, even when these pairings violate correlations seen during training. Previous work found that models can learn this ability when explicitly told what the roles and fillers are, or when given fillers seen during training. We show that networks with external memory learn to bind roles to arbitrary fillers, without explicitly labeled role-filler pairs. We further show that they can perform these bindings on role-filler pairs that violate correlations seen during training, while retaining knowledge of training correlations. We apply analyses inspired by neural decoding to interpret what the networks have learned.
Collapse
Affiliation(s)
- Catherine Chen
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Qihong Lu
- Department of Psychology, Princeton University, Princeton, NJ, USA
| | - Andre Beukers
- Department of Psychology, Princeton University, Princeton, NJ, USA
| | | | - Kenneth A Norman
- Department of Psychology, Princeton University, Princeton, NJ, USA
| |
Collapse
|
35
|
Brown M, Tanenhaus MK, Dilley L. Syllable Inference as a Mechanism for Spoken Language Understanding. Top Cogn Sci 2021; 13:351-398. [PMID: 33780156 DOI: 10.1111/tops.12529] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 02/24/2021] [Accepted: 02/25/2021] [Indexed: 01/25/2023]
Abstract
A classic problem in spoken language comprehension is how listeners perceive speech as being composed of discrete words, given the variable time-course of information in continuous signals. We propose a syllable inference account of spoken word recognition and segmentation, according to which alternative hierarchical models of syllables, words, and phonemes are dynamically posited, which are expected to maximally predict incoming sensory input. Generative models are combined with current estimates of context speech rate drawn from neural oscillatory dynamics, which are sensitive to amplitude rises. Over time, models which result in local minima in error between predicted and recently experienced signals give rise to perceptions of hearing words. Three experiments using the visual world eye-tracking paradigm with a picture-selection task tested hypotheses motivated by this framework. Materials were sentences that were acoustically ambiguous in numbers of syllables, words, and phonemes they contained (cf. English plural constructions, such as "saw (a) raccoon(s) swimming," which have two loci of grammatical information). Time-compressing, or expanding, speech materials permitted determination of how temporal information at, or in the context of, each locus affected looks to, and selection of, pictures with a singular or plural referent (e.g., one or more than one raccoon). Supporting our account, listeners probabilistically interpreted identical chunks of speech as consistent with a singular or plural referent to a degree that was based on the chunk's gradient rate in relation to its context. We interpret these results as evidence that arriving temporal information, judged in relation to language model predictions generated from context speech rate evaluated on a continuous scale, informs inferences about syllables, thereby giving rise to perceptual experiences of understanding spoken language as words separated in time.
Collapse
Affiliation(s)
- Meredith Brown
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, New York, USA.,Department of Psychiatry and Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Department of Psychology, Tufts University, Medford, Massachusetts, USA
| | - Michael K Tanenhaus
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, New York, USA.,School of Psychology, Nanjing Normal University, Nanjing, China
| | - Laura Dilley
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan, USA
| |
Collapse
|
36
|
Meyer L, Lakatos P, He Y. Language Dysfunction in Schizophrenia: Assessing Neural Tracking to Characterize the Underlying Disorder(s)? Front Neurosci 2021; 15:640502. [PMID: 33692672 PMCID: PMC7937925 DOI: 10.3389/fnins.2021.640502] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Accepted: 02/03/2021] [Indexed: 12/19/2022] Open
Abstract
Deficits in language production and comprehension are characteristic of schizophrenia. To date, it remains unclear whether these deficits arise from dysfunctional linguistic knowledge, or dysfunctional predictions derived from the linguistic context. Alternatively, the deficits could be a result of dysfunctional neural tracking of auditory information resulting in decreased auditory information fidelity and even distorted information. Here, we discuss possible ways for clinical neuroscientists to employ neural tracking methodology to independently characterize deficiencies on the auditory-sensory and abstract linguistic levels. This might lead to a mechanistic understanding of the deficits underlying language related disorder(s) in schizophrenia. We propose to combine naturalistic stimulation, measures of speech-brain synchronization, and computational modeling of abstract linguistic knowledge and predictions. These independent but likely interacting assessments may be exploited for an objective and differential diagnosis of schizophrenia, as well as a better understanding of the disorder on the functional level-illustrating the potential of neural tracking methodology as translational tool in a range of psychotic populations.
Collapse
Affiliation(s)
- Lars Meyer
- Research Group Language Cycles, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Clinic for Phoniatrics and Pedaudiology, University Hospital Münster, Münster, Germany
| | - Peter Lakatos
- Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute, Orangeburg, NY, United States
| | - Yifei He
- Department of Psychiatry and Psychotherapy, Philipps-University Marburg, Marburg, Germany
| |
Collapse
|
37
|
Chen Y, Jin P, Ding N. The influence of linguistic information on cortical tracking of words. Neuropsychologia 2020; 148:107640. [DOI: 10.1016/j.neuropsychologia.2020.107640] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Revised: 08/08/2020] [Accepted: 09/28/2020] [Indexed: 10/23/2022]
|
38
|
Linguistic Structure and Meaning Organize Neural Oscillations into a Content-Specific Hierarchy. J Neurosci 2020; 40:9467-9475. [PMID: 33097640 DOI: 10.1523/jneurosci.0302-20.2020] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Revised: 09/25/2020] [Accepted: 10/03/2020] [Indexed: 11/21/2022] Open
Abstract
Neural oscillations track linguistic information during speech comprehension (Ding et al., 2016; Keitel et al., 2018), and are known to be modulated by acoustic landmarks and speech intelligibility (Doelling et al., 2014; Zoefel and VanRullen, 2015). However, studies investigating linguistic tracking have either relied on non-naturalistic isochronous stimuli or failed to fully control for prosody. Therefore, it is still unclear whether low-frequency activity tracks linguistic structure during natural speech, where linguistic structure does not follow such a palpable temporal pattern. Here, we measured electroencephalography (EEG) and manipulated the presence of semantic and syntactic information apart from the timescale of their occurrence, while carefully controlling for the acoustic-prosodic and lexical-semantic information in the signal. EEG was recorded while 29 adult native speakers (22 women, 7 men) listened to naturally spoken Dutch sentences, jabberwocky controls with morphemes and sentential prosody, word lists with lexical content but no phrase structure, and backward acoustically matched controls. Mutual information (MI) analysis revealed sensitivity to linguistic content: MI was highest for sentences at the phrasal (0.8-1.1 Hz) and lexical (1.9-2.8 Hz) timescales, suggesting that the delta-band is modulated by lexically driven combinatorial processing beyond prosody, and that linguistic content (i.e., structure and meaning) organizes neural oscillations beyond the timescale and rhythmicity of the stimulus. This pattern is consistent with neurophysiologically inspired models of language comprehension (Martin, 2016, 2020; Martin and Doumas, 2017) where oscillations encode endogenously generated linguistic content over and above exogenous or stimulus-driven timing and rhythm information.SIGNIFICANCE STATEMENT Biological systems like the brain encode their environment not only by reacting in a series of stimulus-driven responses, but by combining stimulus-driven information with endogenous, internally generated, inferential knowledge and meaning. Understanding language from speech is the human benchmark for this. Much research focuses on the purely stimulus-driven response, but here, we focus on the goal of language behavior: conveying structure and meaning. To that end, we use naturalistic stimuli that contrast acoustic-prosodic and lexical-semantic information to show that, during spoken language comprehension, oscillatory modulations reflect computations related to inferring structure and meaning from the acoustic signal. Our experiment provides the first evidence to date that compositional structure and meaning organize the oscillatory response, above and beyond prosodic and lexical controls.
Collapse
|
39
|
Murphy E. Commentary: A Compositional Neural Architecture for Language. Front Psychol 2020; 11:2101. [PMID: 32982860 PMCID: PMC7492643 DOI: 10.3389/fpsyg.2020.02101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Accepted: 07/28/2020] [Indexed: 11/20/2022] Open
Affiliation(s)
- Elliot Murphy
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School, University of Texas Health Science Center, Houston, TX, United States
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center, Houston, TX, United States
- *Correspondence: Elliot Murphy
| |
Collapse
|
40
|
Haegens S. Entrainment revisited: a commentary on. LANGUAGE, COGNITION AND NEUROSCIENCE 2020; 35:1119-1123. [PMID: 33718510 PMCID: PMC7954236 DOI: 10.1080/23273798.2020.1758335] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Accepted: 04/14/2020] [Indexed: 06/12/2023]
Affiliation(s)
- Saskia Haegens
- Department of Psychiatry, Division of Systems Neuroscience, Columbia University and the Research Foundation for Mental Hygiene, New York State Psychiatric Institute, New York, NY, USA
- Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive Neuroimaging, Radboud University Nijmegen, Nijmegen, The Netherlands
| |
Collapse
|