1
|
Endress AD, de Seyssel M. The specificity of sequential statistical learning: Statistical learning accumulates predictive information from unstructured input but is dissociable from (declarative) memory for words. Cognition 2025; 261:106130. [PMID: 40250103 DOI: 10.1016/j.cognition.2025.106130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2025] [Accepted: 03/24/2025] [Indexed: 04/20/2025]
Abstract
Learning statistical regularities from the environment is ubiquitous across domains and species. It might support the earliest stages of language acquisition, especially identifying and learning words from fluent speech (i.e., word-segmentation). But how do the statistical learning mechanisms involved in word-segmentation interact with the memory mechanisms needed to remember words - and with the learning situations where words need to be learned? Through computational modeling, we first show that earlier results purportedly supporting memory-based theories of statistical learning can be reproduced by memory-less Hebbian learning mechanisms. We then show that, in a memory recall task after exposure to continuous, statistically structured speech sequences, participants track the statistical structure of the speech sequences and are thus sensitive to probable syllable transitions. However, they hardly remember any items at all, with 82% producing no high-probability items. Among the 30% of participants producing (correct) high- or (incorrect) low-probability items, half produced high-probability items and half low-probability items - even while preferring high-probability items in a recognition test. Only discrete familiarization sequences with isolated words yield memories of actual items. Turning to how specific learning situations affect statistical learning, we show that it predominantly operates in continuous speech sequences like those used in earlier experiments, but not in discrete chunk sequences likely more characteristic of early language acquisition. Taken together, these results suggest that statistical learning might be specialized to accumulate distributional information, but that it is dissociable from the (declarative) memory mechanisms needed to acquire words and does not allow learners to identify probable word boundaries.
Collapse
Affiliation(s)
- Ansgar D Endress
- Department of Psychology, City St George's, University of London, UK.
| | - Maureen de Seyssel
- Laboratoire de Sciences Cognitives et de Psycholinguistique, Département d'Etudes Cognitives, ENS, EHESS, CNRS, PSL University, Paris, France; Laboratoire de Linguistique Formelle, Université Paris Cité, CNRS, Paris, France
| |
Collapse
|
2
|
Verosky NJ, Morgan E. Temporal dependencies in event onsets and event content contain redundant information about musical meter. Cognition 2025; 263:106179. [PMID: 40414145 DOI: 10.1016/j.cognition.2025.106179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2024] [Revised: 02/26/2025] [Accepted: 05/05/2025] [Indexed: 05/27/2025]
Abstract
Musical stimuli present listeners with complex temporal information and rich periodic structure. Periodic patterns in music typically involve multiple hierarchical levels: a basic-level repeating pulse known as the "beat," and a higher-order grouping of beats into the "meter." Previous work has found that a musical stimulus's meter is predicted by recurring temporal patterns of note event onsets, measured by profiles of autocorrelation over time lags. Traditionally, that work has emphasized periodic structure in the timing of event onsets (i.e., repeating rhythms). Here, we suggest that musical meter is in fact a more general perceptual phenomenon, instantiating complex profiles of temporal dependencies across both event onsets and multiple feature dimensions in the actual content of events. We use classification techniques to test whether profiles of temporal dependencies in event onsets and in multiple types of event content predict musical meter. Applying random forest models to three musical corpora, we reproduce findings that profiles of temporal dependencies in note event onsets contain information about meter, but we find that profiles of temporal dependencies in pitch height, interval size, and tonal expectancy also contain such information, with high redundancy among temporal dependencies in event onsets and event content as predictors of meter. Moreover, information about meter is distributed across temporal dependencies at multiple time lags, as indicated by the baseline performance of an unsupervised classifier that selects the single time lag with maximum autocorrelation. Redundant profiles of temporal dependencies across multiple stimulus features may provide strong constraints on musical structure that inform listeners' predictive processes.
Collapse
Affiliation(s)
- Niels J Verosky
- Department of Psychology, Yale University, 100 College St., New Haven, CT 06510, United States.
| | - Emily Morgan
- Department of Linguistics, University of California, Davis, United States
| |
Collapse
|
3
|
Verosky NJ. Associative Learning of an Unnormalized Successor Representation. Neural Comput 2024; 36:1410-1423. [PMID: 38776964 DOI: 10.1162/neco_a_01675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2023] [Accepted: 03/13/2024] [Indexed: 05/25/2024]
Abstract
The successor representation is known to relate to temporal associations learned in the temporal context model (Gershman et al., 2012), and subsequent work suggests a wide relevance of the successor representation across spatial, visual, and abstract relational tasks. I demonstrate that the successor representation and purely associative learning have an even deeper relationship than initially indicated: Hebbian temporal associations are an unnormalized form of the successor representation, such that the two converge on an identical representation whenever all states are equally frequent and can correlate highly in practice even when the state distribution is nonuniform.
Collapse
Affiliation(s)
- Niels J Verosky
- Department of Psychology, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
| |
Collapse
|
4
|
Zhao C, Ong JH, Veic A, Patel AD, Jiang C, Fogel AR, Wang L, Hou Q, Das D, Crasto C, Chakrabarti B, Williams TI, Loutrari A, Liu F. Predictive processing of music and language in autism: Evidence from Mandarin and English speakers. Autism Res 2024; 17:1230-1257. [PMID: 38651566 DOI: 10.1002/aur.3133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 04/01/2024] [Indexed: 04/25/2024]
Abstract
Atypical predictive processing has been associated with autism across multiple domains, based mainly on artificial antecedents and consequents. As structured sequences where expectations derive from implicit learning of combinatorial principles, language and music provide naturalistic stimuli for investigating predictive processing. In this study, we matched melodic and sentence stimuli in cloze probabilities and examined musical and linguistic prediction in Mandarin- (Experiment 1) and English-speaking (Experiment 2) autistic and non-autistic individuals using both production and perception tasks. In the production tasks, participants listened to unfinished melodies/sentences and then produced the final notes/words to complete these items. In the perception tasks, participants provided expectedness ratings of the completed melodies/sentences based on the most frequent notes/words in the norms. While Experiment 1 showed intact musical prediction but atypical linguistic prediction in autism in the Mandarin sample that demonstrated imbalanced musical training experience and receptive vocabulary skills between groups, the group difference disappeared in a more closely matched sample of English speakers in Experiment 2. These findings suggest the importance of taking an individual differences approach when investigating predictive processing in music and language in autism, as the difficulty in prediction in autism may not be due to generalized problems with prediction in any type of complex sequence processing.
Collapse
Affiliation(s)
- Chen Zhao
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Jia Hoong Ong
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Anamarija Veic
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Aniruddh D Patel
- Department of Psychology, Tufts University, Medford, Massachusetts, USA
- Program in Brain, Mind, and Consciousness, Canadian Institute for Advanced Research (CIFAR), Toronto, Canada
| | - Cunmei Jiang
- Music College, Shanghai Normal University, Shanghai, China
| | - Allison R Fogel
- Department of Psychology, Tufts University, Medford, Massachusetts, USA
| | - Li Wang
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Qingqi Hou
- Department of Music and Dance, Nanjing Normal University of Special Education, Nanjing, China
| | - Dipsikha Das
- School of Psychology, Keele University, Staffordshire, UK
| | - Cara Crasto
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Bhismadev Chakrabarti
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Tim I Williams
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Ariadne Loutrari
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Fang Liu
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| |
Collapse
|
5
|
Endress AD. Hebbian learning can explain rhythmic neural entrainment to statistical regularities. Dev Sci 2024:e13487. [PMID: 38372153 DOI: 10.1111/desc.13487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 12/26/2023] [Accepted: 01/29/2024] [Indexed: 02/20/2024]
Abstract
In many domains, learners extract recurring units from continuous sequences. For example, in unknown languages, fluent speech is perceived as a continuous signal. Learners need to extract the underlying words from this continuous signal and then memorize them. One prominent candidate mechanism is statistical learning, whereby learners track how predictive syllables (or other items) are of one another. Syllables within the same word predict each other better than syllables straddling word boundaries. But does statistical learning lead to memories of the underlying words-or just to pairwise associations among syllables? Electrophysiological results provide the strongest evidence for the memory view. Electrophysiological responses can be time-locked to statistical word boundaries (e.g., N400s) and show rhythmic activity with a periodicity of word durations. Here, I reproduce such results with a simple Hebbian network. When exposed to statistically structured syllable sequences (and when the underlying words are not excessively long), the network activation is rhythmic with the periodicity of a word duration and activation maxima on word-final syllables. This is because word-final syllables receive more excitation from earlier syllables with which they are associated than less predictable syllables that occur earlier in words. The network is also sensitive to information whose electrophysiological correlates were used to support the encoding of ordinal positions within words. Hebbian learning can thus explain rhythmic neural activity in statistical learning tasks without any memory representations of words. Learners might thus need to rely on cues beyond statistical associations to learn the words of their native language. RESEARCH HIGHLIGHTS: Statistical learning may be utilized to identify recurring units in continuous sequences (e.g., words in fluent speech) but may not generate explicit memory for words. Exposure to statistically structured sequences leads to rhythmic activity with a period of the duration of the underlying units (e.g., words). I show that a memory-less Hebbian network model can reproduce this rhythmic neural activity as well as putative encodings of ordinal positions observed in earlier research. Direct tests are needed to establish whether statistical learning leads to declarative memories for words.
Collapse
Affiliation(s)
- Ansgar D Endress
- Department of Psychology, City, University of London, London, UK
| |
Collapse
|
6
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. SCIENCE ADVANCES 2024; 10:eadk0010. [PMID: 38363839 PMCID: PMC10871532 DOI: 10.1126/sciadv.adk0010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 01/17/2024] [Indexed: 02/18/2024]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception varies along several pitch-based dimensions: (i) the absolute pitch of notes, (ii) the difference in pitch between successive notes, and (iii) the statistical expectation of each note given prior context. How the brain represents these dimensions and whether their encoding is specialized for music remains unknown. We recorded high-density neurophysiological activity directly from the human auditory cortex while participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial map for representing distinct melodic dimensions. The same participants listened to spoken English, and we compared responses to music and speech. Cortical sites selective for music encoded expectation, while sites that encoded pitch and pitch-change in music used the same neural code to represent equivalent properties of speech. Findings reveal how the perception of melody recruits both music-specific and general-purpose sound representations.
Collapse
Affiliation(s)
- Narayan Sankaran
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Matthew K. Leonard
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Frederic Theunissen
- Department of Psychology, University of California, Berkeley, 2121 Berkeley Way, Berkeley, CA 94720, USA
| | - Edward F. Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| |
Collapse
|
7
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.17.562771. [PMID: 37905047 PMCID: PMC10614915 DOI: 10.1101/2023.10.17.562771] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception of melody varies along several pitch-based dimensions: (1) the absolute pitch of notes, (2) the difference in pitch between successive notes, and (3) the higher-order statistical expectation of each note conditioned on its prior context. While humans readily perceive melody, how these dimensions are collectively represented in the brain and whether their encoding is specialized for music remains unknown. Here, we recorded high-density neurophysiological activity directly from the surface of human auditory cortex while Western participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial code for representing distinct dimensions of melody. The same participants listened to spoken English, and we compared evoked responses to music and speech. Cortical sites selective for music were systematically driven by the encoding of expectation. In contrast, sites that encoded pitch and pitch-change used the same neural code to represent equivalent properties of speech. These findings reveal the multidimensional nature of melody encoding, consisting of both music-specific and domain-general sound representations in auditory cortex. Teaser The human brain contains both general-purpose and music-specific neural populations for processing distinct attributes of melody.
Collapse
|
8
|
Endress AD, Johnson SP. Hebbian, correlational learning provides a memory-less mechanism for Statistical Learning irrespective of implementational choices: Reply to Tovar and Westermann (2022). Cognition 2023; 230:105290. [PMID: 36240613 DOI: 10.1016/j.cognition.2022.105290] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Revised: 08/30/2022] [Accepted: 09/17/2022] [Indexed: 11/07/2022]
Abstract
Statistical learning relies on detecting the frequency of co-occurrences of items and has been proposed to be crucial for a variety of learning problems, notably to learn and memorize words from fluent speech. Endress and Johnson (2021) (hereafter EJ) recently showed that such results can be explained based on simple memory-less correlational learning mechanisms such as Hebbian Learning. Tovar and Westermann (2022) (hereafter TW) reproduced these results with a different Hebbian model. We show that the main differences between the models are whether temporal decay acts on both the connection weights and the activations (in TW) or only on the activations (in EJ), and whether interference affects weights (in TW) or activations (in EJ). Given that weights and activations are linked through the Hebbian learning rule, the networks behave similarly. However, in contrast to TW, we do not believe that neurophysiological data are relevant to adjudicate between abstract psychological models with little biological detail. Taken together, both models show that different memory-less correlational learning mechanisms provide a parsimonious account of Statistical Learning results. They are consistent with evidence that Statistical Learning might not allow learners to learn and retain words, and Statistical Learning might support predictive processing instead.
Collapse
Affiliation(s)
| | - Scott P Johnson
- Department of Psychology, University of California, Los Angeles, United States of America
| |
Collapse
|