1
|
Slugocki C, Kuk F, Korhonen P. Using the Mismatch Negativity to Evaluate Hearing Aid Directional Enhancement Based on Multistream Architecture. Ear Hear 2025; 46:747-757. [PMID: 39699127 PMCID: PMC11984554 DOI: 10.1097/aud.0000000000001619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 11/10/2024] [Indexed: 12/20/2024]
Abstract
OBJECTIVES To evaluate whether hearing aid directivity based on multistream architecture (MSA) might enhance the mismatch negativity (MMN) evoked by phonemic contrasts in noise. DESIGN Single-blind within-subjects design. Fifteen older adults (mean age = 72.7 years, range = 40 to 88 years, 8 females) with a moderate-to-severe degree of sensorineural hearing loss participated. Participants first performed an adaptive two-alternative forced choice phonemic discrimination task to determine the speech level-that is, signal to noise ratio (SNR)-required to reliably discriminate between two monosyllabic stimuli (/ba/ and /da/) presented in the presence of ongoing fixed-level background noise. Participants were then presented with a phonemic oddball sequence alternating on each trial between two loudspeakers located in the front at 0° and -30° azimuth. This sequence presented the same monosyllabic stimuli in the same background noise at individualized SNRs determined by the phonemic discrimination task. The MMN was measured as participants passively listened to the oddball sequence in two hearing aid conditions: MSA-ON and MSA-OFF. RESULTS The magnitude of the MMN component was significantly enhanced when evoked in MSA-ON relative to MSA-OFF conditions. Unexpectedly, MMN magnitudes were also positively related to degrees of hearing loss. Neither MSA nor the participant's hearing loss was found to independently affect MMN latency. However, MMN latency was significantly affected by the interaction of hearing aid condition and individualized SNRs, where a negative relationship between individualized SNR and MMN latency was observed only in the MSA-OFF condition. CONCLUSIONS Hearing aid directivity based on the MSA approach was found to improve preattentive detection of phonemic contrasts in a simulated multi-talker situation as indexed by larger MMN component magnitudes. The MMN may generally be useful for exploring the underlying nature of speech-in-noise benefits conferred by some hearing aid features.
Collapse
Affiliation(s)
- Christopher Slugocki
- Office of Research in Clinical Amplification, WS Audiology, Lisle, Illinois, USA
| | - Francis Kuk
- Office of Research in Clinical Amplification, WS Audiology, Lisle, Illinois, USA
| | - Petri Korhonen
- Office of Research in Clinical Amplification, WS Audiology, Lisle, Illinois, USA
| |
Collapse
|
2
|
Usler E. An active inference account of stuttering behavior. Front Hum Neurosci 2025; 19:1498423. [PMID: 40247916 PMCID: PMC12003396 DOI: 10.3389/fnhum.2025.1498423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Accepted: 03/17/2025] [Indexed: 04/19/2025] Open
Abstract
This paper presents an interpretation of stuttering behavior, based on the principles of the active inference framework. Stuttering is a neurodevelopmental disorder characterized by speech disfluencies such as repetitions, prolongations, and blocks. The principles of active inference, a theory of predictive processing and sentient behavior, can be used to conceptualize stuttering as a disruption in perception-action cycling underlying speech production. The theory proposed here posits that stuttering arises from aberrant sensory precision and prediction error dynamics, inhibiting syllable initiation. Relevant to this theory, two hypothesized mechanisms are proposed: (1) a mistiming in precision dynamics, and (2) excessive attentional focus. Both highlight the role of neural oscillations, prediction error, and hierarchical integration in speech production. This framework also explains the contextual variability of stuttering behaviors, including adaptation effects and fluency-inducing conditions. Reframing stuttering as a synaptopathy integrates neurobiological, psychological, and behavioral dimensions, suggesting disruptions in precision-weighting mediated by neuromodulatory systems. This active inference perspective provides a unified account of stuttering and sets the stage for innovative research and therapeutic approaches.
Collapse
Affiliation(s)
- Evan Usler
- Department of Communication Sciences and Disorders, University of Delaware, Newark, DE, United States
| |
Collapse
|
3
|
Parr T, Oswal A, Manohar SG. Inferring when to move. Neurosci Biobehav Rev 2025; 169:105984. [PMID: 39694432 DOI: 10.1016/j.neubiorev.2024.105984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2024] [Revised: 12/11/2024] [Accepted: 12/16/2024] [Indexed: 12/20/2024]
Abstract
Most of our movement consists of sequences of discrete actions at regular intervals-including speech, walking, playing music, or even chewing. Despite this, few models of the motor system address how the brain determines the interval at which to trigger actions. This paper offers a theoretical analysis of the problem of timing movements. We consider a scenario in which we must align an alternating movement with a regular external (auditory) stimulus. We assume that our brains employ generative world models that include internal clocks of various speeds. These allow us to associate a temporally regular sensory input with an internal clock, and actions with parts of that clock cycle. We treat this as process of inferring which clock best explains sensory input. This offers a way in which temporally discrete choices might emerge from a continuous process. This is not straightforward, particularly if each of those choices unfolds during a time that has a (possibly unknown) duration. We develop a route for translation to neurology, in the context of Parkinson's disease-a disorder that characteristically slows down movements. The effects are often elicited in clinic by alternating movements. We find that it is possible to reproduce behavioural and electrophysiological features associated with parkinsonism by disrupting specific parameters-that determine the priors for inferences made by the brain. We observe three core features of Parkinson's disease: amplitude decrement, festination, and breakdown of repetitive movements. Our simulations provide a mechanistic interpretation of how pathology and therapeutics might influence behaviour and neural activity.
Collapse
Affiliation(s)
- Thomas Parr
- Nuffield Department of Clinical Neurosciences, University of Oxford, UK.
| | - Ashwini Oswal
- Nuffield Department of Clinical Neurosciences, University of Oxford, UK
| | - Sanjay G Manohar
- Nuffield Department of Clinical Neurosciences, University of Oxford, UK; Department of Experimental Psychology, University of Oxford, UK
| |
Collapse
|
4
|
Bradshaw AR, Wheeler ED, McGettigan C, Lametti DR. Sensorimotor learning during synchronous speech is modulated by the acoustics of the other voice. Psychon Bull Rev 2025; 32:306-316. [PMID: 38955989 PMCID: PMC11836077 DOI: 10.3758/s13423-024-02536-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/07/2024] [Indexed: 07/04/2024]
Abstract
This study tested the hypothesis that speaking with other voices can influence sensorimotor predictions of one's own voice. Real-time manipulations of auditory feedback were used to drive sensorimotor adaptation in speech, while participants spoke sentences in synchrony with another voice, a task known to induce implicit imitation (phonetic convergence). The acoustic-phonetic properties of the other voice were manipulated between groups, such that convergence with it would either oppose (incongruent group, n = 15) or align with (congruent group, n = 16) speech motor adaptation. As predicted, significantly greater adaptation was seen in the congruent compared to the incongruent group. This suggests the use of shared sensory targets in speech for predicting the sensory outcomes of both the actions of others (speech perception) and the actions of the self (speech production). This finding has important implications for wider theories of shared predictive mechanisms across perception and action, such as active inference.
Collapse
Affiliation(s)
- Abigail R Bradshaw
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK.
| | - Emma D Wheeler
- Department of Psychology, Acadia University, Wolfville, Nova Scotia, Canada
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| | - Daniel R Lametti
- Department of Psychology, Acadia University, Wolfville, Nova Scotia, Canada
| |
Collapse
|
5
|
Chen YP, Neff P, Leske S, Wong DDE, Peter N, Obleser J, Kleinjung T, Dimitrijevic A, Dalal SS, Weisz N. Cochlear implantation in adults with acquired single-sided deafness improves cortical processing and comprehension of speech presented to the non-implanted ears: a longitudinal EEG study. Brain Commun 2025; 7:fcaf001. [PMID: 39816191 PMCID: PMC11733687 DOI: 10.1093/braincomms/fcaf001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 09/26/2024] [Accepted: 01/01/2025] [Indexed: 01/18/2025] Open
Abstract
Former studies have established that individuals with a cochlear implant (CI) for treating single-sided deafness experience improved speech processing after implantation. However, it is not clear how each ear contributes separately to improve speech perception over time at the behavioural and neural level. In this longitudinal EEG study with four different time points, we measured neural activity in response to various temporally and spectrally degraded spoken words presented monaurally to the CI and non-CI ears (5 left and 5 right ears) in 10 single-sided CI users and 10 age- and sex-matched individuals with normal hearing. Subjective comprehension ratings for each word were also recorded. Data from single-sided CI participants were collected pre-CI implantation, and at 3, 6 and 12 months after implantation. We conducted a time-resolved representational similarity analysis on the EEG data to quantify whether and how neural patterns became more similar to those of normal hearing individuals. At 6 months after implantation, the speech comprehension ratings for the degraded words improved in both ears. Notably, the improvement was more pronounced for the non-CI ears than the CI ears. Furthermore, the enhancement in the non-CI ears was paralleled by increased similarity to neural representational patterns of the normal hearing control group. The maximum of this effect coincided with peak decoding accuracy for spoken-word comprehension (600-1200 ms after stimulus onset). The present data demonstrate that cortical processing gradually normalizes within months after CI implantation for speech presented to the non-CI ear. CI enables the deaf ear to provide afferent input, which, according to our results, complements the input of the non-CI ear, gradually improving its function. These novel findings underscore the feasibility of tracking neural recovery after auditory input restoration using advanced multivariate analysis methods, such as representational similarity analysis.
Collapse
Affiliation(s)
- Ya-Ping Chen
- Centre for Cognitive Neuroscience, University of Salzburg, 5020 Salzburg, Austria
- Department of Psychology, University of Salzburg, 5020 Salzburg, Austria
| | - Patrick Neff
- Centre for Cognitive Neuroscience, University of Salzburg, 5020 Salzburg, Austria
- Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Zurich, University of Zurich, 8091 Zurich, Switzerland
- Department of Psychiatry and Psychotherapy, University of Regensburg, 93053 Regensburg, Germany
- Neuro-X Institute, Ecole Polytechnique Fédérale de Lausanne (EPFL), Campus Biotech, 1202 Geneva, Switzerland
| | - Sabine Leske
- RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion, University of Oslo, 0313 Oslo, Norway
- Department of Musicology, University of Oslo, 0313 Oslo, Norway
- Department of Neuropsychology, Helgeland Hospital, 8657 Mosjøen, Norway
- Department of Psychology, Universität Konstanz, 78457 Konstanz, Germany
| | - Daniel D E Wong
- Department of Psychology, Universität Konstanz, 78457 Konstanz, Germany
| | - Nicole Peter
- Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Zurich, University of Zurich, 8091 Zurich, Switzerland
| | - Jonas Obleser
- Center of Brain, Behavior, and Metabolism, University of Lübeck, 23562 Lübeck, Germany
- Department of Psychology, University of Lübeck, 23562 Lübeck, Germany
| | - Tobias Kleinjung
- Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Zurich, University of Zurich, 8091 Zurich, Switzerland
| | - Andrew Dimitrijevic
- Evaluative Clinical Sciences Platform, Sunnybrook Research Institute, Toronto, ON M4N 3M5, Canada
- Otolaryngology-Head and Neck Surgery, Sunnybrook Health Sciences Centre, Toronto, ON M4N 3M5, Canada
- Faculty of Medicine, Otolaryngology-Head and Neck Surgery, University of Toronto, Toronto, ON M5S 3H2, Canada
| | - Sarang S Dalal
- Department of Psychology, Universität Konstanz, 78457 Konstanz, Germany
- Department of Clinical Medicine, Center of Functionally Integrative Neuroscience, Aarhus University, 8200 Aarhus, Denmark
| | - Nathan Weisz
- Centre for Cognitive Neuroscience, University of Salzburg, 5020 Salzburg, Austria
- Department of Psychology, University of Salzburg, 5020 Salzburg, Austria
- Neuroscience Institute, Christian Doppler University Hospital, Paracelsus Medical University, 5020 Salzburg, Austria
| |
Collapse
|
6
|
Parr T, Friston K, Pezzulo G. Generative models for sequential dynamics in active inference. Cogn Neurodyn 2024; 18:3259-3272. [PMID: 39712086 PMCID: PMC11655747 DOI: 10.1007/s11571-023-09963-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 02/28/2023] [Accepted: 03/17/2023] [Indexed: 12/24/2024] Open
Abstract
A central theme of theoretical neurobiology is that most of our cognitive operations require processing of discrete sequences of items. This processing in turn emerges from continuous neuronal dynamics. Notable examples are sequences of words during linguistic communication or sequences of locations during navigation. In this perspective, we address the problem of sequential brain processing from the perspective of active inference, which inherits from a Helmholtzian view of the predictive (Bayesian) brain. Underneath the active inference lies a generative model; namely, a probabilistic description of how (observable) consequences are generated by (unobservable) causes. We show that one can account for many aspects of sequential brain processing by assuming the brain entails a generative model of the sensed world that comprises central pattern generators, narratives, or well-defined sequences. We provide examples in the domains of motor control (e.g., handwriting), perception (e.g., birdsong recognition) through to planning and understanding (e.g., language). The solutions to these problems include the use of sequences of attracting points to direct complex movements-and the move from continuous representations of auditory speech signals to the discrete words that generate those signals.
Collapse
Affiliation(s)
- Thomas Parr
- Wellcome Centre for Human Neuroimaging, Queen Square Institute of Neurology, University College London, London, UK
| | - Karl Friston
- Wellcome Centre for Human Neuroimaging, Queen Square Institute of Neurology, University College London, London, UK
| | - Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, Via S. Martino Della Battaglia, 44, 00185 Rome, Italy
| |
Collapse
|
7
|
Priorelli M, Stoianov IP. Slow but flexible or fast but rigid? Discrete and continuous processes compared. Heliyon 2024; 10:e39129. [PMID: 39497980 PMCID: PMC11532823 DOI: 10.1016/j.heliyon.2024.e39129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 09/29/2024] [Accepted: 10/08/2024] [Indexed: 11/07/2024] Open
Abstract
A tradeoff exists when dealing with complex tasks composed of multiple steps. High-level cognitive processes can find the best sequence of actions to achieve a goal in uncertain environments, but they are slow and require significant computational demand. In contrast, lower-level processing allows reacting to environmental stimuli rapidly, but with limited capacity to determine optimal actions or to replan when expectations are not met. Through reiteration of the same task, biological organisms find the optimal tradeoff: from action primitives, composite trajectories gradually emerge by creating task-specific neural structures. The two frameworks of active inference - a recent brain paradigm that views action and perception as subject to the same free energy minimization imperative - well capture high-level and low-level processes of human behavior, but how task specialization occurs in these terms is still unclear. In this study, we compare two strategies on a dynamic pick-and-place task: a hybrid (discrete-continuous) model with planning capabilities and a continuous-only model with fixed transitions. Both models rely on a hierarchical (intrinsic and extrinsic) structure, well suited for defining reaching and grasping movements, respectively. Our results show that continuous-only models perform better and with minimal resource expenditure but at the cost of less flexibility. Finally, we propose how discrete actions might lead to continuous attractors and compare the two frameworks with different motor learning phases, laying the foundations for further studies on bio-inspired task adaptation.
Collapse
Affiliation(s)
- Matteo Priorelli
- Institute of Cognitive Sciences and Technologies, National Research Council of Italy, Padova, Italy
| | - Ivilin Peev Stoianov
- Institute of Cognitive Sciences and Technologies, National Research Council of Italy, Padova, Italy
| |
Collapse
|
8
|
Medrano J, Sajid N. A Broken Duet: Multistable Dynamics in Dyadic Interactions. ENTROPY (BASEL, SWITZERLAND) 2024; 26:731. [PMID: 39330066 PMCID: PMC11431444 DOI: 10.3390/e26090731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Revised: 08/07/2024] [Accepted: 08/20/2024] [Indexed: 09/28/2024]
Abstract
Misunderstandings in dyadic interactions often persist despite our best efforts, particularly between native and non-native speakers, resembling a broken duet that refuses to harmonise. This paper delves into the computational mechanisms underpinning these misunderstandings through the lens of the broken Lorenz system-a continuous dynamical model. By manipulating a specific parameter regime, we induce bistability within the Lorenz equations, thereby confining trajectories to distinct attractors based on initial conditions. This mirrors the persistence of divergent interpretations that often result in misunderstandings. Our simulations reveal that differing prior beliefs between interlocutors result in misaligned generative models, leading to stable yet divergent states of understanding when exposed to the same percept. Specifically, native speakers equipped with precise (i.e., overconfident) priors expect inputs to align closely with their internal models, thus struggling with unexpected variations. Conversely, non-native speakers with imprecise (i.e., less confident) priors exhibit a greater capacity to adjust and accommodate unforeseen inputs. Our results underscore the important role of generative models in facilitating mutual understanding (i.e., establishing a shared narrative) and highlight the necessity of accounting for multistable dynamics in dyadic interactions.
Collapse
Affiliation(s)
- Johan Medrano
- Wellcome Centre for Human Neuroimaging, University College London, London WC1N 3AR, UK
| | - Noor Sajid
- Max Planck Institute for Biological Cybernetics, 72076 Tübingen, Germany;
| |
Collapse
|
9
|
Murphy E, Holmes E, Friston K. Natural language syntax complies with the free-energy principle. SYNTHESE 2024; 203:154. [PMID: 38706520 PMCID: PMC11068586 DOI: 10.1007/s11229-024-04566-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Accepted: 03/15/2024] [Indexed: 05/07/2024]
Abstract
Natural language syntax yields an unbounded array of hierarchically structured expressions. We claim that these are used in the service of active inference in accord with the free-energy principle (FEP). While conceptual advances alongside modelling and simulation work have attempted to connect speech segmentation and linguistic communication with the FEP, we extend this program to the underlying computations responsible for generating syntactic objects. We argue that recently proposed principles of economy in language design-such as "minimal search" criteria from theoretical syntax-adhere to the FEP. This affords a greater degree of explanatory power to the FEP-with respect to higher language functions-and offers linguistics a grounding in first principles with respect to computability. While we mostly focus on building new principled conceptual relations between syntax and the FEP, we also show through a sample of preliminary examples how both tree-geometric depth and a Kolmogorov complexity estimate (recruiting a Lempel-Ziv compression algorithm) can be used to accurately predict legal operations on syntactic workspaces, directly in line with formulations of variational free energy minimization. This is used to motivate a general principle of language design that we term Turing-Chomsky Compression (TCC). We use TCC to align concerns of linguists with the normative account of self-organization furnished by the FEP, by marshalling evidence from theoretical linguistics and psycholinguistics to ground core principles of efficient syntactic computation within active inference.
Collapse
Affiliation(s)
- Elliot Murphy
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School, University of Texas Health Science Center, Houston, TX 77030 USA
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center, Houston, TX 77030 USA
| | - Emma Holmes
- Department of Speech Hearing and Phonetic Sciences, University College London, London, WC1N 1PF UK
- The Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, London, WC1N 3AR UK
| | - Karl Friston
- The Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, London, WC1N 3AR UK
| |
Collapse
|
10
|
Pezzulo G, Parr T, Friston K. Active inference as a theory of sentient behavior. Biol Psychol 2024; 186:108741. [PMID: 38182015 DOI: 10.1016/j.biopsycho.2023.108741] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 12/05/2023] [Accepted: 12/29/2023] [Indexed: 01/07/2024]
Abstract
This review paper offers an overview of the history and future of active inference-a unifying perspective on action and perception. Active inference is based upon the idea that sentient behavior depends upon our brains' implicit use of internal models to predict, infer, and direct action. Our focus is upon the conceptual roots and development of this theory of (basic) sentience and does not follow a rigid chronological narrative. We trace the evolution from Helmholtzian ideas on unconscious inference, through to a contemporary understanding of action and perception. In doing so, we touch upon related perspectives, the neural underpinnings of active inference, and the opportunities for future development. Key steps in this development include the formulation of predictive coding models and related theories of neuronal message passing, the use of sequential models for planning and policy optimization, and the importance of hierarchical (temporally) deep internal (i.e., generative or world) models. Active inference has been used to account for aspects of anatomy and neurophysiology, to offer theories of psychopathology in terms of aberrant precision control, and to unify extant psychological theories. We anticipate further development in all these areas and note the exciting early work applying active inference beyond neuroscience. This suggests a future not just in biology, but in robotics, machine learning, and artificial intelligence.
Collapse
Affiliation(s)
- Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy.
| | - Thomas Parr
- Nuffield Department of Clinical Neurosciences, University of Oxford, UK
| | - Karl Friston
- Wellcome Centre for Human Neuroimaging, Queen Square Institute of Neurology, University College London, London, UK; VERSES AI Research Lab, Los Angeles, CA 90016, USA
| |
Collapse
|
11
|
Friston KJ, Parr T, Heins C, Constant A, Friedman D, Isomura T, Fields C, Verbelen T, Ramstead M, Clippinger J, Frith CD. Federated inference and belief sharing. Neurosci Biobehav Rev 2024; 156:105500. [PMID: 38056542 PMCID: PMC11139662 DOI: 10.1016/j.neubiorev.2023.105500] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 11/08/2023] [Accepted: 12/01/2023] [Indexed: 12/08/2023]
Abstract
This paper concerns the distributed intelligence or federated inference that emerges under belief-sharing among agents who share a common world-and world model. Imagine, for example, several animals keeping a lookout for predators. Their collective surveillance rests upon being able to communicate their beliefs-about what they see-among themselves. But, how is this possible? Here, we show how all the necessary components arise from minimising free energy. We use numerical studies to simulate the generation, acquisition and emergence of language in synthetic agents. Specifically, we consider inference, learning and selection as minimising the variational free energy of posterior (i.e., Bayesian) beliefs about the states, parameters and structure of generative models, respectively. The common theme-that attends these optimisation processes-is the selection of actions that minimise expected free energy, leading to active inference, learning and model selection (a.k.a., structure learning). We first illustrate the role of communication in resolving uncertainty about the latent states of a partially observed world, on which agents have complementary perspectives. We then consider the acquisition of the requisite language-entailed by a likelihood mapping from an agent's beliefs to their overt expression (e.g., speech)-showing that language can be transmitted across generations by active learning. Finally, we show that language is an emergent property of free energy minimisation, when agents operate within the same econiche. We conclude with a discussion of various perspectives on these phenomena; ranging from cultural niche construction, through federated learning, to the emergence of complexity in ensembles of self-organising systems.
Collapse
Affiliation(s)
- Karl J Friston
- Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, UK; VERSES AI Research Lab, Los Angeles, CA 90016, USA.
| | - Thomas Parr
- Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, UK
| | - Conor Heins
- VERSES AI Research Lab, Los Angeles, CA 90016, USA; Department of Collective Behaviour, Max Planck Institute of Animal Behavior, 78457 Konstanz, Germany; Centre for the Advanced Study of Collective Behaviour, 78457 Konstanz, Germany; Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| | - Axel Constant
- VERSES AI Research Lab, Los Angeles, CA 90016, USA; School of Engineering and Informatics, The University of Sussex, Brighton, UK
| | - Daniel Friedman
- Department of Entomology and Nematology, University of California, Davis, Davis, CA, USA; Active Inference Institute, Davis, CA 95616, USA
| | - Takuya Isomura
- Brain Intelligence Theory Unit, RIKEN Center for Brain Science, Wako, Saitama 351-0198, Japan
| | - Chris Fields
- Allen Discovery Center at Tufts University, Medford, MA 02155, USA
| | - Tim Verbelen
- VERSES AI Research Lab, Los Angeles, CA 90016, USA
| | - Maxwell Ramstead
- Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, UK; VERSES AI Research Lab, Los Angeles, CA 90016, USA
| | | | - Christopher D Frith
- Institute of Philosophy, School of Advanced Studies, University of London, UK
| |
Collapse
|
12
|
Schilling A, Sedley W, Gerum R, Metzner C, Tziridis K, Maier A, Schulze H, Zeng FG, Friston KJ, Krauss P. Predictive coding and stochastic resonance as fundamental principles of auditory phantom perception. Brain 2023; 146:4809-4825. [PMID: 37503725 PMCID: PMC10690027 DOI: 10.1093/brain/awad255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 06/27/2023] [Accepted: 07/15/2023] [Indexed: 07/29/2023] Open
Abstract
Mechanistic insight is achieved only when experiments are employed to test formal or computational models. Furthermore, in analogy to lesion studies, phantom perception may serve as a vehicle to understand the fundamental processing principles underlying healthy auditory perception. With a special focus on tinnitus-as the prime example of auditory phantom perception-we review recent work at the intersection of artificial intelligence, psychology and neuroscience. In particular, we discuss why everyone with tinnitus suffers from (at least hidden) hearing loss, but not everyone with hearing loss suffers from tinnitus. We argue that intrinsic neural noise is generated and amplified along the auditory pathway as a compensatory mechanism to restore normal hearing based on adaptive stochastic resonance. The neural noise increase can then be misinterpreted as auditory input and perceived as tinnitus. This mechanism can be formalized in the Bayesian brain framework, where the percept (posterior) assimilates a prior prediction (brain's expectations) and likelihood (bottom-up neural signal). A higher mean and lower variance (i.e. enhanced precision) of the likelihood shifts the posterior, evincing a misinterpretation of sensory evidence, which may be further confounded by plastic changes in the brain that underwrite prior predictions. Hence, two fundamental processing principles provide the most explanatory power for the emergence of auditory phantom perceptions: predictive coding as a top-down and adaptive stochastic resonance as a complementary bottom-up mechanism. We conclude that both principles also play a crucial role in healthy auditory perception. Finally, in the context of neuroscience-inspired artificial intelligence, both processing principles may serve to improve contemporary machine learning techniques.
Collapse
Affiliation(s)
- Achim Schilling
- Neuroscience Lab, University Hospital Erlangen, 91054 Erlangen, Germany
- Cognitive Computational Neuroscience Group, University Erlangen-Nürnberg, 91058 Erlangen, Germany
| | - William Sedley
- Translational and Clinical Research Institute, Newcastle University Medical School, Newcastle upon Tyne NE2 4HH, UK
| | - Richard Gerum
- Cognitive Computational Neuroscience Group, University Erlangen-Nürnberg, 91058 Erlangen, Germany
- Department of Physics and Astronomy and Center for Vision Research, York University, Toronto, ON M3J 1P3, Canada
| | - Claus Metzner
- Neuroscience Lab, University Hospital Erlangen, 91054 Erlangen, Germany
| | | | - Andreas Maier
- Pattern Recognition Lab, University Erlangen-Nürnberg, 91058 Erlangen, Germany
| | - Holger Schulze
- Neuroscience Lab, University Hospital Erlangen, 91054 Erlangen, Germany
| | - Fan-Gang Zeng
- Center for Hearing Research, Departments of Anatomy and Neurobiology, Biomedical Engineering, Cognitive Sciences, Otolaryngology–Head and Neck Surgery, University of California Irvine, Irvine, CA 92697, USA
| | - Karl J Friston
- Wellcome Centre for Human Neuroimaging, Institute of Neurology, University College London, London WC1N 3AR, UK
| | - Patrick Krauss
- Neuroscience Lab, University Hospital Erlangen, 91054 Erlangen, Germany
- Cognitive Computational Neuroscience Group, University Erlangen-Nürnberg, 91058 Erlangen, Germany
- Pattern Recognition Lab, University Erlangen-Nürnberg, 91058 Erlangen, Germany
| |
Collapse
|
13
|
Parr T, Limanowski J. Synchronising our internal clocks: Comment on: "An active inference model of hierarchical action understanding, learning and imitation" by Proietti et al. Phys Life Rev 2023; 46:258-260. [PMID: 37544051 DOI: 10.1016/j.plrev.2023.07.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 07/24/2023] [Indexed: 08/08/2023]
Affiliation(s)
- Thomas Parr
- Wellcome Centre for Human Neuroimaging, Queen Square Institute of Neurology, University College London, United Kingdom.
| | | |
Collapse
|
14
|
Schubert J, Schmidt F, Gehmacher Q, Bresgen A, Weisz N. Cortical speech tracking is related to individual prediction tendencies. Cereb Cortex 2023; 33:6608-6619. [PMID: 36617790 PMCID: PMC10233232 DOI: 10.1093/cercor/bhac528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 12/13/2022] [Accepted: 12/14/2022] [Indexed: 01/10/2023] Open
Abstract
Listening can be conceptualized as a process of active inference, in which the brain forms internal models to integrate auditory information in a complex interaction of bottom-up and top-down processes. We propose that individuals vary in their "prediction tendency" and that this variation contributes to experiential differences in everyday listening situations and shapes the cortical processing of acoustic input such as speech. Here, we presented tone sequences of varying entropy level, to independently quantify auditory prediction tendency (as the tendency to anticipate low-level acoustic features) for each individual. This measure was then used to predict cortical speech tracking in a multi speaker listening task, where participants listened to audiobooks narrated by a target speaker in isolation or interfered by 1 or 2 distractors. Furthermore, semantic violations were introduced into the story, to also examine effects of word surprisal during speech processing. Our results show that cortical speech tracking is related to prediction tendency. In addition, we find interactions between prediction tendency and background noise as well as word surprisal in disparate brain regions. Our findings suggest that individual prediction tendencies are generalizable across different listening situations and may serve as a valuable element to explain interindividual differences in natural listening situations.
Collapse
Affiliation(s)
- Juliane Schubert
- Centre for Cognitive Neuroscience and Department of Psychology, University of Salzburg, Austria
| | - Fabian Schmidt
- Centre for Cognitive Neuroscience and Department of Psychology, University of Salzburg, Austria
| | - Quirin Gehmacher
- Centre for Cognitive Neuroscience and Department of Psychology, University of Salzburg, Austria
| | - Annika Bresgen
- Centre for Cognitive Neuroscience and Department of Psychology, University of Salzburg, Austria
| | - Nathan Weisz
- Centre for Cognitive Neuroscience and Department of Psychology, University of Salzburg, Austria
- Neuroscience Institute, Christian Doppler University Hospital, Paracelsus Medical University, Salzburg, Austria
| |
Collapse
|
15
|
Malh A, Wood S, Chung KC. The Art of Listening. Plast Reconstr Surg 2023; 151:921-926. [PMID: 37185376 DOI: 10.1097/prs.0000000000010065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Affiliation(s)
| | | | - Kevin C Chung
- Section of Plastic Surgery, Department of Surgery, University of Michigan Medical School
| |
Collapse
|
16
|
Floegel M, Kasper J, Perrier P, Kell CA. How the conception of control influences our understanding of actions. Nat Rev Neurosci 2023; 24:313-329. [PMID: 36997716 DOI: 10.1038/s41583-023-00691-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/28/2023] [Indexed: 04/01/2023]
Abstract
Wilful movement requires neural control. Commonly, neural computations are thought to generate motor commands that bring the musculoskeletal system - that is, the plant - from its current physical state into a desired physical state. The current state can be estimated from past motor commands and from sensory information. Modelling movement on the basis of this concept of plant control strives to explain behaviour by identifying the computational principles for control signals that can reproduce the observed features of movements. From an alternative perspective, movements emerge in a dynamically coupled agent-environment system from the pursuit of subjective perceptual goals. Modelling movement on the basis of this concept of perceptual control aims to identify the controlled percepts and their coupling rules that can give rise to the observed characteristics of behaviour. In this Perspective, we discuss a broad spectrum of approaches to modelling human motor control and their notions of control signals, internal models, handling of sensory feedback delays and learning. We focus on the influence that the plant control and the perceptual control perspective may have on decisions when modelling empirical data, which may in turn shape our understanding of actions.
Collapse
Affiliation(s)
- Mareike Floegel
- Department of Neurology and Brain Imaging Center, Goethe University Frankfurt, Frankfurt, Germany
| | - Johannes Kasper
- Department of Neurology and Brain Imaging Center, Goethe University Frankfurt, Frankfurt, Germany
| | - Pascal Perrier
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France
| | - Christian A Kell
- Department of Neurology and Brain Imaging Center, Goethe University Frankfurt, Frankfurt, Germany.
| |
Collapse
|
17
|
Su Y, MacGregor LJ, Olasagasti I, Giraud AL. A deep hierarchy of predictions enables online meaning extraction in a computational model of human speech comprehension. PLoS Biol 2023; 21:e3002046. [PMID: 36947552 PMCID: PMC10079236 DOI: 10.1371/journal.pbio.3002046] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 04/06/2023] [Accepted: 02/22/2023] [Indexed: 03/23/2023] Open
Abstract
Understanding speech requires mapping fleeting and often ambiguous soundwaves to meaning. While humans are known to exploit their capacity to contextualize to facilitate this process, how internal knowledge is deployed online remains an open question. Here, we present a model that extracts multiple levels of information from continuous speech online. The model applies linguistic and nonlinguistic knowledge to speech processing, by periodically generating top-down predictions and incorporating bottom-up incoming evidence in a nested temporal hierarchy. We show that a nonlinguistic context level provides semantic predictions informed by sensory inputs, which are crucial for disambiguating among multiple meanings of the same word. The explicit knowledge hierarchy of the model enables a more holistic account of the neurophysiological responses to speech compared to using lexical predictions generated by a neural network language model (GPT-2). We also show that hierarchical predictions reduce peripheral processing via minimizing uncertainty and prediction error. With this proof-of-concept model, we demonstrate that the deployment of hierarchical predictions is a possible strategy for the brain to dynamically utilize structured knowledge and make sense of the speech input.
Collapse
Affiliation(s)
- Yaqing Su
- Department of Fundamental Neuroscience, Faculty of Medicine, University of Geneva, Geneva, Switzerland
- Swiss National Centre of Competence in Research “Evolving Language” (NCCR EvolvingLanguage), Geneva, Switzerland
| | - Lucy J. MacGregor
- Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, United Kingdom
| | - Itsaso Olasagasti
- Department of Fundamental Neuroscience, Faculty of Medicine, University of Geneva, Geneva, Switzerland
- Swiss National Centre of Competence in Research “Evolving Language” (NCCR EvolvingLanguage), Geneva, Switzerland
| | - Anne-Lise Giraud
- Department of Fundamental Neuroscience, Faculty of Medicine, University of Geneva, Geneva, Switzerland
- Swiss National Centre of Competence in Research “Evolving Language” (NCCR EvolvingLanguage), Geneva, Switzerland
- Institut Pasteur, Université Paris Cité, Inserm, Institut de l’Audition, Paris, France
| |
Collapse
|
18
|
Holmes E, Johnsrude IS. Intelligibility benefit for familiar voices is not accompanied by better discrimination of fundamental frequency or vocal tract length. Hear Res 2023; 429:108704. [PMID: 36701896 DOI: 10.1016/j.heares.2023.108704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 11/11/2022] [Accepted: 01/19/2023] [Indexed: 01/21/2023]
Abstract
Speech is more intelligible when it is spoken by familiar than unfamiliar people. If this benefit arises because key voice characteristics like perceptual correlates of fundamental frequency or vocal tract length (VTL) are more accurately represented for familiar voices, listeners may be able to discriminate smaller manipulations to such characteristics for familiar than unfamiliar voices. We measured participants' (N = 17) thresholds for discriminating pitch (correlate of fundamental frequency, or glottal pulse rate) and formant spacing (correlate of VTL; 'VTL-timbre') for voices that were familiar (participants' friends) and unfamiliar (other participants' friends). As expected, familiar voices were more intelligible. However, discrimination thresholds were no smaller for the same familiar voices. The size of the intelligibility benefit for a familiar over an unfamiliar voice did not relate to the difference in discrimination thresholds for the same voices. Also, the familiar-voice intelligibility benefit was just as large following perceptible manipulations to pitch and VTL-timbre. These results are more consistent with cognitive accounts of speech perception than traditional accounts that predict better discrimination.
Collapse
Affiliation(s)
- Emma Holmes
- Department of Speech Hearing and Phonetic Sciences, UCL, London WC1N 1PF, UK; Brain and Mind Institute, University of Western Ontario, London, Ontario N6A 3K7, Canada.
| | - Ingrid S Johnsrude
- Brain and Mind Institute, University of Western Ontario, London, Ontario N6A 3K7, Canada; School of Communication Sciences and Disorders, University of Western Ontario, London, Ontario N6G 1H1, Canada
| |
Collapse
|
19
|
Adolfi F, Bowers JS, Poeppel D. Successes and critical failures of neural networks in capturing human-like speech recognition. Neural Netw 2023; 162:199-211. [PMID: 36913820 DOI: 10.1016/j.neunet.2023.02.032] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 02/15/2023] [Accepted: 02/21/2023] [Indexed: 03/15/2023]
Abstract
Natural and artificial audition can in principle acquire different solutions to a given problem. The constraints of the task, however, can nudge the cognitive science and engineering of audition to qualitatively converge, suggesting that a closer mutual examination would potentially enrich artificial hearing systems and process models of the mind and brain. Speech recognition - an area ripe for such exploration - is inherently robust in humans to a number transformations at various spectrotemporal granularities. To what extent are these robustness profiles accounted for by high-performing neural network systems? We bring together experiments in speech recognition under a single synthesis framework to evaluate state-of-the-art neural networks as stimulus-computable, optimized observers. In a series of experiments, we (1) clarify how influential speech manipulations in the literature relate to each other and to natural speech, (2) show the granularities at which machines exhibit out-of-distribution robustness, reproducing classical perceptual phenomena in humans, (3) identify the specific conditions where model predictions of human performance differ, and (4) demonstrate a crucial failure of all artificial systems to perceptually recover where humans do, suggesting alternative directions for theory and model building. These findings encourage a tighter synergy between the cognitive science and engineering of audition.
Collapse
Affiliation(s)
- Federico Adolfi
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Frankfurt, Germany; University of Bristol, School of Psychological Science, Bristol, United Kingdom.
| | - Jeffrey S Bowers
- University of Bristol, School of Psychological Science, Bristol, United Kingdom
| | - David Poeppel
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Frankfurt, Germany; Department of Psychology, New York University, NY, United States; Max Planck NYU Center for Language, Music, and Emotion, Frankfurt, Germany, New York, NY, United States
| |
Collapse
|
20
|
Söderström P, Cutler A. Early neuro-electric indication of lexical match in English spoken-word recognition. PLoS One 2023; 18:e0285286. [PMID: 37200324 DOI: 10.1371/journal.pone.0285286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 04/19/2023] [Indexed: 05/20/2023] Open
Abstract
We investigated early electrophysiological responses to spoken English words embedded in neutral sentence frames, using a lexical decision paradigm. As words unfold in time, similar-sounding lexical items compete for recognition within 200 milliseconds after word onset. A small number of studies have previously investigated event-related potentials in this time window in English and French, with results differing in direction of effects as well as component scalp distribution. Investigations of spoken-word recognition in Swedish have reported an early left-frontally distributed event-related potential that increases in amplitude as a function of the probability of a successful lexical match as the word unfolds. Results from the present study indicate that the same process may occur in English: we propose that increased certainty of a 'word' response in a lexical decision task is reflected in the amplitude of an early left-anterior brain potential beginning around 150 milliseconds after word onset. This in turn is proposed to be connected to the probabilistically driven activation of possible upcoming word forms.
Collapse
Affiliation(s)
- Pelle Söderström
- Centre for Languages and Literature, Lund University, Lund, Sweden
- MARCS Institute for Brain, Behaviour & Development, Western Sydney University, Penrith, Australia
- ARC Centre of Excellence for the Dynamics of Language, St Lucia, Australia
| | - Anne Cutler
- MARCS Institute for Brain, Behaviour & Development, Western Sydney University, Penrith, Australia
- ARC Centre of Excellence for the Dynamics of Language, St Lucia, Australia
- Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands
| |
Collapse
|
21
|
Adolfi F, Wareham T, van Rooij I. A Computational Complexity Perspective on Segmentation as a Cognitive Subcomputation. Top Cogn Sci 2022; 15:255-273. [PMID: 36453947 DOI: 10.1111/tops.12629] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 10/06/2022] [Accepted: 10/07/2022] [Indexed: 12/05/2022]
Abstract
Computational feasibility is a widespread concern that guides the framing and modeling of natural and artificial intelligence. The specification of cognitive system capacities is often shaped by unexamined intuitive assumptions about the search space and complexity of a subcomputation. However, a mistaken intuition might make such initial conceptualizations misleading for what empirical questions appear relevant later on. We undertake here computational-level modeling and complexity analyses of segmentation - a widely hypothesized subcomputation that plays a requisite role in explanations of capacities across domains, such as speech recognition, music cognition, active sensing, event memory, action parsing, and statistical learning - as a case study to show how crucial it is to formally assess these assumptions. We mathematically prove two sets of results regarding computational hardness and search space size that may run counter to intuition, and position their implications with respect to existing views on the subcapacity.
Collapse
Affiliation(s)
- Federico Adolfi
- Ernst Strüngmann Institute for Neuroscience in Cooperation with Max‐Planck Society
- School of Psychological Science University of Bristol
| | - Todd Wareham
- Department of Computer Science Memorial University of Newfoundland
| | - Iris van Rooij
- Donders Institute for Brain, Cognition, and Behaviour Radboud University
- School of Artificial Intelligence Radboud University
- Department of Linguistics, Cognitive Science, and Semiotics & Interacting Minds Centre Aarhus University
| |
Collapse
|
22
|
Bornkessel-Schlesewsky I, Sharrad I, Howlett CA, Alday PM, Corcoran AW, Bellan V, Wilkinson E, Kliegl R, Lewis RL, Small SL, Schlesewsky M. Rapid adaptation of predictive models during language comprehension: Aperiodic EEG slope, individual alpha frequency and idea density modulate individual differences in real-time model updating. Front Psychol 2022; 13:817516. [PMID: 36092106 PMCID: PMC9461998 DOI: 10.3389/fpsyg.2022.817516] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 07/22/2022] [Indexed: 11/13/2022] Open
Abstract
Predictive coding provides a compelling, unified theory of neural information processing, including for language. However, there is insufficient understanding of how predictive models adapt to changing contextual and environmental demands and the extent to which such adaptive processes differ between individuals. Here, we used electroencephalography (EEG) to track prediction error responses during a naturalistic language processing paradigm. In Experiment 1, 45 native speakers of English listened to a series of short passages. Via a speaker manipulation, we introduced changing intra-experimental adjective order probabilities for two-adjective noun phrases embedded within the passages and investigated whether prediction error responses adapt to reflect these intra-experimental predictive contingencies. To this end, we calculated a novel measure of speaker-based, intra-experimental surprisal (“speaker-based surprisal”) as defined on a trial-by-trial basis and by clustering together adjectives with a similar meaning. N400 amplitude at the position of the critical second adjective was used as an outcome measure of prediction error. Results showed that N400 responses attuned to speaker-based surprisal over the course of the experiment, thus indicating that listeners rapidly adapt their predictive models to reflect local environmental contingencies (here: the probability of one type of adjective following another when uttered by a particular speaker). Strikingly, this occurs in spite of the wealth of prior linguistic experience that participants bring to the laboratory. Model adaptation effects were strongest for participants with a steep aperiodic (1/f) slope in resting EEG and low individual alpha frequency (IAF), with idea density (ID) showing a more complex pattern. These results were replicated in a separate sample of 40 participants in Experiment 2, which employed a highly similar design to Experiment 1. Overall, our results suggest that individuals with a steep aperiodic slope adapt their predictive models most strongly to context-specific probabilistic information. Steep aperiodic slope is thought to reflect low neural noise, which in turn may be associated with higher neural gain control and better cognitive control. Individuals with a steep aperiodic slope may thus be able to more effectively and dynamically reconfigure their prediction-related neural networks to meet current task demands. We conclude that predictive mechanisms in language are highly malleable and dynamic, reflecting both the affordances of the present environment as well as intrinsic information processing capabilities of the individual.
Collapse
Affiliation(s)
- Ina Bornkessel-Schlesewsky
- Cognitive Neuroscience Laboratory, Australian Research Centre for Interactive and Virtual Environments, University of South Australia, Adelaide, SA, Australia
- *Correspondence: Ina Bornkessel-Schlesewsky
| | - Isabella Sharrad
- Cognitive Neuroscience Laboratory, Australian Research Centre for Interactive and Virtual Environments, University of South Australia, Adelaide, SA, Australia
| | - Caitlin A. Howlett
- Innovation, Implementation and Clinical Translation (IIMPACT) in Health, University of South Australia, Adelaide, SA, Australia
| | | | - Andrew W. Corcoran
- Cognition and Philosophy Laboratory, Monash University, Melbourne, VIC, Australia
- Monash Centre for Consciousness and Contemplative Studies, Monash University, Melbourne, VIC, Australia
| | - Valeria Bellan
- Cognitive Neuroscience Laboratory, Australian Research Centre for Interactive and Virtual Environments, University of South Australia, Adelaide, SA, Australia
- Innovation, Implementation and Clinical Translation (IIMPACT) in Health, University of South Australia, Adelaide, SA, Australia
| | - Erica Wilkinson
- Innovation, Implementation and Clinical Translation (IIMPACT) in Health, University of South Australia, Adelaide, SA, Australia
| | - Reinhold Kliegl
- Division of Training and Movement Science, University of Potsdam, Potsdam, Germany
| | - Richard L. Lewis
- Department of Psychology, University of Michigan, Ann Arbor, MI, United States
- Weinberg Institute for Cognitive Science, University of Michigan, Ann Arbor, MI, United States
| | - Steven L. Small
- School of Behavioral and Brain Sciences, University of Texas at Dallas, Dallas, TX, United States
| | - Matthias Schlesewsky
- Cognitive Neuroscience Laboratory, Australian Research Centre for Interactive and Virtual Environments, University of South Australia, Adelaide, SA, Australia
| |
Collapse
|
23
|
Abstract
Swedish lexical word accents have been repeatedly said to have a low functional load. Even so, the language has kept these tones ever since they emerged probably over a thousand years ago. This article proposes that the primary function of word accents is for listeners to be able to predict upcoming morphological structures and narrow down the lexical competition rather than being lexically distinctive. Psycho- and neurophysiological evidence for the predictive function of word accents is discussed. A novel analysis displays that word accents have a facilitative role in word processing. Specifically, a correlation is revealed between how much incorrect word accents hinder listeners' processing and how much they reduce response times when correct. Finally, a dual-route model of the predictive use of word accents with distinct neural substrates is put forth.
Collapse
Affiliation(s)
- Mikael Roll
- Centre for Languages and Literature, Lund University, Lund, Sweden
| |
Collapse
|
24
|
Corcoran AW, Perera R, Koroma M, Kouider S, Hohwy J, Andrillon T. Expectations boost the reconstruction of auditory features from electrophysiological responses to noisy speech. Cereb Cortex 2022; 33:691-708. [PMID: 35253871 PMCID: PMC9890472 DOI: 10.1093/cercor/bhac094] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 02/11/2022] [Accepted: 02/12/2022] [Indexed: 02/04/2023] Open
Abstract
Online speech processing imposes significant computational demands on the listening brain, the underlying mechanisms of which remain poorly understood. Here, we exploit the perceptual "pop-out" phenomenon (i.e. the dramatic improvement of speech intelligibility after receiving information about speech content) to investigate the neurophysiological effects of prior expectations on degraded speech comprehension. We recorded electroencephalography (EEG) and pupillometry from 21 adults while they rated the clarity of noise-vocoded and sine-wave synthesized sentences. Pop-out was reliably elicited following visual presentation of the corresponding written sentence, but not following incongruent or neutral text. Pop-out was associated with improved reconstruction of the acoustic stimulus envelope from low-frequency EEG activity, implying that improvements in perceptual clarity were mediated via top-down signals that enhanced the quality of cortical speech representations. Spectral analysis further revealed that pop-out was accompanied by a reduction in theta-band power, consistent with predictive coding accounts of acoustic filling-in and incremental sentence processing. Moreover, delta-band power, alpha-band power, and pupil diameter were all increased following the provision of any written sentence information, irrespective of content. Together, these findings reveal distinctive profiles of neurophysiological activity that differentiate the content-specific processes associated with degraded speech comprehension from the context-specific processes invoked under adverse listening conditions.
Collapse
Affiliation(s)
- Andrew W Corcoran
- Corresponding author: Room E672, 20 Chancellors Walk, Clayton, VIC 3800, Australia.
| | - Ricardo Perera
- Cognition & Philosophy Laboratory, School of Philosophical, Historical, and International Studies, Monash University, Melbourne, VIC 3800 Australia
| | - Matthieu Koroma
- Brain and Consciousness Group (ENS, EHESS, CNRS), Département d’Études Cognitives, École Normale Supérieure-PSL Research University, Paris 75005, France
| | - Sid Kouider
- Brain and Consciousness Group (ENS, EHESS, CNRS), Département d’Études Cognitives, École Normale Supérieure-PSL Research University, Paris 75005, France
| | - Jakob Hohwy
- Cognition & Philosophy Laboratory, School of Philosophical, Historical, and International Studies, Monash University, Melbourne, VIC 3800 Australia,Monash Centre for Consciousness & Contemplative Studies, Monash University, Melbourne, VIC 3800 Australia
| | - Thomas Andrillon
- Monash Centre for Consciousness & Contemplative Studies, Monash University, Melbourne, VIC 3800 Australia,Paris Brain Institute, Sorbonne Université, Inserm-CNRS, Paris 75013, France
| |
Collapse
|
25
|
Quiroga-Martinez DR, Hansen NC, Højlund A, Pearce M, Brattico E, Holmes E, Friston K, Vuust P. Musicianship and melodic predictability enhance neural gain in auditory cortex during pitch deviance detection. Hum Brain Mapp 2021; 42:5595-5608. [PMID: 34459062 PMCID: PMC8559476 DOI: 10.1002/hbm.25638] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 07/22/2021] [Accepted: 08/07/2021] [Indexed: 11/10/2022] Open
Abstract
When listening to music, pitch deviations are more salient and elicit stronger prediction error responses when the melodic context is predictable and when the listener is a musician. Yet, the neuronal dynamics and changes in connectivity underlying such effects remain unclear. Here, we employed dynamic causal modeling (DCM) to investigate whether the magnetic mismatch negativity response (MMNm)-and its modulation by context predictability and musical expertise-are associated with enhanced neural gain of auditory areas, as a plausible mechanism for encoding precision-weighted prediction errors. Using Bayesian model comparison, we asked whether models with intrinsic connections within primary auditory cortex (A1) and superior temporal gyrus (STG)-typically related to gain control-or extrinsic connections between A1 and STG-typically related to propagation of prediction and error signals-better explained magnetoencephalography responses. We found that, compared to regular sounds, out-of-tune pitch deviations were associated with lower intrinsic (inhibitory) connectivity in A1 and STG, and lower backward (inhibitory) connectivity from STG to A1, consistent with disinhibition and enhanced neural gain in these auditory areas. More predictable melodies were associated with disinhibition in right A1, while musicianship was associated with disinhibition in left A1 and reduced connectivity from STG to left A1. These results indicate that musicianship and melodic predictability, as well as pitch deviations themselves, enhance neural gain in auditory cortex during deviance detection. Our findings are consistent with predictive processing theories suggesting that precise and informative error signals are selected by the brain for subsequent hierarchical processing.
Collapse
Affiliation(s)
- David R Quiroga-Martinez
- Center for Music in the Brain, Aarhus University & Royal Academy of Music Aarhus/Aalborg, Aarhus, Denmark
| | - Niels Christian Hansen
- Center for Music in the Brain, Aarhus University & Royal Academy of Music Aarhus/Aalborg, Aarhus, Denmark.,Aarhus Institute of Advanced Studies, Aarhus University, Aarhus, Denmark
| | - Andreas Højlund
- Center for Functionally Integrative Neuroscience, Aarhus University, Aarhus, Denmark
| | - Marcus Pearce
- School of Electronic Engineering and Computer Science, Queen Mary University of London, London, UK
| | - Elvira Brattico
- Center for Music in the Brain, Aarhus University & Royal Academy of Music Aarhus/Aalborg, Aarhus, Denmark.,Department of Education, Psychology and Communication, University of Bari Aldo Moro, Bari, Italy
| | - Emma Holmes
- The Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Karl Friston
- The Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Peter Vuust
- Center for Music in the Brain, Aarhus University & Royal Academy of Music Aarhus/Aalborg, Aarhus, Denmark
| |
Collapse
|
26
|
Holmes E, Parr T, Griffiths TD, Friston KJ. Active inference, selective attention, and the cocktail party problem. Neurosci Biobehav Rev 2021; 131:1288-1304. [PMID: 34687699 DOI: 10.1016/j.neubiorev.2021.09.038] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 08/27/2021] [Accepted: 09/17/2021] [Indexed: 11/25/2022]
Abstract
In this paper, we introduce a new generative model for an active inference account of preparatory and selective attention, in the context of a classic 'cocktail party' paradigm. In this setup, pairs of words are presented simultaneously to the left and right ears and an instructive spatial cue directs attention to the left or right. We use this generative model to test competing hypotheses about the way that human listeners direct preparatory and selective attention. We show that assigning low precision to words at attended-relative to unattended-locations can explain why a listener reports words from a competing sentence. Under this model, temporal changes in sensory precision were not needed to account for faster reaction times with longer cue-target intervals, but were necessary to explain ramping effects on event-related potentials (ERPs)-resembling the contingent negative variation (CNV)-during the preparatory interval. These simulations reveal that different processes are likely to underlie the improvement in reaction times and the ramping of ERPs that are associated with spatial cueing.
Collapse
Affiliation(s)
- Emma Holmes
- Department of Speech Hearing and Phonetic Sciences, UCL, London, WC1N 1PF, UK; Wellcome Centre for Human Neuroimaging, UCL, London, WC1N 3AR, UK.
| | - Thomas Parr
- Wellcome Centre for Human Neuroimaging, UCL, London, WC1N 3AR, UK
| | - Timothy D Griffiths
- Wellcome Centre for Human Neuroimaging, UCL, London, WC1N 3AR, UK; Biosciences Institute, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK
| | - Karl J Friston
- Wellcome Centre for Human Neuroimaging, UCL, London, WC1N 3AR, UK
| |
Collapse
|
27
|
Sajid N, Holmes E, Hope TM, Fountas Z, Price CJ, Friston KJ. Simulating lesion-dependent functional recovery mechanisms. Sci Rep 2021; 11:7475. [PMID: 33811259 PMCID: PMC8018968 DOI: 10.1038/s41598-021-87005-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Accepted: 03/22/2021] [Indexed: 01/13/2023] Open
Abstract
Functional recovery after brain damage varies widely and depends on many factors, including lesion site and extent. When a neuronal system is damaged, recovery may occur by engaging residual (e.g., perilesional) components. When damage is extensive, recovery depends on the availability of other intact neural structures that can reproduce the same functional output (i.e., degeneracy). A system's response to damage may occur rapidly, require learning or both. Here, we simulate functional recovery from four different types of lesions, using a generative model of word repetition that comprised a default premorbid system and a less used alternative system. The synthetic lesions (i) completely disengaged the premorbid system, leaving the alternative system intact, (ii) partially damaged both premorbid and alternative systems, and (iii) limited the experience-dependent plasticity of both. The results, across 1000 trials, demonstrate that (i) a complete disconnection of the premorbid system naturally invoked the engagement of the other, (ii) incomplete damage to both systems had a much more devastating long-term effect on model performance and (iii) the effect of reducing learning capacity within each system. These findings contribute to formal frameworks for interpreting the effect of different types of lesions.
Collapse
Affiliation(s)
- Noor Sajid
- Wellcome Centre for Human Neuroimaging, University College London, UCL Queen Square Institute of Neurology, 12 Queen Square, London, WC1N 3AR, UK.
| | - Emma Holmes
- Wellcome Centre for Human Neuroimaging, University College London, UCL Queen Square Institute of Neurology, 12 Queen Square, London, WC1N 3AR, UK
| | - Thomas M Hope
- Wellcome Centre for Human Neuroimaging, University College London, UCL Queen Square Institute of Neurology, 12 Queen Square, London, WC1N 3AR, UK
| | - Zafeirios Fountas
- Wellcome Centre for Human Neuroimaging, University College London, UCL Queen Square Institute of Neurology, 12 Queen Square, London, WC1N 3AR, UK
- Huawei 2012 Laboratories, London, UK
| | - Cathy J Price
- Wellcome Centre for Human Neuroimaging, University College London, UCL Queen Square Institute of Neurology, 12 Queen Square, London, WC1N 3AR, UK
| | - Karl J Friston
- Wellcome Centre for Human Neuroimaging, University College London, UCL Queen Square Institute of Neurology, 12 Queen Square, London, WC1N 3AR, UK
| |
Collapse
|
28
|
Sajid N, Friston KJ, Ekert JO, Price CJ, W. Green D. Neuromodulatory Control and Language Recovery in Bilingual Aphasia: An Active Inference Approach. Behav Sci (Basel) 2020; 10:E161. [PMID: 33096824 PMCID: PMC7588909 DOI: 10.3390/bs10100161] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2020] [Revised: 10/17/2020] [Accepted: 10/19/2020] [Indexed: 11/17/2022] Open
Abstract
Understanding the aetiology of the diverse recovery patterns in bilingual aphasia is a theoretical challenge with implications for treatment. Loss of control over intact language networks provides a parsimonious starting point that can be tested using in-silico lesions. We simulated a complex recovery pattern (alternate antagonism and paradoxical translation) to test the hypothesis-from an established hierarchical control model-that loss of control was mediated by constraints on neuromodulatory resources. We used active (Bayesian) inference to simulate a selective loss of sensory precision; i.e., confidence in the causes of sensations. This in-silico lesion altered the precision of beliefs about task relevant states, including appropriate actions, and reproduced exactly the recovery pattern of interest. As sensory precision has been linked to acetylcholine release, these simulations endorse the conjecture that loss of neuromodulatory control can explain this atypical recovery pattern. We discuss the relevance of this finding for other recovery patterns.
Collapse
Affiliation(s)
- Noor Sajid
- Wellcome Centre for Human Neuroimaging, University College London, 12 Queen Square, London WC1N 3AR, UK; (K.J.F.); (J.O.E.); (C.J.P.)
| | - Karl J. Friston
- Wellcome Centre for Human Neuroimaging, University College London, 12 Queen Square, London WC1N 3AR, UK; (K.J.F.); (J.O.E.); (C.J.P.)
| | - Justyna O. Ekert
- Wellcome Centre for Human Neuroimaging, University College London, 12 Queen Square, London WC1N 3AR, UK; (K.J.F.); (J.O.E.); (C.J.P.)
| | - Cathy J. Price
- Wellcome Centre for Human Neuroimaging, University College London, 12 Queen Square, London WC1N 3AR, UK; (K.J.F.); (J.O.E.); (C.J.P.)
| | - David W. Green
- Experimental Psychology, University College London, Gower Street, London WC1E 6BT, UK;
| |
Collapse
|