1
|
Heins C, Millidge B, Da Costa L, Mann RP, Friston KJ, Couzin ID. Collective behavior from surprise minimization. Proc Natl Acad Sci U S A 2024; 121:e2320239121. [PMID: 38630721 PMCID: PMC11046639 DOI: 10.1073/pnas.2320239121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 03/08/2024] [Indexed: 04/19/2024] Open
Abstract
Collective motion is ubiquitous in nature; groups of animals, such as fish, birds, and ungulates appear to move as a whole, exhibiting a rich behavioral repertoire that ranges from directed movement to milling to disordered swarming. Typically, such macroscopic patterns arise from decentralized, local interactions among constituent components (e.g., individual fish in a school). Preeminent models of this process describe individuals as self-propelled particles, subject to self-generated motion and "social forces" such as short-range repulsion and long-range attraction or alignment. However, organisms are not particles; they are probabilistic decision-makers. Here, we introduce an approach to modeling collective behavior based on active inference. This cognitive framework casts behavior as the consequence of a single imperative: to minimize surprise. We demonstrate that many empirically observed collective phenomena, including cohesion, milling, and directed motion, emerge naturally when considering behavior as driven by active Bayesian inference-without explicitly building behavioral rules or goals into individual agents. Furthermore, we show that active inference can recover and generalize the classical notion of social forces as agents attempt to suppress prediction errors that conflict with their expectations. By exploring the parameter space of the belief-based model, we reveal nontrivial relationships between the individual beliefs and group properties like polarization and the tendency to visit different collective states. We also explore how individual beliefs about uncertainty determine collective decision-making accuracy. Finally, we show how agents can update their generative model over time, resulting in groups that are collectively more sensitive to external fluctuations and encode information more robustly.
Collapse
Affiliation(s)
- Conor Heins
- Department of Collective Behaviour, Max Planck Institute of Animal Behavior, KonstanzD-78457, Germany
- Centre for the Advanced Study of Collective Behaviour, University of Konstanz, KonstanzD-78457, Germany
- Department of Biology, University of Konstanz, KonstanzD-78457, Germany
- VERSES Research Lab, Los Angeles, CA90016
| | - Beren Millidge
- Medical Research Council Brain Networks Dynamics Unit, University of Oxford, OxfordOX1 3TH, United Kingdom
| | - Lancelot Da Costa
- VERSES Research Lab, Los Angeles, CA90016
- Department of Mathematics, Imperial College London, LondonSW7 2AZ, United Kingdom
- Wellcome Centre for Human Neuroimaging, University College London, LondonWC1N 3AR, United Kingdom
| | - Richard P. Mann
- Department of Statistics, School of Mathematics, University of Leeds, LeedsLS2 9JT, United Kingdom
| | - Karl J. Friston
- VERSES Research Lab, Los Angeles, CA90016
- Wellcome Centre for Human Neuroimaging, University College London, LondonWC1N 3AR, United Kingdom
| | - Iain D. Couzin
- Department of Collective Behaviour, Max Planck Institute of Animal Behavior, KonstanzD-78457, Germany
- Centre for the Advanced Study of Collective Behaviour, University of Konstanz, KonstanzD-78457, Germany
- Department of Biology, University of Konstanz, KonstanzD-78457, Germany
| |
Collapse
|
2
|
Millidge B, Tang M, Osanlouy M, Harper NS, Bogacz R. Predictive coding networks for temporal prediction. PLoS Comput Biol 2024; 20:e1011183. [PMID: 38557984 PMCID: PMC11008833 DOI: 10.1371/journal.pcbi.1011183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 04/11/2024] [Accepted: 03/12/2024] [Indexed: 04/04/2024] Open
Abstract
One of the key problems the brain faces is inferring the state of the world from a sequence of dynamically changing stimuli, and it is not yet clear how the sensory system achieves this task. A well-established computational framework for describing perceptual processes in the brain is provided by the theory of predictive coding. Although the original proposals of predictive coding have discussed temporal prediction, later work developing this theory mostly focused on static stimuli, and key questions on neural implementation and computational properties of temporal predictive coding networks remain open. Here, we address these questions and present a formulation of the temporal predictive coding model that can be naturally implemented in recurrent networks, in which activity dynamics rely only on local inputs to the neurons, and learning only utilises local Hebbian plasticity. Additionally, we show that temporal predictive coding networks can approximate the performance of the Kalman filter in predicting behaviour of linear systems, and behave as a variant of a Kalman filter which does not track its own subjective posterior variance. Importantly, temporal predictive coding networks can achieve similar accuracy as the Kalman filter without performing complex mathematical operations, but just employing simple computations that can be implemented by biological networks. Moreover, when trained with natural dynamic inputs, we found that temporal predictive coding can produce Gabor-like, motion-sensitive receptive fields resembling those observed in real neurons in visual areas. In addition, we demonstrate how the model can be effectively generalized to nonlinear systems. Overall, models presented in this paper show how biologically plausible circuits can predict future stimuli and may guide research on understanding specific neural circuits in brain areas involved in temporal prediction.
Collapse
Affiliation(s)
- Beren Millidge
- MRC Brain Network Dynamics Unit, University of Oxford, Oxford, United Kingdom
| | - Mufeng Tang
- MRC Brain Network Dynamics Unit, University of Oxford, Oxford, United Kingdom
| | - Mahyar Osanlouy
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - Nicol S. Harper
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| | - Rafal Bogacz
- MRC Brain Network Dynamics Unit, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
3
|
Song Y, Millidge B, Salvatori T, Lukasiewicz T, Xu Z, Bogacz R. Inferring neural activity before plasticity as a foundation for learning beyond backpropagation. Nat Neurosci 2024; 27:348-358. [PMID: 38172438 PMCID: PMC7615830 DOI: 10.1038/s41593-023-01514-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 11/02/2023] [Indexed: 01/05/2024]
Abstract
For both humans and machines, the essence of learning is to pinpoint which components in its information processing pipeline are responsible for an error in its output, a challenge that is known as 'credit assignment'. It has long been assumed that credit assignment is best solved by backpropagation, which is also the foundation of modern machine learning. Here, we set out a fundamentally different principle on credit assignment called 'prospective configuration'. In prospective configuration, the network first infers the pattern of neural activity that should result from learning, and then the synaptic weights are modified to consolidate the change in neural activity. We demonstrate that this distinct mechanism, in contrast to backpropagation, (1) underlies learning in a well-established family of models of cortical circuits, (2) enables learning that is more efficient and effective in many contexts faced by biological organisms and (3) reproduces surprising patterns of neural activity and behavior observed in diverse human and rat learning experiments.
Collapse
Affiliation(s)
- Yuhang Song
- Department of Computer Science, University of Oxford, Oxford, UK.
- Medical Research Council Brain Network Dynamics Unit, University of Oxford, Oxford, UK.
- Fractile, Ltd., London, UK.
| | - Beren Millidge
- Medical Research Council Brain Network Dynamics Unit, University of Oxford, Oxford, UK
| | - Tommaso Salvatori
- Department of Computer Science, University of Oxford, Oxford, UK
- Institute of Logic and Computation, Vienna University of Technology, Vienna, Austria
- VERSES AI Research Lab, Los Angeles, CA, USA
| | - Thomas Lukasiewicz
- Department of Computer Science, University of Oxford, Oxford, UK.
- Institute of Logic and Computation, Vienna University of Technology, Vienna, Austria.
| | - Zhenghua Xu
- Department of Computer Science, University of Oxford, Oxford, UK.
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China.
| | - Rafal Bogacz
- Medical Research Council Brain Network Dynamics Unit, University of Oxford, Oxford, UK.
| |
Collapse
|
4
|
Abstract
Predictive coding is an influential model of cortical neural activity. It proposes that perceptual beliefs are furnished by sequentially minimising "prediction errors"-the differences between predicted and observed data. Implicit in this proposal is the idea that successful perception requires multiple cycles of neural activity. This is at odds with evidence that several aspects of visual perception-including complex forms of object recognition-arise from an initial "feedforward sweep" that occurs on fast timescales which preclude substantial recurrent activity. Here, we propose that the feedforward sweep can be understood as performing amortized inference (applying a learned function that maps directly from data to beliefs) and recurrent processing can be understood as performing iterative inference (sequentially updating neural activity in order to improve the accuracy of beliefs). We propose a hybrid predictive coding network that combines both iterative and amortized inference in a principled manner by describing both in terms of a dual optimization of a single objective function. We show that the resulting scheme can be implemented in a biologically plausible neural architecture that approximates Bayesian inference utilising local Hebbian update rules. We demonstrate that our hybrid predictive coding model combines the benefits of both amortized and iterative inference-obtaining rapid and computationally cheap perceptual inference for familiar data while maintaining the context-sensitivity, precision, and sample efficiency of iterative inference schemes. Moreover, we show how our model is inherently sensitive to its uncertainty and adaptively balances iterative and amortized inference to obtain accurate beliefs using minimum computational expense. Hybrid predictive coding offers a new perspective on the functional relevance of the feedforward and recurrent activity observed during visual perception and offers novel insights into distinct aspects of visual phenomenology.
Collapse
Affiliation(s)
- Alexander Tscshantz
- Sussex AI Group, Department of Informatics, University of Sussex, Brighton, United Kingdom
- VERSES Research Lab, Los Angeles, California, United States of America
- Sussex Centre for Consciousness Science, University of Sussex, Brighton, United Kingdom
| | - Beren Millidge
- Sussex AI Group, Department of Informatics, University of Sussex, Brighton, United Kingdom
- VERSES Research Lab, Los Angeles, California, United States of America
- Brain Networks Dynamics Unit, University of Oxford, Oxford, United Kingdom
| | - Anil K. Seth
- Sussex AI Group, Department of Informatics, University of Sussex, Brighton, United Kingdom
- Sussex Centre for Consciousness Science, University of Sussex, Brighton, United Kingdom
| | - Christopher L. Buckley
- Sussex AI Group, Department of Informatics, University of Sussex, Brighton, United Kingdom
- VERSES Research Lab, Los Angeles, California, United States of America
| |
Collapse
|
5
|
Ramstead MJD, Sakthivadivel DAR, Heins C, Koudahl M, Millidge B, Da Costa L, Klein B, Friston KJ. On Bayesian mechanics: a physics of and by beliefs. Interface Focus 2023; 13:20220029. [PMID: 37213925 PMCID: PMC10198254 DOI: 10.1098/rsfs.2022.0029] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 01/17/2023] [Indexed: 05/23/2023] Open
Abstract
The aim of this paper is to introduce a field of study that has emerged over the last decade, called Bayesian mechanics. Bayesian mechanics is a probabilistic mechanics, comprising tools that enable us to model systems endowed with a particular partition (i.e. into particles), where the internal states (or the trajectories of internal states) of a particular system encode the parameters of beliefs about external states (or their trajectories). These tools allow us to write down mechanical theories for systems that look as if they are estimating posterior probability distributions over the causes of their sensory states. This provides a formal language for modelling the constraints, forces, potentials and other quantities determining the dynamics of such systems, especially as they entail dynamics on a space of beliefs (i.e. on a statistical manifold). Here, we will review the state of the art in the literature on the free energy principle, distinguishing between three ways in which Bayesian mechanics has been applied to particular systems (i.e. path-tracking, mode-tracking and mode-matching). We go on to examine a duality between the free energy principle and the constrained maximum entropy principle, both of which lie at the heart of Bayesian mechanics, and discuss its implications.
Collapse
Affiliation(s)
- Maxwell J. D. Ramstead
- VERSES Research Lab, Los Angeles, CA 90016, USA
- Wellcome Centre for Human Neuroimaging, University College London, London WC1N 3AR, UK
| | - Dalton A. R. Sakthivadivel
- VERSES Research Lab, Los Angeles, CA 90016, USA
- Department of Mathematics, Stony Brook University, Stony Brook, NY, USA
- Department of Physics and Astronomy, Stony Brook University, Stony Brook, NY, USA
- Department of Biomedical Engineering, Stony Brook University, Stony Brook, NY, USA
| | - Conor Heins
- VERSES Research Lab, Los Angeles, CA 90016, USA
- Department of Collective Behaviour, Max Planck Institute of Animal Behavior, 78464 Konstanz, Germany
- Department of Biology, University of Konstanz, 78464 Konstanz, Germany
- Centre for the Advanced Study of Collective Behaviour, University of Konstanz, 78464 Konstanz, Germany
| | - Magnus Koudahl
- VERSES Research Lab, Los Angeles, CA 90016, USA
- Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Beren Millidge
- VERSES Research Lab, Los Angeles, CA 90016, USA
- Brain Network Dynamics Unit, University of Oxford, Oxford, UK
| | - Lancelot Da Costa
- Wellcome Centre for Human Neuroimaging, University College London, London WC1N 3AR, UK
- Department of Mathematics, Imperial College London, London SW7 2AZ, UK
| | - Brennan Klein
- VERSES Research Lab, Los Angeles, CA 90016, USA
- Network Science Institute, Northeastern University, Boston, MA, USA
| | - Karl J. Friston
- VERSES Research Lab, Los Angeles, CA 90016, USA
- Wellcome Centre for Human Neuroimaging, University College London, London WC1N 3AR, UK
| |
Collapse
|
6
|
Tang M, Salvatori T, Millidge B, Song Y, Lukasiewicz T, Bogacz R. Recurrent predictive coding models for associative memory employing covariance learning. PLoS Comput Biol 2023; 19:e1010719. [PMID: 37058541 PMCID: PMC10132551 DOI: 10.1371/journal.pcbi.1010719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 04/26/2023] [Accepted: 03/07/2023] [Indexed: 04/15/2023] Open
Abstract
The computational principles adopted by the hippocampus in associative memory (AM) tasks have been one of the most studied topics in computational and theoretical neuroscience. Recent theories suggested that AM and the predictive activities of the hippocampus could be described within a unitary account, and that predictive coding underlies the computations supporting AM in the hippocampus. Following this theory, a computational model based on classical hierarchical predictive networks was proposed and was shown to perform well in various AM tasks. However, this fully hierarchical model did not incorporate recurrent connections, an architectural component of the CA3 region of the hippocampus that is crucial for AM. This makes the structure of the model inconsistent with the known connectivity of CA3 and classical recurrent models such as Hopfield Networks, which learn the covariance of inputs through their recurrent connections to perform AM. Earlier PC models that learn the covariance information of inputs explicitly via recurrent connections seem to be a solution to these issues. Here, we show that although these models can perform AM, they do it in an implausible and numerically unstable way. Instead, we propose alternatives to these earlier covariance-learning predictive coding networks, which learn the covariance information implicitly and plausibly, and can use dendritic structures to encode prediction errors. We show analytically that our proposed models are perfectly equivalent to the earlier predictive coding model learning covariance explicitly, and encounter no numerical issues when performing AM tasks in practice. We further show that our models can be combined with hierarchical predictive coding networks to model the hippocampo-neocortical interactions. Our models provide a biologically plausible approach to modelling the hippocampal network, pointing to a potential computational mechanism during hippocampal memory formation and recall, which employs both predictive coding and covariance learning based on the recurrent network structure of the hippocampus.
Collapse
Affiliation(s)
- Mufeng Tang
- MRC Brain Network Dynamics Unit, University of Oxford, Oxford, United Kingdom
| | - Tommaso Salvatori
- Department of Computer Science, University of Oxford, Oxford, United Kingdom
| | - Beren Millidge
- MRC Brain Network Dynamics Unit, University of Oxford, Oxford, United Kingdom
| | - Yuhang Song
- MRC Brain Network Dynamics Unit, University of Oxford, Oxford, United Kingdom
- Department of Computer Science, University of Oxford, Oxford, United Kingdom
| | - Thomas Lukasiewicz
- Department of Computer Science, University of Oxford, Oxford, United Kingdom
- Institute of Logic and Computation, TU Wien, Vienna, Austria
| | - Rafal Bogacz
- MRC Brain Network Dynamics Unit, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
7
|
Aguilera M, Millidge B, Tschantz A, Buckley CL. From the free energy principle to a confederation of Bayesian mechanics: Reply to comments on "How particular is the physics of the free energy principle?". Phys Life Rev 2023; 44:270-275. [PMID: 36821891 DOI: 10.1016/j.plrev.2023.01.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Accepted: 01/30/2023] [Indexed: 02/10/2023]
Affiliation(s)
- Miguel Aguilera
- BCAM - Basque Center for Applied Mathematics, Bilbao, 480009, Spain; IKERBASQUE, Basque Foundation for Science, Bilbao, 480009, Spain; School of Engineering and Informatics, University of Sussex, Falmer, Brighton, BN1 9QJ, United Kingdom.
| | - Beren Millidge
- VERSES Research Lab, Los Angeles, 2JC7+WX, CA, USA; MRC Brain Network Dynamics Unit, University of Oxford, Oxford, OX1 3TH, United Kingdom
| | | | - Christopher L Buckley
- School of Engineering and Informatics, University of Sussex, Falmer, Brighton, BN1 9QJ, United Kingdom; VERSES Research Lab, Los Angeles, 2JC7+WX, CA, USA
| |
Collapse
|
8
|
Salvatori T, Pinchetti L, Millidge B, Song Y, Bao T, Bogacz R, Lukasiewicz T. Learning on Arbitrary Graph Topologies via Predictive Coding. Adv Neural Inf Process Syst 2022; 35:38232-38244. [PMID: 37090087 PMCID: PMC7614467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Training with backpropagation (BP) in standard deep learning consists of two main steps: a forward pass that maps a data point to its prediction, and a backward pass that propagates the error of this prediction back through the network. This process is highly effective when the goal is to minimize a specific objective function. However, it does not allow training on networks with cyclic or backward connections. This is an obstacle to reaching brain-like capabilities, as the highly complex heterarchical structure of the neural connections in the neocortex are potentially fundamental for its effectiveness. In this paper, we show how predictive coding (PC), a theory of information processing in the cortex, can be used to perform inference and learning on arbitrary graph topologies. We experimentally show how this formulation, called PC graphs, can be used to flexibly perform different tasks with the same network by simply stimulating specific neurons. This enables the model to be queried on stimuli with different structures, such as partial images, images with labels, or images without labels. We conclude by investigating how the topology of the graph influences the final performance, and comparing against simple baselines trained with BP.
Collapse
Affiliation(s)
| | - Luca Pinchetti
- Department of Computer Science, University of Oxford, UK
| | - Beren Millidge
- MRC Brain Network Dynamics Unit, University of Oxford, UK
| | - Yuhang Song
- Department of Computer Science, University of Oxford, UK
- MRC Brain Network Dynamics Unit, University of Oxford, UK
- Corresponding author
| | - Tianyi Bao
- Department of Computer Science, University of Oxford, UK
| | - Rafal Bogacz
- MRC Brain Network Dynamics Unit, University of Oxford, UK
| | - Thomas Lukasiewicz
- Department of Computer Science, University of Oxford, UK
- Institute of Logic and Computation, TU Wien, Austria
| |
Collapse
|
9
|
Millidge B, Salvatori T, Song Y, Lukasiewicz T, Bogacz R. Universal Hopfield Networks: A General Framework for Single-Shot Associative Memory Models. Proc Mach Learn Res 2022; 162:15561-15583. [PMID: 36751405 PMCID: PMC7614148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Subscribe] [Scholar Register] [Indexed: 02/09/2023]
Abstract
A large number of neural network models of associative memory have been proposed in the literature. These include the classical Hopfield networks (HNs), sparse distributed memories (SDMs), and more recently the modern continuous Hopfield networks (MCHNs), which possess close links with self-attention in machine learning. In this paper, we propose a general framework for understanding the operation of such memory networks as a sequence of three operations: similarity, separation, and projection. We derive all these memory models as instances of our general framework with differing similarity and separation functions. We extend the mathematical framework of Krotov & Hopfield (2020) to express general associative memory models using neural network dynamics with local computation, and derive a general energy function that is a Lyapunov function of the dynamics. Finally, using our framework, we empirically investigate the capacity of using different similarity functions for these associative memory models, beyond the dot product similarity measure, and demonstrate empirically that Euclidean or Manhattan distance similarity metrics perform substantially better in practice on many tasks, enabling a more robust retrieval and higher memory capacity than existing models.
Collapse
Affiliation(s)
- Beren Millidge
- MRC Brain Network Dynamics Unit, University of Oxford, UK
| | | | - Yuhang Song
- MRC Brain Network Dynamics Unit, University of Oxford, UK,Department of Computer Science, University of Oxford, UK
| | - Thomas Lukasiewicz
- Department of Computer Science, University of Oxford, UK,Institute of Logic and Computation, TU Wien, Austria
| | - Rafal Bogacz
- MRC Brain Network Dynamics Unit, University of Oxford, UK
| |
Collapse
|
10
|
Millidge B, Tschantz A, Buckley CL. Predictive Coding Approximates Backprop along Arbitrary Computation Graphs. Neural Comput 2022; 34:1329-1368. [PMID: 35534010 DOI: 10.1162/neco_a_01497] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Accepted: 11/10/2021] [Indexed: 11/04/2022]
Abstract
Backpropagation of error (backprop) is a powerful algorithm for training machine learning architectures through end-to-end differentiation. Recently it has been shown that backprop in multilayer perceptrons (MLPs) can be approximated using predictive coding, a biologically plausible process theory of cortical computation that relies solely on local and Hebbian updates. The power of backprop, however, lies not in its instantiation in MLPs but in the concept of automatic differentiation, which allows for the optimization of any differentiable program expressed as a computation graph. Here, we demonstrate that predictive coding converges asymptotically (and in practice, rapidly) to exact backprop gradients on arbitrary computation graphs using only local learning rules. We apply this result to develop a straightforward strategy to translate core machine learning architectures into their predictive coding equivalents. We construct predictive coding convolutional neural networks, recurrent neural networks, and the more complex long short-term memory, which include a nonlayer-like branching internal graph structure and multiplicative interactions. Our models perform equivalently to backprop on challenging machine learning benchmarks while using only local and (mostly) Hebbian plasticity. Our method raises the potential that standard machine learning algorithms could in principle be directly implemented in neural circuitry and may also contribute to the development of completely distributed neuromorphic architectures.
Collapse
Affiliation(s)
- Beren Millidge
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, U.K.
| | - Alexander Tschantz
- Sackler Center for Consciousness Science, School of Engineering and Informatics, University of Sussex, Brighton BN1 9QJ, U.K.
| | - Christopher L Buckley
- Evolutionary and Adaptive Systems Research Group, School of Engineering and Informatics, University of Sussex, Brighton BN1 9QJ, U.K.
| |
Collapse
|
11
|
Constant A, Tschantz ADD, Millidge B, Criado-Boado F, Martinez LM, Müeller J, Clark A. The Acquisition of Culturally Patterned Attention Styles Under Active Inference. Front Neurorobot 2021; 15:729665. [PMID: 34675792 PMCID: PMC8525546 DOI: 10.3389/fnbot.2021.729665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 09/02/2021] [Indexed: 11/13/2022] Open
Abstract
This paper presents an active inference based simulation study of visual foraging. The goal of the simulation is to show the effect of the acquisition of culturally patterned attention styles on cognitive task performance, under active inference. We show how cultural artefacts like antique vase decorations drive cognitive functions such as perception, action and learning, as well as task performance in a simple visual discrimination task. We thus describe a new active inference based research pipeline that future work may employ to inquire on deep guiding principles determining the manner in which material culture drives human thought, by building and rebuilding our patterns of attention.
Collapse
Affiliation(s)
- Axel Constant
- Theory and Method in Biosciences, University of Sydney, Sydney, NSW, Australia
| | - Alexander Daniel Dunsmoir Tschantz
- Department of Informatics, The University of Sussex, Sussex, United Kingdom
- Sackler Centre for Consciousness Science, University of Sussex, Brighton, United Kingdom
| | - Beren Millidge
- Department of Informatics, The University of Sussex, Sussex, United Kingdom
- School of Informatics, University of Edinburgh, Edinburgh, United Kingdom
| | - Felipe Criado-Boado
- Institute of Heritage Sciences, Spanish National Research Council, Santiago de Compostela, Spain
| | - Luis M Martinez
- Institute of Neurosciences, Spanish National Research Council, Universidad Miguel Hernández, Alicante, Spain
| | - Johannes Müeller
- Institute of Prehistoric and Protoshistoric Archaeology, Kiel University, Kiel, Germany
| | - Andy Clark
- Department of Informatics, The University of Sussex, Sussex, United Kingdom
- Department of Philosophy, The University of Sussex, Sussex, United Kingdom
- Department of Philosophy, Macquarie University, Sydney, NSW, Australia
| |
Collapse
|
12
|
Abstract
The expected free energy (EFE) is a central quantity in the theory of active inference. It is the quantity that all active inference agents are mandated to minimize through action, and its decomposition into extrinsic and intrinsic value terms is key to the balance of exploration and exploitation that active inference agents evince. Despite its importance, the mathematical origins of this quantity and its relation to the variational free energy (VFE) remain unclear. In this letter, we investigate the origins of the EFE in detail and show that it is not simply "the free energy in the future." We present a functional that we argue is the natural extension of the VFE but actively discourages exploratory behavior, thus demonstrating that exploration does not directly follow from free energy minimization into the future. We then develop a novel objective, the free energy of the expected future (FEEF), which possesses both the epistemic component of the EFE and an intuitive mathematical grounding as the divergence between predicted and desired futures.
Collapse
Affiliation(s)
- Beren Millidge
- School of Informatics, University of Edinburgh, Edinburgh, EH8 9AB, U.K.
| | - Alexander Tschantz
- Sackler Center for Consciousness Science, School of Engineering and Informatics, University of Sussex, Falmer, Brighton, BN1 9RH, U.K.
| | - Christopher L Buckley
- Evolutionary and Adaptive Systems Research Group, School of Engineering and Informatics, University of Sussex, Falmer, Brighton, BN1 9RH, U.K.
| |
Collapse
|
13
|
Seth AK, Millidge B, Buckley CL, Tschantz A. Curious Inferences: Reply to Sun and Firestone on the Dark Room Problem. Trends Cogn Sci 2020; 24:681-683. [DOI: 10.1016/j.tics.2020.05.011] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Accepted: 05/22/2020] [Indexed: 10/23/2022]
|