1
|
Jedlicka P, Tomko M, Robins A, Abraham WC. Contributions by metaplasticity to solving the Catastrophic Forgetting Problem. Trends Neurosci 2022; 45:656-666. [PMID: 35798611 DOI: 10.1016/j.tins.2022.06.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 06/06/2022] [Accepted: 06/09/2022] [Indexed: 10/17/2022]
Abstract
Catastrophic forgetting (CF) refers to the sudden and severe loss of prior information in learning systems when acquiring new information. CF has been an Achilles heel of standard artificial neural networks (ANNs) when learning multiple tasks sequentially. The brain, by contrast, has solved this problem during evolution. Modellers now use a variety of strategies to overcome CF, many of which have parallels to cellular and circuit functions in the brain. One common strategy, based on metaplasticity phenomena, controls the future rate of change at key connections to help retain previously learned information. However, the metaplasticity properties so far used are only a subset of those existing in neurobiology. We propose that as models become more sophisticated, there could be value in drawing on a richer set of metaplasticity rules, especially when promoting continual learning in agents moving about the environment.
Collapse
Affiliation(s)
- Peter Jedlicka
- ICAR3R - Interdisciplinary Centre for 3Rs in Animal Research, Faculty of Medicine, Justus Liebig University, Giessen, Germany; Institute of Clinical Neuroanatomy, Neuroscience Center, Goethe University Frankfurt, Frankfurt/Main, Germany; Frankfurt Institute for Advanced Studies, Frankfurt 60438, Germany.
| | - Matus Tomko
- ICAR3R - Interdisciplinary Centre for 3Rs in Animal Research, Faculty of Medicine, Justus Liebig University, Giessen, Germany; Institute of Molecular Physiology and Genetics, Centre of Biosciences, Slovak Academy of Sciences, Bratislava, Slovakia
| | - Anthony Robins
- Department of Computer Science, University of Otago, Dunedin 9016, New Zealand
| | - Wickliffe C Abraham
- Department of Psychology, Brain Health Research Centre, University of Otago, Dunedin 9054, New Zealand.
| |
Collapse
|
2
|
Berthet P, Lindahl M, Tully PJ, Hellgren-Kotaleski J, Lansner A. Functional Relevance of Different Basal Ganglia Pathways Investigated in a Spiking Model with Reward Dependent Plasticity. Front Neural Circuits 2016; 10:53. [PMID: 27493625 PMCID: PMC4954853 DOI: 10.3389/fncir.2016.00053] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Accepted: 07/06/2016] [Indexed: 11/13/2022] Open
Abstract
The brain enables animals to behaviorally adapt in order to survive in a complex and dynamic environment, but how reward-oriented behaviors are achieved and computed by its underlying neural circuitry is an open question. To address this concern, we have developed a spiking model of the basal ganglia (BG) that learns to dis-inhibit the action leading to a reward despite ongoing changes in the reward schedule. The architecture of the network features the two pathways commonly described in BG, the direct (denoted D1) and the indirect (denoted D2) pathway, as well as a loop involving striatum and the dopaminergic system. The activity of these dopaminergic neurons conveys the reward prediction error (RPE), which determines the magnitude of synaptic plasticity within the different pathways. All plastic connections implement a versatile four-factor learning rule derived from Bayesian inference that depends upon pre- and post-synaptic activity, receptor type, and dopamine level. Synaptic weight updates occur in the D1 or D2 pathways depending on the sign of the RPE, and an efference copy informs upstream nuclei about the action selected. We demonstrate successful performance of the system in a multiple-choice learning task with a transiently changing reward schedule. We simulate lesioning of the various pathways and show that a condition without the D2 pathway fares worse than one without D1. Additionally, we simulate the degeneration observed in Parkinson's disease (PD) by decreasing the number of dopaminergic neurons during learning. The results suggest that the D1 pathway impairment in PD might have been overlooked. Furthermore, an analysis of the alterations in the synaptic weights shows that using the absolute reward value instead of the RPE leads to a larger change in D1.
Collapse
Affiliation(s)
- Pierre Berthet
- Numerical Analysis and Computer Science, Stockholm UniversityStockholm, Sweden
- Department of Computational Biology, School of Computer Science and Communication, KTH Royal Institute of TechnologyStockholm, Sweden
- Stockholm Brain Institute, Karolinska InstituteStockholm, Sweden
| | - Mikael Lindahl
- Department of Computational Biology, School of Computer Science and Communication, KTH Royal Institute of TechnologyStockholm, Sweden
- Stockholm Brain Institute, Karolinska InstituteStockholm, Sweden
| | - Philip J. Tully
- Department of Computational Biology, School of Computer Science and Communication, KTH Royal Institute of TechnologyStockholm, Sweden
- Stockholm Brain Institute, Karolinska InstituteStockholm, Sweden
- Institute for Adaptive and Neural Computation, School of Informatics, University of EdinburghEdinburgh, UK
| | - Jeanette Hellgren-Kotaleski
- Department of Computational Biology, School of Computer Science and Communication, KTH Royal Institute of TechnologyStockholm, Sweden
- Stockholm Brain Institute, Karolinska InstituteStockholm, Sweden
- Department of Neuroscience, Karolinska InstituteStockholm, Sweden
| | - Anders Lansner
- Numerical Analysis and Computer Science, Stockholm UniversityStockholm, Sweden
- Department of Computational Biology, School of Computer Science and Communication, KTH Royal Institute of TechnologyStockholm, Sweden
- Stockholm Brain Institute, Karolinska InstituteStockholm, Sweden
| |
Collapse
|
3
|
Vogginger B, Schüffny R, Lansner A, Cederström L, Partzsch J, Höppner S. Reducing the computational footprint for real-time BCPNN learning. Front Neurosci 2015; 9:2. [PMID: 25657618 PMCID: PMC4302947 DOI: 10.3389/fnins.2015.00002] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2014] [Accepted: 01/03/2015] [Indexed: 11/26/2022] Open
Abstract
The implementation of synaptic plasticity in neural simulation or neuromorphic hardware is usually very resource-intensive, often requiring a compromise between efficiency and flexibility. A versatile, but computationally-expensive plasticity mechanism is provided by the Bayesian Confidence Propagation Neural Network (BCPNN) paradigm. Building upon Bayesian statistics, and having clear links to biological plasticity processes, the BCPNN learning rule has been applied in many fields, ranging from data classification, associative memory, reward-based learning, probabilistic inference to cortical attractor memory networks. In the spike-based version of this learning rule the pre-, postsynaptic and coincident activity is traced in three low-pass-filtering stages, requiring a total of eight state variables, whose dynamics are typically simulated with the fixed step size Euler method. We derive analytic solutions allowing an efficient event-driven implementation of this learning rule. Further speedup is achieved by first rewriting the model which reduces the number of basic arithmetic operations per update to one half, and second by using look-up tables for the frequently calculated exponential decay. Ultimately, in a typical use case, the simulation using our approach is more than one order of magnitude faster than with the fixed step size Euler method. Aiming for a small memory footprint per BCPNN synapse, we also evaluate the use of fixed-point numbers for the state variables, and assess the number of bits required to achieve same or better accuracy than with the conventional explicit Euler method. All of this will allow a real-time simulation of a reduced cortex model based on BCPNN in high performance computing. More important, with the analytic solution at hand and due to the reduced memory bandwidth, the learning rule can be efficiently implemented in dedicated or existing digital neuromorphic hardware.
Collapse
Affiliation(s)
- Bernhard Vogginger
- Department of Electrical Engineering and Information Technology, Technische Universität Dresden Germany
| | - René Schüffny
- Department of Electrical Engineering and Information Technology, Technische Universität Dresden Germany
| | - Anders Lansner
- Department of Computational Biology, School of Computer Science and Communication, Royal Institute of Technology (KTH) Stockholm, Sweden ; Department of Numerical Analysis and Computer Science, Stockholm University Stockholm, Sweden
| | - Love Cederström
- Department of Electrical Engineering and Information Technology, Technische Universität Dresden Germany
| | - Johannes Partzsch
- Department of Electrical Engineering and Information Technology, Technische Universität Dresden Germany
| | - Sebastian Höppner
- Department of Electrical Engineering and Information Technology, Technische Universität Dresden Germany
| |
Collapse
|
4
|
Knoblauch A, Körner E, Körner U, Sommer FT. Structural synaptic plasticity has high memory capacity and can explain graded amnesia, catastrophic forgetting, and the spacing effect. PLoS One 2014; 9:e96485. [PMID: 24858841 PMCID: PMC4032253 DOI: 10.1371/journal.pone.0096485] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2013] [Accepted: 04/08/2014] [Indexed: 11/19/2022] Open
Abstract
Although already William James and, more explicitly, Donald Hebb's theory of cell assemblies have suggested that activity-dependent rewiring of neuronal networks is the substrate of learning and memory, over the last six decades most theoretical work on memory has focused on plasticity of existing synapses in prewired networks. Research in the last decade has emphasized that structural modification of synaptic connectivity is common in the adult brain and tightly correlated with learning and memory. Here we present a parsimonious computational model for learning by structural plasticity. The basic modeling units are "potential synapses" defined as locations in the network where synapses can potentially grow to connect two neurons. This model generalizes well-known previous models for associative learning based on weight plasticity. Therefore, existing theory can be applied to analyze how many memories and how much information structural plasticity can store in a synapse. Surprisingly, we find that structural plasticity largely outperforms weight plasticity and can achieve a much higher storage capacity per synapse. The effect of structural plasticity on the structure of sparsely connected networks is quite intuitive: Structural plasticity increases the "effectual network connectivity", that is, the network wiring that specifically supports storage and recall of the memories. Further, this model of structural plasticity produces gradients of effectual connectivity in the course of learning, thereby explaining various cognitive phenomena including graded amnesia, catastrophic forgetting, and the spacing effect.
Collapse
Affiliation(s)
- Andreas Knoblauch
- Engineering Faculty, Albstadt-Sigmaringen University, Albstadt, Germany
- Honda Research Institute Europe, Offenbach am Main, Germany
| | - Edgar Körner
- Honda Research Institute Europe, Offenbach am Main, Germany
| | - Ursula Körner
- Honda Research Institute Europe, Offenbach am Main, Germany
| | - Friedrich T. Sommer
- Redwood Center for Theoretical Neuroscience, University of California, Berkeley, California, United States of America
| |
Collapse
|
5
|
Abstract
Neural associative memories are perceptron-like single-layer networks with fast synaptic learning typically storing discrete associations between pairs of neural activity patterns. Previous work optimized the memory capacity for various models of synaptic learning: linear Hopfield-type rules, the Willshaw model employing binary synapses, or the BCPNN rule of Lansner and Ekeberg, for example. Here I show that all of these previous models are limit cases of a general optimal model where synaptic learning is determined by probabilistic Bayesian considerations. Asymptotically, for large networks and very sparse neuron activity, the Bayesian model becomes identical to an inhibitory implementation of the Willshaw and BCPNN-type models. For less sparse patterns, the Bayesian model becomes identical to Hopfield-type networks employing the covariance rule. For intermediate sparseness or finite networks, the optimal Bayesian learning rule differs from the previous models and can significantly improve memory performance. I also provide a unified analytical framework to determine memory capacity at a given output noise level that links approaches based on mutual information, Hamming distance, and signal-to-noise ratio.
Collapse
|
6
|
Johansson C, Ekeberg O, Lansner A. Clustering of stored memories in an attractor network with local competition. Int J Neural Syst 2007; 16:393-403. [PMID: 17285686 DOI: 10.1142/s0129065706000809] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2006] [Revised: 10/02/2006] [Accepted: 11/03/2006] [Indexed: 11/18/2022]
Abstract
In this paper we study an attractor network with units that compete locally for activation and we prove that a reduced version of it has fixpoint dynamics. An analysis, complemented by simulation experiments, of the local characteristics of the network's attractors with respect to a parameter controlling the intensity of the local competition is performed. We find that the attractors are hierarchically clustered when the parameter of the local competition is changed.
Collapse
Affiliation(s)
- Christopher Johansson
- School of Computer Science and Communication, Royal Institute of Technology, Roslagstullsbacken 35, Stockholm, 100 44, Sweden.
| | | | | |
Collapse
|
7
|
Johansson C, Lansner A. Towards cortex sized artificial neural systems. Neural Netw 2006; 20:48-61. [PMID: 16860539 DOI: 10.1016/j.neunet.2006.05.029] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2005] [Accepted: 05/08/2006] [Indexed: 10/24/2022]
Abstract
We propose, implement, and discuss an abstract model of the mammalian neocortex. This model is instantiated with a sparse recurrently connected neural network that has spiking leaky integrator units and continuous Hebbian learning. First we study the structure, modularization, and size of neocortex, and then we describe a generic computational model of the cortical circuitry. A characterizing feature of the model is that it is based on the modularization of neocortex into hypercolumns and minicolumns. Both a floating- and fixed-point arithmetic implementation of the model are presented along with simulation results. We conclude that an implementation on a cluster computer is not communication but computation bounded. A mouse and rat cortex sized version of our model executes in 44% and 23% of real-time respectively. Further, an instance of the model with 1.6 x 10(6) units and 2 x 10(11) connections performed noise reduction and pattern completion. These implementations represent the current frontier of large-scale abstract neural network simulations in terms of network size and running speed.
Collapse
Affiliation(s)
- Christopher Johansson
- Department of Numerical Analysis and Computer Science, Royal Institute of Technology, Stockholm, Sweden.
| | | |
Collapse
|
8
|
Sandberg A, Lansner A. Synaptic depression as an intrinsic driver of reinstatement dynamics in an attractor network. Neurocomputing 2002. [DOI: 10.1016/s0925-2312(02)00448-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
9
|
|
10
|
|