1
|
Warr K, Hare J, Thomas D. Improving Recall in Sparse Associative Memories That Use Neurogenesis. Neural Comput 2025; 37:437-480. [PMID: 39787425 DOI: 10.1162/neco_a_01732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Accepted: 10/04/2024] [Indexed: 01/12/2025]
Abstract
The creation of future low-power neuromorphic solutions requires specialist spiking neural network (SNN) algorithms that are optimized for neuromorphic settings. One such algorithmic challenge is the ability to recall learned patterns from their noisy variants. Solutions to this problem may be required to memorize vast numbers of patterns based on limited training data and subsequently recall the patterns in the presence of noise. To solve this problem, previous work has explored sparse associative memory (SAM)-associative memory neural models that exploit the principle of sparse neural coding observed in the brain. Research into a subcategory of SAM has been inspired by the biological process of adult neurogenesis, whereby new neurons are generated to facilitate adaptive and effective lifelong learning. Although these neurogenesis models have been demonstrated in previous research, they have limitations in terms of recall memory capacity and robustness to noise. In this article, we provide a unifying framework for characterizing a type of SAM network that has been pretrained using a learning strategy that incorporated a simple neurogenesis model. Using this characterization, we formally define network topology and threshold optimization methods to empirically demonstrate greater than 104 times improvement in memory capacity compared to previous work. We show that these optimizations can facilitate the development of networks that have reduced interneuron connectivity while maintaining high recall efficacy. This paves the way for ongoing research into fast, effective, low-power realizations of associative memory on neuromorphic platforms.
Collapse
Affiliation(s)
- Katy Warr
- Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, U.K.
| | - Jonathon Hare
- Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, U.K.
| | - David Thomas
- Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, U.K.
| |
Collapse
|
2
|
Jannesar N, Akbarzadeh-Sherbaf K, Safari S, Vahabie AH. SSTE: Syllable-Specific Temporal Encoding to FORCE-learn audio sequences with an associative memory approach. Neural Netw 2024; 177:106368. [PMID: 38761415 DOI: 10.1016/j.neunet.2024.106368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Revised: 03/28/2024] [Accepted: 05/05/2024] [Indexed: 05/20/2024]
Abstract
The circuitry and pathways in the brains of humans and other species have long inspired researchers and system designers to develop accurate and efficient systems capable of solving real-world problems and responding in real-time. We propose the Syllable-Specific Temporal Encoding (SSTE) to learn vocal sequences in a reservoir of Izhikevich neurons, by forming associations between exclusive input activities and their corresponding syllables in the sequence. Our model converts the audio signals to cochleograms using the CAR-FAC model to simulate a brain-like auditory learning and memorization process. The reservoir is trained using a hardware-friendly approach to FORCE learning. Reservoir computing could yield associative memory dynamics with far less computational complexity compared to RNNs. The SSTE-based learning enables competent accuracy and stable recall of spatiotemporal sequences with fewer reservoir inputs compared with existing encodings in the literature for similar purpose, offering resource savings. The encoding points to syllable onsets and allows recalling from a desired point in the sequence, making it particularly suitable for recalling subsets of long vocal sequences. The SSTE demonstrates the capability of learning new signals without forgetting previously memorized sequences and displays robustness against occasional noise, a characteristic of real-world scenarios. The components of this model are configured to improve resource consumption and computational intensity, addressing some of the cost-efficiency issues that might arise in future implementations aiming for compactness and real-time, low-power operation. Overall, this model proposes a brain-inspired pattern generation network for vocal sequences that can be extended with other bio-inspired computations to explore their potentials for brain-like auditory perception. Future designs could inspire from this model to implement embedded devices that learn vocal sequences and recall them as needed in real-time. Such systems could acquire language and speech, operate as artificial assistants, and transcribe text to speech, in the presence of natural noise and corruption on audio data.
Collapse
Affiliation(s)
- Nastaran Jannesar
- High Performance Embedded Architecture Lab., School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran.
| | | | - Saeed Safari
- High Performance Embedded Architecture Lab., School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran.
| | - Abdol-Hossein Vahabie
- Department of Psychology, Faculty of Psychology and Education, University of Tehran, Tehran, Iran; Cognitive Systems Laboratory, Control and Intelligent Processing Center of Excellence (CIPCE), School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran.
| |
Collapse
|
3
|
Agliari E, Alemanno F, Aquaro M, Barra A, Durante F, Kanter I. Hebbian dreaming for small datasets. Neural Netw 2024; 173:106174. [PMID: 38359641 DOI: 10.1016/j.neunet.2024.106174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 01/02/2024] [Accepted: 02/09/2024] [Indexed: 02/17/2024]
Abstract
The dreaming Hopfield model constitutes a generalization of the Hebbian paradigm for neural networks, that is able to perform on-line learning when "awake" and also to account for off-line "sleeping" mechanisms. The latter have been shown to enhance storing in such a way that, in the long sleep-time limit, this model can reach the maximal storage capacity achievable by networks equipped with symmetric pairwise interactions. In this paper, we inspect the minimal amount of information that must be supplied to such a network to guarantee a successful generalization, and we test it both on random synthetic and on standard structured datasets (i.e., MNIST, Fashion-MNIST and Olivetti). By comparing these minimal thresholds of information with those required by the standard (i.e., always "awake") Hopfield model, we prove that the present network can save up to ∼90% of the dataset size, yet preserving the same performance of the standard counterpart. This suggests that sleep may play a pivotal role in explaining the gap between the large volumes of data required to train artificial neural networks and the relatively small volumes needed by their biological counterparts. Further, we prove that the model Cost function (typically used in statistical mechanics) admits a representation in terms of a standard Loss function (typically used in machine learning) and this allows us to analyze its emergent computational skills both theoretically and computationally: a quantitative picture of its capabilities as a function of its control parameters is achieved and consistency between the two approaches is highlighted. The resulting network is an associative memory for pattern recognition tasks that learns from examples on-line, generalizes correctly (in suitable regions of its control parameters) and optimizes its storage capacity by off-line sleeping: such a reduction of the training cost can be inspiring toward sustainable AI and in situations where data are relatively sparse.
Collapse
Affiliation(s)
- Elena Agliari
- Department of Mathematics of Sapienza Università di Roma, Rome, Italy.
| | - Francesco Alemanno
- Department of Mathematics and Physics of Università del Salento, Lecce, Italy
| | - Miriam Aquaro
- Department of Mathematics of Sapienza Università di Roma, Rome, Italy
| | - Adriano Barra
- Department of Mathematics and Physics of Università del Salento, Lecce, Italy.
| | - Fabrizio Durante
- Department of Economic Sciences of Università del Salento, Lecce, Italy
| | - Ido Kanter
- Department of Physics of Bar-Ilan University, Ramat Gan, Israel
| |
Collapse
|
4
|
Gao Y, Hu J, Yu H, Du J, Jia C. Variance-Constrained Resilient $$H_{\infty }$$ State Estimation for Time-Varying Neural Networks with Random Saturation Observation Under Uncertain Occurrence Probability. Neural Process Lett 2023. [DOI: 10.1007/s11063-022-11078-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
5
|
Wang S, Chen Z, Du S, Lin Z. Learning Deep Sparse Regularizers With Applications to Multi-View Clustering and Semi-Supervised Classification. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:5042-5055. [PMID: 34018930 DOI: 10.1109/tpami.2021.3082632] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Sparsity-constrained optimization problems are common in machine learning, such as sparse coding, low-rank minimization and compressive sensing. However, most of previous studies focused on constructing various hand-crafted sparse regularizers, while little work was devoted to learning adaptive sparse regularizers from given input data for specific tasks. In this paper, we propose a deep sparse regularizer learning model that learns data-driven sparse regularizers adaptively. Via the proximal gradient algorithm, we find that the sparse regularizer learning is equivalent to learning a parameterized activation function. This encourages us to learn sparse regularizers in the deep learning framework. Therefore, we build a neural network composed of multiple blocks, each being differentiable and reusable. All blocks contain learnable piecewise linear activation functions which correspond to the sparse regularizer to be learned. Furthermore, the proposed model is trained with back propagation, and all parameters in this model are learned end-to-end. We apply our framework to multi-view clustering and semi-supervised classification tasks to learn a latent compact representation. Experimental results demonstrate the superiority of the proposed framework over state-of-the-art multi-view learning models.
Collapse
|
6
|
Rodgers N, Tiňo P, Johnson S. Network hierarchy and pattern recovery in directed sparse Hopfield networks. Phys Rev E 2022; 105:064304. [PMID: 35854620 DOI: 10.1103/physreve.105.064304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 05/22/2022] [Indexed: 06/15/2023]
Abstract
Many real-world networks are directed, sparse, and hierarchical, with a mixture of feedforward and feedback connections with respect to the hierarchy. Moreover, a small number of master nodes are often able to drive the whole system. We study the dynamics of pattern presentation and recovery on sparse, directed, Hopfield-like neural networks using trophic analysis to characterize their hierarchical structure. This is a recent method which quantifies the local position of each node in a hierarchy (trophic level) as well as the global directionality of the network (trophic coherence). We show that even in a recurrent network, the state of the system can be controlled by a small subset of neurons which can be identified by their low trophic levels. We also find that performance at the pattern recovery task can be significantly improved by tuning the trophic coherence and other topological properties of the network. This may explain the relatively sparse and coherent structures observed in the animal brain and provide insights for improving the architectures of artificial neural networks. Moreover, we expect that the principles we demonstrate here, through numerical analysis, will be relevant for a broad class of system whose underlying network structure is directed and sparse, such as biological, social, or financial networks.
Collapse
Affiliation(s)
- Niall Rodgers
- School of Mathematics, University of Birmingham, Birmingham B15 2TT, United Kingdom and Topological Design Centre for Doctoral Training, University of Birmingham, Birmingham B15 2TT, United Kingdom
| | - Peter Tiňo
- School of Computer Science, University of Birmingham, Birmingham B15 2TT, United Kingdom
| | - Samuel Johnson
- School of Mathematics, University of Birmingham, Birmingham B15 2TT, United Kingdom and The Alan Turing Institute, British Library, 96 Euston Road, London NW1 2DB, United Kingdom
| |
Collapse
|
7
|
|
8
|
A 100 MHz 0.41 fJ/(Bit∙Search) 28 nm CMOS-Bulk Content Addressable Memory for HEP Experiments. JOURNAL OF LOW POWER ELECTRONICS AND APPLICATIONS 2020. [DOI: 10.3390/jlpea10040035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
This paper presents a transistor-level design with extensive experimental validation of a Content Addressable Memory (CAM), based on an eXclusive OR (XOR) single-bit cell. This design exploits a dedicated architecture and a fully custom approach (both in the schematic and the layout phase), in order to achieve very low-power and high-speed performances. The proposed architecture does not require an internal clock or pre-charge phase, which usually increase the power request and slow down data searches. On the other hand, the dedicated solutions are exploited in order to minimize parasitic layout-induced capacitances in the single-bit cell, further reducing the power consumption. The prototype device, named CAM-28CB, is integrated in the deeply downscaled 28 nm Complementary Metal-Oxide-Semiconductor (CMOS) Bulk (28CB) technology. In this way, the high transistor transition frequency and the intrinsic lower parasitic capacitances allow the system speed to be improved. Furthermore, the high radiation hardness of this technology node (up to 1Grad TID), together with the CAM-28CB high-speed and low-power performances, makes this device suitable for High-Energy Physics experiments, such as ATLAS (A Toroidal LHC ApparatuS) at Large Hadron Collider (LHC). The prototype operates at a frequency of up to 100 MHz and consumes 46.86 µW. The total area occupancy is 1702 µm2 for 1.152 kb memory bit cells. The device operates with a single supply voltage of 1 V and achieves 0.41 fJ/bit/search Figure-of-Merit.
Collapse
|
9
|
Chu Y, Fei J, Hou S. Adaptive Global Sliding-Mode Control for Dynamic Systems Using Double Hidden Layer Recurrent Neural Network Structure. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:1297-1309. [PMID: 31247575 DOI: 10.1109/tnnls.2019.2919676] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In this paper, a full-regulated neural network (NN) with a double hidden layer recurrent neural network (DHLRNN) structure is designed, and an adaptive global sliding-mode controller based on the DHLRNN is proposed for a class of dynamic systems. Theoretical guidance and adaptive adjustment mechanism are established to set up the base width and central vector of the Gaussian function in the DHLRNN structure, where six sets of parameters can be adaptively stabilized to their best values according to different inputs. The new DHLRNN can improve the accuracy and generalization ability of the network, reduce the number of network weights, and accelerate the network training speed due to the strong fitting and presentation ability of two-layer activation functions compared with a general NN with a single hidden layer. Since the neurons of input layer can receive signals which come back from the neurons of output layer in the output feedback neural structure, it can possess associative memory and rapid system convergence, achieving better approximation and superior dynamic capability. Simulation and experiment on an active power filter are carried out to indicate the excellent static and dynamic performances of the proposed DHLRNN-based adaptive global sliding-mode controller, verifying its best approximation performance and the most stable internal state compared with other schemes.
Collapse
|
10
|
Zoppo G, Marrone F, Corinto F. Equilibrium Propagation for Memristor-Based Recurrent Neural Networks. Front Neurosci 2020; 14:240. [PMID: 32265641 PMCID: PMC7105894 DOI: 10.3389/fnins.2020.00240] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Accepted: 03/03/2020] [Indexed: 11/19/2022] Open
Abstract
Among the recent innovative technologies, memristor (memory-resistor) has attracted researchers attention as a fundamental computation element. It has been experimentally shown that memristive elements can emulate synaptic dynamics and are even capable of supporting spike timing dependent plasticity (STDP), an important adaptation rule that is gaining particular interest because of its simplicity and biological plausibility. The overall goal of this work is to provide a novel (theoretical) analog computing platform based on memristor devices and recurrent neural networks that exploits the memristor device physics to implement two variations of the backpropagation algorithm: recurrent backpropagation and equilibrium propagation. In the first learning technique, the use of memristor–based synaptic weights permits to propagate the error signals in the network by means of the nonlinear dynamics via an analog side network. This makes the processing non-digital and different from the current procedures. However, the necessity of a side analog network for the propagation of error derivatives makes this technique still highly biologically implausible. In order to solve this limitation, it is therefore proposed an alternative solution to the use of a side network by introducing a learning technique used for energy-based models: equilibrium propagation. Experimental results show that both approaches significantly outperform conventional architectures used for pattern reconstruction. Furthermore, due to the high suitability for VLSI implementation of the equilibrium propagation learning rule, additional results on the classification of the MNIST dataset are here reported.
Collapse
Affiliation(s)
- Gianluca Zoppo
- Department of Electronics, Politecnico di Torino, Turin, Italy
| | | | | |
Collapse
|