1
|
Tang S, Zhao Y, Lv H, Sun M, Feng Y, Zhang Z. Adaptive Optimization and Dynamic Representation Method for Asynchronous Data Based on Regional Correlation Degree. SENSORS (BASEL, SWITZERLAND) 2024; 24:7430. [PMID: 39685963 DOI: 10.3390/s24237430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2024] [Revised: 11/11/2024] [Accepted: 11/19/2024] [Indexed: 12/18/2024]
Abstract
Event cameras, as bio-inspired visual sensors, offer significant advantages in their high dynamic range and high temporal resolution for visual tasks. These capabilities enable efficient and reliable motion estimation even in the most complex scenes. However, these advantages come with certain trade-offs. For instance, current event-based vision sensors have low spatial resolution, and the process of event representation can result in varying degrees of data redundancy and incompleteness. Additionally, due to the inherent characteristics of event stream data, they cannot be utilized directly; pre-processing steps such as slicing and frame compression are required. Currently, various pre-processing algorithms exist for slicing and compressing event streams. However, these methods fall short when dealing with multiple subjects moving at different and varying speeds within the event stream, potentially exacerbating the inherent deficiencies of the event information flow. To address this longstanding issue, we propose a novel and efficient Asynchronous Spike Dynamic Metric and Slicing algorithm (ASDMS). ASDMS adaptively segments the event stream into fragments of varying lengths based on the spatiotemporal structure and polarity attributes of the events. Moreover, we introduce a new Adaptive Spatiotemporal Subject Surface Compensation algorithm (ASSSC). ASSSC compensates for missing motion information in the event stream and removes redundant information, thereby achieving better performance and effectiveness in event stream segmentation compared to existing event representation algorithms. Additionally, after compressing the processed results into frame images, the imaging quality is significantly improved. Finally, we propose a new evaluation metric, the Actual Performance Efficiency Discrepancy (APED), which combines actual distortion rate and event information entropy to quantify and compare the effectiveness of our method against other existing event representation methods. The final experimental results demonstrate that our event representation method outperforms existing approaches and addresses the shortcomings of current methods in handling event streams with multiple entities moving at varying speeds simultaneously.
Collapse
Affiliation(s)
- Sichao Tang
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yuchen Zhao
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
| | - Hengyi Lv
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
| | - Ming Sun
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
| | - Yang Feng
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
| | - Zeshu Zhang
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
| |
Collapse
|
2
|
Yu Q, Gao J, Wei J, Li J, Tan KC, Huang T. Improving Multispike Learning With Plastic Synaptic Delays. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:10254-10265. [PMID: 35442893 DOI: 10.1109/tnnls.2022.3165527] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Emulating the spike-based processing in the brain, spiking neural networks (SNNs) are developed and act as a promising candidate for the new generation of artificial neural networks that aim to produce efficient cognitions as the brain. Due to the complex dynamics and nonlinearity of SNNs, designing efficient learning algorithms has remained a major difficulty, which attracts great research attention. Most existing ones focus on the adjustment of synaptic weights. However, other components, such as synaptic delays, are found to be adaptive and important in modulating neural behavior. How could plasticity on different components cooperate to improve the learning of SNNs remains as an interesting question. Advancing our previous multispike learning, we propose a new joint weight-delay plasticity rule, named TDP-DL, in this article. Plastic delays are integrated into the learning framework, and as a result, the performance of multispike learning is significantly improved. Simulation results highlight the effectiveness and efficiency of our TDP-DL rule compared to baseline ones. Moreover, we reveal the underlying principle of how synaptic weights and delays cooperate with each other through a synthetic task of interval selectivity and show that plastic delays can enhance the selectivity and flexibility of neurons by shifting information across time. Due to this capability, useful information distributed away in the time domain can be effectively integrated for a better accuracy performance, as highlighted in our generalization tasks of the image, speech, and event-based object recognitions. Our work is thus valuable and significant to improve the performance of spike-based neuromorphic computing.
Collapse
|
3
|
Sanaullah, Koravuna S, Rückert U, Jungeblut T. Evaluation of Spiking Neural Nets-Based Image Classification Using the Runtime Simulator RAVSim. Int J Neural Syst 2023; 33:2350044. [PMID: 37604777 DOI: 10.1142/s0129065723500442] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2023]
Abstract
Spiking Neural Networks (SNNs) help achieve brain-like efficiency and functionality by building neurons and synapses that mimic the human brain's transmission of electrical signals. However, optimal SNN implementation requires a precise balance of parametric values. To design such ubiquitous neural networks, a graphical tool for visualizing, analyzing, and explaining the internal behavior of spikes is crucial. Although some popular SNN simulators are available, these tools do not allow users to interact with the neural network during simulation. To this end, we have introduced the first runtime interactive simulator, called Runtime Analyzing and Visualization Simulator (RAVSim),a developed to analyze and dynamically visualize the behavior of SNNs, allowing end-users to interact, observe output concentration reactions, and make changes directly during the simulation. In this paper, we present RAVSim with the current implementation of runtime interaction using the LIF neural model with different connectivity schemes, an image classification model using SNNs, and a dataset creation feature. Our main objective is to primarily investigate binary classification using SNNs with RGB images. We created a feed-forward network using the LIF neural model for an image classification algorithm and evaluated it by using RAVSim. The algorithm classifies faces with and without masks, achieving an accuracy of 91.8% using 1000 neurons in a hidden layer, 0.0758 MSE, and an execution time of ∼10[Formula: see text]min on the CPU. The experimental results show that using RAVSim not only increases network design speed but also accelerates user learning capability.
Collapse
Affiliation(s)
- Sanaullah
- Department of Engineering and Mathematics, Bielefeld University of Applied Science, Bielefeld, Germany
| | - Shamini Koravuna
- Department of Cognitive Interaction Technology Center, Bielefeld University, Bielefeld, Germany
| | - Ulrich Rückert
- Department of Cognitive Interaction Technology Center, Bielefeld University, Bielefeld, Germany
| | - Thorsten Jungeblut
- Department of Engineering and Mathematics, Bielefeld University of Applied Science, Bielefeld, Germany
| |
Collapse
|
4
|
Konar D, Sarma AD, Bhandary S, Bhattacharyya S, Cangi A, Aggarwal V. A shallow hybrid classical-quantum spiking feedforward neural network for noise-robust image classification. Appl Soft Comput 2023. [DOI: 10.1016/j.asoc.2023.110099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
|
5
|
Zhou W, Wen S, Liu Y, Liu L, Liu X, Chen L. Forgetting memristor based STDP learning circuit for neural networks. Neural Netw 2023; 158:293-304. [PMID: 36493532 DOI: 10.1016/j.neunet.2022.11.023] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Revised: 10/18/2022] [Accepted: 11/14/2022] [Indexed: 11/21/2022]
Abstract
The circuit implementation of STDP based on memristor is of great significance for the application of neural network. However, recent research shows that the research on the pure circuit implementation of forgetting memristor and STDP is still rare. This paper proposes a new STDP learning rule implementation circuit based on the forgetting memristor. This kind of forgetting memory resistance synapse makes the neural network have the function of time-division multiplexing, but the instability of short-term memory will affect the learning ability of the neural network. This paper analyzes and discusses the influence of synapses with long-term and short-term memory on the learning characteristics of neural network STDP, which lays a foundation for the construction of time-division multiplexing neural network with long-term and short-term memory synapses. Through this circuit, it is found that the volatile memristor has different behaviors to the stimulus signal in different initial states, and the resulting LTP phenomenon is more in line with the forgetting effect in biology. This circuit has multiple adjustable parameters, which can fit the STDP learning rules under different conditions. The application of neural network proves the availability of this circuit.
Collapse
Affiliation(s)
- Wenhao Zhou
- Electronic Information and Engineering, Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, Southwest University, 400715, China.
| | - Shiping Wen
- Centre for Artificial Intelligence, Faculty of Engineering and Information Technology, University of Technology Sydney, Australia.
| | - Yi Liu
- Electronic Information and Engineering, Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, Southwest University, 400715, China
| | - Lu Liu
- Electronic Information and Engineering, Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, Southwest University, 400715, China
| | - Xin Liu
- Computer Vision and Pattern Recognition Laboratory, School of Engineering Science, Lappeenranta-Lahti University of Technology LUT, Finland.
| | - Ling Chen
- Electronic Information and Engineering, Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, Southwest University, 400715, China; Computer Vision and Pattern Recognition Laboratory, School of Engineering Science, Lappeenranta-Lahti University of Technology LUT, Finland.
| |
Collapse
|
6
|
Wu Z, Zhang H, Lin Y, Li G, Wang M, Tang Y. LIAF-Net: Leaky Integrate and Analog Fire Network for Lightweight and Efficient Spatiotemporal Information Processing. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:6249-6262. [PMID: 33979292 DOI: 10.1109/tnnls.2021.3073016] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Spiking neural networks (SNNs) based on the leaky integrate and fire (LIF) model have been applied to energy-efficient temporal and spatiotemporal processing tasks. Due to the bioplausible neuronal dynamics and simplicity, LIF-SNN benefits from event-driven processing, however, usually face the embarrassment of reduced performance. This may because, in LIF-SNN, the neurons transmit information via spikes. To address this issue, in this work, we propose a leaky integrate and analog fire (LIAF) neuron model so that analog values can be transmitted among neurons, and a deep network termed LIAF-Net is built on it for efficient spatiotemporal processing. In the temporal domain, LIAF follows the traditional LIF dynamics to maintain its temporal processing capability. In the spatial domain, LIAF is able to integrate spatial information through convolutional integration or fully connected integration. As a spatiotemporal layer, LIAF can also be used with traditional artificial neural network (ANN) layers jointly. In addition, the built network can be trained with backpropagation through time (BPTT) directly, which avoids the performance loss caused by ANN to SNN conversion. Experiment results indicate that LIAF-Net achieves comparable performance to the gated recurrent unit (GRU) and long short-term memory (LSTM) on bAbI question answering (QA) tasks and achieves state-of-the-art performance on spatiotemporal dynamic vision sensor (DVS) data sets, including MNIST-DVS, CIFAR10-DVS, and DVS128 Gesture, with much less number of synaptic weights and computational overhead compared with traditional networks built by LSTM, GRU, convolutional LSTM (ConvLSTM), or 3-D convolution (Conv3D). Compared with traditional LIF-SNN, LIAF-Net also shows dramatic accuracy gain on all these experiments. In conclusion, LIAF-Net provides a framework combining the advantages of both ANNs and SNNs for lightweight and efficient spatiotemporal information processing.
Collapse
|
7
|
Wang H, He Z, Wang T, He J, Zhou X, Wang Y, Liu L, Wu N, Tian M, Shi C. TripleBrain: A Compact Neuromorphic Hardware Core With Fast On-Chip Self-Organizing and Reinforcement Spike-Timing Dependent Plasticity. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2022; 16:636-650. [PMID: 35802542 DOI: 10.1109/tbcas.2022.3189240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Human brain cortex acts as a rich inspiration source for constructing efficient artificial cognitive systems. In this paper, we investigate to incorporate multiple brain-inspired computing paradigms for compact, fast and high-accuracy neuromorphic hardware implementation. We propose the TripleBrain hardware core that tightly combines three common brain-inspired factors: the spike-based processing and plasticity, the self-organizing map (SOM) mechanism and the reinforcement learning scheme, to improve object recognition accuracy and processing throughput, while keeping low resource costs. The proposed hardware core is fully event-driven to mitigate unnecessary operations, and enables various on-chip learning rules (including the proposed SOM-STDP & R-STDP rule and the R-SOM-STDP rule regarded as the two variants of our TripleBrain learning rule) with different accuracy-latency tradeoffs to satisfy user requirements. An FPGA prototype of the neuromorphic core was implemented and elaborately tested. It realized high-speed learning (1349 frame/s) and inference (2698 frame/s), and obtained comparably high recognition accuracies of 95.10%, 80.89%, 100%, 94.94%, 82.32%, 100% and 97.93% on the MNIST, ETH-80, ORL-10, Yale-10, N-MNIST, Poker-DVS and Posture-DVS datasets, respectively, while only consuming 4146 (7.59%) slices, 32 (3.56%) DSPs and 131 (24.04%) Block RAMs on a Xilinx Zynq-7045 FPGA chip. Our neuromorphic core is very attractive for real-time resource-limited edge intelligent systems.
Collapse
|
8
|
Dong J, Jiang R, Xiao R, Yan R, Tang H. Event stream learning using spatio-temporal event surface. Neural Netw 2022; 154:543-559. [DOI: 10.1016/j.neunet.2022.07.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 06/12/2022] [Accepted: 07/10/2022] [Indexed: 11/29/2022]
|
9
|
Jang H, Simeone O. Multisample Online Learning for Probabilistic Spiking Neural Networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:2034-2044. [PMID: 35089867 DOI: 10.1109/tnnls.2022.3144296] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Spiking neural networks (SNNs) capture some of the efficiency of biological brains for inference and learning via the dynamic, online, and event-driven processing of binary time series. Most existing learning algorithms for SNNs are based on deterministic neuronal models, such as leaky integrate-and-fire, and rely on heuristic approximations of backpropagation through time that enforces constraints such as locality. In contrast, probabilistic SNN models can be trained directly via principled online, local, and update rules that have proven to be particularly effective for resource-constrained systems. This article investigates another advantage of probabilistic SNNs, namely, their capacity to generate independent outputs when queried over the same input. It is shown that the multiple generated output samples can be used during inference to robustify decisions and to quantify uncertainty-a feature that deterministic SNN models cannot provide. Furthermore, they can be leveraged for training in order to obtain more accurate statistical estimates of the log-loss training criterion and its gradient. Specifically, this article introduces an online learning rule based on generalized expectation-maximization (GEM) that follows a three-factor form with global learning signals and is referred to as GEM-SNN. Experimental results on structured output memorization and classification on a standard neuromorphic dataset demonstrate significant improvements in terms of log-likelihood, accuracy, and calibration when increasing the number of samples used for inference and training.
Collapse
|
10
|
Milde MB, Afshar S, Xu Y, Marcireau A, Joubert D, Ramesh B, Bethi Y, Ralph NO, El Arja S, Dennler N, van Schaik A, Cohen G. Neuromorphic Engineering Needs Closed-Loop Benchmarks. Front Neurosci 2022; 16:813555. [PMID: 35237122 PMCID: PMC8884247 DOI: 10.3389/fnins.2022.813555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 01/24/2022] [Indexed: 12/02/2022] Open
Abstract
Neuromorphic engineering aims to build (autonomous) systems by mimicking biological systems. It is motivated by the observation that biological organisms—from algae to primates—excel in sensing their environment, reacting promptly to their perils and opportunities. Furthermore, they do so more resiliently than our most advanced machines, at a fraction of the power consumption. It follows that the performance of neuromorphic systems should be evaluated in terms of real-time operation, power consumption, and resiliency to real-world perturbations and noise using task-relevant evaluation metrics. Yet, following in the footsteps of conventional machine learning, most neuromorphic benchmarks rely on recorded datasets that foster sensing accuracy as the primary measure for performance. Sensing accuracy is but an arbitrary proxy for the actual system's goal—taking a good decision in a timely manner. Moreover, static datasets hinder our ability to study and compare closed-loop sensing and control strategies that are central to survival for biological organisms. This article makes the case for a renewed focus on closed-loop benchmarks involving real-world tasks. Such benchmarks will be crucial in developing and progressing neuromorphic Intelligence. The shift towards dynamic real-world benchmarking tasks should usher in richer, more resilient, and robust artificially intelligent systems in the future.
Collapse
|
11
|
Niu LY, Wei Y, Long JY, Liu WB. High-Accuracy Spiking Neural Network for Objective Recognition Based on Proportional Attenuating Neuron. Neural Process Lett 2021. [DOI: 10.1007/s11063-021-10669-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
12
|
Beck M, Maier G, Flitter M, Gruna R, Längle T, Heizmann M, Beyerer J. An Extended Modular Processing Pipeline for Event-Based Vision in Automatic Visual Inspection. SENSORS 2021; 21:s21186143. [PMID: 34577349 PMCID: PMC8472878 DOI: 10.3390/s21186143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 09/02/2021] [Accepted: 09/03/2021] [Indexed: 11/16/2022]
Abstract
Dynamic Vision Sensors differ from conventional cameras in that only intensity changes of individual pixels are perceived and transmitted as an asynchronous stream instead of an entire frame. The technology promises, among other things, high temporal resolution and low latencies and data rates. While such sensors currently enjoy much scientific attention, there are only little publications on practical applications. One field of application that has hardly been considered so far, yet potentially fits well with the sensor principle due to its special properties, is automatic visual inspection. In this paper, we evaluate current state-of-the-art processing algorithms in this new application domain. We further propose an algorithmic approach for the identification of ideal time windows within an event stream for object classification. For the evaluation of our method, we acquire two novel datasets that contain typical visual inspection scenarios, i.e., the inspection of objects on a conveyor belt and during free fall. The success of our algorithmic extension for data processing is demonstrated on the basis of these new datasets by showing that classification accuracy of current algorithms is highly increased. By making our new datasets publicly available, we intend to stimulate further research on application of Dynamic Vision Sensors in machine vision applications.
Collapse
Affiliation(s)
- Moritz Beck
- Fraunhofer IOSB, Karlsruhe, Institute of Optronics, System Technologies and Image Exploitation IOSB, 76131 Karlsruhe, Germany; (M.B.); (M.F.); (R.G.); (T.L.); (J.B.)
| | - Georg Maier
- Fraunhofer IOSB, Karlsruhe, Institute of Optronics, System Technologies and Image Exploitation IOSB, 76131 Karlsruhe, Germany; (M.B.); (M.F.); (R.G.); (T.L.); (J.B.)
- Correspondence:
| | - Merle Flitter
- Fraunhofer IOSB, Karlsruhe, Institute of Optronics, System Technologies and Image Exploitation IOSB, 76131 Karlsruhe, Germany; (M.B.); (M.F.); (R.G.); (T.L.); (J.B.)
| | - Robin Gruna
- Fraunhofer IOSB, Karlsruhe, Institute of Optronics, System Technologies and Image Exploitation IOSB, 76131 Karlsruhe, Germany; (M.B.); (M.F.); (R.G.); (T.L.); (J.B.)
| | - Thomas Längle
- Fraunhofer IOSB, Karlsruhe, Institute of Optronics, System Technologies and Image Exploitation IOSB, 76131 Karlsruhe, Germany; (M.B.); (M.F.); (R.G.); (T.L.); (J.B.)
| | - Michael Heizmann
- Institute of Industrial Information Technology (IIIT), Karlsruhe Institute of Technology (KIT), 76131 Karlsruhe, Germany;
| | - Jürgen Beyerer
- Fraunhofer IOSB, Karlsruhe, Institute of Optronics, System Technologies and Image Exploitation IOSB, 76131 Karlsruhe, Germany; (M.B.); (M.F.); (R.G.); (T.L.); (J.B.)
- Vision and Fusion Laboratory (IES), Karlsruhe Institute of Technology (KIT), 76131 Karlsruhe, Germany
| |
Collapse
|
13
|
A Cost-Efficient High-Speed VLSI Architecture for Spiking Convolutional Neural Network Inference Using Time-Step Binary Spike Maps. SENSORS 2021; 21:s21186006. [PMID: 34577214 PMCID: PMC8471769 DOI: 10.3390/s21186006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Revised: 08/31/2021] [Accepted: 09/03/2021] [Indexed: 11/23/2022]
Abstract
Neuromorphic hardware systems have been gaining ever-increasing focus in many embedded applications as they use a brain-inspired, energy-efficient spiking neural network (SNN) model that closely mimics the human cortex mechanism by communicating and processing sensory information via spatiotemporally sparse spikes. In this paper, we fully leverage the characteristics of spiking convolution neural network (SCNN), and propose a scalable, cost-efficient, and high-speed VLSI architecture to accelerate deep SCNN inference for real-time low-cost embedded scenarios. We leverage the snapshot of binary spike maps at each time-step, to decompose the SCNN operations into a series of regular and simple time-step CNN-like processing to reduce hardware resource consumption. Moreover, our hardware architecture achieves high throughput by employing a pixel stream processing mechanism and fine-grained data pipelines. Our Zynq-7045 FPGA prototype reached a high processing speed of 1250 frames/s and high recognition accuracies on the MNIST and Fashion-MNIST image datasets, demonstrating the plausibility of our SCNN hardware architecture for many embedded applications.
Collapse
|
14
|
Wang T, Shi C, Zhou X, Lin Y, He J, Gan P, Li P, Wang Y, Liu L, Wu N, Luo G. CompSNN: A lightweight spiking neural network based on spatiotemporally compressive spike features. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.10.100] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
15
|
|
16
|
Liu Q, Pan G, Ruan H, Xing D, Xu Q, Tang H. Unsupervised AER Object Recognition Based on Multiscale Spatio-Temporal Features and Spiking Neurons. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:5300-5311. [PMID: 32054587 DOI: 10.1109/tnnls.2020.2966058] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
This article proposes an unsupervised address event representation (AER) object recognition approach. The proposed approach consists of a novel multiscale spatio-temporal feature (MuST) representation of input AER events and a spiking neural network (SNN) using spike-timing-dependent plasticity (STDP) for object recognition with MuST. MuST extracts the features contained in both the spatial and temporal information of AER event flow, and forms an informative and compact feature spike representation. We show not only how MuST exploits spikes to convey information more effectively, but also how it benefits the recognition using SNN. The recognition process is performed in an unsupervised manner, which does not need to specify the desired status of every single neuron of SNN, and thus can be flexibly applied in real-world recognition tasks. The experiments are performed on five AER datasets including a new one named GESTURE-DVS. Extensive experimental results show the effectiveness and advantages of the proposed approach.
Collapse
|
17
|
Ramesh B, Yang H, Orchard G, Le Thi NA, Zhang S, Xiang C. DART: Distribution Aware Retinal Transform for Event-Based Cameras. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2020; 42:2767-2780. [PMID: 31144625 DOI: 10.1109/tpami.2019.2919301] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
We introduce a generic visual descriptor, termed as distribution aware retinal transform (DART), that encodes the structural context using log-polar grids for event cameras. The DART descriptor is applied to four different problems, namely object classification, tracking, detection and feature matching: (1) The DART features are directly employed as local descriptors in a bag-of-words classification framework and testing is carried out on four standard event-based object datasets (N-MNIST, MNIST-DVS, CIFAR10-DVS, NCaltech-101); (2) Extending the classification system, tracking is demonstrated using two key novelties: (i) Statistical bootstrapping is leveraged with online learning for overcoming the low-sample problem during the one-shot learning of the tracker, (ii) Cyclical shifts are induced in the log-polar domain of the DART descriptor to achieve robustness to object scale and rotation variations; (3) To solve the long-term object tracking problem, an object detector is designed using the principle of cluster majority voting. The detection scheme is then combined with the tracker to result in a high intersection-over-union score with augmented ground truth annotations on the publicly available event camera dataset; (4) Finally, the event context encoded by DART greatly simplifies the feature correspondence problem, especially for spatio-temporal slices far apart in time, which has not been explicitly tackled in the event-based vision domain.
Collapse
|
18
|
Bi Y, Chadha A, Abbas A, Bourtsoulatze E, Andreopoulos Y. Graph-based Spatio-Temporal Feature Learning for Neuromorphic Vision Sensing. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; PP:9084-9098. [PMID: 32941136 DOI: 10.1109/tip.2020.3023597] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Neuromorphic vision sensing (NVS) devices represent visual information as sequences of asynchronous discrete events (a.k.a., "spikes") in response to changes in scene reflectance. Unlike conventional active pixel sensing (APS), NVS allows for significantly higher event sampling rates at substantially increased energy efficiency and robustness to illumination changes. However, feature representation for NVS is far behind its APS-based counterparts, resulting in lower performance in high-level computer vision tasks. To fully utilize its sparse and asynchronous nature, we propose a compact graph representation for NVS, which allows for end-to-end learning with graph convolution neural networks. We couple this with a novel end-to-end feature learning framework that accommodates both appearancebased and motion-based tasks. The core of our framework comprises a spatial feature learning module, which utilizes residual-graph convolutional neural networks (RG-CNN), for end-to-end learning of appearance-based features directly from graphs. We extend this with our proposed Graph2Grid block and temporal feature learning module for efficiently modelling temporal dependencies over multiple graphs and a long temporal extent. We show how our framework can be configured for object classification, action recognition and action similarity labeling. Importantly, our approach preserves the spatial and temporal coherence of spike events, while requiring less computation and memory. The experimental validation shows that our proposed framework outperforms all recent methods on standard datasets. Finally, to address the absence of large real-world NVS datasets for complex recognition tasks, we introduce, evaluate and make available the American Sign Language letters (ASL-DVS), as well as human action dataset (UCF101-DVS, HMDB51-DVS and ASLAN-DVS).
Collapse
|
19
|
Xiao R, Tang H, Ma Y, Yan R, Orchard G. An Event-Driven Categorization Model for AER Image Sensors Using Multispike Encoding and Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:3649-3657. [PMID: 31714243 DOI: 10.1109/tnnls.2019.2945630] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this article, we present a systematic computational model to explore brain-based computation for object recognition. The model extracts temporal features embedded in address-event representation (AER) data and discriminates different objects by using spiking neural networks (SNNs). We use multispike encoding to extract temporal features contained in the AER data. These temporal patterns are then learned through the tempotron learning rule. The presented model is consistently implemented in a temporal learning framework, where the precise timing of spikes is considered in the feature-encoding and learning process. A noise-reduction method is also proposed by calculating the correlation of an event with the surrounding spatial neighborhood based on the recently proposed time-surface technique. The model evaluated on wide spectrum data sets (MNIST, N-MNIST, MNIST-DVS, AER Posture, and Poker Card) demonstrates its superior recognition performance, especially for the events with noise.
Collapse
|
20
|
Zuo L, Chen Y, Zhang L, Chen C. A spiking neural network with probability information transmission. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.01.109] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
21
|
A High-Speed Low-Cost VLSI System Capable of On-Chip Online Learning for Dynamic Vision Sensor Data Classification. SENSORS 2020; 20:s20174715. [PMID: 32825560 PMCID: PMC7506740 DOI: 10.3390/s20174715] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 08/09/2020] [Accepted: 08/16/2020] [Indexed: 11/21/2022]
Abstract
This paper proposes a high-speed low-cost VLSI system capable of on-chip online learning for classifying address-event representation (AER) streams from dynamic vision sensor (DVS) retina chips. The proposed system executes a lightweight statistic algorithm based on simple binary features extracted from AER streams and a Random Ferns classifier to classify these features. The proposed system’s characteristics of multi-level pipelines and parallel processing circuits achieves a high throughput up to 1 spike event per clock cycle for AER data processing. Thanks to the nature of the lightweight algorithm, our hardware system is realized in a low-cost memory-centric paradigm. In addition, the system is capable of on-chip online learning to flexibly adapt to different in-situ application scenarios. The extra overheads for on-chip learning in terms of time and resource consumption are quite low, as the training procedure of the Random Ferns is quite simple, requiring few auxiliary learning circuits. An FPGA prototype of the proposed VLSI system was implemented with 9.5~96.7% memory consumption and <11% computational and logic resources on a Xilinx Zynq-7045 chip platform. It was running at a clock frequency of 100 MHz and achieved a peak processing throughput up to 100 Meps (Mega events per second), with an estimated power consumption of 690 mW leading to a high energy efficiency of 145 Meps/W or 145 event/μJ. We tested the prototype system on MNIST-DVS, Poker-DVS, and Posture-DVS datasets, and obtained classification accuracies of 77.9%, 99.4% and 99.3%, respectively. Compared to prior works, our VLSI system achieves higher processing speeds, higher computing efficiency, comparable accuracy, and lower resource costs.
Collapse
|
22
|
Comparing SNNs and RNNs on neuromorphic vision datasets: Similarities and differences. Neural Netw 2020; 132:108-120. [PMID: 32866745 DOI: 10.1016/j.neunet.2020.08.001] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Revised: 06/13/2020] [Accepted: 08/03/2020] [Indexed: 01/16/2023]
Abstract
Neuromorphic data, recording frameless spike events, have attracted considerable attention for the spatiotemporal information components and the event-driven processing fashion. Spiking neural networks (SNNs) represent a family of event-driven models with spatiotemporal dynamics for neuromorphic computing, which are widely benchmarked on neuromorphic data. Interestingly, researchers in the machine learning community can argue that recurrent (artificial) neural networks (RNNs) also have the capability to extract spatiotemporal features although they are not event-driven. Thus, the question of "what will happen if we benchmark these two kinds of models together on neuromorphic data" comes out but remains unclear. In this work, we make a systematic study to compare SNNs and RNNs on neuromorphic data, taking the vision datasets as a case study. First, we identify the similarities and differences between SNNs and RNNs (including the vanilla RNNs and LSTM) from the modeling and learning perspectives. To improve comparability and fairness, we unify the supervised learning algorithm based on backpropagation through time (BPTT), the loss function exploiting the outputs at all timesteps, the network structure with stacked fully-connected or convolutional layers, and the hyper-parameters during training. Especially, given the mainstream loss function used in RNNs, we modify it inspired by the rate coding scheme to approach that of SNNs. Furthermore, we tune the temporal resolution of datasets to test model robustness and generalization. At last, a series of contrast experiments are conducted on two types of neuromorphic datasets: DVS-converted (N-MNIST) and DVS-captured (DVS Gesture). Extensive insights regarding recognition accuracy, feature extraction, temporal resolution and contrast, learning generalization, computational complexity and parameter volume are provided, which are beneficial for the model selection on different workloads and even for the invention of novel neural models in the future.
Collapse
|
23
|
Deng Y, Li Y, Chen H. AMAE: Adaptive Motion-Agnostic Encoder for Event-Based Object Classification. IEEE Robot Autom Lett 2020. [DOI: 10.1109/lra.2020.3002480] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
24
|
Tapiador-Morales R, Maro JM, Jimenez-Fernandez A, Jimenez-Moreno G, Benosman R, Linares-Barranco A. Event-Based Gesture Recognition through a Hierarchy of Time-Surfaces for FPGA. SENSORS (BASEL, SWITZERLAND) 2020; 20:E3404. [PMID: 32560238 PMCID: PMC7349403 DOI: 10.3390/s20123404] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Revised: 06/06/2020] [Accepted: 06/12/2020] [Indexed: 11/16/2022]
Abstract
Neuromorphic vision sensors detect changes in luminosity taking inspiration from mammalian retina and providing a stream of events with high temporal resolution, also known as Dynamic Vision Sensors (DVS). This continuous stream of events can be used to extract spatio-temporal patterns from a scene. A time-surface represents a spatio-temporal context for a given spatial radius around an incoming event from a sensor at a specific time history. Time-surfaces can be organized in a hierarchical way to extract features from input events using the Hierarchy Of Time-Surfaces algorithm, hereinafter HOTS. HOTS can be organized in consecutive layers to extract combination of features in a similar way as some deep-learning algorithms do. This work introduces a novel FPGA architecture for accelerating HOTS network. This architecture is mainly based on block-RAM memory and the non-restoring square root algorithm, requiring basic components and enabling it for low-power low-latency embedded applications. The presented architecture has been tested on a Zynq 7100 platform at 100 MHz. The results show that the latencies are in the range of 1 μ s to 6.7 μ s, requiring a maximum dynamic power consumption of 77 mW. This system was tested with a gesture recognition dataset, obtaining an accuracy loss for 16-bit precision of only 1.2% with respect to the original software HOTS.
Collapse
Affiliation(s)
- Ricardo Tapiador-Morales
- Robotics and Technology of Computers Lab (ETSII-EPS), University of Seville, 41089 Sevilla, Spain; (A.J.-F.); (G.J.-M.); (A.L.-B.)
- aiCTX AG, 8092 Zurich, Switzerland
| | - Jean-Matthieu Maro
- Neuromorphic Vision and Natural Computation, Sorbonne Université, 75006 Paris, France; (J.-M.M.); (R.B.)
| | - Angel Jimenez-Fernandez
- Robotics and Technology of Computers Lab (ETSII-EPS), University of Seville, 41089 Sevilla, Spain; (A.J.-F.); (G.J.-M.); (A.L.-B.)
- SCORE Lab, Research Institute of Computer Engineering (I3US), University of Seville, 41089 Seville, Spain
| | - Gabriel Jimenez-Moreno
- Robotics and Technology of Computers Lab (ETSII-EPS), University of Seville, 41089 Sevilla, Spain; (A.J.-F.); (G.J.-M.); (A.L.-B.)
- SCORE Lab, Research Institute of Computer Engineering (I3US), University of Seville, 41089 Seville, Spain
| | - Ryad Benosman
- Neuromorphic Vision and Natural Computation, Sorbonne Université, 75006 Paris, France; (J.-M.M.); (R.B.)
| | - Alejandro Linares-Barranco
- Robotics and Technology of Computers Lab (ETSII-EPS), University of Seville, 41089 Sevilla, Spain; (A.J.-F.); (G.J.-M.); (A.L.-B.)
- SCORE Lab, Research Institute of Computer Engineering (I3US), University of Seville, 41089 Seville, Spain
| |
Collapse
|
25
|
Liu D, Bellotto N, Yue S. Deep Spiking Neural Network for Video-Based Disguise Face Recognition Based on Dynamic Facial Movements. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:1843-1855. [PMID: 31329135 DOI: 10.1109/tnnls.2019.2927274] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
With the increasing popularity of social media and smart devices, the face as one of the key biometrics becomes vital for person identification. Among those face recognition algorithms, video-based face recognition methods could make use of both temporal and spatial information just as humans do to achieve better classification performance. However, they cannot identify individuals when certain key facial areas, such as eyes or nose, are disguised by heavy makeup or rubber/digital masks. To this end, we propose a novel deep spiking neural network architecture in this paper. It takes dynamic facial movements, the facial muscle changes induced by speaking or other activities, as the sole input. An event-driven continuous spike-timing-dependent plasticity learning rule with adaptive thresholding is applied to train the synaptic weights. The experiments on our proposed video-based disguise face database (MakeFace DB) demonstrate that the proposed learning method performs very well, i.e., it achieves from 95% to 100% correct classification rates under various realistic experimental scenarios.
Collapse
|
26
|
Lee C, Sarwar SS, Panda P, Srinivasan G, Roy K. Enabling Spike-Based Backpropagation for Training Deep Neural Network Architectures. Front Neurosci 2020; 14:119. [PMID: 32180697 PMCID: PMC7059737 DOI: 10.3389/fnins.2020.00119] [Citation(s) in RCA: 79] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Accepted: 01/30/2020] [Indexed: 12/24/2022] Open
Abstract
Spiking Neural Networks (SNNs) have recently emerged as a prominent neural computing paradigm. However, the typical shallow SNN architectures have limited capacity for expressing complex representations while training deep SNNs using input spikes has not been successful so far. Diverse methods have been proposed to get around this issue such as converting off-the-shelf trained deep Artificial Neural Networks (ANNs) to SNNs. However, the ANN-SNN conversion scheme fails to capture the temporal dynamics of a spiking system. On the other hand, it is still a difficult problem to directly train deep SNNs using input spike events due to the discontinuous, non-differentiable nature of the spike generation function. To overcome this problem, we propose an approximate derivative method that accounts for the leaky behavior of LIF neurons. This method enables training deep convolutional SNNs directly (with input spike events) using spike-based backpropagation. Our experiments show the effectiveness of the proposed spike-based learning on deep networks (VGG and Residual architectures) by achieving the best classification accuracies in MNIST, SVHN, and CIFAR-10 datasets compared to other SNNs trained with a spike-based learning. Moreover, we analyze sparse event-based computations to demonstrate the efficacy of the proposed SNN training method for inference operation in the spiking domain.
Collapse
|
27
|
Ju X, Fang B, Yan R, Xu X, Tang H. An FPGA Implementation of Deep Spiking Neural Networks for Low-Power and Fast Classification. Neural Comput 2019; 32:182-204. [PMID: 31703174 DOI: 10.1162/neco_a_01245] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
A spiking neural network (SNN) is a type of biological plausibility model that performs information processing based on spikes. Training a deep SNN effectively is challenging due to the nondifferention of spike signals. Recent advances have shown that high-performance SNNs can be obtained by converting convolutional neural networks (CNNs). However, the large-scale SNNs are poorly served by conventional architectures due to the dynamic nature of spiking neurons. In this letter, we propose a hardware architecture to enable efficient implementation of SNNs. All layers in the network are mapped on one chip so that the computation of different time steps can be done in parallel to reduce latency. We propose new spiking max-pooling method to reduce computation complexity. In addition, we apply approaches based on shift register and coarsely grained parallels to accelerate convolution operation. We also investigate the effect of different encoding methods on SNN accuracy. Finally, we validate the hardware architecture on the Xilinx Zynq ZCU102. The experimental results on the MNIST data set show that it can achieve an accuracy of 98.94% with eight-bit quantized weights. Furthermore, it achieves 164 frames per second (FPS) under 150 MHz clock frequency and obtains 41× speed-up compared to CPU implementation and 22 times lower power than GPU implementation.
Collapse
Affiliation(s)
- Xiping Ju
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Biao Fang
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Rui Yan
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Xiaoliang Xu
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
| | - Huajin Tang
- College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China, and College of Computer Science, Sichuan University, Chengdu 610065, China
| |
Collapse
|
28
|
Fast and robust learning in Spiking Feed-forward Neural Networks based on Intrinsic Plasticity mechanism. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.07.009] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
29
|
Taherkhani A, Belatreche A, Li Y, Cosma G, Maguire LP, McGinnity TM. A review of learning in biologically plausible spiking neural networks. Neural Netw 2019; 122:253-272. [PMID: 31726331 DOI: 10.1016/j.neunet.2019.09.036] [Citation(s) in RCA: 104] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 09/17/2019] [Accepted: 09/23/2019] [Indexed: 11/30/2022]
Abstract
Artificial neural networks have been used as a powerful processing tool in various areas such as pattern recognition, control, robotics, and bioinformatics. Their wide applicability has encouraged researchers to improve artificial neural networks by investigating the biological brain. Neurological research has significantly progressed in recent years and continues to reveal new characteristics of biological neurons. New technologies can now capture temporal changes in the internal activity of the brain in more detail and help clarify the relationship between brain activity and the perception of a given stimulus. This new knowledge has led to a new type of artificial neural network, the Spiking Neural Network (SNN), that draws more faithfully on biological properties to provide higher processing abilities. A review of recent developments in learning of spiking neurons is presented in this paper. First the biological background of SNN learning algorithms is reviewed. The important elements of a learning algorithm such as the neuron model, synaptic plasticity, information encoding and SNN topologies are then presented. Then, a critical review of the state-of-the-art learning algorithms for SNNs using single and multiple spikes is presented. Additionally, deep spiking neural networks are reviewed, and challenges and opportunities in the SNN field are discussed.
Collapse
Affiliation(s)
- Aboozar Taherkhani
- School of Computer Science and Informatics, Faculty of Computing, Engineering and Media, De Montfort University, Leicester, UK.
| | - Ammar Belatreche
- Department of Computer and Information Sciences, Northumbria University, Newcastle upon Tyne, UK
| | - Yuhua Li
- School of Computer Science and Informatics, Cardiff University, Cardiff, UK
| | - Georgina Cosma
- Department of Computer Science, Loughborough University, Loughborough, UK
| | - Liam P Maguire
- Intelligent Systems Research Centre, Ulster University, Northern Ireland, Derry, UK
| | - T M McGinnity
- Intelligent Systems Research Centre, Ulster University, Northern Ireland, Derry, UK; School of Science and Technology, Nottingham Trent University, Nottingham, UK
| |
Collapse
|
30
|
Deep Spiking Convolutional Neural Network Trained With Unsupervised Spike-Timing-Dependent Plasticity. IEEE Trans Cogn Dev Syst 2019. [DOI: 10.1109/tcds.2018.2833071] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
31
|
Hu R, Chang S, Wang H, He J, Huang Q. Efficient Multispike Learning for Spiking Neural Networks Using Probability-Modulated Timing Method. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:1984-1997. [PMID: 30418889 DOI: 10.1109/tnnls.2018.2875471] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Error functions are normally based on the distance between output spikes and target spikes in supervised learning algorithms for spiking neural networks (SNNs). Due to the discontinuous nature of the internal state of spiking neuron, it is challenging to ensure that the number of output spikes and target spikes kept identical in multispike learning. This problem is conventionally dealt with by using the smaller of the number of desired spikes and that of actual output spikes in learning. However, if this approach is used, information is lost as some spikes are neglected. In this paper, a probability-modulated timing mechanism is built on the stochastic neurons, where the discontinuous spike patterns are converted to the likelihood of generating the desired output spike trains. By applying this mechanism to a probability-modulated spiking classifier, a probability-modulated SNN (PMSNN) is constructed. In its multilayer and multispike learning structure, more inputs are incorporated and mapped to the target spike trains. A clustering rule connection mechanism is also applied to a reservoir to improve the efficiency of information transmission among synapses, which can map the highly correlated inputs to the adjacent neurons. Results of comparisons between the proposed method and popular the SNN algorithms showed that the PMSNN yields higher efficiency and requires fewer parameters.
Collapse
|
32
|
Illing B, Gerstner W, Brea J. Biologically plausible deep learning - But how far can we go with shallow networks? Neural Netw 2019; 118:90-101. [PMID: 31254771 DOI: 10.1016/j.neunet.2019.06.001] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Revised: 05/29/2019] [Accepted: 06/02/2019] [Indexed: 11/17/2022]
Abstract
Training deep neural networks with the error backpropagation algorithm is considered implausible from a biological perspective. Numerous recent publications suggest elaborate models for biologically plausible variants of deep learning, typically defining success as reaching around 98% test accuracy on the MNIST data set. Here, we investigate how far we can go on digit (MNIST) and object (CIFAR10) classification with biologically plausible, local learning rules in a network with one hidden layer and a single readout layer. The hidden layer weights are either fixed (random or random Gabor filters) or trained with unsupervised methods (Principal/Independent Component Analysis or Sparse Coding) that can be implemented by local learning rules. The readout layer is trained with a supervised, local learning rule. We first implement these models with rate neurons. This comparison reveals, first, that unsupervised learning does not lead to better performance than fixed random projections or Gabor filters for large hidden layers. Second, networks with localized receptive fields perform significantly better than networks with all-to-all connectivity and can reach backpropagation performance on MNIST. We then implement two of the networks - fixed, localized, random & random Gabor filters in the hidden layer - with spiking leaky integrate-and-fire neurons and spike timing dependent plasticity to train the readout layer. These spiking models achieve >98.2% test accuracy on MNIST, which is close to the performance of rate networks with one hidden layer trained with backpropagation. The performance of our shallow network models is comparable to most current biologically plausible models of deep learning. Furthermore, our results with a shallow spiking network provide an important reference and suggest the use of data sets other than MNIST for testing the performance of future models of biologically plausible deep learning.
Collapse
Affiliation(s)
- Bernd Illing
- School of Computer and Communication Science & School of Life Science, EPFL, 1015 Lausanne, Switzerland.
| | - Wulfram Gerstner
- School of Computer and Communication Science & School of Life Science, EPFL, 1015 Lausanne, Switzerland
| | - Johanni Brea
- School of Computer and Communication Science & School of Life Science, EPFL, 1015 Lausanne, Switzerland
| |
Collapse
|
33
|
Liu D, Yue S. Event-Driven Continuous STDP Learning With Deep Structure for Visual Pattern Recognition. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:1377-1390. [PMID: 29994790 DOI: 10.1109/tcyb.2018.2801476] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Human beings can achieve reliable and fast visual pattern recognition with limited time and learning samples. Underlying this capability, ventral stream plays an important role in object representation and form recognition. Modeling the ventral steam may shed light on further understanding the visual brain in humans and building artificial vision systems for pattern recognition. The current methods to model the mechanism of ventral stream are far from exhibiting fast, continuous, and event-driven learning like the human brain. To create a visual system similar to ventral stream in human with fast learning capability, in this paper, we propose a new spiking neural system with an event-driven continuous spike timing dependent plasticity (STDP) learning method using specific spiking timing sequences. Two novel continuous input mechanisms have been used to obtain the continuous input spiking pattern sequence. With the event-driven STDP learning rule, the proposed learning procedure will be activated if the neuron receive one pre- or post-synaptic spike event. The experimental results on MNIST database show that the proposed method outperforms all other methods in fast learning scenarios and most of the current models in exhaustive learning experiments.
Collapse
|
34
|
Sengupta A, Ye Y, Wang R, Liu C, Roy K. Going Deeper in Spiking Neural Networks: VGG and Residual Architectures. Front Neurosci 2019; 13:95. [PMID: 30899212 PMCID: PMC6416793 DOI: 10.3389/fnins.2019.00095] [Citation(s) in RCA: 210] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 01/25/2019] [Indexed: 11/13/2022] Open
Abstract
Over the past few years, Spiking Neural Networks (SNNs) have become popular as a possible pathway to enable low-power event-driven neuromorphic hardware. However, their application in machine learning have largely been limited to very shallow neural network architectures for simple problems. In this paper, we propose a novel algorithmic technique for generating an SNN with a deep architecture, and demonstrate its effectiveness on complex visual recognition problems such as CIFAR-10 and ImageNet. Our technique applies to both VGG and Residual network architectures, with significantly better accuracy than the state-of-the-art. Finally, we present analysis of the sparse event-driven computations to demonstrate reduced hardware overhead when operating in the spiking domain.
Collapse
Affiliation(s)
- Abhronil Sengupta
- Department of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, United States
| | - Yuting Ye
- Facebook Reality Labs, Facebook Research, Redmond, WA, United States
| | - Robert Wang
- Facebook Reality Labs, Facebook Research, Redmond, WA, United States
| | - Chiao Liu
- Facebook Reality Labs, Facebook Research, Redmond, WA, United States
| | - Kaushik Roy
- Department of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, United States
| |
Collapse
|
35
|
Li H, Li G, Shi L. Super-resolution of spatiotemporal event-stream image. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.12.048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
36
|
Afshar S, Hamilton TJ, Tapson J, van Schaik A, Cohen G. Investigation of Event-Based Surfaces for High-Speed Detection, Unsupervised Feature Extraction, and Object Recognition. Front Neurosci 2019; 12:1047. [PMID: 30705618 PMCID: PMC6344467 DOI: 10.3389/fnins.2018.01047] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Accepted: 12/24/2018] [Indexed: 12/31/2022] Open
Abstract
In this work, we investigate event-based feature extraction through a rigorous framework of testing. We test a hardware efficient variant of Spike Timing Dependent Plasticity (STDP) on a range of spatio-temporal kernels with different surface decaying methods, decay functions, receptive field sizes, feature numbers, and back end classifiers. This detailed investigation can provide helpful insights and rules of thumb for performance vs. complexity trade-offs in more generalized networks, especially in the context of hardware implementation, where design choices can incur significant resource costs. The investigation is performed using a new dataset consisting of model airplanes being dropped free-hand close to the sensor. The target objects exhibit a wide range of relative orientations and velocities. This range of target velocities, analyzed in multiple configurations, allows a rigorous comparison of time-based decaying surfaces (time surfaces) vs. event index-based decaying surface (index surfaces), which are used to perform unsupervised feature extraction, followed by target detection and recognition. We examine each processing stage by comparison to the use of raw events, as well as a range of alternative layer structures, and the use of random features. By comparing results from a linear classifier and an ELM classifier, we evaluate how each element of the system affects accuracy. To generate time and index surfaces, the most commonly used kernels, namely event binning kernels, linearly, and exponentially decaying kernels, are investigated. Index surfaces were found to outperform time surfaces in recognition when invariance to target velocity was made a requirement. In the investigation of network structure, larger networks of neurons with large receptive field sizes were found to perform best. We find that a small number of event-based feature extractors can project the complex spatio-temporal event patterns of the dataset to an almost linearly separable representation in feature space, with best performing linear classifier achieving 98.75% recognition accuracy, using only 25 feature extracting neurons.
Collapse
Affiliation(s)
- Saeed Afshar
- Biomedical Engineering and Neuroscience Program, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Sydney, NSW, Australia
| | - Tara Julia Hamilton
- Biomedical Engineering and Neuroscience Program, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Sydney, NSW, Australia
| | - Jonathan Tapson
- Biomedical Engineering and Neuroscience Program, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Sydney, NSW, Australia
| | - André van Schaik
- Biomedical Engineering and Neuroscience Program, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Sydney, NSW, Australia
| | - Gregory Cohen
- Biomedical Engineering and Neuroscience Program, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Sydney, NSW, Australia
| |
Collapse
|
37
|
Seifozzakerini S, Yau WY, Mao K, Nejati H. Hough Transform Implementation For Event-Based Systems: Concepts and Challenges. Front Comput Neurosci 2018; 12:103. [PMID: 30622466 PMCID: PMC6308381 DOI: 10.3389/fncom.2018.00103] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2018] [Accepted: 12/05/2018] [Indexed: 11/13/2022] Open
Abstract
Hough transform (HT) is one of the most well-known techniques in computer vision that has been the basis of many practical image processing algorithms. HT however is designed to work for frame-based systems such as conventional digital cameras. Recently, event-based systems such as Dynamic Vision Sensor (DVS) cameras, has become popular among researchers. Event-based cameras have a significantly high temporal resolution (1 μs), but each pixel can only detect change and not color. As such, the conventional image processing algorithms cannot be readily applied to event-based output streams. Therefore, it is necessary to adapt the conventional image processing algorithms for event-based cameras. This paper provides a systematic explanation, starting from extending conventional HT to 3D HT, adaptation to event-based systems, and the implementation of the 3D HT using Spiking Neural Networks (SNNs). Using SNN enables the proposed solution to be easily realized on hardware using FPGA, without requiring CPU or additional memory. In addition, we also discuss techniques for optimal SNN-based implementation using efficient number of neurons for the required accuracy and resolution along each dimension, without increasing the overall computational complexity. We hope that this will help to reduce the gap between event-based and frame-based systems.
Collapse
Affiliation(s)
- Sajjad Seifozzakerini
- Institute for Infocomm Research, Agency for Science, Technology and Research (ASTAR), Singapore, Singapore.,School of Electrical and Electronic Engineering, Nanyang Technological University (NTU), Singapore, Singapore
| | - Wei-Yun Yau
- Institute for Infocomm Research, Agency for Science, Technology and Research (ASTAR), Singapore, Singapore
| | - Kezhi Mao
- School of Electrical and Electronic Engineering, Nanyang Technological University (NTU), Singapore, Singapore
| | - Hossein Nejati
- Information Systems Technology and Design (ISTD), Singapore University of Technology and Design (SUTD), Singapore, Singapore
| |
Collapse
|
38
|
Tavanaei A, Ghodrati M, Kheradpisheh SR, Masquelier T, Maida A. Deep learning in spiking neural networks. Neural Netw 2018; 111:47-63. [PMID: 30682710 DOI: 10.1016/j.neunet.2018.12.002] [Citation(s) in RCA: 266] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Revised: 12/02/2018] [Accepted: 12/03/2018] [Indexed: 12/14/2022]
Abstract
In recent years, deep learning has revolutionized the field of machine learning, for computer vision in particular. In this approach, a deep (multilayer) artificial neural network (ANN) is trained, most often in a supervised manner using backpropagation. Vast amounts of labeled training examples are required, but the resulting classification accuracy is truly impressive, sometimes outperforming humans. Neurons in an ANN are characterized by a single, static, continuous-valued activation. Yet biological neurons use discrete spikes to compute and transmit information, and the spike times, in addition to the spike rates, matter. Spiking neural networks (SNNs) are thus more biologically realistic than ANNs, and are arguably the only viable option if one wants to understand how the brain computes at the neuronal description level. The spikes of biological neurons are sparse in time and space, and event-driven. Combined with bio-plausible local learning rules, this makes it easier to build low-power, neuromorphic hardware for SNNs. However, training deep SNNs remains a challenge. Spiking neurons' transfer function is usually non-differentiable, which prevents using backpropagation. Here we review recent supervised and unsupervised methods to train deep SNNs, and compare them in terms of accuracy and computational cost. The emerging picture is that SNNs still lag behind ANNs in terms of accuracy, but the gap is decreasing, and can even vanish on some tasks, while SNNs typically require many fewer operations and are the better candidates to process spatio-temporal data.
Collapse
Affiliation(s)
- Amirhossein Tavanaei
- School of Computing and Informatics, University of Louisiana at Lafayette, Lafayette, LA 70504, USA.
| | - Masoud Ghodrati
- Department of Physiology, Monash University, Clayton, VIC, Australia
| | - Saeed Reza Kheradpisheh
- Department of Computer Science, Faculty of Mathematical Sciences and Computer, Kharazmi University, Tehran, Iran
| | | | - Anthony Maida
- School of Computing and Informatics, University of Louisiana at Lafayette, Lafayette, LA 70504, USA
| |
Collapse
|
39
|
Mozafari M, Kheradpisheh SR, Masquelier T, Nowzari-Dalini A, Ganjtabesh M. First-Spike-Based Visual Categorization Using Reward-Modulated STDP. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:6178-6190. [PMID: 29993898 DOI: 10.1109/tnnls.2018.2826721] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Reinforcement learning (RL) has recently regained popularity with major achievements such as beating the European game of Go champion. Here, for the first time, we show that RL can be used efficiently to train a spiking neural network (SNN) to perform object recognition in natural images without using an external classifier. We used a feedforward convolutional SNN and a temporal coding scheme where the most strongly activated neurons fire first, while less activated ones fire later, or not at all. In the highest layers, each neuron was assigned to an object category, and it was assumed that the stimulus category was the category of the first neuron to fire. If this assumption was correct, the neuron was rewarded, i.e., spike-timing-dependent plasticity (STDP) was applied, which reinforced the neuron's selectivity. Otherwise, anti-STDP was applied, which encouraged the neuron to learn something else. As demonstrated on various image data sets (Caltech, ETH-80, and NORB), this reward-modulated STDP (R-STDP) approach has extracted particularly discriminative visual features, whereas classic unsupervised STDP extracts any feature that consistently repeats. As a result, R-STDP has outperformed STDP on these data sets. Furthermore, R-STDP is suitable for online learning and can adapt to drastic changes such as label permutations. Finally, it is worth mentioning that both feature extraction and classification were done with spikes, using at most one spike per neuron. Thus, the network is hardware friendly and energy efficient.
Collapse
|
40
|
Less Data Same Information for Event-Based Sensors: A Bioinspired Filtering and Data Reduction Algorithm. SENSORS 2018; 18:s18124122. [PMID: 30477237 PMCID: PMC6308842 DOI: 10.3390/s18124122] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Revised: 11/21/2018] [Accepted: 11/22/2018] [Indexed: 11/28/2022]
Abstract
Sensors provide data which need to be processed after acquisition to remove noise and extract relevant information. When the sensor is a network node and acquired data are to be transmitted to other nodes (e.g., through Ethernet), the amount of generated data from multiple nodes can overload the communication channel. The reduction of generated data implies the possibility of lower hardware requirements and less power consumption for the hardware devices. This work proposes a filtering algorithm (LDSI—Less Data Same Information) which reduces the generated data from event-based sensors without loss of relevant information. It is a bioinspired filter, i.e., event data are processed using a structure resembling biological neuronal information processing. The filter is fully configurable, from a “transparent mode” to a very restrictive mode. Based on an analysis of configuration parameters, three main configurations are given: weak, medium and restrictive. Using data from a DVS event camera, results for a similarity detection algorithm show that event data can be reduced up to 30% while maintaining the same similarity index when compared to unfiltered data. Data reduction can reach 85% with a penalty of 15% in similarity index compared to the original data. An object tracking algorithm was also used to compare results of the proposed filter with other existing filter. The LDSI filter provides less error (4.86 ± 1.87) when compared to the background activity filter (5.01 ± 1.93). The algorithm was tested under a PC using pre-recorded datasets, and its FPGA implementation was also carried out. A Xilinx Virtex6 FPGA received data from a 128 × 128 DVS camera, applied the LDSI algorithm, created a AER dataflow and sent the data to the PC for data analysis and visualization. The FPGA could run at 177 MHz clock speed with a low resource usage (671 LUT and 40 Block RAM for the whole system), showing real time operation capabilities and very low resource usage. The results show that, using an adequate filter parameter tuning, the relevant information from the scene is kept while fewer events are generated (i.e., fewer generated data).
Collapse
|
41
|
Demin V, Nekhaev D. Recurrent Spiking Neural Network Learning Based on a Competitive Maximization of Neuronal Activity. Front Neuroinform 2018; 12:79. [PMID: 30498439 PMCID: PMC6250118 DOI: 10.3389/fninf.2018.00079] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2018] [Accepted: 10/18/2018] [Indexed: 12/21/2022] Open
Abstract
Spiking neural networks (SNNs) are believed to be highly computationally and energy efficient for specific neurochip hardware real-time solutions. However, there is a lack of learning algorithms for complex SNNs with recurrent connections, comparable in efficiency with back-propagation techniques and capable of unsupervised training. Here we suppose that each neuron in a biological neural network tends to maximize its activity in competition with other neurons, and put this principle at the basis of a new SNN learning algorithm. In such a way, a spiking network with the learned feed-forward, reciprocal and intralayer inhibitory connections, is introduced to the MNIST database digit recognition. It has been demonstrated that this SNN can be trained without a teacher, after a short supervised initialization of weights by the same algorithm. Also, it has been shown that neurons are grouped into families of hierarchical structures, corresponding to different digit classes and their associations. This property is expected to be useful to reduce the number of layers in deep neural networks and modeling the formation of various functional structures in a biological nervous system. Comparison of the learning properties of the suggested algorithm, with those of the Sparse Distributed Representation approach shows similarity in coding but also some advantages of the former. The basic principle of the proposed algorithm is believed to be practically applicable to the construction of much more complicated and diverse task solving SNNs. We refer to this new approach as "Family-Engaged Execution and Learning of Induced Neuron Groups," or FEELING.
Collapse
Affiliation(s)
- Vyacheslav Demin
- National Research Center "Kurchatov Institute", Moscow, Russia.,Moscow Institute of Phycics and Technology, Dolgoprudny, Russia
| | - Dmitry Nekhaev
- National Research Center "Kurchatov Institute", Moscow, Russia
| |
Collapse
|
42
|
Movement Detection with Event-Based Cameras: Comparison with Frame-Based Cameras in Robot Object Tracking Using Powerlink Communication. ELECTRONICS 2018. [DOI: 10.3390/electronics7110304] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Event-based cameras are not common in industrial applications despite the fact that they can add multiple advantages for applications with moving objects. In comparison with frame-based cameras, the amount of generated data is very low while keeping the main information in the scene. For an industrial environment with interconnected systems, data reduction becomes very important to avoid network congestion and provide faster response time. However, the use of new sensors as event-based cameras is not common since they do not usually provide connectivity to industrial buses. This work develops a network node based on a Field Programmable Gate Array (FPGA), including data acquisition and tracking position for an event-based camera. It also includes spurious reduction and filtering algorithms while keeping the main features at the scene. The FPGA node also includes the stack of the network protocol to provide standard communication among other nodes. The powerlink IEEE 61158 industrial network is used to communicate the FPGA with a controller connected to a self-developed two-axis servo-controlled robot. The inverse kinematics model for the robot is included in the controller. To complete the system and provide a comparison, a traditional frame-based camera is also connected to the controller. Response time and robustness to lighting conditions are tested. Results show that, using the event-based camera, the robot can follow the object using fast image recognition achieving up to 85% percent data reduction providing an average of 99 ms faster position detection and less dispersion in position detection (4.96 mm vs. 17.74 mm in the Y-axis position, and 2.18 mm vs. 8.26 mm in the X-axis position) than the frame-based camera, showing that event-based cameras are more stable under light changes. Additionally, event-based cameras offer intrinsic advantages due to the low computational complexity required: small size, low power, reduced data and low cost. Thus, it is demonstrated how the development of new equipment and algorithms can be efficiently integrated into an industrial system, merging commercial industrial equipment with new devices.
Collapse
|
43
|
Kerr D, Coleman S, McGinnity MT. Biologically Inspired Intensity and Depth Image Edge Extraction. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:5356-5365. [PMID: 29994457 DOI: 10.1109/tnnls.2018.2797994] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In recent years, artificial vision research has moved from focusing on the use of only intensity images to include using depth images, or RGB-D combinations due to the recent development of low-cost depth cameras. However, depth images require a lot of storage and processing requirements. In addition, it is challenging to extract relevant features from depth images in real time. Researchers have sought inspiration from biology in order to overcome these challenges resulting in biologically inspired feature extraction methods. By taking inspiration from nature, it may be possible to reduce redundancy, extract relevant features, and process an image efficiently by emulating biological visual processes. In this paper, we present a depth and intensity image feature extraction approach that has been inspired by biological vision systems. Through the use of biologically inspired spiking neural networks, we emulate functional computational aspects of biological visual systems. The results demonstrate that the proposed bioinspired artificial vision system has increased performance over existing computer vision feature extraction approaches.
Collapse
|
44
|
|
45
|
Zheng N, Mazumder P. Online Supervised Learning for Hardware-Based Multilayer Spiking Neural Networks Through the Modulation of Weight-Dependent Spike-Timing-Dependent Plasticity. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:4287-4302. [PMID: 29990088 DOI: 10.1109/tnnls.2017.2761335] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In this paper, we propose an online learning algorithm for supervised learning in multilayer spiking neural networks (SNNs). It is found that the spike timings of neurons in an SNN can be exploited to estimate the gradients that are associated with each synapse. With the proposed method of estimating gradients, learning similar to the stochastic gradient descent process employed in a conventional artificial neural network (ANN) can be achieved. In addition to the conventional layer-by-layer backpropagation, a one-pass direct backpropagation is possible using the proposed learning algorithm. Two neural networks, with one and two hidden layers, are employed as examples to demonstrate the effectiveness of the proposed learning algorithms. Several techniques for more effective learning are discussed, including utilizing a random refractory period to avoid saturation of spikes, employing a quantization noise injection technique and pseudorandom initial conditions to decorrelate spike timings, in addition to leveraging the progressive precision in an SNN to reduce the inference latency and energy. Extensive parametric simulations are conducted to examine the aforementioned techniques. The learning algorithm is developed with the considerations of ease of hardware implementation and relative compatibility with the classic ANN-based learning. Therefore, the proposed algorithm not only enjoys the high energy efficiency and good scalability of an SNN in its specialized hardware but also benefits from the well-developed theory and techniques of conventional ANN-based learning. The Modified National Institute of Standards and Technology database benchmark test is conducted to verify the newly proposed learning algorithm. Classification correct rates of 97.2% and 97.8% are achieved for the one-hidden-layer and two-hidden-layer neural networks, respectively. Moreover, a brief discussion of the hardware implementations is presented for two mainstream architectures.
Collapse
|
46
|
Camunas-Mesa LA, Serrano-Gotarredona T, Ieng SH, Benosman R, Linares-Barranco B. Event-Driven Stereo Visual Tracking Algorithm to Solve Object Occlusion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:4223-4237. [PMID: 29989974 DOI: 10.1109/tnnls.2017.2759326] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Object tracking is a major problem for many computer vision applications, but it continues to be computationally expensive. The use of bio-inspired neuromorphic event-driven dynamic vision sensors (DVSs) has heralded new methods for vision processing, exploiting reduced amount of data and very precise timing resolutions. Previous studies have shown these neural spiking sensors to be well suited to implementing single-sensor object tracking systems, although they experience difficulties when solving ambiguities caused by object occlusion. DVSs have also performed well in 3-D reconstruction in which event matching techniques are applied in stereo setups. In this paper, we propose a new event-driven stereo object tracking algorithm that simultaneously integrates 3-D reconstruction and cluster tracking, introducing feedback information in both tasks to improve their respective performances. This algorithm, inspired by human vision, identifies objects and learns their position and size in order to solve ambiguities. This strategy has been validated in four different experiments where the 3-D positions of two objects were tracked in a stereo setup even when occlusion occurred. The objects studied in the experiments were: 1) two swinging pens, the distance between which during movement was measured with an error of less than 0.5%; 2) a pen and a box, to confirm the correctness of the results obtained with a more complex object; 3) two straws attached to a fan and rotating at 6 revolutions per second, to demonstrate the high-speed capabilities of this approach; and 4) two people walking in a real-world environment.
Collapse
|
47
|
Lee C, Panda P, Srinivasan G, Roy K. Training Deep Spiking Convolutional Neural Networks With STDP-Based Unsupervised Pre-training Followed by Supervised Fine-Tuning. Front Neurosci 2018; 12:435. [PMID: 30123103 PMCID: PMC6085488 DOI: 10.3389/fnins.2018.00435] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2018] [Accepted: 06/11/2018] [Indexed: 12/02/2022] Open
Abstract
Spiking Neural Networks (SNNs) are fast becoming a promising candidate for brain-inspired neuromorphic computing because of their inherent power efficiency and impressive inference accuracy across several cognitive tasks such as image classification and speech recognition. The recent efforts in SNNs have been focused on implementing deeper networks with multiple hidden layers to incorporate exponentially more difficult functional representations. In this paper, we propose a pre-training scheme using biologically plausible unsupervised learning, namely Spike-Timing-Dependent-Plasticity (STDP), in order to better initialize the parameters in multi-layer systems prior to supervised optimization. The multi-layer SNN is comprised of alternating convolutional and pooling layers followed by fully-connected layers, which are populated with leaky integrate-and-fire spiking neurons. We train the deep SNNs in two phases wherein, first, convolutional kernels are pre-trained in a layer-wise manner with unsupervised learning followed by fine-tuning the synaptic weights with spike-based supervised gradient descent backpropagation. Our experiments on digit recognition demonstrate that the STDP-based pre-training with gradient-based optimization provides improved robustness, faster (~2.5 ×) training time and better generalization compared with purely gradient-based training without pre-training.
Collapse
Affiliation(s)
- Chankyu Lee
- Nanoelectronics Research Laboratory, School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, United States
| | | | | | | |
Collapse
|
48
|
Deep representation via convolutional neural network for classification of spatiotemporal event streams. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.02.019] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
49
|
Paulun L, Wendt A, Kasabov N. A Retinotopic Spiking Neural Network System for Accurate Recognition of Moving Objects Using NeuCube and Dynamic Vision Sensors. Front Comput Neurosci 2018; 12:42. [PMID: 29946249 PMCID: PMC6006267 DOI: 10.3389/fncom.2018.00042] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Accepted: 05/24/2018] [Indexed: 11/13/2022] Open
Abstract
This paper introduces a new system for dynamic visual recognition that combines bio-inspired hardware with a brain-like spiking neural network. The system is designed to take data from a dynamic vision sensor (DVS) that simulates the functioning of the human retina by producing an address event output (spike trains) based on the movement of objects. The system then convolutes the spike trains and feeds them into a brain-like spiking neural network, called NeuCube, which is organized in a three-dimensional manner, representing the organization of the primary visual cortex. Spatio-temporal patterns of the data are learned during a deep unsupervised learning stage, using spike-timing-dependent plasticity. In a second stage, supervised learning is performed to train the network for classification tasks. The convolution algorithm and the mapping into the network mimic the function of retinal ganglion cells and the retinotopic organization of the visual cortex. The NeuCube architecture can be used to visualize the deep connectivity inside the network before, during, and after training and thereby allows for a better understanding of the learning processes. The method was tested on the benchmark MNIST-DVS dataset and achieved a classification accuracy of 92.90%. The paper discusses advantages and limitations of the new method and concludes that it is worth exploring further on different datasets, aiming for advances in dynamic computer vision and multimodal systems that integrate visual, aural, tactile, and other kinds of information in a biologically plausible way.
Collapse
Affiliation(s)
- Lukas Paulun
- Knowledge Engineering and Discovery Research Institute, Auckland University of Technology, Auckland, New Zealand
- Mathematical Institute, Albert Ludwigs University of Freiburg, Freiburg im Breisgau, Germany
| | - Anne Wendt
- Knowledge Engineering and Discovery Research Institute, Auckland University of Technology, Auckland, New Zealand
| | - Nikola Kasabov
- Knowledge Engineering and Discovery Research Institute, Auckland University of Technology, Auckland, New Zealand
| |
Collapse
|
50
|
Xu X, Jin X, Yan R, Fang Q, Lu W. Visual Pattern Recognition Using Enhanced Visual Features and PSD-Based Learning Rule. IEEE Trans Cogn Dev Syst 2018. [DOI: 10.1109/tcds.2017.2769166] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|