51
|
Taherkhani A, Belatreche A, Li Y, Cosma G, Maguire LP, McGinnity TM. A review of learning in biologically plausible spiking neural networks. Neural Netw 2019; 122:253-272. [PMID: 31726331 DOI: 10.1016/j.neunet.2019.09.036] [Citation(s) in RCA: 104] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 09/17/2019] [Accepted: 09/23/2019] [Indexed: 11/30/2022]
Abstract
Artificial neural networks have been used as a powerful processing tool in various areas such as pattern recognition, control, robotics, and bioinformatics. Their wide applicability has encouraged researchers to improve artificial neural networks by investigating the biological brain. Neurological research has significantly progressed in recent years and continues to reveal new characteristics of biological neurons. New technologies can now capture temporal changes in the internal activity of the brain in more detail and help clarify the relationship between brain activity and the perception of a given stimulus. This new knowledge has led to a new type of artificial neural network, the Spiking Neural Network (SNN), that draws more faithfully on biological properties to provide higher processing abilities. A review of recent developments in learning of spiking neurons is presented in this paper. First the biological background of SNN learning algorithms is reviewed. The important elements of a learning algorithm such as the neuron model, synaptic plasticity, information encoding and SNN topologies are then presented. Then, a critical review of the state-of-the-art learning algorithms for SNNs using single and multiple spikes is presented. Additionally, deep spiking neural networks are reviewed, and challenges and opportunities in the SNN field are discussed.
Collapse
Affiliation(s)
- Aboozar Taherkhani
- School of Computer Science and Informatics, Faculty of Computing, Engineering and Media, De Montfort University, Leicester, UK.
| | - Ammar Belatreche
- Department of Computer and Information Sciences, Northumbria University, Newcastle upon Tyne, UK
| | - Yuhua Li
- School of Computer Science and Informatics, Cardiff University, Cardiff, UK
| | - Georgina Cosma
- Department of Computer Science, Loughborough University, Loughborough, UK
| | - Liam P Maguire
- Intelligent Systems Research Centre, Ulster University, Northern Ireland, Derry, UK
| | - T M McGinnity
- Intelligent Systems Research Centre, Ulster University, Northern Ireland, Derry, UK; School of Science and Technology, Nottingham Trent University, Nottingham, UK
| |
Collapse
|
52
|
Abderrahmane N, Lemaire E, Miramond B. Design Space Exploration of Hardware Spiking Neurons for Embedded Artificial Intelligence. Neural Netw 2019; 121:366-386. [PMID: 31593842 DOI: 10.1016/j.neunet.2019.09.024] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Revised: 09/16/2019] [Accepted: 09/17/2019] [Indexed: 11/30/2022]
Abstract
Machine learning is yielding unprecedented interest in research and industry, due to recent success in many applied contexts such as image classification and object recognition. However, the deployment of these systems requires huge computing capabilities, thus making them unsuitable for embedded systems. To deal with this limitation, many researchers are investigating brain-inspired computing, which would be a perfect alternative to the conventional Von Neumann architecture based computers (CPU/GPU) that meet the requirements for computing performance, but not for energy-efficiency. Therefore, neuromorphic hardware circuits that are adaptable for both parallel and distributed computations need to be designed. In this paper, we focus on Spiking Neural Networks (SNNs) with a comprehensive study of neural coding methods and hardware exploration. In this context, we propose a framework for neuromorphic hardware design space exploration, which allows to define a suitable architecture based on application-specific constraints and starting from a wide variety of possible architectural choices. For this framework, we have developed a behavioral level simulator for neuromorphic hardware architectural exploration named NAXT. Moreover, we propose modified versions of the standard Rate Coding technique to make trade-offs with the Time Coding paradigm, which is characterized by the low number of spikes propagating in the network. Thus, we are able to reduce the number of spikes while keeping the same neuron's model, which results in an SNN with fewer events to process. By doing so, we seek to reduce the amount of power consumed by the hardware. Furthermore, we present three neuromorphic hardware architectures in order to quantitatively study the implementation of SNNs. One of these architectures integrates a novel hybrid structure: a highly-parallel computation core for most solicited layers, and time-multiplexed computation units for deeper layers. These architectures are derived from a novel funnel-like Design Space Exploration framework for neuromorphic hardware.
Collapse
Affiliation(s)
| | - Edgar Lemaire
- Université Côte d'Azur, CNRS, LEAT, France; Thales Research Technology / STI Group / LCHP, Palaiseau, France.
| | | |
Collapse
|
53
|
Doborjeh M, Kasabov N, Doborjeh Z, Enayatollahi R, Tu E, Gandomi AH. Personalised modelling with spiking neural networks integrating temporal and static information. Neural Netw 2019; 119:162-177. [PMID: 31446235 DOI: 10.1016/j.neunet.2019.07.021] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2018] [Revised: 07/19/2019] [Accepted: 07/25/2019] [Indexed: 10/26/2022]
Abstract
This paper proposes a new personalised prognostic/diagnostic system that supports classification, prediction and pattern recognition when both static and dynamic/spatiotemporal features are presented in a dataset. The system is based on a proposed clustering method (named d2WKNN) for optimal selection of neighbouring samples to an individual with respect to the integration of both static (vector-based) and temporal individual data. The most relevant samples to an individual are selected to train a Personalised Spiking Neural Network (PSNN) that learns from sets of streaming data to capture the space and time association patterns. The generated time-dependant patterns resulted in a higher accuracy of classification/prediction (80% to 93%) when compared with global modelling and conventional methods. In addition, the PSNN models can support interpretability by creating personalised profiling of an individual. This contributes to a better understanding of the interactions between features. Therefore, an end-user can comprehend what interactions in the model have led to a certain decision (outcome). The proposed PSNN model is an analytical tool, applicable to several real-life health applications, where different data domains describe a person's health condition. The system was applied to two case studies: (1) classification of spatiotemporal neuroimaging data for the investigation of individual response to treatment and (2) prediction of risk of stroke with respect to temporal environmental data. For both datasets, besides the temporal data, static health data were also available. The hyper-parameters of the proposed system, including the PSNN models and the d2WKNN clustering parameters, are optimised for each individual.
Collapse
Affiliation(s)
- Maryam Doborjeh
- Knowledge Engineering and Discovery Research Institute, Auckland University of Technology, Auckland, New Zealand; Computer Science Department, Auckland University of Technology, New Zealand.
| | - Nikola Kasabov
- Knowledge Engineering and Discovery Research Institute, Auckland University of Technology, Auckland, New Zealand; Computer Science Department, Auckland University of Technology, New Zealand
| | - Zohreh Doborjeh
- Knowledge Engineering and Discovery Research Institute, Auckland University of Technology, Auckland, New Zealand
| | - Reza Enayatollahi
- BioDesign Lab, School of Engineering, Computer & Mathematical Sciences, Auckland University of Technology, Auckland, New Zealand
| | - Enmei Tu
- School of Electronics, Information & Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Amir H Gandomi
- Faculty of Engineering & Information Technology, University of Technology, Sydney, Ultimo, NSW 2007, Australia; School of Business, Stevens Institute of Technology, Hoboken, NJ 07030, USA
| |
Collapse
|
54
|
Hu R, Chang S, Wang H, He J, Huang Q. Efficient Multispike Learning for Spiking Neural Networks Using Probability-Modulated Timing Method. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:1984-1997. [PMID: 30418889 DOI: 10.1109/tnnls.2018.2875471] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Error functions are normally based on the distance between output spikes and target spikes in supervised learning algorithms for spiking neural networks (SNNs). Due to the discontinuous nature of the internal state of spiking neuron, it is challenging to ensure that the number of output spikes and target spikes kept identical in multispike learning. This problem is conventionally dealt with by using the smaller of the number of desired spikes and that of actual output spikes in learning. However, if this approach is used, information is lost as some spikes are neglected. In this paper, a probability-modulated timing mechanism is built on the stochastic neurons, where the discontinuous spike patterns are converted to the likelihood of generating the desired output spike trains. By applying this mechanism to a probability-modulated spiking classifier, a probability-modulated SNN (PMSNN) is constructed. In its multilayer and multispike learning structure, more inputs are incorporated and mapped to the target spike trains. A clustering rule connection mechanism is also applied to a reservoir to improve the efficiency of information transmission among synapses, which can map the highly correlated inputs to the adjacent neurons. Results of comparisons between the proposed method and popular the SNN algorithms showed that the PMSNN yields higher efficiency and requires fewer parameters.
Collapse
|
55
|
Steffen L, Reichard D, Weinland J, Kaiser J, Roennau A, Dillmann R. Neuromorphic Stereo Vision: A Survey of Bio-Inspired Sensors and Algorithms. Front Neurorobot 2019; 13:28. [PMID: 31191287 PMCID: PMC6546825 DOI: 10.3389/fnbot.2019.00028] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Accepted: 05/07/2019] [Indexed: 11/16/2022] Open
Abstract
Any visual sensor, whether artificial or biological, maps the 3D-world on a 2D-representation. The missing dimension is depth and most species use stereo vision to recover it. Stereo vision implies multiple perspectives and matching, hence it obtains depth from a pair of images. Algorithms for stereo vision are also used prosperously in robotics. Although, biological systems seem to compute disparities effortless, artificial methods suffer from high energy demands and latency. The crucial part is the correspondence problem; finding the matching points of two images. The development of event-based cameras, inspired by the retina, enables the exploitation of an additional physical constraint—time. Due to their asynchronous course of operation, considering the precise occurrence of spikes, Spiking Neural Networks take advantage of this constraint. In this work, we investigate sensors and algorithms for event-based stereo vision leading to more biologically plausible robots. Hereby, we focus mainly on binocular stereo vision.
Collapse
Affiliation(s)
- Lea Steffen
- FZI Research Center for Information Technology, Karlsruhe, Germany
| | - Daniel Reichard
- FZI Research Center for Information Technology, Karlsruhe, Germany
| | - Jakob Weinland
- FZI Research Center for Information Technology, Karlsruhe, Germany
| | - Jacques Kaiser
- FZI Research Center for Information Technology, Karlsruhe, Germany
| | - Arne Roennau
- FZI Research Center for Information Technology, Karlsruhe, Germany
| | - Rüdiger Dillmann
- FZI Research Center for Information Technology, Karlsruhe, Germany.,Humanoids and Intelligence Systems Lab, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
| |
Collapse
|
56
|
Afshar S, Hamilton TJ, Tapson J, van Schaik A, Cohen G. Investigation of Event-Based Surfaces for High-Speed Detection, Unsupervised Feature Extraction, and Object Recognition. Front Neurosci 2019; 12:1047. [PMID: 30705618 PMCID: PMC6344467 DOI: 10.3389/fnins.2018.01047] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Accepted: 12/24/2018] [Indexed: 12/31/2022] Open
Abstract
In this work, we investigate event-based feature extraction through a rigorous framework of testing. We test a hardware efficient variant of Spike Timing Dependent Plasticity (STDP) on a range of spatio-temporal kernels with different surface decaying methods, decay functions, receptive field sizes, feature numbers, and back end classifiers. This detailed investigation can provide helpful insights and rules of thumb for performance vs. complexity trade-offs in more generalized networks, especially in the context of hardware implementation, where design choices can incur significant resource costs. The investigation is performed using a new dataset consisting of model airplanes being dropped free-hand close to the sensor. The target objects exhibit a wide range of relative orientations and velocities. This range of target velocities, analyzed in multiple configurations, allows a rigorous comparison of time-based decaying surfaces (time surfaces) vs. event index-based decaying surface (index surfaces), which are used to perform unsupervised feature extraction, followed by target detection and recognition. We examine each processing stage by comparison to the use of raw events, as well as a range of alternative layer structures, and the use of random features. By comparing results from a linear classifier and an ELM classifier, we evaluate how each element of the system affects accuracy. To generate time and index surfaces, the most commonly used kernels, namely event binning kernels, linearly, and exponentially decaying kernels, are investigated. Index surfaces were found to outperform time surfaces in recognition when invariance to target velocity was made a requirement. In the investigation of network structure, larger networks of neurons with large receptive field sizes were found to perform best. We find that a small number of event-based feature extractors can project the complex spatio-temporal event patterns of the dataset to an almost linearly separable representation in feature space, with best performing linear classifier achieving 98.75% recognition accuracy, using only 25 feature extracting neurons.
Collapse
Affiliation(s)
- Saeed Afshar
- Biomedical Engineering and Neuroscience Program, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Sydney, NSW, Australia
| | - Tara Julia Hamilton
- Biomedical Engineering and Neuroscience Program, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Sydney, NSW, Australia
| | - Jonathan Tapson
- Biomedical Engineering and Neuroscience Program, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Sydney, NSW, Australia
| | - André van Schaik
- Biomedical Engineering and Neuroscience Program, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Sydney, NSW, Australia
| | - Gregory Cohen
- Biomedical Engineering and Neuroscience Program, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Sydney, NSW, Australia
| |
Collapse
|
57
|
Zheng Y, Li S, Yan R, Tang H, Tan KC. Sparse Temporal Encoding of Visual Features for Robust Object Recognition by Spiking Neurons. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:5823-5833. [PMID: 29994102 DOI: 10.1109/tnnls.2018.2812811] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Robust object recognition in spiking neural systems remains a challenging in neuromorphic computing area as it needs to solve both the effective encoding of sensory information and also its integration with downstream learning neurons. We target this problem by developing a spiking neural system consisting of sparse temporal encoding and temporal classifier. We propose a sparse temporal encoding algorithm which exploits both spatial and temporal information derived from an spike-timing-dependent plasticity-based HMAX feature extraction process. The temporal feature representation, thus, becomes more appropriate to be integrated with a temporal classifier based on spiking neurons rather than with nontemporal classifier. The algorithm has been validated on two benchmark data sets and the results show the temporal feature encoding and learning-based method achieves high recognition accuracy. The proposed model provides an efficient approach to perform feature representation and recognition in a consistent temporal learning framework, which is easily adapted to neuromorphic implementations.
Collapse
|
58
|
|
59
|
Pfeiffer M, Pfeil T. Deep Learning With Spiking Neurons: Opportunities and Challenges. Front Neurosci 2018; 12:774. [PMID: 30410432 PMCID: PMC6209684 DOI: 10.3389/fnins.2018.00774] [Citation(s) in RCA: 137] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Accepted: 10/04/2018] [Indexed: 01/16/2023] Open
Abstract
Spiking neural networks (SNNs) are inspired by information processing in biology, where sparse and asynchronous binary signals are communicated and processed in a massively parallel fashion. SNNs on neuromorphic hardware exhibit favorable properties such as low power consumption, fast inference, and event-driven information processing. This makes them interesting candidates for the efficient implementation of deep neural networks, the method of choice for many machine learning tasks. In this review, we address the opportunities that deep spiking networks offer and investigate in detail the challenges associated with training SNNs in a way that makes them competitive with conventional deep learning, but simultaneously allows for efficient mapping to hardware. A wide range of training methods for SNNs is presented, ranging from the conversion of conventional deep networks into SNNs, constrained training before conversion, spiking variants of backpropagation, and biologically motivated variants of STDP. The goal of our review is to define a categorization of SNN training methods, and summarize their advantages and drawbacks. We further discuss relationships between SNNs and binary networks, which are becoming popular for efficient digital hardware implementation. Neuromorphic hardware platforms have great potential to enable deep spiking networks in real-world applications. We compare the suitability of various neuromorphic systems that have been developed over the past years, and investigate potential use cases. Neuromorphic approaches and conventional machine learning should not be considered simply two solutions to the same classes of problems, instead it is possible to identify and exploit their task-specific advantages. Deep SNNs offer great opportunities to work with new types of event-based sensors, exploit temporal codes and local on-chip learning, and we have so far just scratched the surface of realizing these advantages in practical applications.
Collapse
Affiliation(s)
- Michael Pfeiffer
- Bosch Center for Artificial Intelligence, Robert Bosch GmbH, Renningen, Germany
| | | |
Collapse
|
60
|
Yousefzadeh A, Stromatias E, Soto M, Serrano-Gotarredona T, Linares-Barranco B. On Practical Issues for Stochastic STDP Hardware With 1-bit Synaptic Weights. Front Neurosci 2018; 12:665. [PMID: 30374283 PMCID: PMC6196279 DOI: 10.3389/fnins.2018.00665] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2017] [Accepted: 09/04/2018] [Indexed: 11/21/2022] Open
Abstract
In computational neuroscience, synaptic plasticity learning rules are typically studied using the full 64-bit floating point precision computers provide. However, for dedicated hardware implementations, the precision used not only penalizes directly the required memory resources, but also the computing, communication, and energy resources. When it comes to hardware engineering, a key question is always to find the minimum number of necessary bits to keep the neurocomputational system working satisfactorily. Here we present some techniques and results obtained when limiting synaptic weights to 1-bit precision, applied to a Spike-Timing-Dependent-Plasticity (STDP) learning rule in Spiking Neural Networks (SNN). We first illustrate the 1-bit synapses STDP operation by replicating a classical biological experiment on visual orientation tuning, using a simple four neuron setup. After this, we apply 1-bit STDP learning to the hidden feature extraction layer of a 2-layer system, where for the second (and output) layer we use already reported SNN classifiers. The systems are tested on two spiking datasets: a Dynamic Vision Sensor (DVS) recorded poker card symbols dataset and a Poisson-distributed spike representation MNIST dataset version. Tests are performed using the in-house MegaSim event-driven behavioral simulator and by implementing the systems on FPGA (Field Programmable Gate Array) hardware.
Collapse
Affiliation(s)
- Amirreza Yousefzadeh
- Instituto de Microelectrónica de Sevilla (IMSE-CNM), CSIC and Universidad de Sevilla, Sevilla, Spain
| | - Evangelos Stromatias
- Instituto de Microelectrónica de Sevilla (IMSE-CNM), CSIC and Universidad de Sevilla, Sevilla, Spain
| | - Miguel Soto
- Instituto de Microelectrónica de Sevilla (IMSE-CNM), CSIC and Universidad de Sevilla, Sevilla, Spain
| | | | - Bernabé Linares-Barranco
- Instituto de Microelectrónica de Sevilla (IMSE-CNM), CSIC and Universidad de Sevilla, Sevilla, Spain
| |
Collapse
|
61
|
Cohen G, Afshar S, Orchard G, Tapson J, Benosman R, van Schaik A. Spatial and Temporal Downsampling in Event-Based Visual Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:5030-5044. [PMID: 29994752 DOI: 10.1109/tnnls.2017.2785272] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
As the interest in event-based vision sensors for mobile and aerial applications grows, there is an increasing need for high-speed and highly robust algorithms for performing visual tasks using event-based data. As event rate and network structure have a direct impact on the power consumed by such systems, it is important to explore the efficiency of the event-based encoding used by these sensors. The work presented in this paper represents the first study solely focused on the effects of both spatial and temporal downsampling on event-based vision data and makes use of a variety of data sets chosen to fully explore and characterize the nature of downsampling operations. The results show that both spatial downsampling and temporal downsampling produce improved classification accuracy and, additionally, a lower overall data rate. A finding is particularly relevant for bandwidth and power constrained systems. For a given network containing 1000 hidden layer neurons, the spatially downsampled systems achieved a best case accuracy of 89.38% on N-MNIST as opposed to 81.03% with no downsampling at the same hidden layer size. On the N-Caltech101 data set, the downsampled system achieved a best case accuracy of 18.25%, compared with 7.43% achieved with no downsampling. The results show that downsampling is an important preprocessing technique in event-based visual processing, especially for applications sensitive to power consumption and transmission bandwidth.
Collapse
|
62
|
Camunas-Mesa LA, Serrano-Gotarredona T, Ieng SH, Benosman R, Linares-Barranco B. Event-Driven Stereo Visual Tracking Algorithm to Solve Object Occlusion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:4223-4237. [PMID: 29989974 DOI: 10.1109/tnnls.2017.2759326] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Object tracking is a major problem for many computer vision applications, but it continues to be computationally expensive. The use of bio-inspired neuromorphic event-driven dynamic vision sensors (DVSs) has heralded new methods for vision processing, exploiting reduced amount of data and very precise timing resolutions. Previous studies have shown these neural spiking sensors to be well suited to implementing single-sensor object tracking systems, although they experience difficulties when solving ambiguities caused by object occlusion. DVSs have also performed well in 3-D reconstruction in which event matching techniques are applied in stereo setups. In this paper, we propose a new event-driven stereo object tracking algorithm that simultaneously integrates 3-D reconstruction and cluster tracking, introducing feedback information in both tasks to improve their respective performances. This algorithm, inspired by human vision, identifies objects and learns their position and size in order to solve ambiguities. This strategy has been validated in four different experiments where the 3-D positions of two objects were tracked in a stereo setup even when occlusion occurred. The objects studied in the experiments were: 1) two swinging pens, the distance between which during movement was measured with an error of less than 0.5%; 2) a pen and a box, to confirm the correctness of the results obtained with a more complex object; 3) two straws attached to a fan and rotating at 6 revolutions per second, to demonstrate the high-speed capabilities of this approach; and 4) two people walking in a real-world environment.
Collapse
|
63
|
Yousefzadeh A, Orchard G, Serrano-Gotarredona T, Linares-Barranco B. Active Perception With Dynamic Vision Sensors. Minimum Saccades With Optimum Recognition. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2018; 12:927-939. [PMID: 29994268 DOI: 10.1109/tbcas.2018.2834428] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Vision processing with dynamic vision sensors (DVSs) is becoming increasingly popular. This type of a bio-inspired vision sensor does not record static images. The DVS pixel activity relies on the changes in light intensity. In this paper, we introduce a platform for the object recognition with a DVS in which the sensor is installed on a moving pan-tilt unit in a closed loop with a recognition neural network. This neural network is trained to recognize objects observed by a DVS, while the pan-tilt unit is moved to emulate micro-saccades. We show that performing more saccades in different directions can result in having more information about the object, and therefore, more accurate object recognition is possible. However, in high-performance and low-latency platforms, performing additional saccades adds latency and power consumption. Here, we show that the number of saccades can be reduced while keeping the same recognition accuracy by performing intelligent saccadic movements, in a closed action-perception smart loop. We propose an algorithm for smart saccadic movement decisions that can reduce the number of necessary saccades to half, on average, for a predefined accuracy on the N-MNIST dataset. Additionally, we show that by replacing this control algorithm with an artificial neural network that learns to control the saccades, we can also reduce to half the average number of saccades needed for the N-MNIST recognition.
Collapse
|
64
|
Ieng SH, Lehtonen E, Benosman R. Complexity Analysis of Iterative Basis Transformations Applied to Event-Based Signals. Front Neurosci 2018; 12:373. [PMID: 29946231 PMCID: PMC6006676 DOI: 10.3389/fnins.2018.00373] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 05/14/2018] [Indexed: 11/30/2022] Open
Abstract
This paper introduces an event-based methodology to perform arbitrary linear basis transformations that encompass a broad range of practically important signal transforms, such as the discrete Fourier transform (DFT) and the discrete wavelet transform (DWT). We present a complexity analysis of the proposed method, and show that the amount of required multiply-and-accumulate operations is reduced in comparison to frame-based method in natural video sequences, when the required temporal resolution is high enough. Experimental results on natural video sequences acquired by the asynchronous time-based neuromorphic image sensor (ATIS) are provided to support the feasibility of the method, and to illustrate the gain in computation resources.
Collapse
Affiliation(s)
- Sio-Hoi Ieng
- INSERM UMRI S 968, Sorbonne Universites, UPMC Univ Paris 06, UMR S 968, Centre National de la Recherche Scientifique, UMR 7210, Institut de la Vision, Paris, France
- *Correspondence: Sio-Hoi Ieng
| | - Eero Lehtonen
- Department of Future Technologies, University of Turku, Turku, Finland
| | - Ryad Benosman
- INSERM UMRI S 968, Sorbonne Universites, UPMC Univ Paris 06, UMR S 968, Centre National de la Recherche Scientifique, UMR 7210, Institut de la Vision, Paris, France
| |
Collapse
|
65
|
Xu X, Jin X, Yan R, Fang Q, Lu W. Visual Pattern Recognition Using Enhanced Visual Features and PSD-Based Learning Rule. IEEE Trans Cogn Dev Syst 2018. [DOI: 10.1109/tcds.2017.2769166] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
66
|
Shi C, Li J, Wang Y, Luo G. Exploiting Lightweight Statistical Learning for Event-Based Vision Processing. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2018; 6:19396-19406. [PMID: 29750138 PMCID: PMC5937990 DOI: 10.1109/access.2018.2823260] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This paper presents a lightweight statistical learning framework potentially suitable for low-cost event-based vision systems, where visual information is captured by a dynamic vision sensor (DVS) and represented as an asynchronous stream of pixel addresses (events) indicating a relative intensity change on those locations. A simple random ferns classifier based on randomly selected patch-based binary features is employed to categorize pixel event flows. Our experimental results demonstrate that compared to existing event-based processing algorithms, such as spiking convolutional neural networks (SCNNs) and the state-of-the-art bag-of-events (BoE)-based statistical algorithms, our framework excels in high processing speed (2× faster than the BoE statistical methods and >100× faster than previous SCNNs in training speed) with extremely simple online learning process, and achieves state-of-the-art classification accuracy on four popular address-event representation data sets: MNIST-DVS, Poker-DVS, Posture-DVS, and CIFAR10-DVS. Hardware estimation shows that our algorithm will be preferable for low-cost embedded system implementations.
Collapse
Affiliation(s)
- Cong Shi
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114 USA
| | - Jiajun Li
- State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100864, China
| | - Ying Wang
- State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100864, China
| | - Gang Luo
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114 USA
| |
Collapse
|
67
|
Camuñas-Mesa LA, Domínguez-Cordero YL, Linares-Barranco A, Serrano-Gotarredona T, Linares-Barranco B. A Configurable Event-Driven Convolutional Node with Rate Saturation Mechanism for Modular ConvNet Systems Implementation. Front Neurosci 2018; 12:63. [PMID: 29515349 PMCID: PMC5826227 DOI: 10.3389/fnins.2018.00063] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2017] [Accepted: 01/26/2018] [Indexed: 11/13/2022] Open
Abstract
Convolutional Neural Networks (ConvNets) are a particular type of neural network often used for many applications like image recognition, video analysis or natural language processing. They are inspired by the human brain, following a specific organization of the connectivity pattern between layers of neurons known as receptive field. These networks have been traditionally implemented in software, but they are becoming more computationally expensive as they scale up, having limitations for real-time processing of high-speed stimuli. On the other hand, hardware implementations show difficulties to be used for different applications, due to their reduced flexibility. In this paper, we propose a fully configurable event-driven convolutional node with rate saturation mechanism that can be used to implement arbitrary ConvNets on FPGAs. This node includes a convolutional processing unit and a routing element which allows to build large 2D arrays where any multilayer structure can be implemented. The rate saturation mechanism emulates the refractory behavior in biological neurons, guaranteeing a minimum separation in time between consecutive events. A 4-layer ConvNet with 22 convolutional nodes trained for poker card symbol recognition has been implemented in a Spartan6 FPGA. This network has been tested with a stimulus where 40 poker cards were observed by a Dynamic Vision Sensor (DVS) in 1 s time. Different slow-down factors were applied to characterize the behavior of the system for high speed processing. For slow stimulus play-back, a 96% recognition rate is obtained with a power consumption of 0.85 mW. At maximum play-back speed, a traffic control mechanism downsamples the input stimulus, obtaining a recognition rate above 63% when less than 20% of the input events are processed, demonstrating the robustness of the network.
Collapse
Affiliation(s)
- Luis A. Camuñas-Mesa
- Instituto de Microelectrónica de Sevilla (IMSE-CNM), CSIC y Universidad de Sevilla, Sevilla, Spain
| | | | | | | | - Bernabé Linares-Barranco
- Instituto de Microelectrónica de Sevilla (IMSE-CNM), CSIC y Universidad de Sevilla, Sevilla, Spain
| |
Collapse
|
68
|
Moradi S, Qiao N, Stefanini F, Indiveri G. A Scalable Multicore Architecture With Heterogeneous Memory Structures for Dynamic Neuromorphic Asynchronous Processors (DYNAPs). IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2018; 12:106-122. [PMID: 29377800 DOI: 10.1109/tbcas.2017.2759700] [Citation(s) in RCA: 115] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Neuromorphic computing systems comprise networks of neurons that use asynchronous events for both computation and communication. This type of representation offers several advantages in terms of bandwidth and power consumption in neuromorphic electronic systems. However, managing the traffic of asynchronous events in large scale systems is a daunting task, both in terms of circuit complexity and memory requirements. Here, we present a novel routing methodology that employs both hierarchical and mesh routing strategies and combines heterogeneous memory structures for minimizing both memory requirements and latency, while maximizing programming flexibility to support a wide range of event-based neural network architectures, through parameter configuration. We validated the proposed scheme in a prototype multicore neuromorphic processor chip that employs hybrid analog/digital circuits for emulating synapse and neuron dynamics together with asynchronous digital circuits for managing the address-event traffic. We present a theoretical analysis of the proposed connectivity scheme, describe the methods and circuits used to implement such scheme, and characterize the prototype chip. Finally, we demonstrate the use of the neuromorphic processor with a convolutional neural network for the real-time classification of visual symbols being flashed to a dynamic vision sensor (DVS) at high speed.
Collapse
|
69
|
Rueckauer B, Lungu IA, Hu Y, Pfeiffer M, Liu SC. Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification. Front Neurosci 2017; 11:682. [PMID: 29375284 PMCID: PMC5770641 DOI: 10.3389/fnins.2017.00682] [Citation(s) in RCA: 226] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2017] [Accepted: 11/22/2017] [Indexed: 11/30/2022] Open
Abstract
Spiking neural networks (SNNs) can potentially offer an efficient way of doing inference because the neurons in the networks are sparsely activated and computations are event-driven. Previous work showed that simple continuous-valued deep Convolutional Neural Networks (CNNs) can be converted into accurate spiking equivalents. These networks did not include certain common operations such as max-pooling, softmax, batch-normalization and Inception-modules. This paper presents spiking equivalents of these operations therefore allowing conversion of nearly arbitrary CNN architectures. We show conversion of popular CNN architectures, including VGG-16 and Inception-v3, into SNNs that produce the best results reported to date on MNIST, CIFAR-10 and the challenging ImageNet dataset. SNNs can trade off classification error rate against the number of available operations whereas deep continuous-valued neural networks require a fixed number of operations to achieve their classification error rate. From the examples of LeNet for MNIST and BinaryNet for CIFAR-10, we show that with an increase in error rate of a few percentage points, the SNNs can achieve more than 2x reductions in operations compared to the original CNNs. This highlights the potential of SNNs in particular when deployed on power-efficient neuromorphic spiking neuron chips, for use in embedded applications.
Collapse
Affiliation(s)
- Bodo Rueckauer
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Iulia-Alexandra Lungu
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Yuhuang Hu
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Michael Pfeiffer
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland.,Bosch Center for Artificial Intelligence, Renningen, Germany
| | - Shih-Chii Liu
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland
| |
Collapse
|
70
|
Rebecq H, Gallego G, Mueggler E, Scaramuzza D. EMVS: Event-Based Multi-View Stereo—3D Reconstruction with an Event Camera in Real-Time. Int J Comput Vis 2017. [DOI: 10.1007/s11263-017-1050-6] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
71
|
Lagorce X, Orchard G, Galluppi F, Shi BE, Benosman RB. HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2017; 39:1346-1359. [PMID: 27411216 DOI: 10.1109/tpami.2016.2574707] [Citation(s) in RCA: 92] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
This paper describes novel event-based spatio-temporal features called time-surfaces and how they can be used to create a hierarchical event-based pattern recognition architecture. Unlike existing hierarchical architectures for pattern recognition, the presented model relies on a time oriented approach to extract spatio-temporal features from the asynchronously acquired dynamics of a visual scene. These dynamics are acquired using biologically inspired frameless asynchronous event-driven vision sensors. Similarly to cortical structures, subsequent layers in our hierarchy extract increasingly abstract features using increasingly large spatio-temporal windows. The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood. We demonstrate that this concept can robustly be used at all stages of an event-based hierarchical model. First layer feature units operate on groups of pixels, while subsequent layer feature units operate on the output of lower level feature units. We report results on a previously published 36 class character recognition task and a four class canonical dynamic card pip task, achieving near 100 percent accuracy on each. We introduce a new seven class moving face recognition task, achieving 79 percent accuracy.This paper describes novel event-based spatio-temporal features called time-surfaces and how they can be used to create a hierarchical event-based pattern recognition architecture. Unlike existing hierarchical architectures for pattern recognition, the presented model relies on a time oriented approach to extract spatio-temporal features from the asynchronously acquired dynamics of a visual scene. These dynamics are acquired using biologically inspired frameless asynchronous event-driven vision sensors. Similarly to cortical structures, subsequent layers in our hierarchy extract increasingly abstract features using increasingly large spatio-temporal windows. The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood. We demonstrate that this concept can robustly be used at all stages of an event-based hierarchical model. First layer feature units operate on groups of pixels, while subsequent layer feature units operate on the output of lower level feature units. We report results on a previously published 36 class character recognition task and a four class canonical dynamic card pip task, achieving near 100 percent accuracy on each. We introduce a new seven class moving face recognition task, achieving 79 percent accuracy.
Collapse
Affiliation(s)
- Xavier Lagorce
- Vision and Natural Computation Group, Institut National de la Santé et de la Recherche Médicale, Sorbonne Universités, Institut de la Vision, Université Paris 06, Paris, Paris, FranceFrance
| | - Garrick Orchard
- Singapore Institute for Neurotechnology (SINAPSE), National University of Singapore, Singapore
| | - Francesco Galluppi
- Vision and Natural Computation Group, Institut National de la Santé et de la Recherche Médicale, Sorbonne Universités, Institut de la Vision, Université Paris 06, Paris, Paris, FranceFrance
| | | | - Ryad B Benosman
- Vision and Natural Computation Group, Institut National de la Santé et de la Recherche Médicale, Sorbonne Universités, Institut de la Vision, Université Paris 06, Paris, Paris, FranceFrance
| |
Collapse
|
72
|
Stromatias E, Soto M, Serrano-Gotarredona T, Linares-Barranco B. An Event-Driven Classifier for Spiking Neural Networks Fed with Synthetic or Dynamic Vision Sensor Data. Front Neurosci 2017; 11:350. [PMID: 28701911 PMCID: PMC5487436 DOI: 10.3389/fnins.2017.00350] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Accepted: 06/06/2017] [Indexed: 11/25/2022] Open
Abstract
This paper introduces a novel methodology for training an event-driven classifier within a Spiking Neural Network (SNN) System capable of yielding good classification results when using both synthetic input data and real data captured from Dynamic Vision Sensor (DVS) chips. The proposed supervised method uses the spiking activity provided by an arbitrary topology of prior SNN layers to build histograms and train the classifier in the frame domain using the stochastic gradient descent algorithm. In addition, this approach can cope with leaky integrate-and-fire neuron models within the SNN, a desirable feature for real-world SNN applications, where neural activation must fade away after some time in the absence of inputs. Consequently, this way of building histograms captures the dynamics of spikes immediately before the classifier. We tested our method on the MNIST data set using different synthetic encodings and real DVS sensory data sets such as N-MNIST, MNIST-DVS, and Poker-DVS using the same network topology and feature maps. We demonstrate the effectiveness of our approach by achieving the highest classification accuracy reported on the N-MNIST (97.77%) and Poker-DVS (100%) real DVS data sets to date with a spiking convolutional network. Moreover, by using the proposed method we were able to retrain the output layer of a previously reported spiking neural network and increase its performance by 2%, suggesting that the proposed classifier can be used as the output layer in works where features are extracted using unsupervised spike-based learning methods. In addition, we also analyze SNN performance figures such as total event activity and network latencies, which are relevant for eventual hardware implementations. In summary, the paper aggregates unsupervised-trained SNNs with a supervised-trained SNN classifier, combining and applying them to heterogeneous sets of benchmarks, both synthetic and from real DVS chips.
Collapse
Affiliation(s)
| | | | | | - Bernabé Linares-Barranco
- Instituto de Microelectrónica de Sevilla (CNM), Consejo Superior de Investigaciones Científicas (CSIC), Universidad de SevillaSevilla, Spain
| |
Collapse
|
73
|
Broccard FD, Joshi S, Wang J, Cauwenberghs G. Neuromorphic neural interfaces: from neurophysiological inspiration to biohybrid coupling with nervous systems. J Neural Eng 2017; 14:041002. [PMID: 28573983 DOI: 10.1088/1741-2552/aa67a9] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
OBJECTIVE Computation in nervous systems operates with different computational primitives, and on different hardware, than traditional digital computation and is thus subjected to different constraints from its digital counterpart regarding the use of physical resources such as time, space and energy. In an effort to better understand neural computation on a physical medium with similar spatiotemporal and energetic constraints, the field of neuromorphic engineering aims to design and implement electronic systems that emulate in very large-scale integration (VLSI) hardware the organization and functions of neural systems at multiple levels of biological organization, from individual neurons up to large circuits and networks. Mixed analog/digital neuromorphic VLSI systems are compact, consume little power and operate in real time independently of the size and complexity of the model. APPROACH This article highlights the current efforts to interface neuromorphic systems with neural systems at multiple levels of biological organization, from the synaptic to the system level, and discusses the prospects for future biohybrid systems with neuromorphic circuits of greater complexity. MAIN RESULTS Single silicon neurons have been interfaced successfully with invertebrate and vertebrate neural networks. This approach allowed the investigation of neural properties that are inaccessible with traditional techniques while providing a realistic biological context not achievable with traditional numerical modeling methods. At the network level, populations of neurons are envisioned to communicate bidirectionally with neuromorphic processors of hundreds or thousands of silicon neurons. Recent work on brain-machine interfaces suggests that this is feasible with current neuromorphic technology. SIGNIFICANCE Biohybrid interfaces between biological neurons and VLSI neuromorphic systems of varying complexity have started to emerge in the literature. Primarily intended as a computational tool for investigating fundamental questions related to neural dynamics, the sophistication of current neuromorphic systems now allows direct interfaces with large neuronal networks and circuits, resulting in potentially interesting clinical applications for neuroengineering systems, neuroprosthetics and neurorehabilitation.
Collapse
Affiliation(s)
- Frédéric D Broccard
- Institute for Neural Computation, UC San Diego, United States of America. Department of Bioengineering, UC San Diego, United States of America
| | | | | | | |
Collapse
|
74
|
Sabatier Q, Ieng SH, Benosman R. Asynchronous Event-Based Fourier Analysis. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:2192-2202. [PMID: 28186889 DOI: 10.1109/tip.2017.2661702] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
This paper introduces a method to compute the FFT of a visual scene at a high temporal precision of around 1- [Formula: see text] output from an asynchronous event-based camera. Event-based cameras allow to go beyond the widespread and ingrained belief that acquiring series of images at some rate is a good way to capture visual motion. Each pixel adapts its own sampling rate to the visual input it receives and defines the timing of its own sampling points in response to its visual input by reacting to changes of the amount of incident light. As a consequence, the sampling process is no longer governed by a fixed timing source but by the signal to be sampled itself, or more precisely by the variations of the signal in the amplitude domain. Event-based cameras acquisition paradigm allows to go beyond the current conventional method to compute the FFT. The event-driven FFT algorithm relies on a heuristic methodology designed to operate directly on incoming gray level events to update incrementally the FFT while reducing both computation and data load. We show that for reasonable levels of approximations at equivalent frame rates beyond the millisecond, the method performs faster and more efficiently than conventional image acquisition. Several experiments are carried out on indoor and outdoor scenes where both conventional and event-driven FFT computation is shown and compared.
Collapse
|
75
|
Clady X, Maro JM, Barré S, Benosman RB. A Motion-Based Feature for Event-Based Pattern Recognition. Front Neurosci 2017; 10:594. [PMID: 28101001 PMCID: PMC5209354 DOI: 10.3389/fnins.2016.00594] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2016] [Accepted: 12/13/2016] [Indexed: 11/13/2022] Open
Abstract
This paper introduces an event-based luminance-free feature from the output of asynchronous event-based neuromorphic retinas. The feature consists in mapping the distribution of the optical flow along the contours of the moving objects in the visual scene into a matrix. Asynchronous event-based neuromorphic retinas are composed of autonomous pixels, each of them asynchronously generating "spiking" events that encode relative changes in pixels' illumination at high temporal resolutions. The optical flow is computed at each event, and is integrated locally or globally in a speed and direction coordinate frame based grid, using speed-tuned temporal kernels. The latter ensures that the resulting feature equitably represents the distribution of the normal motion along the current moving edges, whatever their respective dynamics. The usefulness and the generality of the proposed feature are demonstrated in pattern recognition applications: local corner detection and global gesture recognition.
Collapse
Affiliation(s)
- Xavier Clady
- Centre National de la Recherche Scientifique, Institut National de la Santé Et de la Recherche Médicale, Institut de la Vision, Sorbonne Universités, UPMC University Paris 06 Paris, France
| | - Jean-Matthieu Maro
- Centre National de la Recherche Scientifique, Institut National de la Santé Et de la Recherche Médicale, Institut de la Vision, Sorbonne Universités, UPMC University Paris 06 Paris, France
| | - Sébastien Barré
- Centre National de la Recherche Scientifique, Institut National de la Santé Et de la Recherche Médicale, Institut de la Vision, Sorbonne Universités, UPMC University Paris 06 Paris, France
| | - Ryad B Benosman
- Centre National de la Recherche Scientifique, Institut National de la Santé Et de la Recherche Médicale, Institut de la Vision, Sorbonne Universités, UPMC University Paris 06 Paris, France
| |
Collapse
|
76
|
Wang H, Xu J, Gao Z, Lu C, Yao S, Ma J. An Event-Based Neurobiological Recognition System with Orientation Detector for Objects in Multiple Orientations. Front Neurosci 2016; 10:498. [PMID: 27867346 PMCID: PMC5095131 DOI: 10.3389/fnins.2016.00498] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Accepted: 10/19/2016] [Indexed: 11/24/2022] Open
Abstract
A new multiple orientation event-based neurobiological recognition system is proposed by integrating recognition and tracking function in this paper, which is used for asynchronous address-event representation (AER) image sensors. The characteristic of this system has been enriched to recognize the objects in multiple orientations with only training samples moving in a single orientation. The system extracts multi-scale and multi-orientation line features inspired by models of the primate visual cortex. An orientation detector based on modified Gaussian blob tracking algorithm is introduced for object tracking and orientation detection. The orientation detector and feature extraction block work in simultaneous mode, without any increase in categorization time. An addresses lookup table (addresses LUT) is also presented to adjust the feature maps by addresses mapping and reordering, and they are categorized in the trained spiking neural network. This recognition system is evaluated with the MNIST dataset which have played important roles in the development of computer vision, and the accuracy is increased owing to the use of both ON and OFF events. AER data acquired by a dynamic vision senses (DVS) are also tested on the system, such as moving digits, pokers, and vehicles. The experimental results show that the proposed system can realize event-based multi-orientation recognition. The work presented in this paper makes a number of contributions to the event-based vision processing system for multi-orientation object recognition. It develops a new tracking-recognition architecture to feedforward categorization system and an address reorder approach to classify multi-orientation objects using event-based data. It provides a new way to recognize multiple orientation objects with only samples in single orientation.
Collapse
Affiliation(s)
- Hanyu Wang
- School of Electronic Information Engineering, Tianjin University Tianjin, China
| | - Jiangtao Xu
- School of Electronic Information Engineering, Tianjin University Tianjin, China
| | - Zhiyuan Gao
- School of Electronic Information Engineering, Tianjin University Tianjin, China
| | - Chengye Lu
- School of Electronic Information Engineering, Tianjin University Tianjin, China
| | - Suying Yao
- School of Electronic Information Engineering, Tianjin University Tianjin, China
| | - Jianguo Ma
- School of Electronic Information Engineering, Tianjin University Tianjin, China
| |
Collapse
|
77
|
Cohen GK, Orchard G, Leng SH, Tapson J, Benosman RB, van Schaik A. Skimming Digits: Neuromorphic Classification of Spike-Encoded Images. Front Neurosci 2016; 10:184. [PMID: 27199646 PMCID: PMC4848313 DOI: 10.3389/fnins.2016.00184] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2016] [Accepted: 04/11/2016] [Indexed: 11/13/2022] Open
Abstract
The growing demands placed upon the field of computer vision have renewed the focus on alternative visual scene representations and processing paradigms. Silicon retinea provide an alternative means of imaging the visual environment, and produce frame-free spatio-temporal data. This paper presents an investigation into event-based digit classification using N-MNIST, a neuromorphic dataset created with a silicon retina, and the Synaptic Kernel Inverse Method (SKIM), a learning method based on principles of dendritic computation. As this work represents the first large-scale and multi-class classification task performed using the SKIM network, it explores different training patterns and output determination methods necessary to extend the original SKIM method to support multi-class problems. Making use of SKIM networks applied to real-world datasets, implementing the largest hidden layer sizes and simultaneously training the largest number of output neurons, the classification system achieved a best-case accuracy of 92.87% for a network containing 10,000 hidden layer neurons. These results represent the highest accuracies achieved against the dataset to date and serve to validate the application of the SKIM method to event-based visual classification tasks. Additionally, the study found that using a square pulse as the supervisory training signal produced the highest accuracy for most output determination methods, but the results also demonstrate that an exponential pattern is better suited to hardware implementations as it makes use of the simplest output determination method based on the maximum value.
Collapse
Affiliation(s)
- Gregory K Cohen
- Biomedical Engineering and Neuroscience, The MARCS Institute, Western Sydney UniversitySydney, NSW, Australia; Natural Vision and Computation Team, Vision Institute, University Pierre and Marie Curie-Centre National de la Recherche ScientifiqueParis, France
| | - Garrick Orchard
- Temasek Labs (TLAB), National University of SingaporeSingapore, Singapore; Neuromorphic Engineering and Robotics, Singapore Institute for Neurotechnology (SINAPSE), National University of SingaporeSingapore, Singapore
| | - Sio-Hoi Leng
- Natural Vision and Computation Team, Vision Institute, University Pierre and Marie Curie-Centre National de la Recherche Scientifique Paris, France
| | - Jonathan Tapson
- Biomedical Engineering and Neuroscience, The MARCS Institute, Western Sydney University Sydney, NSW, Australia
| | - Ryad B Benosman
- Natural Vision and Computation Team, Vision Institute, University Pierre and Marie Curie-Centre National de la Recherche Scientifique Paris, France
| | - André van Schaik
- Biomedical Engineering and Neuroscience, The MARCS Institute, Western Sydney University Sydney, NSW, Australia
| |
Collapse
|
78
|
Reverter Valeiras D, Orchard G, Ieng SH, Benosman RB. Neuromorphic Event-Based 3D Pose Estimation. Front Neurosci 2016; 9:522. [PMID: 26834547 PMCID: PMC4722112 DOI: 10.3389/fnins.2015.00522] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2015] [Accepted: 12/24/2015] [Indexed: 11/13/2022] Open
Abstract
Pose estimation is a fundamental step in many artificial vision tasks. It consists of estimating the 3D pose of an object with respect to a camera from the object's 2D projection. Current state of the art implementations operate on images. These implementations are computationally expensive, especially for real-time applications. Scenes with fast dynamics exceeding 30-60 Hz can rarely be processed in real-time using conventional hardware. This paper presents a new method for event-based 3D object pose estimation, making full use of the high temporal resolution (1 μs) of asynchronous visual events output from a single neuromorphic camera. Given an initial estimate of the pose, each incoming event is used to update the pose by combining both 3D and 2D criteria. We show that the asynchronous high temporal resolution of the neuromorphic camera allows us to solve the problem in an incremental manner, achieving real-time performance at an update rate of several hundreds kHz on a conventional laptop. We show that the high temporal resolution of neuromorphic cameras is a key feature for performing accurate pose estimation. Experiments are provided showing the performance of the algorithm on real data, including fast moving objects, occlusions, and cases where the neuromorphic camera and the object are both in motion.
Collapse
Affiliation(s)
| | | | - Sio-Hoi Ieng
- Natural Vision and Computation Team, Institut de la Vision Paris, France
| | - Ryad B Benosman
- Natural Vision and Computation Team, Institut de la Vision Paris, France
| |
Collapse
|
79
|
Serrano-Gotarredona T, Linares-Barranco B. Poker-DVS and MNIST-DVS. Their History, How They Were Made, and Other Details. Front Neurosci 2015; 9:481. [PMID: 26733794 PMCID: PMC4686704 DOI: 10.3389/fnins.2015.00481] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Accepted: 11/30/2015] [Indexed: 11/20/2022] Open
Abstract
This article reports on two databases for event-driven object recognition using a Dynamic Vision Sensor (DVS). The first, which we call Poker-DVS and is being released together with this article, was obtained by browsing specially made poker card decks in front of a DVS camera for 2–4 s. Each card appeared on the screen for about 20–30 ms. The poker pips were tracked and isolated off-line to constitute the 131-recording Poker-DVS database. The second database, which we call MNIST-DVS and which was released in December 2013, consists of a set of 30,000 DVS camera recordings obtained by displaying 10,000 moving symbols from the standard MNIST 70,000-picture database on an LCD monitor for about 2–3 s each. Each of the 10,000 symbols was displayed at three different scales, so that event-driven object recognition algorithms could easily be tested for different object sizes. This article tells the story behind both databases, covering, among other aspects, details of how they work and the reasons for their creation. We provide not only the databases with corresponding scripts, but also the scripts and data used to generate the figures shown in this article (as Supplementary Material).
Collapse
Affiliation(s)
| | - Bernabé Linares-Barranco
- Instituto de Microelectrónica de Sevilla (IMSE-CNM), CSIC and Universidad de Sevilla Sevilla, Spain
| |
Collapse
|
80
|
Orchard G, Jayawant A, Cohen GK, Thakor N. Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades. Front Neurosci 2015; 9:437. [PMID: 26635513 PMCID: PMC4644806 DOI: 10.3389/fnins.2015.00437] [Citation(s) in RCA: 152] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2015] [Accepted: 10/30/2015] [Indexed: 11/13/2022] Open
Abstract
Creating datasets for Neuromorphic Vision is a challenging task. A lack of available recordings from Neuromorphic Vision sensors means that data must typically be recorded specifically for dataset creation rather than collecting and labeling existing data. The task is further complicated by a desire to simultaneously provide traditional frame-based recordings to allow for direct comparison with traditional Computer Vision algorithms. Here we propose a method for converting existing Computer Vision static image datasets into Neuromorphic Vision datasets using an actuated pan-tilt camera platform. Moving the sensor rather than the scene or image is a more biologically realistic approach to sensing and eliminates timing artifacts introduced by monitor updates when simulating motion on a computer monitor. We present conversion of two popular image datasets (MNIST and Caltech101) which have played important roles in the development of Computer Vision, and we provide performance metrics on these datasets using spike-based recognition algorithms. This work contributes datasets for future use in the field, as well as results from spike-based algorithms against which future works can compare. Furthermore, by converting datasets already popular in Computer Vision, we enable more direct comparison with frame-based approaches.
Collapse
Affiliation(s)
- Garrick Orchard
- Singapore Institute for Neurotechnology (SINAPSE), National University of Singapore Singapore, Singapore ; Temasek Labs, National University of Singapore Singapore, Singapore
| | - Ajinkya Jayawant
- Department of Electrical Engineering, Indian Institute of Technology Bombay Mumbai, India
| | - Gregory K Cohen
- MARCS Institute, University of Western Sydney Penrith, NSW, Australia
| | - Nitish Thakor
- Singapore Institute for Neurotechnology (SINAPSE), National University of Singapore Singapore, Singapore
| |
Collapse
|
81
|
Tan C, Lallee S, Orchard G. Benchmarking neuromorphic vision: lessons learnt from computer vision. Front Neurosci 2015; 9:374. [PMID: 26528120 PMCID: PMC4602133 DOI: 10.3389/fnins.2015.00374] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2015] [Accepted: 09/28/2015] [Indexed: 11/20/2022] Open
Abstract
Neuromorphic Vision sensors have improved greatly since the first silicon retina was presented almost three decades ago. They have recently matured to the point where they are commercially available and can be operated by laymen. However, despite improved availability of sensors, there remains a lack of good datasets, while algorithms for processing spike-based visual data are still in their infancy. On the other hand, frame-based computer vision algorithms are far more mature, thanks in part to widely accepted datasets which allow direct comparison between algorithms and encourage competition. We are presented with a unique opportunity to shape the development of Neuromorphic Vision benchmarks and challenges by leveraging what has been learnt from the use of datasets in frame-based computer vision. Taking advantage of this opportunity, in this paper we review the role that benchmarks and challenges have played in the advancement of frame-based computer vision, and suggest guidelines for the creation of Neuromorphic Vision benchmarks and challenges. We also discuss the unique challenges faced when benchmarking Neuromorphic Vision algorithms, particularly when attempting to provide direct comparison with frame-based computer vision.
Collapse
Affiliation(s)
- Cheston Tan
- Agency for Science, Technology, and Research (ASTAR), Institute for Infocomm Research Singapore, Singapore
| | - Stephane Lallee
- Agency for Science, Technology, and Research (ASTAR), Institute for Infocomm Research Singapore, Singapore
| | - Garrick Orchard
- Singapore Institute for Neurotechnology (SINAPSE), National University of Singapore Singapore, Singapore ; Temasek Labs, National University of Singapore Singapore, Singapore
| |
Collapse
|