1
|
Elamin A, El-Rabbany A, Jacob S. Event-Based Visual/Inertial Odometry for UAV Indoor Navigation. SENSORS (BASEL, SWITZERLAND) 2024; 25:61. [PMID: 39796852 PMCID: PMC11722967 DOI: 10.3390/s25010061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/19/2024] [Revised: 12/19/2024] [Accepted: 12/24/2024] [Indexed: 01/13/2025]
Abstract
Indoor navigation is becoming increasingly essential for multiple applications. It is complex and challenging due to dynamic scenes, limited space, and, more importantly, the unavailability of global navigation satellite system (GNSS) signals. Recently, new sensors have emerged, namely event cameras, which show great potential for indoor navigation due to their high dynamic range and low latency. In this study, an event-based visual-inertial odometry approach is proposed, emphasizing adaptive event accumulation and selective keyframe updates to reduce computational overhead. The proposed approach fuses events, standard frames, and inertial measurements for precise indoor navigation. Features are detected and tracked on the standard images. The events are accumulated into frames and used to track the features between the standard frames. Subsequently, the IMU measurements and the feature tracks are fused to continuously estimate the sensor states. The proposed approach is evaluated using both simulated and real-world datasets. Compared with the state-of-the-art U-SLAM algorithm, our approach achieves a substantial reduction in the mean positional error and RMSE in simulated environments, showing up to 50% and 47% reductions along the x- and y-axes, respectively. The approach achieves 5-10 ms latency per event batch and 10-20 ms for frame updates, demonstrating real-time performance on resource-constrained platforms. These results underscore the potential of our approach as a robust solution for real-world UAV indoor navigation scenarios.
Collapse
Affiliation(s)
- Ahmed Elamin
- Civil Engineering Department, Faculty of Engineering and Architectural Science, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada;
- Civil Engineering Department, Faculty of Engineering, Zagazig University, Zagazig 10162, Egypt
| | - Ahmed El-Rabbany
- Civil Engineering Department, Faculty of Engineering and Architectural Science, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada;
| | - Sunil Jacob
- SOTI Aerospace, SOTI Inc., Mississauga, ON L5N 8L9, Canada;
| |
Collapse
|
2
|
Ussa A, Rajen CS, Pulluri T, Singla D, Acharya J, Chuanrong GF, Basu A, Ramesh B. A Hybrid Neuromorphic Object Tracking and Classification Framework for Real-Time Systems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10726-10735. [PMID: 37027553 DOI: 10.1109/tnnls.2023.3243679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Deep learning inference that needs to largely take place on the "edge" is a highly computational and memory intensive workload, making it intractable for low-power, embedded platforms such as mobile nodes and remote security applications. To address this challenge, this article proposes a real-time, hybrid neuromorphic framework for object tracking and classification using event-based cameras that possess desirable properties such as low-power consumption (5-14 mW) and high dynamic range (120 dB). Nonetheless, unlike traditional approaches of using event-by-event processing, this work uses a mixed frame and event approach to get energy savings with high performance. Using a frame-based region proposal method based on the density of foreground events, a hardware-friendly object tracking scheme is implemented using the apparent object velocity while tackling occlusion scenarios. The frame-based object track input is converted back to spikes for TrueNorth (TN) classification via the energy-efficient deep network (EEDN) pipeline. Using originally collected datasets, we train the TN model on the hardware track outputs, instead of using ground truth object locations as commonly done, and demonstrate the ability of our system to handle practical surveillance scenarios. As an alternative tracker paradigm, we also propose a continuous-time tracker with C++ implementation where each event is processed individually, which better exploits the low latency and asynchronous nature of neuromorphic vision sensors. Subsequently, we extensively compare the proposed methodologies to state-of-the-art event-based and frame-based methods for object tracking and classification, and demonstrate the use case of our neuromorphic approach for real-time and embedded applications without sacrificing performance. Finally, we also showcase the efficacy of the proposed neuromorphic system to a standard RGB camera setup when simultaneously evaluated over several hours of traffic recordings.
Collapse
|
3
|
Grose M, Schmidt JD, Hirakawa K. Convolutional neural network for improved event-based Shack-Hartmann wavefront reconstruction. APPLIED OPTICS 2024; 63:E35-E47. [PMID: 38856590 DOI: 10.1364/ao.520652] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Accepted: 03/30/2024] [Indexed: 06/11/2024]
Abstract
Shack-Hartmann wavefront sensing is a technique for measuring wavefront aberrations, whose use in adaptive optics relies on fast position tracking of an array of spots. These sensors conventionally use frame-based cameras operating at a fixed sampling rate to report pixel intensities, even though only a fraction of the pixels have signal. Prior in-lab experiments have shown feasibility of event-based cameras for Shack-Hartmann wavefront sensing (SHWFS), asynchronously reporting the spot locations as log intensity changes at a microsecond time scale. In our work, we propose a convolutional neural network (CNN) called event-based wavefront network (EBWFNet) that achieves highly accurate estimation of the spot centroid position in real time. We developed a custom Shack-Hartmann wavefront sensing hardware with a common aperture for the synchronized frame- and event-based cameras so that spot centroid locations computed from the frame-based camera may be used to train/test the event-CNN-based centroid position estimation method in an unsupervised manner. Field testing with this hardware allows us to conclude that the proposed EBWFNet achieves sub-pixel accuracy in real-world scenarios with substantial improvement over the state-of-the-art event-based SHWFS. An ablation study reveals the impact of data processing, CNN components, and training cost function; and an unoptimized MATLAB implementation is shown to run faster than 800 Hz on a single GPU.
Collapse
|
4
|
Lesage X, Tran R, Mancini S, Fesquet L. Velocity and Color Estimation Using Event-Based Clustering. SENSORS (BASEL, SWITZERLAND) 2023; 23:9768. [PMID: 38139614 PMCID: PMC10747939 DOI: 10.3390/s23249768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 11/27/2023] [Accepted: 12/04/2023] [Indexed: 12/24/2023]
Abstract
Event-based clustering provides a low-power embedded solution for low-level feature extraction in a scene. The algorithm utilizes the non-uniform sampling capability of event-based image sensors to measure local intensity variations within a scene. Consequently, the clustering algorithm forms similar event groups while simultaneously estimating their attributes. This work proposes taking advantage of additional event information in order to provide new attributes for further processing. We elaborate on the estimation of the object velocity using the mean motion of the cluster. Next, we are examining a novel form of events, which includes intensity measurement of the color at the concerned pixel. These events may be processed to estimate the rough color of a cluster, or the color distribution in a cluster. Lastly, this paper presents some applications that utilize these features. The resulting algorithms are applied and exercised thanks to a custom event-based simulator, which generates videos of outdoor scenes. The velocity estimation methods provide satisfactory results with a trade-off between accuracy and convergence speed. Regarding color estimation, the luminance estimation is challenging in the test cases, while the chrominance is precisely estimated. The estimated quantities are adequate for accurately classifying objects into predefined categories.
Collapse
Affiliation(s)
- Xavier Lesage
- Univ. Grenoble Alpes, CNRS (National Centre for Scientific Research), Grenoble INP (Institute of Engineering), TIMA (Techniques of Informatics and Microelectronics for Integrated Systems Architecture), F-38000 Grenoble, France; (X.L.); (R.T.); (S.M.)
- Orioma, F-38430 Moirans, France
| | - Rosalie Tran
- Univ. Grenoble Alpes, CNRS (National Centre for Scientific Research), Grenoble INP (Institute of Engineering), TIMA (Techniques of Informatics and Microelectronics for Integrated Systems Architecture), F-38000 Grenoble, France; (X.L.); (R.T.); (S.M.)
| | - Stéphane Mancini
- Univ. Grenoble Alpes, CNRS (National Centre for Scientific Research), Grenoble INP (Institute of Engineering), TIMA (Techniques of Informatics and Microelectronics for Integrated Systems Architecture), F-38000 Grenoble, France; (X.L.); (R.T.); (S.M.)
| | - Laurent Fesquet
- Univ. Grenoble Alpes, CNRS (National Centre for Scientific Research), Grenoble INP (Institute of Engineering), TIMA (Techniques of Informatics and Microelectronics for Integrated Systems Architecture), F-38000 Grenoble, France; (X.L.); (R.T.); (S.M.)
| |
Collapse
|
5
|
Ji M, Wang Z, Yan R, Liu Q, Xu S, Tang H. SCTN: Event-based object tracking with energy-efficient deep convolutional spiking neural networks. Front Neurosci 2023; 17:1123698. [PMID: 36875665 PMCID: PMC9978206 DOI: 10.3389/fnins.2023.1123698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 01/30/2023] [Indexed: 02/18/2023] Open
Abstract
Event cameras are asynchronous and neuromorphically inspired visual sensors, which have shown great potential in object tracking because they can easily detect moving objects. Since event cameras output discrete events, they are inherently suitable to coordinate with Spiking Neural Network (SNN), which has a unique event-driven computation characteristic and energy-efficient computing. In this paper, we tackle the problem of event-based object tracking by a novel architecture with a discriminatively trained SNN, called the Spiking Convolutional Tracking Network (SCTN). Taking a segment of events as input, SCTN not only better exploits implicit associations among events rather than event-wise processing, but also fully utilizes precise temporal information and maintains the sparse representation in segments instead of frames. To make SCTN more suitable for object tracking, we propose a new loss function that introduces an exponential Intersection over Union (IoU) in the voltage domain. To the best of our knowledge, this is the first tracking network directly trained with SNN. Besides, we present a new event-based tracking dataset, dubbed DVSOT21. In contrast to other competing trackers, experimental results on DVSOT21 demonstrate that our method achieves competitive performance with very low energy consumption compared to ANN based trackers with very low energy consumption compared to ANN based trackers. With lower energy consumption, tracking on neuromorphic hardware will reveal its advantage.
Collapse
Affiliation(s)
- Mingcheng Ji
- College of Computer Science and Technology, Zhejiang University, Hangzhou, China
| | - Ziling Wang
- College of Computer Science and Technology, Zhejiang University, Hangzhou, China
| | - Rui Yan
- College of Computer Science, Zhejiang University of Technology, Hangzhou, China
| | - Qingjie Liu
- Machine Intelligence Laboratory, China Nanhu Academy of Electronics and Information Technology, Jiaxing, China
| | - Shu Xu
- Machine Intelligence Laboratory, China Nanhu Academy of Electronics and Information Technology, Jiaxing, China
| | - Huajin Tang
- College of Computer Science and Technology, Zhejiang University, Hangzhou, China
- Zhejiang Lab, Hangzhou, China
| |
Collapse
|
6
|
Zheng Y, Yu Z, Wang S, Huang T. Spike-Based Motion Estimation for Object Tracking Through Bio-Inspired Unsupervised Learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 32:335-349. [PMID: 37015554 DOI: 10.1109/tip.2022.3228168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Neuromorphic vision sensors, whose pixels output events/spikes asynchronously with a high temporal resolution according to the scene radiance change, are naturally appropriate for capturing high-speed motion in the scenes. However, how to utilize the events/spikes to smoothly track high-speed moving objects is still a challenging problem. Existing approaches either employ time-consuming iterative optimization, or require large amounts of labeled data to train the object detector. To this end, we propose a bio-inspired unsupervised learning framework, which takes advantage of the spatiotemporal information of events/spikes generated by neuromorphic vision sensors to capture the intrinsic motion patterns. Without off-line training, our models can filter the redundant signals with dynamic adaption module based on short-term plasticity, and extract the motion patterns with motion estimation module based on the spike-timing-dependent plasticity. Combined with the spatiotemporal and motion information of the filtered spike stream, the traditional DBSCAN clustering algorithm and Kalman filter can effectively track multiple targets in extreme scenes. We evaluate the proposed unsupervised framework for object detection and tracking tasks on synthetic data, publicly available event-based datasets, and spiking camera datasets. The experiment results show that the proposed model can robustly detect and smoothly track the moving targets on various challenging scenarios and outperforms state-of-the-art approaches.
Collapse
|
7
|
Ralph N, Joubert D, Jolley A, Afshar S, Tothill N, van Schaik A, Cohen G. Real-Time Event-Based Unsupervised Feature Consolidation and Tracking for Space Situational Awareness. Front Neurosci 2022; 16:821157. [PMID: 35600627 PMCID: PMC9120364 DOI: 10.3389/fnins.2022.821157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 04/04/2022] [Indexed: 11/19/2022] Open
Abstract
Earth orbit is a limited natural resource that hosts a vast range of vital space-based systems that support the international community's national, commercial and defence interests. This resource is rapidly becoming depleted with over-crowding in high demand orbital slots and a growing presence of space debris. We propose the Fast Iterative Extraction of Salient targets for Tracking Asynchronously (FIESTA) algorithm as a robust, real-time and reactive approach to optical Space Situational Awareness (SSA) using Event-Based Cameras (EBCs) to detect, localize, and track Resident Space Objects (RSOs) accurately and timely. We address the challenges of the asynchronous nature and high temporal resolution output of the EBC accurately, unsupervised and with few tune-able parameters using concepts established in the neuromorphic and conventional tracking literature. We show this algorithm is capable of highly accurate in-frame RSO velocity estimation and average sub-pixel localization in a simulated test environment to distinguish the capabilities of the EBC and optical setup from the proposed tracking system. This work is a fundamental step toward accurate end-to-end real-time optical event-based SSA, and developing the foundation for robust closed-form tracking evaluated using standardized tracking metrics.
Collapse
Affiliation(s)
- Nicholas Ralph
- International Centre for Neuromorphic Engineering, MARCS Institute for Brain Behaviour and Development, Western Sydney University, Werrington, NSW, Australia
- *Correspondence: Nicholas Ralph
| | - Damien Joubert
- International Centre for Neuromorphic Engineering, MARCS Institute for Brain Behaviour and Development, Western Sydney University, Werrington, NSW, Australia
| | - Andrew Jolley
- International Centre for Neuromorphic Engineering, MARCS Institute for Brain Behaviour and Development, Western Sydney University, Werrington, NSW, Australia
- Air and Space Power Development Centre, Royal Australian Air Force, Canberra, ACT, Australia
| | - Saeed Afshar
- International Centre for Neuromorphic Engineering, MARCS Institute for Brain Behaviour and Development, Western Sydney University, Werrington, NSW, Australia
| | - Nicholas Tothill
- International Centre for Neuromorphic Engineering, MARCS Institute for Brain Behaviour and Development, Western Sydney University, Werrington, NSW, Australia
| | - André van Schaik
- International Centre for Neuromorphic Engineering, MARCS Institute for Brain Behaviour and Development, Western Sydney University, Werrington, NSW, Australia
| | - Gregory Cohen
- International Centre for Neuromorphic Engineering, MARCS Institute for Brain Behaviour and Development, Western Sydney University, Werrington, NSW, Australia
| |
Collapse
|
8
|
Gallego G, Delbruck T, Orchard G, Bartolozzi C, Taba B, Censi A, Leutenegger S, Davison AJ, Conradt J, Daniilidis K, Scaramuzza D. Event-Based Vision: A Survey. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:154-180. [PMID: 32750812 DOI: 10.1109/tpami.2020.3008413] [Citation(s) in RCA: 227] [Impact Index Per Article: 75.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of μs), very high dynamic range (140 dB versus 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world.
Collapse
|
9
|
|
10
|
Tayarani-Najaran MH, Schmuker M. Event-Based Sensing and Signal Processing in the Visual, Auditory, and Olfactory Domain: A Review. Front Neural Circuits 2021; 15:610446. [PMID: 34135736 PMCID: PMC8203204 DOI: 10.3389/fncir.2021.610446] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Accepted: 04/27/2021] [Indexed: 11/13/2022] Open
Abstract
The nervous systems converts the physical quantities sensed by its primary receptors into trains of events that are then processed in the brain. The unmatched efficiency in information processing has long inspired engineers to seek brain-like approaches to sensing and signal processing. The key principle pursued in neuromorphic sensing is to shed the traditional approach of periodic sampling in favor of an event-driven scheme that mimicks sampling as it occurs in the nervous system, where events are preferably emitted upon the change of the sensed stimulus. In this paper we highlight the advantages and challenges of event-based sensing and signal processing in the visual, auditory and olfactory domains. We also provide a survey of the literature covering neuromorphic sensing and signal processing in all three modalities. Our aim is to facilitate research in event-based sensing and signal processing by providing a comprehensive overview of the research performed previously as well as highlighting conceptual advantages, current progress and future challenges in the field.
Collapse
Affiliation(s)
| | - Michael Schmuker
- School of Physics, Engineering and Computer Science, University of Hertfordshire, Hatfield, United Kingdom
| |
Collapse
|
11
|
Seeing through Events: Real-Time Moving Object Sonification for Visually Impaired People Using Event-Based Camera. SENSORS 2021; 21:s21103558. [PMID: 34065360 PMCID: PMC8161033 DOI: 10.3390/s21103558] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 05/11/2021] [Accepted: 05/12/2021] [Indexed: 11/25/2022]
Abstract
Scene sonification is a powerful technique to help Visually Impaired People (VIP) understand their surroundings. Existing methods usually perform sonification on the entire images of the surrounding scene acquired by a standard camera or on the priori static obstacles acquired by image processing algorithms on the RGB image of the surrounding scene. However, if all the information in the scene are delivered to VIP simultaneously, it will cause information redundancy. In fact, biological vision is more sensitive to moving objects in the scene than static objects, which is also the original intention of the event-based camera. In this paper, we propose a real-time sonification framework to help VIP understand the moving objects in the scene. First, we capture the events in the scene using an event-based camera and cluster them into multiple moving objects without relying on any prior knowledge. Then, sonification based on MIDI is enabled on these objects synchronously. Finally, we conduct comprehensive experiments on the scene video with sonification audio attended by 20 VIP and 20 Sighted People (SP). The results show that our method allows both participants to clearly distinguish the number, size, motion speed, and motion trajectories of multiple objects. The results show that our method is more comfortable to hear than existing methods in terms of aesthetics.
Collapse
|
12
|
Jiang R, Wang Q, Shi S, Mou X, Chen S. Flow‐assisted visual tracking using event cameras. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 2021. [DOI: 10.1049/cit2.12005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Affiliation(s)
- Rui Jiang
- CelePixel Technology Co. Ltd 71 Nanyang Drive Singapore638075
| | - Qinyi Wang
- CelePixel Technology Co. Ltd 71 Nanyang Drive Singapore638075
- School of Electrical and Electronic Engineering Nanyang Technological University Singapore639798
| | - Shunshun Shi
- CelePixel Technology Co. Ltd 71 Nanyang Drive Singapore638075
| | - Xiaozheng Mou
- CelePixel Technology Co. Ltd 71 Nanyang Drive Singapore638075
| | - Shoushun Chen
- CelePixel Technology Co. Ltd 71 Nanyang Drive Singapore638075
- School of Electrical and Electronic Engineering Nanyang Technological University Singapore639798
| |
Collapse
|
13
|
Review on Vehicle Detection Technology for Unmanned Ground Vehicles. SENSORS 2021; 21:s21041354. [PMID: 33672976 PMCID: PMC7918767 DOI: 10.3390/s21041354] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 02/05/2021] [Accepted: 02/10/2021] [Indexed: 11/17/2022]
Abstract
Unmanned ground vehicles (UGVs) have great potential in the application of both civilian and military fields, and have become the focus of research in many countries. Environmental perception technology is the foundation of UGVs, which is of great significance to achieve a safer and more efficient performance. This article firstly introduces commonly used sensors for vehicle detection, lists their application scenarios and compares the strengths and weakness of different sensors. Secondly, related works about one of the most important aspects of environmental perception technology-vehicle detection-are reviewed and compared in detail in terms of different sensors. Thirdly, several simulation platforms related to UGVs are presented for facilitating simulation testing of vehicle detection algorithms. In addition, some datasets about UGVs are summarized to achieve the verification of vehicle detection algorithms in practical application. Finally, promising research topics in the future study of vehicle detection technology for UGVs are discussed in detail.
Collapse
|
14
|
Kong F, Lambert A, Joubert D, Cohen G. Shack-Hartmann wavefront sensing using spatial-temporal data from an event-based image sensor. OPTICS EXPRESS 2020; 28:36159-36175. [PMID: 33379717 DOI: 10.1364/oe.409682] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Accepted: 10/28/2020] [Indexed: 06/12/2023]
Abstract
An event-based image sensor works dramatically differently from the conventional frame-based image sensors in a way that it only responds to local brightness changes whereas its counterparts' output is a linear representation of the illumination over a fixed exposure time. The output of an event-based image sensor therefore is an asynchronous stream of spatial-temporal events data tagged with the location, timestamp and polarity of the triggered events. Compared to traditional frame-based image sensors, event-based image sensors have advantages of high temporal resolution, low latency, high dynamic range and low power consumption. Although event-based image sensors have been used in many computer vision, navigation and even space situation awareness applications, little work has been done to explore their applicability in the field of wavefront sensing. In this work, we present the integration of an event camera in a Shack-Hartmann wavefront sensor and the usage of event data to determine spot displacement and wavefront estimation. We show that it can achieve the same functionality but with substantial speed and can operate in extremely low light conditions. This makes an event-based Shack-Hartmann wavefront sensor a preferable choice for adaptive optics systems where light budget is limited or high bandwidth is required.
Collapse
|
15
|
Ramesh B, Yang H, Orchard G, Le Thi NA, Zhang S, Xiang C. DART: Distribution Aware Retinal Transform for Event-Based Cameras. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2020; 42:2767-2780. [PMID: 31144625 DOI: 10.1109/tpami.2019.2919301] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
We introduce a generic visual descriptor, termed as distribution aware retinal transform (DART), that encodes the structural context using log-polar grids for event cameras. The DART descriptor is applied to four different problems, namely object classification, tracking, detection and feature matching: (1) The DART features are directly employed as local descriptors in a bag-of-words classification framework and testing is carried out on four standard event-based object datasets (N-MNIST, MNIST-DVS, CIFAR10-DVS, NCaltech-101); (2) Extending the classification system, tracking is demonstrated using two key novelties: (i) Statistical bootstrapping is leveraged with online learning for overcoming the low-sample problem during the one-shot learning of the tracker, (ii) Cyclical shifts are induced in the log-polar domain of the DART descriptor to achieve robustness to object scale and rotation variations; (3) To solve the long-term object tracking problem, an object detector is designed using the principle of cluster majority voting. The detection scheme is then combined with the tracker to result in a high intersection-over-union score with augmented ground truth annotations on the publicly available event camera dataset; (4) Finally, the event context encoded by DART greatly simplifies the feature correspondence problem, especially for spatio-temporal slices far apart in time, which has not been explicitly tackled in the event-based vision domain.
Collapse
|
16
|
Kirkland P, Di Caterina G, Soraghan J, Matich G. Perception Understanding Action: Adding Understanding to the Perception Action Cycle With Spiking Segmentation. Front Neurorobot 2020; 14:568319. [PMID: 33192434 PMCID: PMC7604290 DOI: 10.3389/fnbot.2020.568319] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Accepted: 10/20/2020] [Indexed: 11/30/2022] Open
Abstract
Traditionally the Perception Action cycle is the first stage of building an autonomous robotic system and a practical way to implement a low latency reactive system within a low Size, Weight and Power (SWaP) package. However, within complex scenarios, this method can lack contextual understanding about the scene, such as object recognition-based tracking or system attention. Object detection, identification and tracking along with semantic segmentation and attention are all modern computer vision tasks in which Convolutional Neural Networks (CNN) have shown significant success, although such networks often have a large computational overhead and power requirements, which are not ideal in smaller robotics tasks. Furthermore, cloud computing and massively parallel processing like in Graphic Processing Units (GPUs) are outside the specification of many tasks due to their respective latency and SWaP constraints. In response to this, Spiking Convolutional Neural Networks (SCNNs) look to provide the feature extraction benefits of CNNs, while maintaining low latency and power overhead thanks to their asynchronous spiking event-based processing. A novel Neuromorphic Perception Understanding Action (PUA) system is presented, that aims to combine the feature extraction benefits of CNNs with low latency processing of SCNNs. The PUA utilizes a Neuromorphic Vision Sensor for Perception that facilitates asynchronous processing within a Spiking fully Convolutional Neural Network (SpikeCNN) to provide semantic segmentation and Understanding of the scene. The output is fed to a spiking control system providing Actions. With this approach, the aim is to bring features of deep learning into the lower levels of autonomous robotics, while maintaining a biologically plausible STDP rule throughout the learned encoding part of the network. The network will be shown to provide a more robust and predictable management of spiking activity with an improved thresholding response. The reported experiments show that this system can deliver robust results of over 96 and 81% for accuracy and Intersection over Union, ensuring such a system can be successfully used within object recognition, classification and tracking problem. This demonstrates that the attention of the system can be tracked accurately, while the asynchronous processing means the controller can give precise track updates with minimal latency.
Collapse
Affiliation(s)
- Paul Kirkland
- Neuromorphic Sensor Signal Processing Lab, Centre for Image and Signal Processing, Electrical and Electronic Engineering, University of Strathclyde, Glasgow, United Kingdom
| | - Gaetano Di Caterina
- Neuromorphic Sensor Signal Processing Lab, Centre for Image and Signal Processing, Electrical and Electronic Engineering, University of Strathclyde, Glasgow, United Kingdom
| | - John Soraghan
- Neuromorphic Sensor Signal Processing Lab, Centre for Image and Signal Processing, Electrical and Electronic Engineering, University of Strathclyde, Glasgow, United Kingdom
| | | |
Collapse
|
17
|
Lenz G, Ieng SH, Benosman R. Event-Based Face Detection and Tracking Using the Dynamics of Eye Blinks. Front Neurosci 2020; 14:587. [PMID: 32848527 PMCID: PMC7397845 DOI: 10.3389/fnins.2020.00587] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Accepted: 05/12/2020] [Indexed: 12/02/2022] Open
Abstract
We present the first purely event-based method for face detection using the high temporal resolution properties of an event-based camera to detect the presence of a face in a scene using eye blinks. Eye blinks are a unique and stable natural dynamic temporal signature of human faces across population that can be fully captured by event-based sensors. We show that eye blinks have a unique temporal signature over time that can be easily detected by correlating the acquired local activity with a generic temporal model of eye blinks that has been generated from a wide population of users. In a second stage once a face has been located it becomes possible to apply a probabilistic framework to track its spatial location for each incoming event while using eye blinks to correct for drift and tracking errors. Results are shown for several indoor and outdoor experiments. We also release an annotated data set that can be used for future work on the topic.
Collapse
Affiliation(s)
- Gregor Lenz
- INSERM UMRI S 968, Sorbonne Université, UPMC Univ. Paris, UMRS 968, Paris, France
- CNRS, UMR 7210, Institut de la Vision, Paris, France
| | - Sio-Hoi Ieng
- INSERM UMRI S 968, Sorbonne Université, UPMC Univ. Paris, UMRS 968, Paris, France
- CNRS, UMR 7210, Institut de la Vision, Paris, France
| | - Ryad Benosman
- INSERM UMRI S 968, Sorbonne Université, UPMC Univ. Paris, UMRS 968, Paris, France
- CNRS, UMR 7210, Institut de la Vision, Paris, France
- Departments of Ophthalmology/ECE/BioE, University of Pittsburgh, Pittsburgh, PA, United States
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, United States
| |
Collapse
|
18
|
Marcireau A, Ieng SH, Benosman R. Sepia, Tarsier, and Chameleon: A Modular C++ Framework for Event-Based Computer Vision. Front Neurosci 2020; 13:1338. [PMID: 31969799 PMCID: PMC6960268 DOI: 10.3389/fnins.2019.01338] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Accepted: 11/27/2019] [Indexed: 11/13/2022] Open
Abstract
This paper introduces an new open-source, header-only and modular C++ framework to facilitate the implementation of event-driven algorithms. The framework relies on three independent components: sepia (file IO), tarsier (algorithms), and chameleon (display). Our benchmarks show that algorithms implemented with tarsier are faster and have a lower latency than identical implementations in other state-of-the-art frameworks, thanks to static polymorphism (compile-time pipeline assembly). The observer pattern used throughout the framework encourages implementations that better reflect the event-driven nature of the algorithms and the way they process events, easing future translation to neuromorphic hardware. The framework integrates drivers to communicate with the DVS, the DAVIS, the Opal Kelly ATIS, and the CCam ATIS.
Collapse
Affiliation(s)
- Alexandre Marcireau
- INSERM UMRI S 968, Sorbonne Universites, UPMC Univ Paris 06, UMR S 968, CNRS, UMR 7210, Institut de la Vision, Paris, France
| | - Sio-Hoi Ieng
- INSERM UMRI S 968, Sorbonne Universites, UPMC Univ Paris 06, UMR S 968, CNRS, UMR 7210, Institut de la Vision, Paris, France
| | - Ryad Benosman
- INSERM UMRI S 968, Sorbonne Universites, UPMC Univ Paris 06, UMR S 968, CNRS, UMR 7210, Institut de la Vision, Paris, France.,University of Pittsburgh Medical Center, Pittsburgh, PA, United States.,Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, United States
| |
Collapse
|
19
|
Sehara K, Bahr V, Mitchinson B, Pearson MJ, Larkum ME, Sachdev RNS. Fast, Flexible Closed-Loop Feedback: Tracking Movement in "Real-Millisecond-Time". eNeuro 2019; 6:ENEURO.0147-19.2019. [PMID: 31611334 PMCID: PMC6825957 DOI: 10.1523/eneuro.0147-19.2019] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Revised: 09/12/2019] [Accepted: 09/16/2019] [Indexed: 12/19/2022] Open
Abstract
One of the principal functions of the brain is to control movement and rapidly adapt behavior to a changing external environment. Over the last decades our ability to monitor activity in the brain, manipulate it while also manipulating the environment the animal moves through, has been tackled with increasing sophistication. However, our ability to track the movement of the animal in real time has not kept pace. Here, we use a dynamic vision sensor (DVS) based event-driven neuromorphic camera system to implement real-time, low-latency tracking of a single whisker that mice can move at ∼25 Hz. The customized DVS system described here converts whisker motion into a series of events that can be used to estimate the position of the whisker and to trigger a position-based output interactively within 2 ms. This neuromorphic chip-based closed-loop system provides feedback rapidly and flexibly. With this system, it becomes possible to use the movement of whiskers or in principal, movement of any part of the body to reward, punish, in a rapidly reconfigurable way. These methods can be used to manipulate behavior, and the neural circuits that help animals adapt to changing values of a sequence of motor actions.
Collapse
Affiliation(s)
- Keisuke Sehara
- Institute of Biology, Humboldt University of Berlin, D-10117 Berlin, Germany
| | | | - Ben Mitchinson
- Department of Computer Science, University of Sheffield, Sheffield, S10 2TP United Kingdom
| | - Martin J Pearson
- Bristol Robotics Laboratory, University of Bristol and University of the West of England, Bristol, BS16 1QY United Kingdom
| | - Matthew E Larkum
- Institute of Biology, Humboldt University of Berlin, D-10117 Berlin, Germany
| | - Robert N S Sachdev
- Institute of Biology, Humboldt University of Berlin, D-10117 Berlin, Germany
| |
Collapse
|
20
|
Li H, Shi L. Robust Event-Based Object Tracking Combining Correlation Filter and CNN Representation. Front Neurorobot 2019; 13:82. [PMID: 31649524 PMCID: PMC6795673 DOI: 10.3389/fnbot.2019.00082] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Accepted: 09/20/2019] [Indexed: 11/13/2022] Open
Abstract
Object tracking based on the event-based camera or dynamic vision sensor (DVS) remains a challenging task due to the noise events, rapid change of event-stream shape, chaos of complex background textures, and occlusion. To address the challenges, this paper presents a robust event-stream object tracking method based on correlation filter mechanism and convolutional neural network (CNN) representation. In the proposed method, rate coding is used to encode the event-stream object. Feature representations from hierarchical convolutional layers of a pre-trained CNN are used to represent the appearance of the rate encoded event-stream object. Results prove that the proposed method not only achieves good tracking performance in many complicated scenes with noise events, complex background textures, occlusion, and intersected trajectories, but also is robust to variable scale, variable pose, and non-rigid deformations. In addition, the correlation filter-based method has the advantage of high speed. The proposed approach will promote the potential applications of these event-based vision sensors in autonomous driving, robots and many other high-speed scenes.
Collapse
Affiliation(s)
- Hongmin Li
- Department of Precision Instrument, Center for Brain-Inspired Computing Research, Tsinghua University, Beijing, China
| | - Luping Shi
- Department of Precision Instrument, Center for Brain-Inspired Computing Research, Tsinghua University, Beijing, China
| |
Collapse
|
21
|
Zhu G, Zhang Z, Wang J, Wu Y, Lu H. Dynamic Collaborative Tracking. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:3035-3046. [PMID: 32175852 DOI: 10.1109/tnnls.2018.2861838] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Correlation filter has been demonstrated remarkable success for visual tracking recently. However, most existing methods often face model drift caused by several factors, such as unlimited boundary effect, heavy occlusion, fast motion, and distracter perturbation. To address the issue, this paper proposes a unified dynamic collaborative tracking framework that can perform more flexible and robust position prediction. Specifically, the framework learns the object appearance model by jointly training the objective function with three components: target regression submodule, distracter suppression submodule, and maximum margin relation submodule. The first submodule mainly takes advantage of the circulant structure of training samples to obtain the distinguishing ability between the target and its surrounding background. The second submodule optimizes the label response of the possible distracting region close to zero for reducing the peak value of the confidence map in the distracting region. Inspired by the structure output support vector machines, the third submodule is introduced to utilize the differences between target appearance representation and distracter appearance representation in the discriminative mapping space for alleviating the disturbance of the most possible hard negative samples. In addition, a CUR filter as an assistant detector is embedded to provide effective object candidates for alleviating the model drift problem. Comprehensive experimental results show that the proposed approach achieves the state-of-the-art performance in several public benchmark data sets.
Collapse
|
22
|
Gehrig D, Rebecq H, Gallego G, Scaramuzza D. EKLT: Asynchronous Photometric Feature Tracking Using Events and Frames. Int J Comput Vis 2019. [DOI: 10.1007/s11263-019-01209-w] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
23
|
Jiang Z, Bing Z, Huang K, Knoll A. Retina-Based Pipe-Like Object Tracking Implemented Through Spiking Neural Network on a Snake Robot. Front Neurorobot 2019; 13:29. [PMID: 31191288 PMCID: PMC6549545 DOI: 10.3389/fnbot.2019.00029] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2019] [Accepted: 05/07/2019] [Indexed: 11/18/2022] Open
Abstract
Vision based-target tracking ability is crucial to bio-inspired snake robots for exploring unknown environments. However, it is difficult for the traditional vision modules of snake robots to overcome the image blur resulting from periodic swings. A promising approach is to use a neuromorphic vision sensor (NVS), which mimics the biological retina to detect a target at a higher temporal frequency and in a wider dynamic range. In this study, an NVS and a spiking neural network (SNN) were performed on a snake robot for the first time to achieve pipe-like object tracking. An SNN based on Hough Transform was designed to detect a target with an asynchronous event stream fed by the NVS. Combining the state of snake motion analyzed by the joint position sensors, a tracking framework was proposed. The experimental results obtained from the simulator demonstrated the validity of our framework and the autonomous locomotion ability of our snake robot. Comparing the performances of the SNN model on CPUs and on GPUs, respectively, the SNN model showed the best performance on a GPU under a simplified and synchronous update rule while it possessed higher precision on a CPU in an asynchronous way.
Collapse
Affiliation(s)
- Zhuangyi Jiang
- Chair of Robotics, Artificial Intelligence and Real-time Systems, Department of Informatics, Technical University of Munich, Munich, Germany
| | - Zhenshan Bing
- Chair of Robotics, Artificial Intelligence and Real-time Systems, Department of Informatics, Technical University of Munich, Munich, Germany
| | - Kai Huang
- Department of Data and Computer Science, Sun Yat-Sen University, Guangzhou, China
| | - Alois Knoll
- Chair of Robotics, Artificial Intelligence and Real-time Systems, Department of Informatics, Technical University of Munich, Munich, Germany
| |
Collapse
|
24
|
Afshar S, Hamilton TJ, Tapson J, van Schaik A, Cohen G. Investigation of Event-Based Surfaces for High-Speed Detection, Unsupervised Feature Extraction, and Object Recognition. Front Neurosci 2019; 12:1047. [PMID: 30705618 PMCID: PMC6344467 DOI: 10.3389/fnins.2018.01047] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Accepted: 12/24/2018] [Indexed: 12/31/2022] Open
Abstract
In this work, we investigate event-based feature extraction through a rigorous framework of testing. We test a hardware efficient variant of Spike Timing Dependent Plasticity (STDP) on a range of spatio-temporal kernels with different surface decaying methods, decay functions, receptive field sizes, feature numbers, and back end classifiers. This detailed investigation can provide helpful insights and rules of thumb for performance vs. complexity trade-offs in more generalized networks, especially in the context of hardware implementation, where design choices can incur significant resource costs. The investigation is performed using a new dataset consisting of model airplanes being dropped free-hand close to the sensor. The target objects exhibit a wide range of relative orientations and velocities. This range of target velocities, analyzed in multiple configurations, allows a rigorous comparison of time-based decaying surfaces (time surfaces) vs. event index-based decaying surface (index surfaces), which are used to perform unsupervised feature extraction, followed by target detection and recognition. We examine each processing stage by comparison to the use of raw events, as well as a range of alternative layer structures, and the use of random features. By comparing results from a linear classifier and an ELM classifier, we evaluate how each element of the system affects accuracy. To generate time and index surfaces, the most commonly used kernels, namely event binning kernels, linearly, and exponentially decaying kernels, are investigated. Index surfaces were found to outperform time surfaces in recognition when invariance to target velocity was made a requirement. In the investigation of network structure, larger networks of neurons with large receptive field sizes were found to perform best. We find that a small number of event-based feature extractors can project the complex spatio-temporal event patterns of the dataset to an almost linearly separable representation in feature space, with best performing linear classifier achieving 98.75% recognition accuracy, using only 25 feature extracting neurons.
Collapse
Affiliation(s)
- Saeed Afshar
- Biomedical Engineering and Neuroscience Program, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Sydney, NSW, Australia
| | - Tara Julia Hamilton
- Biomedical Engineering and Neuroscience Program, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Sydney, NSW, Australia
| | - Jonathan Tapson
- Biomedical Engineering and Neuroscience Program, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Sydney, NSW, Australia
| | - André van Schaik
- Biomedical Engineering and Neuroscience Program, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Sydney, NSW, Australia
| | - Gregory Cohen
- Biomedical Engineering and Neuroscience Program, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Sydney, NSW, Australia
| |
Collapse
|
25
|
Seifozzakerini S, Yau WY, Mao K, Nejati H. Hough Transform Implementation For Event-Based Systems: Concepts and Challenges. Front Comput Neurosci 2018; 12:103. [PMID: 30622466 PMCID: PMC6308381 DOI: 10.3389/fncom.2018.00103] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2018] [Accepted: 12/05/2018] [Indexed: 11/13/2022] Open
Abstract
Hough transform (HT) is one of the most well-known techniques in computer vision that has been the basis of many practical image processing algorithms. HT however is designed to work for frame-based systems such as conventional digital cameras. Recently, event-based systems such as Dynamic Vision Sensor (DVS) cameras, has become popular among researchers. Event-based cameras have a significantly high temporal resolution (1 μs), but each pixel can only detect change and not color. As such, the conventional image processing algorithms cannot be readily applied to event-based output streams. Therefore, it is necessary to adapt the conventional image processing algorithms for event-based cameras. This paper provides a systematic explanation, starting from extending conventional HT to 3D HT, adaptation to event-based systems, and the implementation of the 3D HT using Spiking Neural Networks (SNNs). Using SNN enables the proposed solution to be easily realized on hardware using FPGA, without requiring CPU or additional memory. In addition, we also discuss techniques for optimal SNN-based implementation using efficient number of neurons for the required accuracy and resolution along each dimension, without increasing the overall computational complexity. We hope that this will help to reduce the gap between event-based and frame-based systems.
Collapse
Affiliation(s)
- Sajjad Seifozzakerini
- Institute for Infocomm Research, Agency for Science, Technology and Research (ASTAR), Singapore, Singapore.,School of Electrical and Electronic Engineering, Nanyang Technological University (NTU), Singapore, Singapore
| | - Wei-Yun Yau
- Institute for Infocomm Research, Agency for Science, Technology and Research (ASTAR), Singapore, Singapore
| | - Kezhi Mao
- School of Electrical and Electronic Engineering, Nanyang Technological University (NTU), Singapore, Singapore
| | - Hossein Nejati
- Information Systems Technology and Design (ISTD), Singapore University of Technology and Design (SUTD), Singapore, Singapore
| |
Collapse
|
26
|
Berthelon X, Chenegros G, Finateu T, Ieng SH, Benosman R. Effects of Cooling on the SNR and Contrast Detection of a Low-Light Event-Based Camera. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2018; 12:1467-1474. [PMID: 30334806 DOI: 10.1109/tbcas.2018.2875202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Johnson-Nyquist noise is the electronic noise generated by the thermal agitation of charge carriers, which increases when the sensor overheats. Current high-speed cameras used in low-light conditions are often cooled down to reduce thermal noise and increase their signal to noise ratio. These sensors, however, record hundreds of frames per second, which takes time, requires energy, and heavy computing power due to the substantial data load. Event-based sensors benefit from a high temporal resolution and record the information in a sparse manner. Based on an asynchronous time-based image sensor, we developed another version of this event-based camera whose pixels were designed for low-light applications and added a Peltier-effect-based cooling system at the back of the sensor in order to reduce thermal noise. We show the benefits from thermal noise reduction and study the improvement of the signal to noise ratio in the estimation of event-based normal flow norm and angle and particle tracking in microscopy.
Collapse
|
27
|
Pfeiffer M, Pfeil T. Deep Learning With Spiking Neurons: Opportunities and Challenges. Front Neurosci 2018; 12:774. [PMID: 30410432 PMCID: PMC6209684 DOI: 10.3389/fnins.2018.00774] [Citation(s) in RCA: 137] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Accepted: 10/04/2018] [Indexed: 01/16/2023] Open
Abstract
Spiking neural networks (SNNs) are inspired by information processing in biology, where sparse and asynchronous binary signals are communicated and processed in a massively parallel fashion. SNNs on neuromorphic hardware exhibit favorable properties such as low power consumption, fast inference, and event-driven information processing. This makes them interesting candidates for the efficient implementation of deep neural networks, the method of choice for many machine learning tasks. In this review, we address the opportunities that deep spiking networks offer and investigate in detail the challenges associated with training SNNs in a way that makes them competitive with conventional deep learning, but simultaneously allows for efficient mapping to hardware. A wide range of training methods for SNNs is presented, ranging from the conversion of conventional deep networks into SNNs, constrained training before conversion, spiking variants of backpropagation, and biologically motivated variants of STDP. The goal of our review is to define a categorization of SNN training methods, and summarize their advantages and drawbacks. We further discuss relationships between SNNs and binary networks, which are becoming popular for efficient digital hardware implementation. Neuromorphic hardware platforms have great potential to enable deep spiking networks in real-world applications. We compare the suitability of various neuromorphic systems that have been developed over the past years, and investigate potential use cases. Neuromorphic approaches and conventional machine learning should not be considered simply two solutions to the same classes of problems, instead it is possible to identify and exploit their task-specific advantages. Deep SNNs offer great opportunities to work with new types of event-based sensors, exploit temporal codes and local on-chip learning, and we have so far just scratched the surface of realizing these advantages in practical applications.
Collapse
Affiliation(s)
- Michael Pfeiffer
- Bosch Center for Artificial Intelligence, Robert Bosch GmbH, Renningen, Germany
| | | |
Collapse
|
28
|
Cohen G, Afshar S, Orchard G, Tapson J, Benosman R, van Schaik A. Spatial and Temporal Downsampling in Event-Based Visual Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:5030-5044. [PMID: 29994752 DOI: 10.1109/tnnls.2017.2785272] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
As the interest in event-based vision sensors for mobile and aerial applications grows, there is an increasing need for high-speed and highly robust algorithms for performing visual tasks using event-based data. As event rate and network structure have a direct impact on the power consumed by such systems, it is important to explore the efficiency of the event-based encoding used by these sensors. The work presented in this paper represents the first study solely focused on the effects of both spatial and temporal downsampling on event-based vision data and makes use of a variety of data sets chosen to fully explore and characterize the nature of downsampling operations. The results show that both spatial downsampling and temporal downsampling produce improved classification accuracy and, additionally, a lower overall data rate. A finding is particularly relevant for bandwidth and power constrained systems. For a given network containing 1000 hidden layer neurons, the spatially downsampled systems achieved a best case accuracy of 89.38% on N-MNIST as opposed to 81.03% with no downsampling at the same hidden layer size. On the N-Caltech101 data set, the downsampled system achieved a best case accuracy of 18.25%, compared with 7.43% achieved with no downsampling. The results show that downsampling is an important preprocessing technique in event-based visual processing, especially for applications sensitive to power consumption and transmission bandwidth.
Collapse
|
29
|
Valeiras DR, Clady X, Ieng SH, Benosman R. Event-Based Line Fitting and Segment Detection Using a Neuromorphic Visual Sensor. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 30:1218-1230. [PMID: 30222585 DOI: 10.1109/tnnls.2018.2807983] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This paper introduces an event-based luminance-free algorithm for line and segment detection from the output of asynchronous event-based neuromorphic retinas. These recent biomimetic vision sensors are composed of autonomous pixels, each of them asynchronously generating visual events that encode relative changes in pixels' illumination at high temporal resolutions. This frame-free approach results in an increased energy efficiency and in real-time operation, making these sensors especially suitable for applications such as autonomous robotics. The proposed algorithm is based on an iterative event-based weighted least squares fitting, and it is consequently well suited to the high temporal resolution and asynchronous acquisition of neuromorphic cameras: parameters of a current line are updated for each event attributed (i.e., spatio-temporally close) to it, while implicitly forgetting the contribution of older events according to a speed-tuned exponentially decaying function. A detection occurs if a measure of activity, i.e., implicit measure of the number of contributing events and using the same decay function, exceeds a given threshold. The speed-tuned decreasing function is based on a measure of the apparent motion, i.e., the optical flow computed around each event. This latter ensures that the algorithm behaves independently of the edges' dynamics. Line segments are then extracted from the lines, allowing for the tracking of the corresponding endpoints. We provide experiments showing the accuracy of our algorithm and study the influence of the apparent velocity and relative orientation of the observed edges. Finally, evaluations of its computational efficiency show that this algorithm can be envisioned for high-speed applications, such as vision-based robotic navigation.
Collapse
|
30
|
Camunas-Mesa LA, Serrano-Gotarredona T, Ieng SH, Benosman R, Linares-Barranco B. Event-Driven Stereo Visual Tracking Algorithm to Solve Object Occlusion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:4223-4237. [PMID: 29989974 DOI: 10.1109/tnnls.2017.2759326] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Object tracking is a major problem for many computer vision applications, but it continues to be computationally expensive. The use of bio-inspired neuromorphic event-driven dynamic vision sensors (DVSs) has heralded new methods for vision processing, exploiting reduced amount of data and very precise timing resolutions. Previous studies have shown these neural spiking sensors to be well suited to implementing single-sensor object tracking systems, although they experience difficulties when solving ambiguities caused by object occlusion. DVSs have also performed well in 3-D reconstruction in which event matching techniques are applied in stereo setups. In this paper, we propose a new event-driven stereo object tracking algorithm that simultaneously integrates 3-D reconstruction and cluster tracking, introducing feedback information in both tasks to improve their respective performances. This algorithm, inspired by human vision, identifies objects and learns their position and size in order to solve ambiguities. This strategy has been validated in four different experiments where the 3-D positions of two objects were tracked in a stereo setup even when occlusion occurred. The objects studied in the experiments were: 1) two swinging pens, the distance between which during movement was measured with an error of less than 0.5%; 2) a pen and a box, to confirm the correctness of the results obtained with a more complex object; 3) two straws attached to a fan and rotating at 6 revolutions per second, to demonstrate the high-speed capabilities of this approach; and 4) two people walking in a real-world environment.
Collapse
|
31
|
Chen BH, Huang SC, Li CY, Kuo SY. Haze Removal Using Radial Basis Function Networks for Visibility Restoration Applications. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:3828-3838. [PMID: 28922130 DOI: 10.1109/tnnls.2017.2741975] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Restoration of visibility in hazy images is the first relevant step of information analysis in many outdoor computer vision applications. To this aim, the restored image must feature clear visibility with sufficient brightness and visible edges, while avoiding the production of noticeable artifacts. In this paper, we propose a haze removal approach based on the radial basis function (RBF) through artificial neural networks dedicated to effectively removing haze formation while retaining not only the visible edges but also the brightness of restored images. Unlike traditional haze-removal methods that consist of single atmospheric veils, the multiatmospheric veil is generated and then dynamically learned by the neurons of the proposed RBF networks according to the scene complexity. Through this process, more visible edges are retained in the restored images. Subsequently, the activation function during the testing process is employed to represent the brightness of the restored image. We compare the proposed method with the other state-of-the-art haze-removal methods and report experimental results in terms of qualitative and quantitative evaluations for benchmark color images captured in typical hazy weather conditions. The experimental results demonstrate that the proposed method is able to produce brighter and more vivid haze-free images with more visible edges than can the other state-of-the-art methods.
Collapse
|
32
|
Marcireau A, Ieng SH, Simon-Chane C, Benosman RB. Event-Based Color Segmentation With a High Dynamic Range Sensor. Front Neurosci 2018; 12:135. [PMID: 29695948 PMCID: PMC5904265 DOI: 10.3389/fnins.2018.00135] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Accepted: 02/20/2018] [Indexed: 12/01/2022] Open
Abstract
This paper introduces a color asynchronous neuromorphic event-based camera and a methodology to process color output from the device to perform color segmentation and tracking at the native temporal resolution of the sensor (down to one microsecond). Our color vision sensor prototype is a combination of three Asynchronous Time-based Image Sensors, sensitive to absolute color information. We devise a color processing algorithm leveraging this information. It is designed to be computationally cheap, thus showing how low level processing benefits from asynchronous acquisition and high temporal resolution data. The resulting color segmentation and tracking performance is assessed both with an indoor controlled scene and two outdoor uncontrolled scenes. The tracking's mean error to the ground truth for the objects of the outdoor scenes ranges from two to twenty pixels.
Collapse
Affiliation(s)
- Alexandre Marcireau
- Institut National de la Santé et de la Recherche Médicale, UMRI S 968, Sorbonne Universites, UPMC Univ Paris 06, UMR S 968, Centre National de la Recherche Scientifique, UMR 7210, Institut de la Vision, Paris, France
| | - Sio-Hoi Ieng
- Institut National de la Santé et de la Recherche Médicale, UMRI S 968, Sorbonne Universites, UPMC Univ Paris 06, UMR S 968, Centre National de la Recherche Scientifique, UMR 7210, Institut de la Vision, Paris, France
| | - Camille Simon-Chane
- Institut National de la Santé et de la Recherche Médicale, UMRI S 968, Sorbonne Universites, UPMC Univ Paris 06, UMR S 968, Centre National de la Recherche Scientifique, UMR 7210, Institut de la Vision, Paris, France
| | - Ryad B Benosman
- Institut National de la Santé et de la Recherche Médicale, UMRI S 968, Sorbonne Universites, UPMC Univ Paris 06, UMR S 968, Centre National de la Recherche Scientifique, UMR 7210, Institut de la Vision, Paris, France
| |
Collapse
|
33
|
Shi C, Li J, Wang Y, Luo G. Exploiting Lightweight Statistical Learning for Event-Based Vision Processing. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2018; 6:19396-19406. [PMID: 29750138 PMCID: PMC5937990 DOI: 10.1109/access.2018.2823260] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This paper presents a lightweight statistical learning framework potentially suitable for low-cost event-based vision systems, where visual information is captured by a dynamic vision sensor (DVS) and represented as an asynchronous stream of pixel addresses (events) indicating a relative intensity change on those locations. A simple random ferns classifier based on randomly selected patch-based binary features is employed to categorize pixel event flows. Our experimental results demonstrate that compared to existing event-based processing algorithms, such as spiking convolutional neural networks (SCNNs) and the state-of-the-art bag-of-events (BoE)-based statistical algorithms, our framework excels in high processing speed (2× faster than the BoE statistical methods and >100× faster than previous SCNNs in training speed) with extremely simple online learning process, and achieves state-of-the-art classification accuracy on four popular address-event representation data sets: MNIST-DVS, Poker-DVS, Posture-DVS, and CIFAR10-DVS. Hardware estimation shows that our algorithm will be preferable for low-cost embedded system implementations.
Collapse
Affiliation(s)
- Cong Shi
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114 USA
| | - Jiajun Li
- State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100864, China
| | - Ying Wang
- State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100864, China
| | - Gang Luo
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Department of Ophthalmology, Harvard Medical School, Boston, MA 02114 USA
| |
Collapse
|
34
|
Everding L, Conradt J. Low-Latency Line Tracking Using Event-Based Dynamic Vision Sensors. Front Neurorobot 2018; 12:4. [PMID: 29515386 PMCID: PMC5825909 DOI: 10.3389/fnbot.2018.00004] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Accepted: 01/25/2018] [Indexed: 11/13/2022] Open
Abstract
In order to safely navigate and orient in their local surroundings autonomous systems need to rapidly extract and persistently track visual features from the environment. While there are many algorithms tackling those tasks for traditional frame-based cameras, these have to deal with the fact that conventional cameras sample their environment with a fixed frequency. Most prominently, the same features have to be found in consecutive frames and corresponding features then need to be matched using elaborate techniques as any information between the two frames is lost. We introduce a novel method to detect and track line structures in data streams of event-based silicon retinae [also known as dynamic vision sensors (DVS)]. In contrast to conventional cameras, these biologically inspired sensors generate a quasicontinuous stream of vision information analogous to the information stream created by the ganglion cells in mammal retinae. All pixels of DVS operate asynchronously without a periodic sampling rate and emit a so-called DVS address event as soon as they perceive a luminance change exceeding an adjustable threshold. We use the high temporal resolution achieved by the DVS to track features continuously through time instead of only at fixed points in time. The focus of this work lies on tracking lines in a mostly static environment which is observed by a moving camera, a typical setting in mobile robotics. Since DVS events are mostly generated at object boundaries and edges which in man-made environments often form lines they were chosen as feature to track. Our method is based on detecting planes of DVS address events in x-y-t-space and tracing these planes through time. It is robust against noise and runs in real time on a standard computer, hence it is suitable for low latency robotics. The efficacy and performance are evaluated on real-world data sets which show artificial structures in an office-building using event data for tracking and frame data for ground-truth estimation from a DAVIS240C sensor.
Collapse
Affiliation(s)
- Lukas Everding
- Department of Electrical and Computer Engineering, Neuroscientific Systemtheory, Technical University of Munich, Munich, Germany
| | - Jörg Conradt
- Department of Electrical and Computer Engineering, Neuroscientific Systemtheory, Technical University of Munich, Munich, Germany
| |
Collapse
|
35
|
Asynchronous, Photometric Feature Tracking Using Events and Frames. COMPUTER VISION – ECCV 2018 2018. [DOI: 10.1007/978-3-030-01258-8_46] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
|
36
|
Sabatier Q, Ieng SH, Benosman R. Asynchronous Event-Based Fourier Analysis. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:2192-2202. [PMID: 28186889 DOI: 10.1109/tip.2017.2661702] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
This paper introduces a method to compute the FFT of a visual scene at a high temporal precision of around 1- [Formula: see text] output from an asynchronous event-based camera. Event-based cameras allow to go beyond the widespread and ingrained belief that acquiring series of images at some rate is a good way to capture visual motion. Each pixel adapts its own sampling rate to the visual input it receives and defines the timing of its own sampling points in response to its visual input by reacting to changes of the amount of incident light. As a consequence, the sampling process is no longer governed by a fixed timing source but by the signal to be sampled itself, or more precisely by the variations of the signal in the amplitude domain. Event-based cameras acquisition paradigm allows to go beyond the current conventional method to compute the FFT. The event-driven FFT algorithm relies on a heuristic methodology designed to operate directly on incoming gray level events to update incrementally the FFT while reducing both computation and data load. We show that for reasonable levels of approximations at equivalent frame rates beyond the millisecond, the method performs faster and more efficiently than conventional image acquisition. Several experiments are carried out on indoor and outdoor scenes where both conventional and event-driven FFT computation is shown and compared.
Collapse
|
37
|
Peng X, Zhao B, Yan R, Tang H, Yi Z. Bag of Events: An Efficient Probability-Based Feature Extraction Method for AER Image Sensors. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2017; 28:791-803. [PMID: 28113870 DOI: 10.1109/tnnls.2016.2536741] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Address event representation (AER) image sensors represent the visual information as a sequence of events that denotes the luminance changes of the scene. In this paper, we introduce a feature extraction method for AER image sensors based on the probability theory, namely, bag of events (BOE). The proposed approach represents each object as the joint probability distribution of the concurrent events, and each event corresponds to a unique activated pixel of the AER sensor. The advantages of BOE include: 1) it is a statistical learning method and has a good interpretability in mathematics; 2) BOE can significantly reduce the effort to tune parameters for different data sets, because it only has one hyperparameter and is robust to the value of the parameter; 3) BOE is an online learning algorithm, which does not require the training data to be collected in advance; 4) BOE can achieve competitive results in real time for feature extraction (>275 frames/s and >120,000 events/s); and 5) the implementation complexity of BOE only involves some basic operations, e.g., addition and multiplication. This guarantees the hardware friendliness of our method. The experimental results on three popular AER databases (i.e., MNIST-dynamic vision sensor, Poker Card, and Posture) show that our method is remarkably faster than two recently proposed AER categorization systems while preserving a good classification accuracy.
Collapse
|
38
|
Mishra A, Ghosh R, Principe JC, Thakor NV, Kukreja SL. A Saccade Based Framework for Real-Time Motion Segmentation Using Event Based Vision Sensors. Front Neurosci 2017; 11:83. [PMID: 28316563 PMCID: PMC5334512 DOI: 10.3389/fnins.2017.00083] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2016] [Accepted: 02/06/2017] [Indexed: 11/25/2022] Open
Abstract
Motion segmentation is a critical pre-processing step for autonomous robotic systems to facilitate tracking of moving objects in cluttered environments. Event based sensors are low power analog devices that represent a scene by means of asynchronous information updates of only the dynamic details at high temporal resolution and, hence, require significantly less calculations. However, motion segmentation using spatiotemporal data is a challenging task due to data asynchrony. Prior approaches for object tracking using neuromorphic sensors perform well while the sensor is static or a known model of the object to be followed is available. To address these limitations, in this paper we develop a technique for generalized motion segmentation based on spatial statistics across time frames. First, we create micromotion on the platform to facilitate the separation of static and dynamic elements of a scene, inspired by human saccadic eye movements. Second, we introduce the concept of spike-groups as a methodology to partition spatio-temporal event groups, which facilitates computation of scene statistics and characterize objects in it. Experimental results show that our algorithm is able to classify dynamic objects with a moving camera with maximum accuracy of 92%.
Collapse
Affiliation(s)
- Abhishek Mishra
- Singapore Institute for Neurotechnology, National University of Singapore Singapore, Singapore
| | - Rohan Ghosh
- Singapore Institute for Neurotechnology, National University of Singapore Singapore, Singapore
| | - Jose C Principe
- Department of Electrical and Computer Engineering, University of Florida Gainesville, FL, USA
| | - Nitish V Thakor
- Singapore Institute for Neurotechnology, National University of SingaporeSingapore, Singapore; Biomedical Engineering Department, Johns Hopkins UniversityBaltimore, MD, USA
| | - Sunil L Kukreja
- Singapore Institute for Neurotechnology, National University of Singapore Singapore, Singapore
| |
Collapse
|
39
|
Clady X, Maro JM, Barré S, Benosman RB. A Motion-Based Feature for Event-Based Pattern Recognition. Front Neurosci 2017; 10:594. [PMID: 28101001 PMCID: PMC5209354 DOI: 10.3389/fnins.2016.00594] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2016] [Accepted: 12/13/2016] [Indexed: 11/13/2022] Open
Abstract
This paper introduces an event-based luminance-free feature from the output of asynchronous event-based neuromorphic retinas. The feature consists in mapping the distribution of the optical flow along the contours of the moving objects in the visual scene into a matrix. Asynchronous event-based neuromorphic retinas are composed of autonomous pixels, each of them asynchronously generating "spiking" events that encode relative changes in pixels' illumination at high temporal resolutions. The optical flow is computed at each event, and is integrated locally or globally in a speed and direction coordinate frame based grid, using speed-tuned temporal kernels. The latter ensures that the resulting feature equitably represents the distribution of the normal motion along the current moving edges, whatever their respective dynamics. The usefulness and the generality of the proposed feature are demonstrated in pattern recognition applications: local corner detection and global gesture recognition.
Collapse
Affiliation(s)
- Xavier Clady
- Centre National de la Recherche Scientifique, Institut National de la Santé Et de la Recherche Médicale, Institut de la Vision, Sorbonne Universités, UPMC University Paris 06 Paris, France
| | - Jean-Matthieu Maro
- Centre National de la Recherche Scientifique, Institut National de la Santé Et de la Recherche Médicale, Institut de la Vision, Sorbonne Universités, UPMC University Paris 06 Paris, France
| | - Sébastien Barré
- Centre National de la Recherche Scientifique, Institut National de la Santé Et de la Recherche Médicale, Institut de la Vision, Sorbonne Universités, UPMC University Paris 06 Paris, France
| | - Ryad B Benosman
- Centre National de la Recherche Scientifique, Institut National de la Santé Et de la Recherche Médicale, Institut de la Vision, Sorbonne Universités, UPMC University Paris 06 Paris, France
| |
Collapse
|
40
|
Wang H, Xu J, Gao Z, Lu C, Yao S, Ma J. An Event-Based Neurobiological Recognition System with Orientation Detector for Objects in Multiple Orientations. Front Neurosci 2016; 10:498. [PMID: 27867346 PMCID: PMC5095131 DOI: 10.3389/fnins.2016.00498] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Accepted: 10/19/2016] [Indexed: 11/24/2022] Open
Abstract
A new multiple orientation event-based neurobiological recognition system is proposed by integrating recognition and tracking function in this paper, which is used for asynchronous address-event representation (AER) image sensors. The characteristic of this system has been enriched to recognize the objects in multiple orientations with only training samples moving in a single orientation. The system extracts multi-scale and multi-orientation line features inspired by models of the primate visual cortex. An orientation detector based on modified Gaussian blob tracking algorithm is introduced for object tracking and orientation detection. The orientation detector and feature extraction block work in simultaneous mode, without any increase in categorization time. An addresses lookup table (addresses LUT) is also presented to adjust the feature maps by addresses mapping and reordering, and they are categorized in the trained spiking neural network. This recognition system is evaluated with the MNIST dataset which have played important roles in the development of computer vision, and the accuracy is increased owing to the use of both ON and OFF events. AER data acquired by a dynamic vision senses (DVS) are also tested on the system, such as moving digits, pokers, and vehicles. The experimental results show that the proposed system can realize event-based multi-orientation recognition. The work presented in this paper makes a number of contributions to the event-based vision processing system for multi-orientation object recognition. It develops a new tracking-recognition architecture to feedforward categorization system and an address reorder approach to classify multi-orientation objects using event-based data. It provides a new way to recognize multiple orientation objects with only samples in single orientation.
Collapse
Affiliation(s)
- Hanyu Wang
- School of Electronic Information Engineering, Tianjin University Tianjin, China
| | - Jiangtao Xu
- School of Electronic Information Engineering, Tianjin University Tianjin, China
| | - Zhiyuan Gao
- School of Electronic Information Engineering, Tianjin University Tianjin, China
| | - Chengye Lu
- School of Electronic Information Engineering, Tianjin University Tianjin, China
| | - Suying Yao
- School of Electronic Information Engineering, Tianjin University Tianjin, China
| | - Jianguo Ma
- School of Electronic Information Engineering, Tianjin University Tianjin, China
| |
Collapse
|
41
|
Simon Chane C, Ieng SH, Posch C, Benosman RB. Event-Based Tone Mapping for Asynchronous Time-Based Image Sensor. Front Neurosci 2016; 10:391. [PMID: 27642275 PMCID: PMC5015463 DOI: 10.3389/fnins.2016.00391] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2015] [Accepted: 08/09/2016] [Indexed: 11/30/2022] Open
Abstract
The asynchronous time-based neuromorphic image sensor ATIS is an array of autonomously operating pixels able to encode luminance information with an exceptionally high dynamic range (>143 dB). This paper introduces an event-based methodology to display data from this type of event-based imagers, taking into account the large dynamic range and high temporal accuracy that go beyond available mainstream display technologies. We introduce an event-based tone mapping methodology for asynchronously acquired time encoded gray-level data. A global and a local tone mapping operator are proposed. Both are designed to operate on a stream of incoming events rather than on time frame windows. Experimental results on real outdoor scenes are presented to evaluate the performance of the tone mapping operators in terms of quality, temporal stability, adaptation capability, and computational time.
Collapse
Affiliation(s)
| | - Sio-Hoi Ieng
- Institut National de la Santé et de la Recherche Médicale UMRI S 968, Sorbonne Universités, UPMC Univ Paris 06, UMR S 968, Centre National de la Recherche Scientifique, UMR 7210, Institut de la Vision Paris, France
| | - Christoph Posch
- Institut National de la Santé et de la Recherche Médicale UMRI S 968, Sorbonne Universités, UPMC Univ Paris 06, UMR S 968, Centre National de la Recherche Scientifique, UMR 7210, Institut de la Vision Paris, France
| | - Ryad B Benosman
- Institut National de la Santé et de la Recherche Médicale UMRI S 968, Sorbonne Universités, UPMC Univ Paris 06, UMR S 968, Centre National de la Recherche Scientifique, UMR 7210, Institut de la Vision Paris, France
| |
Collapse
|
42
|
Reverter Valeiras D, Kime S, Ieng SH, Benosman RB. An Event-Based Solution to the Perspective-n-Point Problem. Front Neurosci 2016; 10:208. [PMID: 27242412 PMCID: PMC4870282 DOI: 10.3389/fnins.2016.00208] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2016] [Accepted: 04/25/2016] [Indexed: 11/13/2022] Open
Abstract
The goal of the Perspective-n-Point problem (PnP) is to find the relative pose between an object and a camera from a set of n pairings between 3D points and their corresponding 2D projections on the focal plane. Current state of the art solutions, designed to operate on images, rely on computationally expensive minimization techniques. For the first time, this work introduces an event-based PnP algorithm designed to work on the output of a neuromorphic event-based vision sensor. The problem is formulated here as a least-squares minimization problem, where the error function is updated with every incoming event. The optimal translation is then computed in closed form, while the desired rotation is given by the evolution of a virtual mechanical system whose energy is proven to be equal to the error function. This allows for a simple yet robust solution of the problem, showing how event-based vision can simplify computer vision tasks. The approach takes full advantage of the high temporal resolution of the sensor, as the estimated pose is incrementally updated with every incoming event. Two approaches are proposed: the Full and the Efficient methods. These two methods are compared against a state of the art PnP algorithm both on synthetic and on real data, producing similar accuracy in addition of being faster.
Collapse
Affiliation(s)
- David Reverter Valeiras
- Centre National de la Recherche Scientifique, Institut National de la Santé et de la Recherche Médicale, Institut de la Vision, Sorbonne Universités, UPMC Université Paris 06 Paris, France
| | - Sihem Kime
- Centre National de la Recherche Scientifique, Institut National de la Santé et de la Recherche Médicale, Institut de la Vision, Sorbonne Universités, UPMC Université Paris 06 Paris, France
| | - Sio-Hoi Ieng
- Centre National de la Recherche Scientifique, Institut National de la Santé et de la Recherche Médicale, Institut de la Vision, Sorbonne Universités, UPMC Université Paris 06 Paris, France
| | - Ryad Benjamin Benosman
- Centre National de la Recherche Scientifique, Institut National de la Santé et de la Recherche Médicale, Institut de la Vision, Sorbonne Universités, UPMC Université Paris 06 Paris, France
| |
Collapse
|
43
|
Reverter Valeiras D, Lagorce X, Clady X, Bartolozzi C, Ieng SH, Benosman R. An Asynchronous Neuromorphic Event-Driven Visual Part-Based Shape Tracking. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2015; 26:3045-3059. [PMID: 25794399 DOI: 10.1109/tnnls.2015.2401834] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Object tracking is an important step in many artificial vision tasks. The current state-of-the-art implementations remain too computationally demanding for the problem to be solved in real time with high dynamics. This paper presents a novel real-time method for visual part-based tracking of complex objects from the output of an asynchronous event-based camera. This paper extends the pictorial structures model introduced by Fischler and Elschlager 40 years ago and introduces a new formulation of the problem, allowing the dynamic processing of visual input in real time at high temporal resolution using a conventional PC. It relies on the concept of representing an object as a set of basic elements linked by springs. These basic elements consist of simple trackers capable of successfully tracking a target with an ellipse-like shape at several kilohertz on a conventional computer. For each incoming event, the method updates the elastic connections established between the trackers and guarantees a desired geometric structure corresponding to the tracked object in real time. This introduces a high temporal elasticity to adapt to projective deformations of the tracked object in the focal plane. The elastic energy of this virtual mechanical system provides a quality criterion for tracking and can be used to determine whether the measured deformations are caused by the perspective projection of the perceived object or by occlusions. Experiments on real-world data show the robustness of the method in the context of dynamic face tracking.
Collapse
|