1
|
Zhang S, Wang W, Li H, Zhang S. EVtracker: An Event-Driven Spatiotemporal Method for Dynamic Object Tracking. SENSORS (BASEL, SWITZERLAND) 2022; 22:6090. [PMID: 36015851 PMCID: PMC9414578 DOI: 10.3390/s22166090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Revised: 07/06/2022] [Accepted: 08/12/2022] [Indexed: 06/15/2023]
Abstract
An event camera is a novel bio-inspired sensor that effectively compensates for the shortcomings of current frame cameras, which include high latency, low dynamic range, motion blur, etc. Rather than capturing images at a fixed frame rate, an event camera produces an asynchronous signal by measuring the brightness change of each pixel. Consequently, an appropriate algorithm framework that can handle the unique data types of event-based vision is required. In this paper, we propose a dynamic object tracking framework using an event camera to achieve long-term stable tracking of event objects. One of the key novel features of our approach is to adopt an adaptive strategy that adjusts the spatiotemporal domain of event data. To achieve this, we reconstruct event images from high-speed asynchronous streaming data via online learning. Additionally, we apply the Siamese network to extract features from event data. In contrast to earlier models that only extract hand-crafted features, our method provides powerful feature description and a more flexible reconstruction strategy for event data. We assess our algorithm in three challenging scenarios: 6-DoF (six degrees of freedom), translation, and rotation. Unlike fixed cameras in traditional object tracking tasks, all three tracking scenarios involve the simultaneous violent rotation and shaking of both the camera and objects. Results from extensive experiments suggest that our proposed approach achieves superior accuracy and robustness compared to other state-of-the-art methods. Without reducing time efficiency, our novel method exhibits a 30% increase in accuracy over other recent models. Furthermore, results indicate that event cameras are capable of robust object tracking, which is a task that conventional cameras cannot adequately perform, especially for super-fast motion tracking and challenging lighting situations.
Collapse
Affiliation(s)
| | - Wenmin Wang
- School of Computer Science and Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau
| | | | | |
Collapse
|
2
|
Online Semantic Subspace Learning with Siamese Network for UAV Tracking. REMOTE SENSING 2020. [DOI: 10.3390/rs12020325] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In urban environment monitoring, visual tracking on unmanned aerial vehicles (UAVs) can produce more applications owing to the inherent advantages, but it also brings new challenges for existing visual tracking approaches (such as complex background clutters, rotation, fast motion, small objects, and realtime issues due to camera motion and viewpoint changes). Based on the Siamese network, tracking can be conducted efficiently in recent UAV datasets. Unfortunately, the learned convolutional neural network (CNN) features are not discriminative when identifying the target from the background/clutter, In particular for the distractor, and cannot capture the appearance variations temporally. Additionally, occlusion and disappearance are also reasons for tracking failure. In this paper, a semantic subspace module is designed to be integrated into the Siamese network tracker to encode the local fine-grained details of the target for UAV tracking. More specifically, the target’s semantic subspace is learned online to adapt to the target in the temporal domain. Additionally, the pixel-wise response of the semantic subspace can be used to detect occlusion and disappearance of the target, and this enables reasonable updating to relieve model drifting. Substantial experiments conducted on challenging UAV benchmarks illustrate that the proposed method can obtain competitive results in both accuracy and efficiency when they are applied to UAV videos.
Collapse
|
3
|
Wang Y, Hu S, Wu S. Object Tracking Based On Huber Loss Function. THE VISUAL COMPUTER 2019; 35:1641-1654. [PMID: 31741545 PMCID: PMC6860376 DOI: 10.1007/s00371-018-1563-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this paper we present a novel visual tracking algorithm, in which object tracking is achieved by using subspace learning and Huber loss regularization in a particle filter framework. The changing appearance of tracked target is modeled by Principle Component Analysis (PCA) basis vectors and row group sparsity. This method takes advantage of the strengths of sub-space representation and explicitly takes the underlying relationship between particle candidates into consideration in the tracker. The representation of each particle is learned via the multi-task sparse learning method. Huber loss function is employed to model the error between candidates and templates, yielding robust tracking. We utilize the Alternating Direction Method of Multipliers (ADMM) to solve the proposed representation model. In experiments we tested sixty representative video sequences that reflect the specific challenges of tracking and used both qualitative and quantitative metrics to evaluate the performance of our tracker. The experiment results demonstrated that the proposed tracking algorithm achieves superior performance compared to nine state-of-the-art tracking methods.
Collapse
Affiliation(s)
- Yong Wang
- School of Electrical and Computer Science, University of Ottawa, Ottawa Canada
| | - Shiqiang Hu
- School of Aeronautics and Astronautics, Shanghai Jiao Tong University, Shanghai, China
| | - Shandong Wu
- Departments of Radiology, Biomedical Informatics, Bioengineering, and Intelligent System (Computer Science), University of Pittsburgh, USA
| |
Collapse
|
4
|
Masood H, Rehman S, Khan A, Riaz F, Hassan A, Abbas M. Approximate Proximal Gradient-Based Correlation Filter for Target Tracking in Videos: A Unified Approach. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2019. [DOI: 10.1007/s13369-019-03861-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
5
|
Kim C, Song D, Kim CS, Park SK. Object tracking under large motion: Combining coarse-to-fine search with superpixels. Inf Sci (N Y) 2019. [DOI: 10.1016/j.ins.2018.12.042] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
6
|
Bo C, Zhang J, Liu J, Yao Q. Robust online object tracking via the convex hull representation model. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.02.013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
7
|
Computationally Efficient Automatic Coast Mode Target Tracking Based on Occlusion Awareness in Infrared Images. SENSORS 2018; 18:s18040996. [PMID: 29584667 PMCID: PMC5948633 DOI: 10.3390/s18040996] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Revised: 03/24/2018] [Accepted: 03/25/2018] [Indexed: 11/17/2022]
Abstract
This paper proposes the automatic coast mode tracking of centroid trackers for infrared images to overcome the target occlusion status. The centroid tracking method, using only the brightness information of an image, is still widely used in infrared imaging tracking systems because it is difficult to extract meaningful features from infrared images. However, centroid trackers are likely to lose the track because they are highly vulnerable to screened status by the clutter or background. Coast mode, one of the tracking modes, maintains the servo slew rate with the tracking rate right before the loss of track. The proposed automatic coast mode tracking method makes decisions regarding entering coast mode by the prediction of target occlusion and tries to re-lock the target and resume the tracking after blind time. This algorithm comprises three steps. The first step is the prediction process of the occlusion by checking both matters which have target-likelihood brightness and which may screen the target despite different brightness. The second step is the process making inertial tracking commands to the servo. The last step is the process of re-locking a target based on the target modeling of histogram ratio. The effectiveness of the proposed algorithm is addressed by presenting experimental results based on computer simulation with various test imagery sequences compared to published tracking algorithms. The proposed algorithm is tested under a real environment with a naval electro-optical tracking system (EOTS) and airborne EO/IR system.
Collapse
|
8
|
Chen Z, You X, Zhong B, Li J, Tao D. Dynamically Modulated Mask Sparse Tracking. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:3706-3718. [PMID: 28113386 DOI: 10.1109/tcyb.2016.2577718] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Visual tracking is a critical task in many computer vision applications such as surveillance and robotics. However, although the robustness to local corruptions has been improved, prevailing trackers are still sensitive to large scale corruptions, such as occlusions and illumination variations. In this paper, we propose a novel robust object tracking technique depends on subspace learning-based appearance model. Our contributions are twofold. First, mask templates produced by frame difference are introduced into our template dictionary. Since the mask templates contain abundant structure information of corruptions, the model could encode information about the corruptions on the object more efficiently. Meanwhile, the robustness of the tracker is further enhanced by adopting system dynamic, which considers the moving tendency of the object. Second, we provide the theoretic guarantee that by adapting the modulated template dictionary system, our new sparse model can be solved by the accelerated proximal gradient algorithm as efficient as in traditional sparse tracking methods. Extensive experimental evaluations demonstrate that our method significantly outperforms 21 other cutting-edge algorithms in both speed and tracking accuracy, especially when there are challenges such as pose variation, occlusion, and illumination changes.
Collapse
|
9
|
|
10
|
Li X, Han Z, Wang L, Lu H. Visual Tracking via Random Walks on Graph Model. IEEE TRANSACTIONS ON CYBERNETICS 2016; 46:2144-2155. [PMID: 26292358 DOI: 10.1109/tcyb.2015.2466437] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
In this paper, we formulate visual tracking as random walks on graph models with nodes representing superpixels and edges denoting relationships between superpixels. We integrate two novel graphs with the theory of Markov random walks, resulting in two Markov chains. First, an ergodic Markov chain is enforced to globally search for the candidate nodes with similar features to the template nodes. Second, an absorbing Markov chain is utilized to model the temporal coherence between consecutive frames. The final confidence map is generated by a structural model which combines both appearance similarity measurement derived by the random walks and internal spatial layout demonstrated by different target parts. The effectiveness of the proposed Markov chains as well as the structural model is evaluated both qualitatively and quantitatively. Experimental results on challenging sequences show that the proposed tracking algorithm performs favorably against state-of-the-art methods.
Collapse
|
11
|
Yang H, Qu S. Online Hierarchical Sparse Representation of Multifeature for Robust Object Tracking. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2016; 2016:5894639. [PMID: 27630710 PMCID: PMC5008034 DOI: 10.1155/2016/5894639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/10/2016] [Accepted: 07/10/2016] [Indexed: 11/18/2022]
Abstract
Object tracking based on sparse representation has given promising tracking results in recent years. However, the trackers under the framework of sparse representation always overemphasize the sparse representation and ignore the correlation of visual information. In addition, the sparse coding methods only encode the local region independently and ignore the spatial neighborhood information of the image. In this paper, we propose a robust tracking algorithm. Firstly, multiple complementary features are used to describe the object appearance; the appearance model of the tracked target is modeled by instantaneous and stable appearance features simultaneously. A two-stage sparse-coded method which takes the spatial neighborhood information of the image patch and the computation burden into consideration is used to compute the reconstructed object appearance. Then, the reliability of each tracker is measured by the tracking likelihood function of transient and reconstructed appearance models. Finally, the most reliable tracker is obtained by a well established particle filter framework; the training set and the template library are incrementally updated based on the current tracking results. Experiment results on different challenging video sequences show that the proposed algorithm performs well with superior tracking accuracy and robustness.
Collapse
Affiliation(s)
- Honghong Yang
- Department of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Shiru Qu
- Department of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| |
Collapse
|
12
|
Lan X, Ma AJ, Yuen PC, Chellappa R. Joint Sparse Representation and Robust Feature-Level Fusion for Multi-Cue Visual Tracking. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:5826-5841. [PMID: 26415172 DOI: 10.1109/tip.2015.2481325] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Visual tracking using multiple features has been proved as a robust approach because features could complement each other. Since different types of variations such as illumination, occlusion, and pose may occur in a video sequence, especially long sequence videos, how to properly select and fuse appropriate features has become one of the key problems in this approach. To address this issue, this paper proposes a new joint sparse representation model for robust feature-level fusion. The proposed method dynamically removes unreliable features to be fused for tracking by using the advantages of sparse representation. In order to capture the non-linear similarity of features, we extend the proposed method into a general kernelized framework, which is able to perform feature fusion on various kernel spaces. As a result, robust tracking performance is obtained. Both the qualitative and quantitative experimental results on publicly available videos show that the proposed method outperforms both sparse representation-based and fusion based-trackers.
Collapse
|
13
|
Mei X, Hong Z, Prokhorov D, Tao D. Robust Multitask Multiview Tracking in Videos. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2015; 26:2874-2890. [PMID: 25730831 DOI: 10.1109/tnnls.2015.2399233] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Various sparse-representation-based methods have been proposed to solve tracking problems, and most of them employ least squares (LSs) criteria to learn the sparse representation. In many tracking scenarios, traditional LS-based methods may not perform well owing to the presence of heavy-tailed noise. In this paper, we present a tracking approach using an approximate least absolute deviation (LAD)-based multitask multiview sparse learning method to enjoy robustness of LAD and take advantage of multiple types of visual features, such as intensity, color, and texture. The proposed method is integrated in a particle filter framework, where learning the sparse representation for each view of the single particle is regarded as an individual task. The underlying relationship between tasks across different views and different particles is jointly exploited in a unified robust multitask formulation based on LAD. In addition, to capture the frequently emerging outlier tasks, we decompose the representation matrix to two collaborative components that enable a more robust and accurate approximation. We show that the proposed formulation can be effectively approximated by Nesterov's smoothing method and efficiently solved using the accelerated proximal gradient method. The presented tracker is implemented using four types of features and is tested on numerous synthetic sequences and real-world video sequences, including the CVPR2013 tracking benchmark and ALOV++ data set. Both the qualitative and quantitative results demonstrate the superior performance of the proposed approach compared with several state-of-the-art trackers.
Collapse
|
14
|
Visual tracking based on extreme learning machine and sparse representation. SENSORS 2015; 15:26877-905. [PMID: 26506359 PMCID: PMC4634458 DOI: 10.3390/s151026877] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/13/2015] [Revised: 10/15/2015] [Accepted: 10/16/2015] [Indexed: 12/02/2022]
Abstract
The existing sparse representation-based visual trackers mostly suffer from both being time consuming and having poor robustness problems. To address these issues, a novel tracking method is presented via combining sparse representation and an emerging learning technique, namely extreme learning machine (ELM). Specifically, visual tracking can be divided into two consecutive processes. Firstly, ELM is utilized to find the optimal separate hyperplane between the target observations and background ones. Thus, the trained ELM classification function is able to remove most of the candidate samples related to background contents efficiently, thereby reducing the total computational cost of the following sparse representation. Secondly, to further combine ELM and sparse representation, the resultant confidence values (i.e., probabilities to be a target) of samples on the ELM classification function are used to construct a new manifold learning constraint term of the sparse representation framework, which tends to achieve robuster results. Moreover, the accelerated proximal gradient method is used for deriving the optimal solution (in matrix form) of the constrained sparse tracking model. Additionally, the matrix form solution allows the candidate samples to be calculated in parallel, thereby leading to a higher efficiency. Experiments demonstrate the effectiveness of the proposed tracker.
Collapse
|
15
|
Wang D, Lu H, Xiao Z, Yang MH. Inverse sparse tracker with a locally weighted distance metric. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:2646-2657. [PMID: 25935033 DOI: 10.1109/tip.2015.2427518] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Sparse representation has been recently extensively studied for visual tracking and generally facilitates more accurate tracking results than classic methods. In this paper, we propose a sparsity-based tracking algorithm that is featured with two components: 1) an inverse sparse representation formulation and 2) a locally weighted distance metric. In the inverse sparse representation formulation, the target template is reconstructed with particles, which enables the tracker to compute the weights of all particles by solving only one l1 optimization problem and thereby provides a quite efficient model. This is in direct contrast to most previous sparse trackers that entail solving one optimization problem for each particle. However, we notice that this formulation with normal Euclidean distance metric is sensitive to partial noise like occlusion and illumination changes. To this end, we design a locally weighted distance metric to replace the Euclidean one. Similar ideas of using local features appear in other works, but only being supported by popular assumptions like local models could handle partial noise better than holistic models, without any solid theoretical analysis. In this paper, we attempt to explicitly explain it from a mathematical view. On that basis, we further propose a method to assign local weights by exploiting the temporal and spatial continuity. In the proposed method, appearance changes caused by partial occlusion and shape deformation are carefully considered, thereby facilitating accurate similarity measurement and model update. The experimental validation is conducted from two aspects: 1) self validation on key components and 2) comparison with other state-of-the-art algorithms. Results over 15 challenging sequences show that the proposed tracking algorithm performs favorably against the existing sparsity-based trackers and the other state-of-the-art methods.
Collapse
|
16
|
Yang Y, Xie Y, Zhang W, Hu W, Tan Y. Global Coupled Learning and Local Consistencies Ensuring for sparse-based tracking. Neurocomputing 2015. [DOI: 10.1016/j.neucom.2014.12.060] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|