1
|
Yang Z, Liu K, Li Q, Hou Y, Yan Z. Three-stage cascade architecture-based siamese sliding window network algorithm for object tracking. Heliyon 2025; 11:e41612. [PMID: 39897889 PMCID: PMC11782978 DOI: 10.1016/j.heliyon.2024.e41612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Revised: 12/30/2024] [Accepted: 12/31/2024] [Indexed: 02/04/2025] Open
Abstract
To enhance the correlation of feature information and enrich the pattern of cross-correlation metrics, we propose the Siam ST algorithm, which is based on a three-stage cascade (TSC) architecture. The sliding window is introduced in the last three layers of convolution blocks, which can obtain the global information of images and fully capture the target feature. The TSC structure is developed by using the regional proposal network. It makes the features of the current frame interact with the previous frame. As a result, our method has a high effect of robustness and association features extraction. Therefore, our ablation experiments are conducted on the VOT2016 dataset, and comparison experiments are conducted on four datasets, VOT2018, LaSOT, Tracking Net, and UAV123. Our proposed algorithm demonstrates a significant improvement compared to SiamRPN++ across four datasets.
Collapse
Affiliation(s)
- Zheng Yang
- School of Electrical Engineering, Yellow River Conservancy Technical Institute, Dongjing street, Kaifeng, 475004, Henan, China
| | - Kaiwen Liu
- School of Artificial Intelligence, Henan University, Mingli street, Zhengzhou, 450000, Henan, China
| | - Quanlong Li
- School of Artificial Intelligence, Henan University, Mingli street, Zhengzhou, 450000, Henan, China
| | - Yandong Hou
- School of Artificial Intelligence, Henan University, Mingli street, Zhengzhou, 450000, Henan, China
| | - Zhiyu Yan
- School of Electrical Engineering, Yellow River Conservancy Technical Institute, Dongjing street, Kaifeng, 475004, Henan, China
| |
Collapse
|
2
|
Gao H, Li H, Liu Y, Lu H, Kim H, Pun CM. High-quality-guided artificial bee colony algorithm for designing loudspeaker. Neural Comput Appl 2020. [DOI: 10.1007/s00521-018-3568-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
3
|
Kang B, Liang D, Ding W, Zhou H, Zhu WP. Grayscale-Thermal Tracking via Inverse Sparse Representation based Collaborative Encoding. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:3401-3415. [PMID: 31880552 DOI: 10.1109/tip.2019.2959912] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Grayscale-thermal tracking has attracted a great deal of attention due to its capability of fusing two different yet complementary target observations. Existing methods often consider extracting the discriminative target information and exploring the target correlation among different images as two separate issues, ignoring their interdependence. This may cause tracking drifts in challenging video pairs. This paper presents a collaborative encoding model called joint correlation and discriminant analysis based inver-sparse representation (JCDA-InvSR) to jointly encode the target candidates in the grayscale and thermal video sequences. In particular, we develop a multi-objective programming to integrate the feature selection and the multi-view correlation analysis into a unified optimization problem in JCDA-InvSR, which can simultaneously highlight the special characters of the grayscale and thermal targets through alternately optimizing two aspects: the target discrimination within a given image and the target correlation across different images. For robust grayscale-thermal tracking, we also incorporate the prior knowledge of target candidate codes into the SVM based target classifier to overcome the overfitting caused by limited training labels. Extensive experiments on GTOT and RGBT234 datasets illustrate the promising performance of our tracking framework.
Collapse
|
4
|
|
5
|
Zhang D, Zakir A. Top–Down Saliency Detection Based on Deep-Learned Features. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS 2019. [DOI: 10.1142/s1469026819500093] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
How to localize objects in images accurately and efficiently is a challenging problem in computer vision. In this paper, a novel top–down fine-grained salient object detection method based on deep-learned features is proposed, which can detect the same object in input image as the query image. The query image and its three subsample images are used as top–down cues to guide saliency detection. We ameliorate convolutional neural network (CNN) using the fast VGG network (VGG-f) pre-trained on ImageNet and re-trained on the Pascal VOC 2012 dataset. Experiment on the FiFA dataset demonstrates that proposed method can localize the saliency region and find the specific object (e.g., human face) as the query. Experiments on the David1 and Face1 sequences conclusively prove that the proposed algorithm is able to effectively deal with many challenging factors including illumination change, shape deformation, scale change and partial occlusion.
Collapse
Affiliation(s)
- Duzhen Zhang
- School of Computer Science and Technology, Jiangsu Normal University, Xuzhou 221116, Jiangsu, P. R. China
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, Jiangsu, P. R. China
| | - Ali Zakir
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, Jiangsu, P. R. China
| |
Collapse
|
6
|
Zuo W, Wu X, Lin L, Zhang L, Yang MH. Learning Support Correlation Filters for Visual Tracking. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019; 41:1158-1172. [PMID: 29993910 DOI: 10.1109/tpami.2018.2829180] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
For visual tracking methods based on kernel support vector machines (SVMs), data sampling is usually adopted to reduce the computational cost in training. In addition, budgeting of support vectors is required for computational efficiency. Instead of sampling and budgeting, recently the circulant matrix formed by dense sampling of translated image patches has been utilized in kernel correlation filters for fast tracking. In this paper, we derive an equivalent formulation of a SVM model with the circulant matrix expression and present an efficient alternating optimization method for visual tracking. We incorporate the discrete Fourier transform with the proposed alternating optimization process, and pose the tracking problem as an iterative learning of support correlation filters (SCFs). In the fully-supervision setting, our SCF can find the globally optimal solution with real-time performance. For a given circulant data matrix with n2 samples of n ×n pixels, the computational complexity of the proposed algorithm is O(n2 logn) whereas that of the standard SVM-based approaches is at least O(n4). In addition, we extend the SCF-based tracking algorithm with multi-channel features, kernel functions, and scale-adaptive approaches to further improve the tracking performance. Experimental results on a large benchmark dataset show that the proposed SCF-based algorithms perform favorably against the state-of-the-art tracking methods in terms of accuracy and speed.
Collapse
|
7
|
|
8
|
Du M, Ding Y, Meng X, Wei HL, Zhao Y. Distractor-Aware Deep Regression for Visual Tracking. SENSORS 2019; 19:s19020387. [PMID: 30669369 PMCID: PMC6359135 DOI: 10.3390/s19020387] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/02/2018] [Revised: 01/10/2019] [Accepted: 01/15/2019] [Indexed: 11/16/2022]
Abstract
In recent years, regression trackers have drawn increasing attention in the visual-object tracking community due to their favorable performance and easy implementation. The tracker algorithms directly learn mapping from dense samples around the target object to Gaussian-like soft labels. However, in many real applications, when applied to test data, the extreme imbalanced distribution of training samples usually hinders the robustness and accuracy of regression trackers. In this paper, we propose a novel effective distractor-aware loss function to balance this issue by highlighting the significant domain and by severely penalizing the pure background. In addition, we introduce a full differentiable hierarchy-normalized concatenation connection to exploit abstractions across multiple convolutional layers. Extensive experiments were conducted on five challenging benchmark-tracking datasets, that is, OTB-13, OTB-15, TC-128, UAV-123, and VOT17. The experimental results are promising and show that the proposed tracker performs much better than nearly all the compared state-of-the-art approaches.
Collapse
Affiliation(s)
- Ming Du
- Key Laboratory of Dynamics and Control of Flight Vehicle, Ministry of Education, School of Aerospace Engineering, Beijing Institute of Technology, Beijing 100081, China.
| | - Yan Ding
- Key Laboratory of Dynamics and Control of Flight Vehicle, Ministry of Education, School of Aerospace Engineering, Beijing Institute of Technology, Beijing 100081, China.
| | - Xiuyun Meng
- Key Laboratory of Dynamics and Control of Flight Vehicle, Ministry of Education, School of Aerospace Engineering, Beijing Institute of Technology, Beijing 100081, China.
| | - Hua-Liang Wei
- Department of Automatic Control and Systems Engineering, University of Sheffield, Sheffield S1 3JD, UK.
| | - Yifan Zhao
- Through-Life Engineering Services Institute, School of Aerospace, Transport and Manufacturing, Cranfield University, Cranfield, Bedfordshire MK43 0AL, UK.
| |
Collapse
|
9
|
A novel reverse sparse model utilizing the spatio-temporal relationship of target templates for object tracking. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.10.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
10
|
Qi Y, Qin L, Zhang J, Zhang S, Huang Q, Yang MH. Structure-Aware Local Sparse Coding for Visual Tracking. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:3857-3869. [PMID: 29727271 DOI: 10.1109/tip.2018.2797482] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Sparse coding has been applied to visual tracking and related vision problems with demonstrated success in recent years. Existing tracking methods based on local sparse coding sample patches from a target candidate and sparsely encode these using a dictionary consisting of patches sampled from target template images. The discriminative strength of existing methods based on local sparse coding is limited as spatial structure constraints among the template patches are not exploited. To address this problem, we propose a structure-aware local sparse coding algorithm, which encodes a target candidate using templates with both global and local sparsity constraints. For robust tracking, we show the local regions of a candidate region should be encoded only with the corresponding local regions of the target templates that are the most similar from the global view. Thus, a more precise and discriminative sparse representation is obtained to account for appearance changes. To alleviate the issues with tracking drifts, we design an effective template update scheme. Extensive experiments on challenging image sequences demonstrate the effectiveness of the proposed algorithm against numerous state-of-the-art methods.
Collapse
|
11
|
Moujahid D, Elharrouss O, Tairi H. Visual object tracking via the local soft cosine similarity. Pattern Recognit Lett 2018. [DOI: 10.1016/j.patrec.2018.03.026] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
12
|
Bo C, Zhang J, Liu J, Yao Q. Robust online object tracking via the convex hull representation model. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.02.013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
13
|
Yuen PC, Chellappa R. Learning Common and Feature-Specific Patterns: A Novel Multiple-Sparse-Representation-Based Tracker. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:2022-2037. [PMID: 29989985 DOI: 10.1109/tip.2017.2777183] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The use of multiple features has been shown to be an effective strategy for visual tracking because of their complementary contributions to appearance modeling. The key problem is how to learn a fused representation from multiple features for appearance modeling. Different features extracted from the same object should share some commonalities in their representations while each feature should also have some feature-specific representation patterns which reflect its complementarity in appearance modeling. Different from existing multi-feature sparse trackers which only consider the commonalities among the sparsity patterns of multiple features, this paper proposes a novel multiple sparse representation framework for visual tracking which jointly exploits the shared and feature-specific properties of different features by decomposing multiple sparsity patterns. Moreover, we introduce a novel online multiple metric learning to efficiently and adaptively incorporate the appearance proximity constraint, which ensures that the learned commonalities of multiple features are more representative. Experimental results on tracking benchmark videos and other challenging videos demonstrate the effectiveness of the proposed tracker.
Collapse
|
14
|
Sui Y, Wang G, Zhang L, Yang MH. Exploiting Spatial-Temporal Locality of Tracking via Structured Dictionary Learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:1282-1296. [PMID: 29990191 DOI: 10.1109/tip.2017.2779275] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In this paper, a novel spatial-temporal locality is proposed and unified via a discriminative dictionary learning framework for visual tracking. By exploring the strong local correlations between temporally obtained target and their spatially distributed nearby background neighbors, a spatial-temporal locality is obtained. The locality is formulated as a subspace model and exploited under a unified structure of discriminative dictionary learning with a subspace structure. Using the learned dictionary, the target and its background can be described and distinguished effectively through their sparse codes. As a result, the target is localized by integrating both the descriptive and the discriminative qualities. Extensive experiments on various challenging video sequences demonstrate the superior performance of proposed algorithm over the other state-of-the-art approaches.
Collapse
|
15
|
Xue W, Feng Z, Xu C, Liu T, Meng Z, Zhang C. Visual tracking via improving motion model and model updater. INT J ADV ROBOT SYST 2018. [DOI: 10.1177/1729881418756238] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Motion model and model updater are two necessary components for online visual tracking. On the one hand, an effective motion model needs to strike the right balance between target processing, to account for the target appearance and scene analysis, and to describe stable background information. Most conventional trackers focus on one aspect out of the two and hence are not able to achieve the correct balance. On the other hand, the admirable model update needs to consider both the tracking speed and the model drift. Most tracking models are updated on every frame or fixed frames, so it cannot achieve the best performance. In this article, we solve the motion model problem by collaboratively using salient region detection and image segmentation. Particularly, the two methods are for different purposes. In the absence of prior knowledge, the former considers image attributes like color, gradient, edges, and boundaries then forms a robust object; the latter aggregates individual pixels into meaningful atomic regions by using the prior knowledge of target and background in the video sequence. Taking advantage of their complementary roles, we construct a more reasonable confidence map. For model update problems, we dynamically update the model by analyzing scene with image similarity, which not only reduces the update frequency of the model but also suppresses the model drift. Finally, we use these improved building blocks not only to do comparative tests but also to give a basic tracker, and extensive experimental results on OTB50 show that the proposed methods perform favorably against the state-of-the-art methods.
Collapse
Affiliation(s)
- Wanli Xue
- School of Computer Science and Technology, Tianjin University, Tianjin, China
| | - Zhiyong Feng
- School of Computer Software, Tianjin University, Tianjin, China
| | - Chao Xu
- School of Computer Software, Tianjin University, Tianjin, China
| | - Tong Liu
- School of Computer Software, Tianjin University, Tianjin, China
| | - Zhaopeng Meng
- School of Computer Software, Tianjin University, Tianjin, China
| | - Chengwei Zhang
- School of Computer Science and Technology, Tianjin University, Tianjin, China
| |
Collapse
|
16
|
Deng C, Wang B, Lin W, Huang GB, Zhao B. Effective visual tracking by pairwise metric learning. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.05.115] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
17
|
Abdechiri M, Faez K, Amindavar H. Visual object tracking with online weighted chaotic multiple instance learning. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2017.03.032] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
18
|
Ma B, Huang L, Shen J, Shao L. Discriminative Tracking Using Tensor Pooling. IEEE TRANSACTIONS ON CYBERNETICS 2016; 46:2411-2422. [PMID: 26441435 DOI: 10.1109/tcyb.2015.2477879] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
How to effectively organize local descriptors to build a global representation has a critical impact on the performance of vision tasks. Recently, local sparse representation has been successfully applied to visual tracking, owing to its discriminative nature and robustness against local noise and partial occlusions. Local sparse codes computed with a template actually form a three-order tensor according to their original layout, although most existing pooling operators convert the codes to a vector by concatenating or computing statistics on them. We argue that, compared to pooling vectors, the tensor form could deliver more intrinsic structural information for the target appearance, and can also avoid high dimensionality learning problems suffered in concatenation-based pooling methods. Therefore, in this paper, we propose to represent target templates and candidates directly with sparse coding tensors, and build the appearance model by incrementally learning on these tensors. We propose a discriminative framework to further improve robustness of our method against drifting and environmental noise. Experiments on a recent comprehensive benchmark indicate that our method performs better than state-of-the-art trackers.
Collapse
|
19
|
Li X, Han Z, Wang L, Lu H. Visual Tracking via Random Walks on Graph Model. IEEE TRANSACTIONS ON CYBERNETICS 2016; 46:2144-2155. [PMID: 26292358 DOI: 10.1109/tcyb.2015.2466437] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
In this paper, we formulate visual tracking as random walks on graph models with nodes representing superpixels and edges denoting relationships between superpixels. We integrate two novel graphs with the theory of Markov random walks, resulting in two Markov chains. First, an ergodic Markov chain is enforced to globally search for the candidate nodes with similar features to the template nodes. Second, an absorbing Markov chain is utilized to model the temporal coherence between consecutive frames. The final confidence map is generated by a structural model which combines both appearance similarity measurement derived by the random walks and internal spatial layout demonstrated by different target parts. The effectiveness of the proposed Markov chains as well as the structural model is evaluated both qualitatively and quantitatively. Experimental results on challenging sequences show that the proposed tracking algorithm performs favorably against state-of-the-art methods.
Collapse
|
20
|
Sun C, Wang D, Lu H. Occlusion-Aware Fragment-Based Tracking With Spatial-Temporal Consistency. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:3814-3825. [PMID: 27323362 DOI: 10.1109/tip.2016.2580463] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
In this paper, we present a robust tracking method by exploiting a fragment-based appearance model with consideration of both temporal continuity and discontinuity information. From the perspective of probability theory, the proposed tracking algorithm can be viewed as a two-stage optimization problem. In the first stage, by adopting the estimated occlusion state as a prior, the optimal state of the tracked object can be obtained by solving an optimization problem, where the objective function is designed based on the classification score, occlusion prior, and temporal continuity information. In the second stage, we propose a discriminative occlusion model, which exploits both foreground and background information to detect the possible occlusion, and also models the consistency of occlusion labels among different frames. In addition, a simple yet effective training strategy is introduced during the model training (and updating) process, with which the effects of spatial-temporal consistency are properly weighted. The proposed tracker is evaluated by using the recent benchmark data set, on which the results demonstrate that our tracker performs favorably against other state-of-the-art tracking algorithms.
Collapse
|
21
|
Mazzu A, Morerio P, Marcenaro L, Regazzoni CS. A Cognitive Control-Inspired Approach to Object Tracking. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:2697-2711. [PMID: 27093628 DOI: 10.1109/tip.2016.2553781] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Under a tracking framework, the definition of the target state is the basic step for automatic understanding of dynamic scenes. More specifically, far object tracking raises challenges related to the potentially abrupt size changes of the targets as they approach the sensor. If not handled, size changes can introduce heavy issues in data association and position estimation. This is why adaptability and self-awareness of a tracking module are desirable features. The paradigm of cognitive dynamic systems (CDSs) can provide a framework under which a continuously learning cognitive module can be designed. In particular, CDS theory describes a basic vocabulary of components that can be used as the founding blocks of a module capable to learn behavioral rules from continuous active interactions with the environment. This quality is the fundamental to deal with dynamic situations. In this paper we propose a general CDS-based approach to tracking. We show that such a CDS-inspired design can lead to the self-adaptability of a Bayesian tracker in fusing heterogeneous object features, overcoming size change issues. The experimental results on infrared sequences show how the proposed framework is able to outperform other existing far object tracking methods.
Collapse
|
22
|
Zhang L, Lu H, Du D, Liu L. Sparse Hashing Tracking. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:840-849. [PMID: 26685241 DOI: 10.1109/tip.2015.2509244] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
In this paper, we propose a novel tracking framework based on a sparse and discriminative hashing method. Different from the previous work, we treat object tracking as an approximate nearest neighbor searching process in a binary space. Using the hash functions, the target templates and the candidates can be projected into the Hamming space, facilitating the distance calculation and tracking efficiency. First, we integrate both the inter-class and intra-class information to train multiple hash functions for better classification, while most classifiers in previous tracking methods usually neglect the inter-class correlation, which may cause the inaccuracy. Then, we introduce sparsity into the hash coefficient vectors for dynamic feature selection, which is crucial to select the discriminative and stable features to adapt to visual variations during the tracking process. Extensive experiments on various challenging sequences show that the proposed algorithm performs favorably against the state-of-the-art methods.
Collapse
|
23
|
|
24
|
Wang D, Lu H, Bo C. Fast and Robust Object Tracking via Probability Continuous Outlier Model. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:5166-5176. [PMID: 26390456 DOI: 10.1109/tip.2015.2478399] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
This paper presents a novel visual tracking method based on linear representation. First, we present a novel probability continuous outlier model (PCOM) to depict the continuous outliers within the linear representation model. In the proposed model, the element of the noisy observation sample can be either represented by a principle component analysis subspace with small Guassian noise or treated as an arbitrary value with a uniform prior, in which a simple Markov random field model is adopted to exploit the spatial consistency information among outliers (or inliners). Then, we derive the objective function of the PCOM method from the perspective of probability theory. The objective function can be solved iteratively by using the outlier-free least squares and standard max-flow/min-cut steps. Finally, for visual tracking, we develop an effective observation likelihood function based on the proposed PCOM method and background information, and design a simple update scheme. Both qualitative and quantitative evaluations demonstrate that our tracker achieves considerable performance in terms of both accuracy and speed.
Collapse
|
25
|
Visual tracking based on extreme learning machine and sparse representation. SENSORS 2015; 15:26877-905. [PMID: 26506359 PMCID: PMC4634458 DOI: 10.3390/s151026877] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/13/2015] [Revised: 10/15/2015] [Accepted: 10/16/2015] [Indexed: 12/02/2022]
Abstract
The existing sparse representation-based visual trackers mostly suffer from both being time consuming and having poor robustness problems. To address these issues, a novel tracking method is presented via combining sparse representation and an emerging learning technique, namely extreme learning machine (ELM). Specifically, visual tracking can be divided into two consecutive processes. Firstly, ELM is utilized to find the optimal separate hyperplane between the target observations and background ones. Thus, the trained ELM classification function is able to remove most of the candidate samples related to background contents efficiently, thereby reducing the total computational cost of the following sparse representation. Secondly, to further combine ELM and sparse representation, the resultant confidence values (i.e., probabilities to be a target) of samples on the ELM classification function are used to construct a new manifold learning constraint term of the sparse representation framework, which tends to achieve robuster results. Moreover, the accelerated proximal gradient method is used for deriving the optimal solution (in matrix form) of the constrained sparse tracking model. Additionally, the matrix form solution allows the candidate samples to be calculated in parallel, thereby leading to a higher efficiency. Experiments demonstrate the effectiveness of the proposed tracker.
Collapse
|