1
|
Zhang W, Jiao L, Li Y, Liu J. Sparse Learning-Based Correlation Filter for Robust Tracking. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 30:878-891. [PMID: 33237861 DOI: 10.1109/tip.2020.3039392] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Many objective tracking methods are based on the framework of correlation filtering (CF) due to its high efficiency. In this paper, we propose a l2 -norm based sparse response regularization term to restrain unexpected crests in response for CF framework. CF trackers learn online to regress the region of interest into a Gaussian response. However, due to the uncertain transformations of tracked object, there are many unexpected crests in the response map. When the response of tracked object is corrupted by other crests, the tracker will lost the object. Therefore, the sparse response is used to increase the robustness to transformations of tracked object. Since the novel term is directly incorporated into the objective function of the CF framework, it can be used to improve the performance of many methods which are based on this framework. Moreover, from the solutions we derive, the new method will not increase the computational complexity. Through the experiments on benchmarks of OTB-100, TempleColor, VOT2016 and VOT2017, the proposed regularization term can improve the tracking performance of various CF trackers, including those based on standard discriminative CF framework and those based on context-aware CF framework. We also embed the sparse response regularization term in the state-of-the-art integrated tracker MCCT to test its generalization performance. Although MCCT is an expert integrated tracker and owns an exquisite algorithm for selecting experts, the experimental results show that our method can still improve its long-term tracking performance without increasing computational complexity.
Collapse
|
2
|
Zhao L, Huang P, Liu F, Huang H, Chen H. Drift-free tracking via the construction of an effective dictionary. INT J ADV ROBOT SYST 2020. [DOI: 10.1177/1729881420929651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Template dictionary construction is an important issue in sparse representation (SP)-based tracking algorithms. In this article, a drift-free visual tracking algorithm is proposed via the construction of an effective template dictionary. The constructed dictionary is composed of three categories of atoms (templates): nonpolluted atoms, variational atoms, and noise atoms. Moreover, the linear combinations of nonpolluted atoms are also added to the dictionary for the diversity of atoms. All the atoms are selectively updated to capture appearance changes and alleviate the model drifting problem. A bidirectional tracking process is used and each process is optimized by two-step SP, which greatly reduces the computational burden. Compared with other related works, the constructed dictionary and tracking algorithm are both robust and efficient.
Collapse
Affiliation(s)
- Li Zhao
- College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, China
| | - Pengcheng Huang
- College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, China
| | - Fei Liu
- College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, China
| | - Hui Huang
- College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, China
| | - Huiling Chen
- College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, China
| |
Collapse
|
3
|
Masood H, Rehman S, Khan A, Riaz F, Hassan A, Abbas M. Approximate Proximal Gradient-Based Correlation Filter for Target Tracking in Videos: A Unified Approach. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2019. [DOI: 10.1007/s13369-019-03861-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
4
|
Qi Y, Qin L, Zhang J, Zhang S, Huang Q, Yang MH. Structure-Aware Local Sparse Coding for Visual Tracking. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:3857-3869. [PMID: 29727271 DOI: 10.1109/tip.2018.2797482] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Sparse coding has been applied to visual tracking and related vision problems with demonstrated success in recent years. Existing tracking methods based on local sparse coding sample patches from a target candidate and sparsely encode these using a dictionary consisting of patches sampled from target template images. The discriminative strength of existing methods based on local sparse coding is limited as spatial structure constraints among the template patches are not exploited. To address this problem, we propose a structure-aware local sparse coding algorithm, which encodes a target candidate using templates with both global and local sparsity constraints. For robust tracking, we show the local regions of a candidate region should be encoded only with the corresponding local regions of the target templates that are the most similar from the global view. Thus, a more precise and discriminative sparse representation is obtained to account for appearance changes. To alleviate the issues with tracking drifts, we design an effective template update scheme. Extensive experiments on challenging image sequences demonstrate the effectiveness of the proposed algorithm against numerous state-of-the-art methods.
Collapse
|
5
|
Chen B, Peng M, Liu L, Lu T. Visual Tracking with Multilevel Sparse Representation and Metric Learning. JOURNAL OF INFORMATION TECHNOLOGY RESEARCH 2018. [DOI: 10.4018/jitr.2018040101] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Visual tracking arises in various real-world tasks where an object should be located in a video. Sparse representation can implement tracking problems by linearly representing object with a few templates. However, this approach has two main shortcomings. Namely, setting the templates updating frequency is difficult and meanwhile it is relatively weak in distinguishing the object from the background. For solving these problems, the author models a multilevel object template set that can be stratified by different updating time spans. The hierarchical structure and updating strategy promise the real-timeness, stability, and diversity of object template. Additionally, metric learning is combined to evaluate the object candidates and thereby improve the discriminative ability. Experiments on well-known visual tracking datasets demonstrate that the proposed method can track an object more robustly and accurately compared to the state-of-the-art approaches.
Collapse
Affiliation(s)
- Baifan Chen
- School of Information Science and Engineering, Central South University, Changsha, China
| | - Meng Peng
- School of Computer and Communication, Hunan Institute of Engineering, Xiangtan, China
| | - Lijue Liu
- School of Information Science and Engineering, Central South University, Changsha, China
| | - Tao Lu
- Hubei Province Key Laboratory of Intelligent Robot, Wuhan Institute of Technology, Wuhan, China
| |
Collapse
|
6
|
Gundogdu E, Alatan AA. Good Features to Correlate for Visual Tracking. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:2526-2540. [PMID: 29994635 DOI: 10.1109/tip.2018.2806280] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
During the recent years, correlation filters have shown dominant and spectacular results for visual object tracking. The types of the features that are employed in these family of trackers significantly affect the performance of visual tracking. The ultimate goal is to utilize robust features invariant to any kind of appearance change of the object, while predicting the object location as properly as in the case of no appearance change. As the deep learning based methods have emerged, the study of learning features for specific tasks has accelerated. For instance, discriminative visual tracking methods based on deep architectures have been studied with promising performance. Nevertheless, correlation filter based (CFB) trackers confine themselves to use the pre-trained networks which are trained for object classification problem. To this end, in this manuscript the problem of learning deep fully convolutional features for the CFB visual tracking is formulated. In order to learn the proposed model, a novel and efficient backpropagation algorithm is presented based on the loss function of the network. The proposed learning framework enables the network model to be flexible for a custom design. Moreover, it alleviates the dependency on the network trained for classification. Extensive performance analysis shows the efficacy of the proposed custom design in the CFB tracking framework. By fine-tuning the convolutional parts of a state-of-the-art network and integrating this model to a CFB tracker, which is the top performing one of VOT2016, 18% increase is achieved in terms of expected average overlap, and tracking failures are decreased by 25%, while maintaining the superiority over the state-of-the-art methods in OTB-2013 and OTB-2015 tracking datasets.
Collapse
|
7
|
Feng P, Xu C, Zhao Z, Liu F, Yuan C, Wang T, Duan K. Sparse representation combined with context information for visual tracking. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.11.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
8
|
Liu R, Wang J, Shang X, Wang Y, Su Z, Cai Y. Sparse Coding and Counting for Robust Visual Tracking. PLoS One 2016; 11:e0168093. [PMID: 27992474 PMCID: PMC5161354 DOI: 10.1371/journal.pone.0168093] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2016] [Accepted: 11/24/2016] [Indexed: 11/19/2022] Open
Abstract
In this paper, we propose a novel sparse coding and counting method under Bayesian framework for visual tracking. In contrast to existing methods, the proposed method employs the combination of L0 and L1 norm to regularize the linear coefficients of incrementally updated linear basis. The sparsity constraint enables the tracker to effectively handle difficult challenges, such as occlusion or image corruption. To achieve real-time processing, we propose a fast and efficient numerical algorithm for solving the proposed model. Although it is an NP-hard problem, the proposed accelerated proximal gradient (APG) approach is guaranteed to converge to a solution quickly. Besides, we provide a closed solution of combining L0 and L1 regularized representation to obtain better sparsity. Experimental results on challenging video sequences demonstrate that the proposed method achieves state-of-the-art results both in accuracy and speed.
Collapse
Affiliation(s)
- Risheng Liu
- School of Software Technology, Dalian University of Technology, Dalian City, Liaoning Province, China
- Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, Dalian City, Liaoning Province, China
| | - Jing Wang
- School of Mathematic Sciences, Dalian University of Technology, Dalian City, Liaoning Province, China
| | - Xiaoke Shang
- Dalian Campus, Luxun Academy of Fine Arts, Dalian City, Liaoning Province, China
| | - Yiyang Wang
- School of Mathematic Sciences, Dalian University of Technology, Dalian City, Liaoning Province, China
| | - Zhixun Su
- School of Mathematic Sciences, Dalian University of Technology, Dalian City, Liaoning Province, China
| | - Yu Cai
- School of Mathematic Sciences, Dalian University of Technology, Dalian City, Liaoning Province, China
| |
Collapse
|
9
|
Kong J, Liu C, Jiang M, Wu J, Tian S, Lai H. Generalized ℓP-regularized representation for visual tracking. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2016.03.100] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
10
|
Jia X, Lu H, Yang MH. Visual Tracking via Coarse and Fine Structural Local Sparse Appearance Models. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:4555-4564. [PMID: 27448350 DOI: 10.1109/tip.2016.2592701] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Sparse representation has been successfully applied to visual tracking by finding the best candidate with a minimal reconstruction error using target templates. However, most sparse representation-based tracking methods only consider holistic rather than local appearance to discriminate between target and background regions, and hence may not perform well when target objects are heavily occluded. In this paper, we develop a simple yet robust tracking algorithm based on a coarse and fine structural local sparse appearance model. The proposed method exploits both partial and structural information of a target object based on sparse coding using the dictionary composed of patches from multiple target templates. The likelihood obtained by averaging and pooling operations exploits consistent appearance of object parts, thereby helping not only locate targets accurately but also handle partial occlusion. To update templates more accurately without introducing occluding regions, we introduce an occlusion detection scheme to account for pixels belonging to the target objects. The proposed method is evaluated on a large benchmark data set with three evaluation metrics. Experimental results demonstrate that the proposed tracking algorithm performs favorably against several state-of-the-art methods.
Collapse
|
11
|
Li X, Han Z, Wang L, Lu H. Visual Tracking via Random Walks on Graph Model. IEEE TRANSACTIONS ON CYBERNETICS 2016; 46:2144-2155. [PMID: 26292358 DOI: 10.1109/tcyb.2015.2466437] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
In this paper, we formulate visual tracking as random walks on graph models with nodes representing superpixels and edges denoting relationships between superpixels. We integrate two novel graphs with the theory of Markov random walks, resulting in two Markov chains. First, an ergodic Markov chain is enforced to globally search for the candidate nodes with similar features to the template nodes. Second, an absorbing Markov chain is utilized to model the temporal coherence between consecutive frames. The final confidence map is generated by a structural model which combines both appearance similarity measurement derived by the random walks and internal spatial layout demonstrated by different target parts. The effectiveness of the proposed Markov chains as well as the structural model is evaluated both qualitatively and quantitatively. Experimental results on challenging sequences show that the proposed tracking algorithm performs favorably against state-of-the-art methods.
Collapse
|
12
|
Yang H, Qu S. Online Hierarchical Sparse Representation of Multifeature for Robust Object Tracking. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2016; 2016:5894639. [PMID: 27630710 PMCID: PMC5008034 DOI: 10.1155/2016/5894639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/10/2016] [Accepted: 07/10/2016] [Indexed: 11/18/2022]
Abstract
Object tracking based on sparse representation has given promising tracking results in recent years. However, the trackers under the framework of sparse representation always overemphasize the sparse representation and ignore the correlation of visual information. In addition, the sparse coding methods only encode the local region independently and ignore the spatial neighborhood information of the image. In this paper, we propose a robust tracking algorithm. Firstly, multiple complementary features are used to describe the object appearance; the appearance model of the tracked target is modeled by instantaneous and stable appearance features simultaneously. A two-stage sparse-coded method which takes the spatial neighborhood information of the image patch and the computation burden into consideration is used to compute the reconstructed object appearance. Then, the reliability of each tracker is measured by the tracking likelihood function of transient and reconstructed appearance models. Finally, the most reliable tracker is obtained by a well established particle filter framework; the training set and the template library are incrementally updated based on the current tracking results. Experiment results on different challenging video sequences show that the proposed algorithm performs well with superior tracking accuracy and robustness.
Collapse
Affiliation(s)
- Honghong Yang
- Department of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Shiru Qu
- Department of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| |
Collapse
|
13
|
|
14
|
|
15
|
Wang G, Qin X, Zhong F, Liu Y, Li H, Peng Q, Yang MH. Visual Tracking via Sparse and Local Linear Coding. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:3796-3809. [PMID: 26353352 DOI: 10.1109/tip.2015.2445291] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The state search is an important component of any object tracking algorithm. Numerous algorithms have been proposed, but stochastic sampling methods (e.g., particle filters) are arguably one of the most effective approaches. However, the discretization of the state space complicates the search for the precise object location. In this paper, we propose a novel tracking algorithm that extends the state space of particle observations from discrete to continuous. The solution is determined accurately via iterative linear coding between two convex hulls. The algorithm is modeled by an optimal function, which can be efficiently solved by either convex sparse coding or locality constrained linear coding. The algorithm is also very flexible and can be combined with many generic object representations. Thus, we first use sparse representation to achieve an efficient searching mechanism of the algorithm and demonstrate its accuracy. Next, two other object representation models, i.e., least soft-threshold squares and adaptive structural local sparse appearance, are implemented with improved accuracy to demonstrate the flexibility of our algorithm. Qualitative and quantitative experimental results demonstrate that the proposed tracking algorithm performs favorably against the state-of-the-art methods in dynamic scenes.
Collapse
|
16
|
Wang D, Lu H, Xiao Z, Yang MH. Inverse sparse tracker with a locally weighted distance metric. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:2646-2657. [PMID: 25935033 DOI: 10.1109/tip.2015.2427518] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Sparse representation has been recently extensively studied for visual tracking and generally facilitates more accurate tracking results than classic methods. In this paper, we propose a sparsity-based tracking algorithm that is featured with two components: 1) an inverse sparse representation formulation and 2) a locally weighted distance metric. In the inverse sparse representation formulation, the target template is reconstructed with particles, which enables the tracker to compute the weights of all particles by solving only one l1 optimization problem and thereby provides a quite efficient model. This is in direct contrast to most previous sparse trackers that entail solving one optimization problem for each particle. However, we notice that this formulation with normal Euclidean distance metric is sensitive to partial noise like occlusion and illumination changes. To this end, we design a locally weighted distance metric to replace the Euclidean one. Similar ideas of using local features appear in other works, but only being supported by popular assumptions like local models could handle partial noise better than holistic models, without any solid theoretical analysis. In this paper, we attempt to explicitly explain it from a mathematical view. On that basis, we further propose a method to assign local weights by exploiting the temporal and spatial continuity. In the proposed method, appearance changes caused by partial occlusion and shape deformation are carefully considered, thereby facilitating accurate similarity measurement and model update. The experimental validation is conducted from two aspects: 1) self validation on key components and 2) comparison with other state-of-the-art algorithms. Results over 15 challenging sequences show that the proposed tracking algorithm performs favorably against the existing sparsity-based trackers and the other state-of-the-art methods.
Collapse
|
17
|
Yang Y, Xie Y, Zhang W, Hu W, Tan Y. Global Coupled Learning and Local Consistencies Ensuring for sparse-based tracking. Neurocomputing 2015. [DOI: 10.1016/j.neucom.2014.12.060] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
18
|
|
19
|
Hu W, Li W, Zhang X, Maybank S. Single and Multiple Object Tracking Using a Multi-Feature Joint Sparse Representation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2015; 37:816-833. [PMID: 26353296 DOI: 10.1109/tpami.2014.2353628] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
In this paper, we propose a tracking algorithm based on a multi-feature joint sparse representation. The templates for the sparse representation can include pixel values, textures, and edges. In the multi-feature joint optimization, noise or occlusion is dealt with using a set of trivial templates. A sparse weight constraint is introduced to dynamically select the relevant templates from the full set of templates. A variance ratio measure is adopted to adaptively adjust the weights of different features. The multi-feature template set is updated adaptively. We further propose an algorithm for tracking multi-objects with occlusion handling based on the multi-feature joint sparse reconstruction. The observation model based on sparse reconstruction automatically focuses on the visible parts of an occluded object by using the information in the trivial templates. The multi-object tracking is simplified into a joint Bayesian inference. The experimental results show the superiority of our algorithm over several state-of-the-art tracking algorithms.
Collapse
|
20
|
|
21
|
Zhang T, Liu S, Ahuja N, Yang MH, Ghanem B. Robust Visual Tracking Via Consistent Low-Rank Sparse Learning. Int J Comput Vis 2014. [DOI: 10.1007/s11263-014-0738-0] [Citation(s) in RCA: 105] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
22
|
Xie Y, Zhang W, Li C, Lin S, Qu Y, Zhang Y. Discriminative object tracking via sparse representation and online dictionary learning. IEEE TRANSACTIONS ON CYBERNETICS 2014; 44:539-553. [PMID: 23757567 DOI: 10.1109/tcyb.2013.2259230] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
We propose a robust tracking algorithm based on local sparse coding with discriminative dictionary learning and new keypoint matching schema. This algorithm consists of two parts: the local sparse coding with online updated discriminative dictionary for tracking (SOD part), and the keypoint matching refinement for enhancing the tracking performance (KP part). In the SOD part, the local image patches of the target object and background are represented by their sparse codes using an over-complete discriminative dictionary. Such discriminative dictionary, which encodes the information of both the foreground and the background, may provide more discriminative power. Furthermore, in order to adapt the dictionary to the variation of the foreground and background during the tracking, an online learning method is employed to update the dictionary. The KP part utilizes refined keypoint matching schema to improve the performance of the SOD. With the help of sparse representation and online updated discriminative dictionary, the KP part are more robust than the traditional method to reject the incorrect matches and eliminate the outliers. The proposed method is embedded into a Bayesian inference framework for visual tracking. Experimental results on several challenging video sequences demonstrate the effectiveness and robustness of our approach.
Collapse
|
23
|
|
24
|
Mei X, Ling H, Wu Y, Blasch EP, Bai L. Efficient minimum error bounded particle resampling L1 tracker with occlusion detection. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2013; 22:2661-2675. [PMID: 23549892 DOI: 10.1109/tip.2013.2255301] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Recently, sparse representation has been applied to visual tracking to find the target with the minimum reconstruction error from a target template subspace. Though effective, these L1 trackers require high computational costs due to numerous calculations for l1 minimization. In addition, the inherent occlusion insensitivity of the l1 minimization has not been fully characterized. In this paper, we propose an efficient L1 tracker, named bounded particle resampling (BPR)-L1 tracker, with a minimum error bound and occlusion detection. First, the minimum error bound is calculated from a linear least squares equation and serves as a guide for particle resampling in a particle filter (PF) framework. Most of the insignificant samples are removed before solving the computationally expensive l1 minimization in a two-step testing. The first step, named τ testing, compares the sample observation likelihood to an ordered set of thresholds to remove insignificant samples without loss of resampling precision. The second step, named max testing, identifies the largest sample probability relative to the target to further remove insignificant samples without altering the tracking result of the current frame. Though sacrificing minimal precision during resampling, max testing achieves significant speed up on top of τ testing. The BPR-L1 technique can also be beneficial to other trackers that have minimum error bounds in a PF framework, especially for trackers based on sparse representations. After the error-bound calculation, BPR-L1 performs occlusion detection by investigating the trivial coefficients in the l1 minimization. These coefficients, by design, contain rich information about image corruptions, including occlusion. Detected occlusions are then used to enhance the template updating. For evaluation, we conduct experiments on three video applications: biometrics (head movement, hand holding object, singers on stage), pedestrians (urban travel, hallway monitoring), and cars in traffic (wide area motion imagery, ground-mounted perspectives). The proposed BPR-L1 method demonstrates an excellent performance as compared with nine state-of-the-art trackers on eleven challenging benchmark sequences.
Collapse
Affiliation(s)
- Xue Mei
- Toyota Research Institute, North America, Ann Arbor, MI 48105, USA.
| | | | | | | | | |
Collapse
|
25
|
Li X, Dick A, Shen C, van den Hengel A, Wang H. Incremental learning of 3D-DCT compact representations for robust visual tracking. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2013; 35:863-881. [PMID: 22868649 DOI: 10.1109/tpami.2012.166] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Visual tracking usually requires an object appearance model that is robust to changing illumination, pose, and other factors encountered in video. Many recent trackers utilize appearance samples in previous frames to form the bases upon which the object appearance model is built. This approach has the following limitations: 1) The bases are data driven, so they can be easily corrupted, and 2) it is difficult to robustly update the bases in challenging situations. In this paper, we construct an appearance model using the 3D discrete cosine transform (3D-DCT). The 3D-DCT is based on a set of cosine basis functions which are determined by the dimensions of the 3D signal and thus independent of the input video data. In addition, the 3D-DCT can generate a compact energy spectrum whose high-frequency coefficients are sparse if the appearance samples are similar. By discarding these high-frequency coefficients, we simultaneously obtain a compact 3D-DCT-based object representation and a signal reconstruction-based similarity measure (reflecting the information loss from signal reconstruction). To efficiently update the object representation, we propose an incremental 3D-DCT algorithm which decomposes the 3D-DCT into successive operations of the 2D discrete cosine transform (2D-DCT) and 1D discrete cosine transform (1D-DCT) on the input video data. As a result, the incremental 3D-DCT algorithm only needs to compute the 2D-DCT for newly added frames as well as the 1D-DCT along the third dimension, which significantly reduces the computational complexity. Based on this incremental 3D-DCT algorithm, we design a discriminative criterion to evaluate the likelihood of a test sample belonging to the foreground object. We then embed the discriminative criterion into a particle filtering framework for object state inference over time. Experimental results demonstrate the effectiveness and robustness of the proposed tracker.
Collapse
Affiliation(s)
- Xi Li
- Australian Centre for Visual Technologies, School of Computer Science, the University of Adelaide, North Terrace, SA 5005, Australia.
| | | | | | | | | |
Collapse
|
26
|
Kim DY, Jeon M. Spatio-temporal auxiliary particle filtering with l1-norm-based appearance model learning for robust visual tracking. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2013; 22:511-522. [PMID: 22997266 DOI: 10.1109/tip.2012.2218824] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
In this paper, we propose an efficient and accurate visual tracker equipped with a new particle filtering algorithm and robust subspace learning-based appearance model. The proposed visual tracker avoids drifting problems caused by abrupt motion changes and severe appearance variations that are well-known difficulties in visual tracking. The proposed algorithm is based on a type of auxiliary particle filtering that uses a spatio-temporal sliding window. Compared to conventional particle filtering algorithms, spatio-temporal auxiliary particle filtering is computationally efficient and successfully implemented in visual tracking. In addition, a real-time robust principal component pursuit (RRPCP) equipped with l(1)-norm optimization has been utilized to obtain a new appearance model learning block for reliable visual tracking especially for occlusions in object appearance. The overall tracking framework based on the dual ideas is robust against occlusions and out-of-plane motions because of the proposed spatio-temporal filtering and recursive form of RRPCP. The designed tracker has been evaluated using challenging video sequences, and the results confirm the advantage of using this tracker.
Collapse
Affiliation(s)
- Du Yong Kim
- School of Information and Communications, Gwangju Institute of Science and Technology, Gwangju 500-712, Korea.
| | | |
Collapse
|
27
|
Zhang T, Ghanem B, Liu S, Ahuja N. Robust Visual Tracking via Structured Multi-Task Sparse Learning. Int J Comput Vis 2012. [DOI: 10.1007/s11263-012-0582-z] [Citation(s) in RCA: 370] [Impact Index Per Article: 30.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
28
|
Zhang T, Ghanem B, Liu S, Ahuja N. Low-Rank Sparse Learning for Robust Visual Tracking. COMPUTER VISION – ECCV 2012 2012. [DOI: 10.1007/978-3-642-33783-3_34] [Citation(s) in RCA: 116] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
29
|
Wu Y, Ling H, Blasch E, Bai L, Chen G. Visual Tracking Based on Log-Euclidean Riemannian Sparse Representation. ADVANCES IN VISUAL COMPUTING 2011. [DOI: 10.1007/978-3-642-24028-7_68] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|