1
|
Zhang H, Qian F, Shang F, Du W, Qian J, Yang J. Global Convergence Guarantees of (A)GIST for a Family of Nonconvex Sparse Learning Problems. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:3276-3288. [PMID: 32784147 DOI: 10.1109/tcyb.2020.3010960] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In recent years, most of the studies have shown that the generalized iterated shrinkage thresholdings (GISTs) have become the commonly used first-order optimization algorithms in sparse learning problems. The nonconvex relaxations of the l0 -norm usually achieve better performance than the convex case (e.g., l1 -norm) since the former can achieve a nearly unbiased solver. To increase the calculation efficiency, this work further provides an accelerated GIST version, that is, AGIST, through the extrapolation-based acceleration technique, which can contribute to reduce the number of iterations when solving a family of nonconvex sparse learning problems. Besides, we present the algorithmic analysis, including both local and global convergence guarantees, as well as other intermediate results for the GIST and AGIST, denoted as (A)GIST, by virtue of the Kurdyka-Łojasiewica (KŁ) property and some milder assumptions. Numerical experiments on both synthetic data and real-world databases can demonstrate that the convergence results of objective function accord to the theoretical properties and nonconvex sparse learning methods can achieve superior performance over some convex ones.
Collapse
|
2
|
Chen M, Li X. Robust Matrix Factorization With Spectral Embedding. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:5698-5707. [PMID: 33090957 DOI: 10.1109/tnnls.2020.3027351] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Nonnegative matrix factorization (NMF) and spectral clustering are two of the most widely used clustering techniques. However, NMF cannot deal with the nonlinear data, and spectral clustering relies on the postprocessing. In this article, we propose a Robust Matrix factorization with Spectral embedding (RMS) approach for data clustering, which inherits the advantages of NMF and spectral clustering, while avoiding their shortcomings. In addition, to cluster the data represented by multiple views, we present the multiview version of RMS (M-RMS), and the weights of different views are self-tuned. The main contributions of this research are threefold: 1) by integrating spectral clustering and matrix factorization, the proposed methods are able to capture the nonlinear data structure and obtain the cluster indicator directly; 2) instead of using the squared Frobenius-norm, the objectives are developed with the l2,1 -norm, such that the effects of the outliers are alleviated; and 3) the proposed methods are totally parameter-free, which increases the applicability for various real-world problems. Extensive experiments on several single-view/multiview data sets demonstrate the effectiveness of our methods and verify their superior clustering performance over the state of the arts.
Collapse
|
3
|
Zhang X, Ma S, Wang S, Zhang J, Sun H, Gao W. Divisively Normalized Sparse Coding: Toward Perceptual Visual Signal Representation. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:4237-4250. [PMID: 30843814 DOI: 10.1109/tcyb.2019.2899005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Sparse representation has been shown to be highly correlated with the visual perception of natural images, which can be characterized by a linear combination of neuronal responses in the visual cortex. Divisive normalization transform (DNT) has been proven to be an effective method in reducing statistical and perceptual dependencies for nonlinear properties in primary visual cortex. In this paper, we develop a divisively normalized sparse coding scheme, aiming to further bridge the gap between sparse representation and human visual perception. We show that such a scheme is perceptually meaningful for representing visual signals, with which the pixel-domain image representation and processing tasks can be feasibly and efficiently achieved in the divisively normalized sparse-domain. Specifically, we develop a sparse-domain similarity (SDS) index for perceptual quality evaluation, where the DNT is employed for transforming image signals into a perceptually uniform space. Furthermore, the proposed SDS index is employed to optimize the sparse coding process when representing natural images. The experimental results indicate that the SDS can provide accurate and consistent predictions of perceived image quality, and the performance of sparse coding can be significantly improved in terms of both objective and subjective quality evaluations.
Collapse
|
4
|
Zhang Z, Zhang Y, Xu M, Zhang L, Yang Y, Yan S. A Survey on Concept Factorization: From Shallow to Deep Representation Learning. Inf Process Manag 2021. [DOI: 10.1016/j.ipm.2021.102534] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
5
|
JROTM: Jointly reinforced object tracking with temporal content reference and motion guidance. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.12.111] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
6
|
Dong G, Liu H. Global Receptive-Based Neural Network for Target Recognition in SAR Images. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:1954-1967. [PMID: 31794417 DOI: 10.1109/tcyb.2019.2952400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The past years have witnessed a revival of neural network and learning strategies. These models configure multiple hidden layers hierarchically and require large amounts of labeled samples to estimate the model parameters. It is yet difficult to be met for target recognition under the realistic environments. For either space borne or airborne radars, collecting multiple samples with label information is very expensive and difficult. In addition, the huge computational cost and poor speed of convergence limit the practical applications. To address the problems, this article presents a new thought of receptive, under which a special hierarchy of feedforward neural network has been built. The proposed strategy consists of two sequential modules: 1) feature generation and 2) feature refinement. We first build pairwise baseline signals by means of the Riesz transform along the range and the azimuth, and extend them to a family of receptive signals using the bandpass filter bank. The input SAR image is then generally convoluted with the set of receptive signals to extract the global features. Certain kinds of information can be then exploited. We make the receptive signals predefined, rather than learned automatically, to handle the environment of a small sample size. In addition, the expert knowledge can be transmitted into the neural network. The resulting features are further refined by a special unit, wherein the input neurons and the latent states are bridged by the weights and the bias randomly generated. They are fixed during the training process. On the other hand, we cast the latent state into the Hilbert space, forming the kernel version of refinement. We aim to achieve the comparable or even better performance yet with limited training resources.
Collapse
|
7
|
Zhou Q, Fan H, Yang H, Su H, Zheng S, Wu S, Ling H. Robust and Efficient Graph Correspondence Transfer for Person Re-Identification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:1623-1638. [PMID: 31071040 DOI: 10.1109/tip.2019.2914575] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Spatial misalignment caused by variations in poses and viewpoints is one of the most critical issues that hinder the performance improvement in existing person re-identification (Re-ID) algorithms. Although it is straightforward to explore correspondence learning algorithms for alignment, online learning is intractable for negative pairs due to the intrinsic visual difference between negative pairs and efficiency concern. To address this problem, in this paper, we present a robust and efficient graph correspondence transfer (REGCT) approach for explicit spatial alignment in Re-ID. Specifically, we propose the off-line correspondence learning and on-line correspondence transfer framework. During training, patch-wise correspondences between positive training pairs are established via graph matching. By exploiting both spatial and visual contexts of human appearance in graph matching, meaningful semantic correspondences can be obtained. During testing, the off-line learned patch-wise correspondence templates are transferred to test pairs with similar pose-pair configurations for local feature distance calculation. To enhance the robustness of correspondence transfer, we design a novel pose context descriptor to accurately model human body configurations, and present an approach to measure the similarity between a pair of pose context descriptors. Meanwhile, to improve testing efficiency, we propose a correspondence template ensemble method using the voting mechanism, which significantly reduces the amount of patch-wise matchings involved in distance calculation. With the aforementioned strategies, the REGCT model can effectively and efficiently handle the spatial misalignment problem in Re-ID. Extensive experiments on five challenging benchmarks, including VIPeR, Road, PRID450S, 3DPES, and CUHK01, evidence the superior performance of REGCT over other state-of-the-art approaches.
Collapse
|
8
|
Dubey SR, Chakraborty S, Roy SK, Mukherjee S, Singh SK, Chaudhuri BB. diffGrad: An Optimization Method for Convolutional Neural Networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:4500-4511. [PMID: 31880565 DOI: 10.1109/tnnls.2019.2955777] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Stochastic gradient descent (SGD) is one of the core techniques behind the success of deep neural networks. The gradient provides information on the direction in which a function has the steepest rate of change. The main problem with basic SGD is to change by equal-sized steps for all parameters, irrespective of the gradient behavior. Hence, an efficient way of deep network optimization is to have adaptive step sizes for each parameter. Recently, several attempts have been made to improve gradient descent methods such as AdaGrad, AdaDelta, RMSProp, and adaptive moment estimation (Adam). These methods rely on the square roots of exponential moving averages of squared past gradients. Thus, these methods do not take advantage of local change in gradients. In this article, a novel optimizer is proposed based on the difference between the present and the immediate past gradient (i.e., diffGrad). In the proposed diffGrad optimization technique, the step size is adjusted for each parameter in such a way that it should have a larger step size for faster gradient changing parameters and a lower step size for lower gradient changing parameters. The convergence analysis is done using the regret bound approach of the online learning framework. In this article, thorough analysis is made over three synthetic complex nonconvex functions. The image categorization experiments are also conducted over the CIFAR10 and CIFAR100 data sets to observe the performance of diffGrad with respect to the state-of-the-art optimizers such as SGDM, AdaGrad, AdaDelta, RMSProp, AMSGrad, and Adam. The residual unit (ResNet)-based convolutional neural network (CNN) architecture is used in the experiments. The experiments show that diffGrad outperforms other optimizers. Also, we show that diffGrad performs uniformly well for training CNN using different activation functions. The source code is made publicly available at https://github.com/shivram1987/diffGrad.
Collapse
|
9
|
Wan A. Stable recovery of approximately k-sparse signals in noisy cases via ℓ minimization. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.04.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
10
|
Rout DK, Subudhi BN, Veerakumar T, Chaudhury S. Walsh–Hadamard-Kernel-Based Features in Particle Filter Framework for Underwater Object Tracking. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS 2020; 16:5712-5722. [DOI: 10.1109/tii.2019.2937902] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/19/2023]
|
11
|
Talal TM, Attiya G, Metwalli MR, Abd El-Samie FE, Dessouky MI. Satellite image fusion based on modified central force optimization. MULTIMEDIA TOOLS AND APPLICATIONS 2020; 79:21129-21154. [DOI: 10.1007/s11042-019-08471-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/18/2018] [Revised: 08/10/2019] [Accepted: 11/12/2019] [Indexed: 09/02/2023]
|
12
|
Walia GS, Ahuja H, Kumar A, Bansal N, Sharma K. Unified Graph-Based Multicue Feature Fusion for Robust Visual Tracking. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:2357-2368. [PMID: 31251204 DOI: 10.1109/tcyb.2019.2920289] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Visual tracking is a complex problem due to unconstrained appearance variations and a dynamic environment. The extraction of complementary information from the object environment via multiple features and adaption to the target's appearance variations are the key problems of this paper. To this end, we propose a robust object tracking framework based on the unified graph fusion (UGF) of multicue to adapt to the object's appearance. The proposed cross-diffusion of sparse and dense features not only suppresses the individual feature deficiencies but also extracts the complementary information from multicue. This iterative process builds robust unified features which are invariant to object deformations, fast motion, and occlusion. Robustness of the unified feature also enables the random forest classifier to precisely distinguish the foreground from the background, adding resilience to background clutter. In addition, we present a novel kernel-based adaptation strategy using outlier detection and a transductive reliability metric. The adaptation strategy updates the appearance model to accommodate variations in scale, illumination, and rotation. Both qualitative and quantitative analyses on benchmark video sequences from OTB-50, OTB-100, VOT2017/18, and UAV123 show that the proposed UGF tracker performs favorably against 18 other state-of-the-art trackers under various object tracking challenges.
Collapse
|
13
|
Deng C, Han Y, Zhao B. High-Performance Visual Tracking With Extreme Learning Machine Framework. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:2781-2792. [PMID: 30624237 DOI: 10.1109/tcyb.2018.2886580] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In real-time applications, a fast and robust visual tracker should generally have the following important properties: 1) feature representation of an object that is not only efficient but also has a good discriminative capability and 2) appearance modeling which can quickly adapt to the variations of foreground and backgrounds. However, most of the existing tracking algorithms cannot achieve satisfactory performance in both of the two aspects. To address this issue, in this paper, we advocate a novel and efficient visual tracker by exploiting the excellent feature learning and classification capabilities of an emerging learning technique, that is, extreme learning machine (ELM). The contributions of the proposed work are as follows: 1) motivated by the simplicity and learning ability of the ELM autoencoder (ELM-AE), an ELM-AE-based feature extraction model is presented, and this model can provide a compact and discriminative representation of the inputs efficiently and 2) due to the fast learning speed of an ELM classifier, an ELM-based appearance model is developed for feature classification, and is able to rapidly distinguish the object of interest from its surroundings. In addition, in order to cope with the visual changes of the target and its backgrounds, the online sequential ELM is used to incrementally update the appearance model. Plenty of experiments on challenging image sequences demonstrate the effectiveness and robustness of the proposed tracker.
Collapse
|
14
|
|
15
|
A new graph-preserving unsupervised feature selection embedding LLE with low-rank constraint and feature-level representation. Artif Intell Rev 2020. [DOI: 10.1007/s10462-019-09749-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
16
|
Zhou T, Zhang C, Gong C, Bhaskar H, Yang J. Multiview Latent Space Learning With Feature Redundancy Minimization. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:1655-1668. [PMID: 30571651 DOI: 10.1109/tcyb.2018.2883673] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Multiview learning has received extensive research interest and has demonstrated promising results in recent years. Despite the progress made, there are two significant challenges within multiview learning. First, some of the existing methods directly use original features to reconstruct data points without considering the issue of feature redundancy. Second, existing methods cannot fully exploit the complementary information across multiple views and meanwhile preserve the view-specific properties; therefore, the degraded learning performance will be generated. To address the above issues, we propose a novel multiview latent space learning framework with feature redundancy minimization. We aim to learn a latent space to mitigate the feature redundancy and use the learned representation to reconstruct every original data point. More specifically, we first project the original features from multiple views onto a latent space, and then learn a shared dictionary and view-specific dictionaries to, respectively, exploit the correlations across multiple views as well as preserve the view-specific properties. Furthermore, the Hilbert-Schmidt independence criterion is adopted as a diversity constraint to explore the complementarity of multiview representations, which further ensures the diversity from multiple views and preserves the local structure of the data in each view. Experimental results on six public datasets have demonstrated the effectiveness of our multiview learning approach against other state-of-the-art methods.
Collapse
|
17
|
Jia F, Wang X, Guan J, Liao Q, Zhang J, Li H, Qi S. Bi-Connect Net for salient object detection. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.12.020] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
18
|
Ashiba MI, Tolba MS, El-Fishawy AS, El-Samie FEA. Hybrid enhancement of infrared night vision imaging system. MULTIMEDIA TOOLS AND APPLICATIONS 2020; 79:6085-6108. [DOI: 10.1007/s11042-019-7510-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2018] [Revised: 02/07/2019] [Accepted: 03/18/2019] [Indexed: 09/02/2023]
|
19
|
Lan X, Ye M, Zhang S, Zhou H, Yuen PC. Modality-correlation-aware sparse representation for RGB-infrared object tracking. Pattern Recognit Lett 2020. [DOI: 10.1016/j.patrec.2018.10.002] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
20
|
Lightweight Attention Pyramid Network for Object Detection and Instance Segmentation. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10030883] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Feature pyramids of convolutional neural networks (ConvNets)—from bottom to top—are used by most recent researchers for the improvement of object detection accuracy, but they seldom aim to address the correlation of each feature channel and the fusion of low-level features and high-level features. In this paper, an Attention Pyramid Network (APN) is proposed, which mainly contains the adaptive transformation module and feature attention block. The adaptive transformation module utilizes the multiscale feature fusion, and makes full use of the accurate target location information of low-level features and the semantic information of high-level features. Then, the feature attention block strengthens the features of important channels and weakens the features of unimportant channels through learning. By implementing the APN in a basic Mask R-CNN system, our method achieves state-of-the-art results on the MS COCO dataset and 2018 WAD database without bells and whistles. In addition, the structure of the APN makes the network parameters lighter, and runs at 4 ms on average, which is ignorable when compared to the inference time of the backbone of ConvNet.
Collapse
|
21
|
Shi Y, Suk HI, Gao Y, Lee SW, Shen D. Leveraging Coupled Interaction for Multimodal Alzheimer's Disease Diagnosis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:186-200. [PMID: 30908241 DOI: 10.1109/tnnls.2019.2900077] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
As the population becomes older worldwide, accurate computer-aided diagnosis for Alzheimer's disease (AD) in the early stage has been regarded as a crucial step for neurodegeneration care in recent years. Since it extracts the low-level features from the neuroimaging data, previous methods regarded this computer-aided diagnosis as a classification problem that ignored latent featurewise relation. However, it is known that multiple brain regions in the human brain are anatomically and functionally interlinked according to the current neuroscience perspective. Thus, it is reasonable to assume that the extracted features from different brain regions are related to each other to some extent. Also, the complementary information between different neuroimaging modalities could benefit multimodal fusion. To this end, we consider leveraging the coupled interactions in the feature level and modality level for diagnosis in this paper. First, we propose capturing the feature-level coupled interaction using a coupled feature representation. Then, to model the modality-level coupled interaction, we present two novel methods: 1) the coupled boosting (CB) that models the correlation of pairwise coupled-diversity on both inconsistently and incorrectly classified samples between different modalities and 2) the coupled metric ensemble (CME) that learns an informative feature projection from different modalities by integrating the intrarelation and interrelation of training samples. We systematically evaluated our methods with the AD neuroimaging initiative data set. By comparison with the baseline learning-based methods and the state-of-the-art methods that are specially developed for AD/MCI (mild cognitive impairment) diagnosis, our methods achieved the best performance with accuracy of 95.0% and 80.7% (CB), 94.9% and 79.9% (CME) for AD/NC (normal control), and MCI/NC identification, respectively.
Collapse
|
22
|
Zhang H, Qian J, Zhang B, Yang J, Gong C, Wei Y. Low-Rank Matrix Recovery via Modified Schatten-p Norm Minimization with Convergence Guarantees. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:3132-3142. [PMID: 31831418 DOI: 10.1109/tip.2019.2957925] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In recent years, low-rank matrix recovery problems have attracted much attention in computer vision and machine learning. The corresponding rank minimization problems are both combinational and NP-hard in general, which are mainly solved by both nuclear norm and Schatten-p (0
Collapse
|
23
|
Zheng P, Zhao H, Zhan J, Yan Y, Ren J, Lv J, Huang Z. Incremental learning-based visual tracking with weighted discriminative dictionaries. INT J ADV ROBOT SYST 2019. [DOI: 10.1177/1729881419890155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Existing sparse representation-based visual tracking methods detect the target positions by minimizing the reconstruction error. However, due to complex background, illumination change, and occlusion problems, these methods are difficult to locate the target properly. In this article, we propose a novel visual tracking method based on weighted discriminative dictionaries and a pyramidal feature selection strategy. First, we utilize color features and texture features of the training samples to obtain multiple discriminative dictionaries. Then, we use the position information of those samples to assign weights to the base vectors in dictionaries. For robust visual tracking, we propose a pyramidal sparse feature selection strategy where the weights of base vectors and reconstruction errors in different feature are integrated together to get the best target regions. At the same time, we measure feature reliability to dynamically adjust the weights of different features. In addition, we introduce a scenario-aware mechanism and an incremental dictionary update method based on noise energy analysis. Comparison experiments show that the proposed algorithm outperforms several state-of-the-art methods, and useful quantitative and qualitative analyses are also carried out.
Collapse
Affiliation(s)
- Penggen Zheng
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| | - Huimin Zhao
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| | - Jin Zhan
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| | - Yijun Yan
- Department of Electronic and Electrical Engineering, University of Strathclyde, Glasgow, UK
| | - Jinchang Ren
- Department of Electronic and Electrical Engineering, University of Strathclyde, Glasgow, UK
| | - Jujian Lv
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| | - Zhihui Huang
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| |
Collapse
|
24
|
Abdellatef E, Ismail NA, Abd Elrahman SESE, Ismail KN, Rihan M, Abd El-Samie FE. Cancelable fusion-based face recognition. MULTIMEDIA TOOLS AND APPLICATIONS 2019; 78:31557-31580. [DOI: 10.1007/s11042-019-07848-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 05/15/2019] [Accepted: 05/31/2019] [Indexed: 09/01/2023]
|
25
|
Zhu G, Zhang Z, Wang J, Wu Y, Lu H. Dynamic Collaborative Tracking. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:3035-3046. [PMID: 32175852 DOI: 10.1109/tnnls.2018.2861838] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Correlation filter has been demonstrated remarkable success for visual tracking recently. However, most existing methods often face model drift caused by several factors, such as unlimited boundary effect, heavy occlusion, fast motion, and distracter perturbation. To address the issue, this paper proposes a unified dynamic collaborative tracking framework that can perform more flexible and robust position prediction. Specifically, the framework learns the object appearance model by jointly training the objective function with three components: target regression submodule, distracter suppression submodule, and maximum margin relation submodule. The first submodule mainly takes advantage of the circulant structure of training samples to obtain the distinguishing ability between the target and its surrounding background. The second submodule optimizes the label response of the possible distracting region close to zero for reducing the peak value of the confidence map in the distracting region. Inspired by the structure output support vector machines, the third submodule is introduced to utilize the differences between target appearance representation and distracter appearance representation in the discriminative mapping space for alleviating the disturbance of the most possible hard negative samples. In addition, a CUR filter as an assistant detector is embedded to provide effective object candidates for alleviating the model drift problem. Comprehensive experimental results show that the proposed approach achieves the state-of-the-art performance in several public benchmark data sets.
Collapse
|
26
|
Group variable selection via ℓp,0 regularization and application to optimal scoring. Neural Netw 2019; 118:220-234. [DOI: 10.1016/j.neunet.2019.05.011] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Revised: 05/12/2019] [Accepted: 05/19/2019] [Indexed: 11/22/2022]
|
27
|
Zhang H, Gong C, Qian J, Zhang B, Xu C, Yang J. Efficient Recovery of Low-Rank Matrix via Double Nonconvex Nonsmooth Rank Minimization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:2916-2925. [PMID: 30892254 DOI: 10.1109/tnnls.2019.2900572] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Recently, there is a rapidly increasing attraction for the efficient recovery of low-rank matrix in computer vision and machine learning. The popular convex solution of rank minimization is nuclear norm-based minimization (NNM), which usually leads to a biased solution since NNM tends to overshrink the rank components and treats each rank component equally. To address this issue, some nonconvex nonsmooth rank (NNR) relaxations have been exploited widely. Different from these convex and nonconvex rank substitutes, this paper first introduces a general and flexible rank relaxation function named weighted NNR relaxation function, which is actually derived from the initial double NNR (DNNR) relaxations, i.e., DNNR relaxation function acts on the nonconvex singular values function (SVF). An iteratively reweighted SVF optimization algorithm with continuation technology through computing the supergradient values to define the weighting vector is devised to solve the DNNR minimization problem, and the closed-form solution of the subproblem can be efficiently obtained by a general proximal operator, in which each element of the desired weighting vector usually satisfies the nondecreasing order. We next prove that the objective function values decrease monotonically, and any limit point of the generated subsequence is a critical point. Combining the Kurdyka-Łojasiewicz property with some milder assumptions, we further give its global convergence guarantee. As an application in the matrix completion problem, experimental results on both synthetic data and real-world data can show that our methods are competitive with several state-of-the-art convex and nonconvex matrix completion methods.
Collapse
|
28
|
|
29
|
Computational Imaging Method with a Learned Plug-and-Play Prior for Electrical Capacitance Tomography. Cognit Comput 2019. [DOI: 10.1007/s12559-019-09682-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
30
|
Zhou JT, Fang M, Zhang H, Gong C, Peng X, Cao Z, Goh RSM. Learning With Annotation of Various Degrees. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:2794-2804. [PMID: 30640630 DOI: 10.1109/tnnls.2018.2885854] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In this paper, we study a new problem in the scenario of sequences labeling. To be exact, we consider that the training data are with annotation of various degrees, namely, fully labeled, unlabeled, and partially labeled sequences. The learning with fully un/labeled sequence refers to the standard setting in traditional un/supervised learning, and the proposed partially labeling specifies the subject that the element does not belong to. The partially labeled data are cheaper to obtain compared with the fully labeled data though it is less informative, especially when the tasks require a lot of domain knowledge. To solve such a practical challenge, we propose a novel deep conditional random field (CRF) model which utilizes an end-to-end learning manner to smoothly handle fully/un/partially labeled sequences within a unified framework. To the best of our knowledge, this could be one of the first works to utilize the partially labeled instance for sequence labeling, and the proposed algorithm unifies the deep learning and CRF in an end-to-end framework. Extensive experiments show that our method achieves state-of-the-art performance in two sequence labeling tasks on some popular data sets.
Collapse
|
31
|
Zhang H, Qian J, Gao J, Yang J, Xu C. Scalable Proximal Jacobian Iteration Method With Global Convergence Analysis for Nonconvex Unconstrained Composite Optimizations. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:2825-2839. [PMID: 30668503 DOI: 10.1109/tnnls.2018.2885699] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The recent studies have found that the nonconvex relaxation functions usually perform better than the convex counterparts in the l0 -norm and rank function minimization problems. However, due to the absence of convexity in these nonconvex problems, developing efficient algorithms with convergence guarantee becomes very challenging. Inspired by the basic ideas of both the Jacobian alternating direction method of multipliers (JADMMs) for solving linearly constrained problems with separable objectives and the proximal gradient methods (PGMs) for optimizing the unconstrained problems with one variable, this paper focuses on extending the PGMs to the proximal Jacobian iteration methods (PJIMs) for handling with a family of nonconvex composite optimization problems with two splitting variables. To reduce the total computational complexity by decreasing the number of iterations, we devise the accelerated version of PJIMs through the well-known Nesterov's acceleration strategy and further extend both to solve the multivariable cases. Most importantly, we provide a rigorous convergence analysis, in theory, to show that the generated variable sequence globally converges to a critical point by exploiting the Kurdyka-Łojasiewica (KŁ) property for a broad class of functions. Furthermore, we also establish the linear and sublinear convergence rates of the obtained variable sequence in the objective function. As the specific application to the nonconvex sparse and low-rank recovery problems, several numerical experiments can verify that the newly proposed algorithms not only keep fast convergence speed but also have high precision.
Collapse
|
32
|
Zhang C, Cheng J, Tian Q. Multi-View Image Classification With Visual, Semantic And View Consistency. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:617-627. [PMID: 31425078 DOI: 10.1109/tip.2019.2934576] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Multi-view visual classification methods have been widely applied to use discriminative information of different views. This strategy has been proven very effective by many researchers. On the one hand, images are often treated independently without fully considering their visual and semantic correlations. On the other hand, view consistency is often ignored. To solve these problems, in this paper, we propose a novel multi-view image classification method with visual, semantic and view consistency (VSVC). For each image, we linearly combine multi-view information for image classification. The combination parameters are determined by considering both the classification loss and the visual, semantic and view consistency. Visual consistency is imposed by ensuring that visually similar images of the same view are predicted to have similar values. For semantic consistency, we impose the locality constraint that nearby images should be predicted to have the same class by multiview combination. View consistency is also used to ensure that similar images have consistent multi-view combination parameters. An alternative optimization strategy is used to learn the combination parameters. To evaluate the effectiveness of VSVC, we perform image classification experiments on several public datasets. The experimental results on these datasets show the effectiveness of the proposed VSVC method.
Collapse
|
33
|
A sum-modified-Laplacian and sparse representation based multimodal medical image fusion in Laplacian pyramid domain. Med Biol Eng Comput 2019; 57:2265-2275. [DOI: 10.1007/s11517-019-02023-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Accepted: 07/29/2019] [Indexed: 10/26/2022]
|
34
|
Abstract
Curriculum Learning (CL) is a recently proposed learning paradigm that aims to achieve satisfactory performance by properly organizing the learning sequence from simple curriculum examples to more difficult ones. Up to now, few works have been done to explore CL for the data with graph structure. Therefore, this article proposes a novel CL algorithm that can be utilized to guide the Label Propagation (LP) over graphs, of which the target is to “learn” the labels of unlabeled examples on the graphs. Specifically, we assume that different unlabeled examples have different levels of difficulty for propagation, and their label learning should follow a simple-to-difficult sequence with the updated curricula. Furthermore, considering that the practical data are often characterized by multiple modalities, every modality in our method is associated with a “teacher” that not only evaluates the difficulties of examples from its own viewpoint, but also cooperates with other teachers to generate the overall simplest curriculum examples for propagation. By taking the curriculums suggested by the teachers as a whole, the common preference (i.e., commonality) of teachers on selecting the simplest examples can be discovered by a row-sparse matrix, and their distinct opinions (i.e., individuality) are captured by a sparse noise matrix. As a result, an accurate curriculum sequence can be established and the propagation quality can thus be improved. Theoretically, we prove that the propagation risk bound is closely related to the examples’ difficulty information, and empirically, we show that our method can generate higher accuracy than the state-of-the-art CL approach and LP algorithms on various multi-modal tasks.
Collapse
Affiliation(s)
- Chen Gong
- PCA Lab, Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, and Jiangsu Key Lab of Image and Video Understanding for Social Security, School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
| | - Jian Yang
- PCA Lab, Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, and Jiangsu Key Lab of Image and Video Understanding for Social Security, School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
| | - Dacheng Tao
- UBTECH Sydney Artificial Intelligence Centre and the School of Computer Science, Faculty of Engineering and Information Technologies, the University of Sydney, Sydney, Australia
| |
Collapse
|
35
|
Liu Y, Nie F, Gao Q, Gao X, Han J, Shao L. Flexible unsupervised feature extraction for image classification. Neural Netw 2019; 115:65-71. [DOI: 10.1016/j.neunet.2019.03.008] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Revised: 03/13/2019] [Accepted: 03/13/2019] [Indexed: 11/27/2022]
|
36
|
Lücke J, Forster D. k-means as a variational EM approximation of Gaussian mixture models. Pattern Recognit Lett 2019. [DOI: 10.1016/j.patrec.2019.04.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
37
|
|
38
|
Zhao Y, Liu Y, Wen G, Huang T. Finite-Time Distributed Average Tracking for Second-Order Nonlinear Systems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:1780-1789. [PMID: 30371392 DOI: 10.1109/tnnls.2018.2873676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This paper studies the distributed average tracking (DAT) problem for multiple reference signals described by the second-order nonlinear dynamical systems. Leveraging the state-dependent gain design and the adaptive control approaches, a couple of DAT algorithms are developed in this paper, which are named finite-time and adaptive-gain DAT algorithms. Based on the finite-time one, the states of the physical agents in this paper can track the average of the time-varying reference signals within a finite settling time. Furthermore, the finite settling time is also estimated by considering a well-designed Lyapunov function in this paper. Compared with asymptotical DAT algorithms, the proposed finite-time algorithm not only solve finite-time DAT problems but also ensure states of physical agents to achieve an accurate average of the multiple signals. Then, an adaptive-gain DAT algorithm is designed. Based on the adaptive-gain one, the DAT problem is solved without global information. Thus, it is fully distributed. Finally, numerical simulations show the effectiveness of the theoretical results.
Collapse
|
39
|
|
40
|
Ye M, Li J, Ma AJ, Zheng L, Yuen PC. Dynamic Graph Co-Matching for Unsupervised Video-based Person Re-Identification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:2976-2990. [PMID: 30640612 DOI: 10.1109/tip.2019.2893066] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Cross-camera label estimation from a set of unlabelled training data is an extremely important component in unsupervised person re-identification (re-ID) systems. With the estimated labels, existing advanced supervised learning methods can be leveraged to learn discriminative re-ID models. In this paper, we utilize the graph matching technique for accurate label estimation due to its advantages in optimal global matching and intra-camera relationship mining. However, the graph structure constructed with non-learnt similarity measurement cannot handle the large cross-camera variations, which leads to noisy and inaccurate label outputs. This paper designs a Dynamic Graph Matching (DGM) framework, which improves the label estimation process by iteratively refining the graph structure with better similarity measurement learnt from intermediate estimated labels. In addition, we design a positive re-weighting strategy to refine the intermediate labels, which enhances the robustness against inaccurate matching output and noisy initial training data. To fully utilize the abundant video information and reduce false matchings, a co-matching strategy is further incorporated into the framework. Comprehensive experiments conducted on three video benchmarks demonstrate that DGM outperforms state-of-the-art unsupervised re-ID methods and yields competitive performance to fully supervised upper bounds.
Collapse
|
41
|
A novel reverse sparse model utilizing the spatio-temporal relationship of target templates for object tracking. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.10.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
42
|
|
43
|
Cui A, Peng J, Li H. Exact recovery low-rank matrix via transformed affine matrix rank minimization. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.05.092] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
44
|
Yun X, Sun Y, Wang S, Shi Y, Lu N. Multi-layer convolutional network-based visual tracking via important region selection. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.07.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
45
|
Zhou T, Liu F, Bhaskar H, Yang J. Robust Visual Tracking via Online Discriminative and Low-Rank Dictionary Learning. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:2643-2655. [PMID: 28920914 DOI: 10.1109/tcyb.2017.2747998] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
In this paper, we propose a novel and robust tracking framework based on online discriminative and low-rank dictionary learning. The primary aim of this paper is to obtain compact and low-rank dictionaries that can provide good discriminative representations of both target and background. We accomplish this by exploiting the recovery ability of low-rank matrices. That is if we assume that the data from the same class are linearly correlated, then the corresponding basis vectors learned from the training set of each class shall render the dictionary to become approximately low-rank. The proposed dictionary learning technique incorporates a reconstruction error that improves the reliability of classification. Also, a multiconstraint objective function is designed to enable active learning of a discriminative and robust dictionary. Further, an optimal solution is obtained by iteratively computing the dictionary, coefficients, and by simultaneously learning the classifier parameters. Finally, a simple yet effective likelihood function is implemented to estimate the optimal state of the target during tracking. Moreover, to make the dictionary adaptive to the variations of the target and background during tracking, an online update criterion is employed while learning the new dictionary. Experimental results on a publicly available benchmark dataset have demonstrated that the proposed tracking algorithm performs better than other state-of-the-art trackers.
Collapse
|
46
|
|
47
|
Zhang C, Cheng J, Tian Q. Incremental Codebook Adaptation for Visual Representation and Categorization. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:2012-2023. [PMID: 28749362 DOI: 10.1109/tcyb.2017.2726079] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
The bag-of-visual-words model is widely used for visual content analysis. For visual data, the codebook plays an important role for efficient representation. However, the codebook has to be relearned with the changes of training images. Once the codebook is changed, the encoding parameters of local features have to be recomputed. To alleviate this problem, in this paper, we propose an incremental codebook adaptation method for efficient visual representation. Instead of learning a new codebook, we gradually adapt a prelearned codebook using new images in an incremental way. To make use of the prelearned codebook, we try to make changes to the prelearned codebook with sparsity constraint and low-rank correlation. Besides, we also encode visually similar local features within a neighborhood to take advantage of locality information and ensure the encoded parameters are consistent. To evaluate the effectiveness of the proposed method, we apply the proposed method for categorization tasks on several public image datasets. Experimental results prove the effectiveness and usefulness of the proposed method over other codebook-based methods.
Collapse
|
48
|
Zhou JT, Zhao H, Peng X, Fang M, Qin Z, Goh RSM. Transfer Hashing: From Shallow to Deep. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:6191-6201. [PMID: 29993900 DOI: 10.1109/tnnls.2018.2827036] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
One major assumption used in most existing hashing approaches is that the domain of interest (i.e., the target domain) could provide sufficient training data, either labeled or unlabeled. However, this assumption may be violated in practice. To address this so-called data sparsity issue in hashing, a new framework termed transfer hashing with privileged information (THPI) is proposed, which marriages hashing and transfer learning (TL). To show the efficacy of THPI, we propose three variants of the well-known iterative quantization (ITQ) as a showcase. The proposed methods, ITQ+, LapITQ+, and deep transfer hashing (DTH), solve the aforementioned data sparsity issue from different aspects. Specifically, ITQ+ is a shallow model, which makes ITQ achieve hashing in a TL manner. ITQ+ learns a new slack function from the source domain to approximate the quantization error on the target domain given by ITQ. To further improve the performance of ITQ+, LapITQ+ is proposed by embedding the geometric relationship of the source domain into the target domain. Moreover, DTH is proposed to show the generality of our framework by utilizing the powerful representative capacity of deep learning. To the best of our knowledge, this could be one of the first DTH works. Extensive experiments on several popular data sets demonstrate the effectiveness of our shallow and DTH approaches comparing with several state-of-the-art hashing approaches.
Collapse
|
49
|
|
50
|
Yuen PC, Chellappa R. Learning Common and Feature-Specific Patterns: A Novel Multiple-Sparse-Representation-Based Tracker. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:2022-2037. [PMID: 29989985 DOI: 10.1109/tip.2017.2777183] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The use of multiple features has been shown to be an effective strategy for visual tracking because of their complementary contributions to appearance modeling. The key problem is how to learn a fused representation from multiple features for appearance modeling. Different features extracted from the same object should share some commonalities in their representations while each feature should also have some feature-specific representation patterns which reflect its complementarity in appearance modeling. Different from existing multi-feature sparse trackers which only consider the commonalities among the sparsity patterns of multiple features, this paper proposes a novel multiple sparse representation framework for visual tracking which jointly exploits the shared and feature-specific properties of different features by decomposing multiple sparsity patterns. Moreover, we introduce a novel online multiple metric learning to efficiently and adaptively incorporate the appearance proximity constraint, which ensures that the learned commonalities of multiple features are more representative. Experimental results on tracking benchmark videos and other challenging videos demonstrate the effectiveness of the proposed tracker.
Collapse
|