1
|
Wei X, Shen H, Kleinsteuber M. Trace Quotient with Sparsity Priors for Learning Low Dimensional Image Representations. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2020; 42:3119-3135. [PMID: 31180888 DOI: 10.1109/tpami.2019.2921031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
This work studies the problem of learning appropriate low dimensional image representations. We propose a generic algorithmic framework, which leverages two classic representation learning paradigms, i.e., sparse representation and the trace quotient criterion, to disentangle underlying factors of variation in high dimensional images. Specifically, we aim to learn simple representations of low dimensional, discriminant factors by applying the trace quotient criterion to well-engineered sparse representations. We construct a unified cost function, coined as the SPARse LOW dimensional representation (SparLow) function, for jointly learning both a sparsifying dictionary and a dimensionality reduction transformation. The SparLow function is widely applicable for developing various algorithms in three classic machine learning scenarios, namely, unsupervised, supervised, and semi-supervised learning. In order to develop efficient joint learning algorithms for maximizing the SparLow function, we deploy a framework of sparse coding with appropriate convex priors to ensure the sparse representations to be locally differentiable. Moreover, we develop an efficient geometric conjugate gradient algorithm to maximize the SparLow function on its underlying Riemannian manifold. Performance of the proposed SparLow algorithmic framework is investigated on several image processing tasks, such as 3D data visualization, face/digit recognition, and object/scene categorization.
Collapse
|
2
|
Cai M, Zhang Z, Shi X, Yang J, Hu Z, Tian J. Non-Negative Iterative Convex Refinement Approach for Accurate and Robust Reconstruction in Cerenkov Luminescence Tomography. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:3207-3217. [PMID: 32324543 DOI: 10.1109/tmi.2020.2987640] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Cerenkov luminescence tomography (CLT) is a promising imaging tool for obtaining three-dimensional (3D) non-invasive visualization of the in vivo distribution of radiopharmaceuticals. However, the reconstruction performance remains unsatisfactory for biomedical applications because the inverse problem of CLT is severely ill-conditioned and intractable. In this study, therefore, a novel non-negative iterative convex refinement (NNICR) approach was utilized to improve the CLT reconstruction accuracy, robustness as well as the shape recovery capability. The spike and slab prior information was employed to capture the sparsity of Cerenkov source, which could be formalized as a non-convex optimization problem. The NNICR approach solved this non-convex problem by refining the solutions of the convex sub-problems. To evaluate the performance of the NNICR approach, numerical simulations and in vivo tumor-bearing mice models experiments were conducted. Conjugated gradient based Tikhonov regularization approach (CG-Tikhonov), fast iterative shrinkage-thresholding algorithm based Lasso approach (Fista-Lasso) and Elastic-Net regularization approach were used for the comparison of the reconstruction performance. The results of these experiments demonstrated that the NNICR approach obtained superior reconstruction performance in terms of location accuracy, shape recovery capability, robustness and in vivo practicability. It was believed that this study would facilitate the preclinical and clinical applications of CLT in the future.
Collapse
|
3
|
Zheng J, Lou K, Yang X, Bai C, Tang J. Weighted Mixed-Norm Regularized Regression for Robust Face Identification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:3788-3802. [PMID: 30908239 DOI: 10.1109/tnnls.2019.2899073] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Face identification (FI) via regression-based classification has been extensively studied during the recent years. Most vector-based methods achieve appealing performance in handing the noncontiguous pixelwise noises, while some matrix-based regression methods show great potential in dealing with contiguous imagewise noises. However, there is a lack of consideration of the mixture noises case, where both contiguous and noncontiguous noises are jointly contained. In this paper, we propose a weighted mixed-norm regression (WMNR) method to cope with the mixture image corruption. WMNR reveals certain essential characteristics of FI problems and bridges the vector- and matrix-based methods. Particularly, WMNR provides two advantages for both theoretical analysis and practical implementation. First, it generalizes possible distributions of the residuals into a unified feature weighted loss function. Second, it constrains the residual image as low-rank structure that can be quantified with general nonconvex functions and a weight factor. Moreover, a new reweighted alternating direction method of multipliers algorithm is derived for the proposed WMNR model. The algorithm exhibits great computational efficiency since it divides the original optimization problem into certain subproblems with analytical solution or can be implemented in a parallel manner. Extensive experiments on several public face databases demonstrate the advantages of WMNR over the state-of-the-art regression-based approaches. More specifically, the WMNR achieves an appealing tradeoff between identification accuracy and computational efficiency. Compared with the pure vector-based methods, our approach achieves more than 10% performance improvement and saves more than 70% of runtime, especially in severe corruption scenarios. Compared with the pure matrix-based methods, although it requires slightly more computation time, the performance benefits are even larger; up to 20% improvement can be obtained.
Collapse
|
4
|
Cherukuri V, G VKB, Bala R, Monga V. Deep Retinal Image Segmentation with Regularization Under Geometric Priors. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:2552-2567. [PMID: 31613766 DOI: 10.1109/tip.2019.2946078] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Vessel segmentation of retinal images is a key diagnostic capability in ophthalmology. This problem faces several challenges including low contrast, variable vessel size and thickness, and presence of interfering pathology such as micro-aneurysms and hemorrhages. Early approaches addressing this problem employed hand-crafted filters to capture vessel structures, accompanied by morphological post-processing. More recently, deep learning techniques have been employed with significantly enhanced segmentation accuracy. We propose a novel domain enriched deep network that consists of two components: 1) a representation network that learns geometric features specific to retinal images, and 2) a custom designed computationally efficient residual task network that utilizes the features obtained from the representation layer to perform pixel-level segmentation. The representation and task networks are jointly learned for any given training set. To obtain physically meaningful and practically effective representation filters, we propose two new constraints that are inspired by expected prior structure on these filters: 1) orientation constraint that promotes geometric diversity of curvilinear features, and 2) a data adaptive noise regularizer that penalizes false positives. Multi-scale extensions are developed to enable accurate detection of thin vessels. Experiments performed on three challenging benchmark databases under a variety of training scenarios show that the proposed prior guided deep network outperforms state of the art alternatives as measured by common evaluation metrics, while being more economical in network size and inference time.
Collapse
|
5
|
Joint Sparse and Low-Rank Multitask Learning with Laplacian-Like Regularization for Hyperspectral Classification. REMOTE SENSING 2018. [DOI: 10.3390/rs10020322] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Multitask learning (MTL) has recently provided significant performance improvements in supervised classification of hyperspectral images (HSIs) by incorporating shared information across multiple tasks. However, the original MTL cannot effectively exploit both local and global structures of the HSI and the class label information is not fully used. Moreover, although the mathematical morphology (MM) has attracted considerable interest in feature extraction of HSI, it remains a challenging issue to sufficiently utilize multiple morphological profiles obtained by various structuring elements (SEs). In this paper, we propose a joint sparse and low-rank MTL method with Laplacian-like regularization (termed as sllMTL) for hyperspectral classification by utilizing the three-dimensional morphological profiles (3D-MPs) features. The main steps of the proposed method are twofold. First, the 3D-MPs are extracted by the 3D-opening and 3D-closing operators. Different SEs are adopted to result in multiple 3D-MPs. Second, sllMTL is proposed for hyperspectral classification by taking the 3D-MPs as features of different tasks. In the sllMTL, joint sparse and low-rank structures are exploited to capture the task specificity and relatedness, respectively. Laplacian-like regularization is also added to make full use of the label information of training samples. Experiments on three datasets demonstrate the OA of the proposed method is at least about 2% higher than other state-of-the-art methods with very limited training samples.
Collapse
|
6
|
|
7
|
Monga V. Fast Low-Rank Shared Dictionary Learning for Image Classification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:5160-5175. [PMID: 28742035 DOI: 10.1109/tip.2017.2729885] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Despite the fact that different objects possess distinct class-specific features, they also usually share common patterns. This observation has been exploited partially in a recently proposed dictionary learning framework by separating the particularity and the commonality (COPAR). Inspired by this, we propose a novel method to explicitly and simultaneously learn a set of common patterns as well as class-specific features for classification with more intuitive constraints. Our dictionary learning framework is hence characterized by both a shared dictionary and particular (class-specific) dictionaries. For the shared dictionary, we enforce a low-rank constraint, i.e., claim that its spanning subspace should have low dimension and the coefficients corresponding to this dictionary should be similar. For the particular dictionaries, we impose on them the well-known constraints stated in the Fisher discrimination dictionary learning (FDDL). Furthermore, we develop new fast and accurate algorithms to solve the subproblems in the learning step, accelerating its convergence. The said algorithms could also be applied to FDDL and its extensions. The efficiencies of these algorithms are theoretically and experimentally verified by comparing their complexities and running time with those of other well-known dictionary learning methods. Experimental results on widely used image data sets establish the advantages of our method over the state-of-the-art dictionary learning methods.
Collapse
|
8
|
|
9
|
Zheng J, Yang P, Chen S, Shen G, Wang W. Iterative Re-Constrained Group Sparse Face Recognition With Adaptive Weights Learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:2408-2423. [PMID: 28320663 DOI: 10.1109/tip.2017.2681841] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
In this paper, we consider the robust face recognition problem via iterative re-constrained group sparse classifier (IRGSC) with adaptive weights learning. Specifically, we propose a group sparse representation classification (GSRC) approach in which weighted features and groups are collaboratively adopted to encode more structure information and discriminative information than other regression based methods. In addition, we derive an efficient algorithm to optimize the proposed objective function, and theoretically prove the convergence. There are several appealing aspects associated with IRGSC. First, adaptively learned weights can be seamlessly incorporated into the GSRC framework. This integrates the locality structure of the data and validity information of the features into l2,p -norm regularization to form a unified formulation. Second, IRGSC is very flexible to different size of training set as well as feature dimension thanks to the l2,p -norm regularization. Third, the derived solution is proved to be a stationary point (globally optimal if p ≥ 1 ). Comprehensive experiments on representative data sets demonstrate that IRGSC is a robust discriminative classifier which significantly improves the performance and efficiency compared with the state-of-the-art methods in dealing with face occlusion, corruption, and illumination changes, and so on.
Collapse
|
10
|
Luo L, Chen L, Yang J, Qian J, Zhang B. Tree-Structured Nuclear Norm Approximation with Applications to Robust Face Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:5757-5767. [PMID: 28113977 DOI: 10.1109/tip.2016.2612885] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Structured sparsity, as an extension of standard sparsity, has shown the outstanding performance when dealing with some highly correlated variables in computer vision and pattern recognition. However, the traditional mixed (L1, L2) or (L1, L∞) group norm becomes weak in characterizing the internal structure of each group since they cannot alleviate the correla-tions between variables. Recently, nuclear norm has been vali-dated to be useful for depicting a spatially structured matrix variable. It considers the global structure of the matrix variable but overlooks the local structure. To combine the advantages of structured sparsity and nuclear norm, this paper presents a tree-structured nuclear norm approximation (TSNA) model as-suming that the representation residual with tree-structured prior is a random matrix variable and follows a dependent matrix dis-tribution. The Extended Alternating Direction Method of Multi-pliers (EADMM) is utilized to solve the proposed model. An effi-cient bound condition based on the extended restricted isometry constants is provided to show the exact recovery of the proposed model under the given noisy case. In addition, TSNA is connected with some newest methods such as sparse representation based classifier (SRC), nuclear-L1 norm joint regression (NL1R) and nuclear norm based matrix regression (NMR), which can be re-garded as the special cases of TSNA. Experiments with face re-construction and recognition demonstrate the benefits of TSNA over other approaches.
Collapse
|
11
|
Wan M, Gu G, Qian W, Ren K, Chen Q. Robust infrared small target detection via non-negativity constraint-based sparse representation. APPLIED OPTICS 2016; 55:7604-7612. [PMID: 27661588 DOI: 10.1364/ao.55.007604] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Infrared (IR) small target detection is one of the vital techniques in many military applications, including IR remote sensing, early warning, and IR precise guidance. Over-complete dictionary based sparse representation is an effective image representation method that can capture geometrical features of IR small targets by the redundancy of the dictionary. In this paper, we concentrate on solving the problem of robust infrared small target detection under various scenes via sparse representation theory. First, a frequency saliency detection based preprocessing is developed to extract suspected regions that may possibly contain the target so that the subsequent computing load is reduced. Second, a target over-complete dictionary is constructed by a varietal two-dimensional Gaussian model with an extent feature constraint and a background term. Third, a sparse representation model with a non-negativity constraint is proposed for the suspected regions to calculate the corresponding coefficient vectors. Fourth, the detection problem is skillfully converted to an l1-regularized optimization through an accelerated proximal gradient (APG) method. Finally, based on the distinct sparsity difference, an evaluation index called sparse rate (SR) is presented to extract the real target by an adaptive segmentation directly. Large numbers of experiments demonstrate both the effectiveness and robustness of this method.
Collapse
|
12
|
Yong Luo, Yonggang Wen, Dacheng Tao, Jie Gui, Chao Xu. Large Margin Multi-Modal Multi-Task Feature Extraction for Image Classification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:414-427. [PMID: 26529763 DOI: 10.1109/tip.2015.2495116] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The features used in many image analysis-based applications are frequently of very high dimension. Feature extraction offers several advantages in high-dimensional cases, and many recent studies have used multi-task feature extraction approaches, which often outperform single-task feature extraction approaches. However, most of these methods are limited in that they only consider data represented by a single type of feature, even though features usually represent images from multiple modalities. We, therefore, propose a novel large margin multi-modal multi-task feature extraction (LM3FE) framework for handling multi-modal features for image classification. In particular, LM3FE simultaneously learns the feature extraction matrix for each modality and the modality combination coefficients. In this way, LM3FE not only handles correlated and noisy features, but also utilizes the complementarity of different modalities to further help reduce feature redundancy in each modality. The large margin principle employed also helps to extract strongly predictive features, so that they are more suitable for prediction (e.g., classification). An alternating algorithm is developed for problem optimization, and each subproblem can be efficiently solved. Experiments on two challenging real-world image data sets demonstrate the effectiveness and superiority of the proposed method.
Collapse
|