1
|
Li B, Qin H, Xiong W, Li Y, Feng S, Hu W, Maybank S. Ranking-Based Color Constancy With Limited Training Samples. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:12304-12320. [PMID: 37216258 DOI: 10.1109/tpami.2023.3278832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Computational color constancy is an important component of Image Signal Processors (ISP) for white balancing in many imaging devices. Recently, deep convolutional neural networks (CNN) have been introduced for color constancy. They achieve prominent performance improvements comparing with those statistics or shallow learning-based methods. However, the need for a large number of training samples, a high computational cost and a huge model size make CNN-based methods unsuitable for deployment on low-resource ISPs for real-time applications. In order to overcome these limitations and to achieve comparable performance to CNN-based methods, an efficient method is defined for selecting the optimal simple statistics-based method (SM) for each image. To this end, we propose a novel ranking-based color constancy method (RCC) that formulates the selection of the optimal SM method as a label ranking problem. RCC designs a specific ranking loss function, and uses a low rank constraint to control the model complexity and a grouped sparse constraint for feature selection. Finally, we apply the RCC model to predict the order of the candidate SM methods for a test image, and then estimate its illumination using the predicted optimal SM method (or fusing the results estimated by the top k SM methods). Comprehensive experiment results show that the proposed RCC outperforms nearly all the shallow learning-based methods and achieves comparable performance to (sometimes even better performance than) deep CNN-based methods with only 1/2000 of the model size and training time. RCC also shows good robustness to limited training samples and good generalization crossing cameras. Furthermore, to remove the dependence on the ground truth illumination, we extend RCC to obtain a novel ranking-based method without ground truth illumination (RCC_NO) that learns the ranking model using simple partial binary preference annotations provided by untrained annotators rather than experts. RCC_NO also achieves better performance than the SM methods and most shallow learning-based methods with low costs of sample collection and illumination measurement.
Collapse
|
2
|
Zhang X, Zheng J, Wang D, Tang G, Zhou Z, Lin Z. Structured Sparsity Optimization With Non-Convex Surrogates of l 2,0-Norm: A Unified Algorithmic Framework. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:6386-6402. [PMID: 36219668 DOI: 10.1109/tpami.2022.3213716] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
In this article, we present a general optimization framework that leverages structured sparsity to achieve superior recovery results. The traditional method for solving the structured sparse objectives based on l2,0-norm is to use the l2,1-norm as a convex surrogate. However, such an approximation often yields a large performance gap. To tackle this issue, we first provide a framework that allows for a wide range of surrogate functions (including non-convex surrogates), which exhibits better performance in harnessing structured sparsity. Moreover, we develop a fixed point algorithm that solves a key underlying non-convex structured sparse recovery optimization problem to global optimality with a guaranteed super-linear convergence rate. Building on this, we consider three specific applications, i.e., outlier pursuit, supervised feature selection, and structured dictionary learning, which can benefit from the proposed structured sparsity optimization framework. In each application, how the optimization problem can be formulated and thus be relaxed under a generic surrogate function is explained in detail. We conduct extensive experiments on both synthetic and real-world data and demonstrate the effectiveness and efficiency of the proposed framework.
Collapse
|
3
|
Rebollo-Neira L, Inacio A. Enhancing sparse representation of color images by cross channel transformation. PLoS One 2023; 18:e0279917. [PMID: 36701348 PMCID: PMC9879438 DOI: 10.1371/journal.pone.0279917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 12/16/2022] [Indexed: 01/27/2023] Open
Abstract
Transformations for enhancing sparsity in the approximation of color images by 2D atomic decomposition are discussed. The sparsity is firstly considered with respect to the most significant coefficients in the wavelet decomposition of the color image. The discrete cosine transform is singled out as an effective 3 point transformation for this purpose. The enhanced feature is further exploited by approximating the transformed arrays using an effective greedy strategy with a separable highly redundant dictionary. The relevance of the achieved sparsity is illustrated by a simple encoding procedure. On typical test images the compression at high quality recovery is shown to significantly improve upon JPEG and WebP formats.
Collapse
Affiliation(s)
- Laura Rebollo-Neira
- Mathematics Department, Aston University B4 7ET, Birmingham, United Kingdom
- * E-mail:
| | | |
Collapse
|
4
|
Nai K, Li Z, Gan Y, Wang Q. Robust Visual Tracking via Multitask Sparse Correlation Filters Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:502-515. [PMID: 34310327 DOI: 10.1109/tnnls.2021.3097498] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In this article, a novel multitask sparse correlation filters (MTSCF) model, which introduces multitask sparse learning into the CFs framework, is proposed for visual tracking. Specifically, the proposed MTSCF method exploits multitask learning to take the interdependencies among different visual features (e.g., histogram of oriented gradient (HOG), color names, and CNN features) into account to simultaneously learn the CFs and make the learned filters enhance and complement each other to boost the tracking performance. Moreover, it also performs feature selection to dynamically select discriminative spatial features from the target region to distinguish the target object from the background. A l2,1 regularization term is considered to realize multitask sparse learning. In order to solve the objective model, alternating direction method of multipliers is utilized for learning the CFs. By considering multitask sparse learning, the proposed MTSCF model can fully utilize the strength of different visual features and select effective spatial features to better model the appearance of the target object. Extensive experiment results on multiple tracking benchmarks demonstrate that our MTSCF tracker achieves competitive tracking performance in comparison with several state-of-the-art trackers.
Collapse
|
5
|
Feature fusion based on joint sparse representations and wavelets for multiview classification. Pattern Anal Appl 2022. [DOI: 10.1007/s10044-022-01110-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Abstract
AbstractFeature-level-based fusion has attracted much interest. Generally, a dataset can be created in different views, features, or modalities. To improve the classification rate, local information is shared among different views by various fusion methods. However, almost all the methods use the views without considering their common aspects. In this paper, wavelet transform is considered to extract high and low frequencies of the views as common aspects to improve the classification rate. The fusion method for the decomposed parts is based on joint sparse representation in which a number of scenarios can be considered. The presented approach is tested on three datasets. The results obtained by this method prove competitive performance in terms of the datasets compared to the state-of-the-art results.
Collapse
|
6
|
Peng J, Tang B, Jiang H, Li Z, Lei Y, Lin T, Li H. Overcoming Long-Term Catastrophic Forgetting Through Adversarial Neural Pruning and Synaptic Consolidation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:4243-4256. [PMID: 33577459 DOI: 10.1109/tnnls.2021.3056201] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Enabling a neural network to sequentially learn multiple tasks is of great significance for expanding the applicability of neural networks in real-world applications. However, artificial neural networks face the well-known problem of catastrophic forgetting. What is worse, the degradation of previously learned skills becomes more severe as the task sequence increases, known as the long-term catastrophic forgetting. It is due to two facts: first, as the model learns more tasks, the intersection of the low-error parameter subspace satisfying for these tasks becomes smaller or even does not exist; second, when the model learns a new task, the cumulative error keeps increasing as the model tries to protect the parameter configuration of previous tasks from interference. Inspired by the memory consolidation mechanism in mammalian brains with synaptic plasticity, we propose a confrontation mechanism in which Adversarial Neural Pruning and synaptic Consolidation (ANPyC) is used to overcome the long-term catastrophic forgetting issue. The neural pruning acts as long-term depression to prune task-irrelevant parameters, while the novel synaptic consolidation acts as long-term potentiation to strengthen task-relevant parameters. During the training, this confrontation achieves a balance in that only crucial parameters remain, and non-significant parameters are freed to learn subsequent tasks. ANPyC avoids forgetting important information and makes the model efficient to learn a large number of tasks. Specifically, the neural pruning iteratively relaxes the current task's parameter conditions to expand the common parameter subspace of the task; the synaptic consolidation strategy, which consists of a structure-aware parameter-importance measurement and an element-wise parameter updating strategy, decreases the cumulative error when learning new tasks. Our approach encourages the synapse to be sparse and polarized, which enables long-term learning and memory. ANPyC exhibits effectiveness and generalization on both image classification and generation tasks with multiple layer perceptron, convolutional neural networks, and generative adversarial networks, and variational autoencoder. The full source code is available at https://github.com/GeoX-Lab/ANPyC.
Collapse
|
7
|
An Improved Dictionary-Based Method for Gas Identification with Electronic Nose. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12136650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The dictionary learning algorithm has been successfully applied to electronic noses because of its high recognition rate. However, most dictionary learning algorithms use l0-norm or l1-norm to regularize the sparse coefficients, which means that the electronic nose takes a long time to test samples and results in the inefficiency of the system. Aiming at accelerating the recognition speed of the electronic nose system, an efficient dictionary learning algorithm is proposed in this paper where the algorithm performs a multi-column atomic update. Meanwhile, to solve the problem that the singular value decomposition of the k-means (K-SVD) dictionary has little discriminative power, a novel classification model is proposed, a coefficient matrix is achieved by a linear projection to the training sample, and a constraint is imposed where the coefficients in the same category should keep a large coefficient and be closer to their class centers while coefficients in the different categories should keep sparsity. The algorithm was evaluated and analyzed based on the comparisons of several traditional classification algorithms. When the dimension of the sample was larger than 10, the average recognition rate of the algorithm was maintained above 92%, and the average training time was controlled within 4 s. The experimental results show that the improved algorithm is an effective method for the development of an electronic nose.
Collapse
|
8
|
An Approach for Selecting the Most Explanatory Features for Facial Expression Recognition. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12115637] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The objective of this work is to analyze which features are most important in the recognition of facial expressions. To achieve this, we built a facial expression recognition system that learns from a controlled capture data set. The system uses different representations and combines them from a learned model. We studied the most important features by applying different feature extraction methods for facial expression representation, transforming each obtained representation into a sparse representation (SR) domain, and trained combination models to classify signals, using the extended Cohn–Kanade (CK+), BU-3DFE, and JAFFE data sets for validation. We compared 14 combination methods for 247 possible combinations of eight different feature spaces and obtained the most explanatory features for each facial expression. The results indicate that the LPQ (83%), HOG (82%), and RAW (82%) features are those features most able to improve the classification of expressions and that some features apply specifically to one expression (e.g., RAW for neutral, LPQ for angry and happy, LBP for disgust, and HOG for surprise).
Collapse
|
9
|
Hierarchical Fusion Using Subsets of Multi-Features for Historical Arabic Manuscript Dating. J Imaging 2022; 8:jimaging8030060. [PMID: 35324615 PMCID: PMC8954291 DOI: 10.3390/jimaging8030060] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 02/21/2022] [Accepted: 02/24/2022] [Indexed: 11/21/2022] Open
Abstract
Automatic dating tools for historical documents can greatly assist paleographers and save them time and effort. This paper describes a novel method for estimating the date of historical Arabic documents that employs hierarchical fusions of multiple features. A set of traditional features and features extracted by a residual network (ResNet) are fused in a hierarchical approach using joint sparse representation. To address noise during the fusion process, a new approach based on subsets of multiple features is being considered. Following that, supervised and unsupervised classifiers are used for classification. We show that using hierarchical fusion based on subsets of multiple features in the KERTAS dataset can produce promising results and significantly improve the results.
Collapse
|
10
|
Zhu Y, Zhang W, Zhang M, Zhang K, Zhu Y. Image emotion distribution learning based on enhanced fuzzy KNN algorithm with sparse learning. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021. [DOI: 10.3233/jifs-210251] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
With the trend of people expressing opinions and emotions via images online, increasing attention has been paid to affective analysis of visual content. Traditional image affective analysis mainly focuses on single-label classification, but an image usually evokes multiple emotions. To this end, emotion distribution learning is proposed to describe emotions more explicitly. However, most current studies ignore the ambiguity included in emotions and the elusive correlations with complex visual features. Considering that emotions evoked by images are delivered through various visual features, and each feature in the image may have multiple emotion attributes, this paper develops a novel model that extracts multiple features and proposes an enhanced fuzzy k-nearest neighbor (EFKNN) to calculate the fuzzy emotional memberships. Specifically, the multiple visual features are converted into fuzzy emotional memberships of each feature belonging to emotion classes, which can be regarded as an intermediate representation to bridge the affective gap. Then, the fuzzy emotional memberships are fed into a fully connected neural network to learn the relationships between the fuzzy memberships and image emotion distributions. To obtain the fuzzy memberships of test images, a novel sparse learning method is introduced by learning the combination coefficients of test images and training images. Extensive experimental results on several datasets verify the superiority of our proposed approach for emotion distribution learning of images.
Collapse
Affiliation(s)
- Yunwen Zhu
- Shanghai Film Academy, Shanghai University, Shanghai, China
| | - Wenjun Zhang
- College of Information Technology, Shanghai Jian Qiao University, Shanghai, China
| | - Meixian Zhang
- Shanghai Film Academy, Shanghai University, Shanghai, China
| | - Ke Zhang
- Shanghai Film Academy, Shanghai University, Shanghai, China
| | - Yonghua Zhu
- Shanghai Film Academy, Shanghai University, Shanghai, China
| |
Collapse
|
11
|
Kontar R, Raskutti G, Zhou S. Minimizing Negative Transfer of Knowledge in Multivariate Gaussian Processes: A Scalable and Regularized Approach. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:3508-3522. [PMID: 32305903 DOI: 10.1109/tpami.2020.2987482] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Recently there has been an increasing interest in the multivariate Gaussian process (MGP) which extends the Gaussian process (GP) to deal with multiple outputs. One approach to construct the MGP and account for non-trivial commonalities amongst outputs employs a convolution process (CP). The CP is based on the idea of sharing latent functions across several convolutions. Despite the elegance of the CP construction, it provides new challenges that need yet to be tackled. First, even with a moderate number of outputs, model building is extremely prohibitive due to the huge increase in computational demands and number of parameters to be estimated. Second, the negative transfer of knowledge may occur when some outputs do not share commonalities. In this paper we address these issues. We propose a regularized pairwise modeling approach for the MGP established using CP. The key feature of our approach is to distribute the estimation of the full multivariate model into a group of bivariate GPs which are individually built. Interestingly pairwise modeling turns out to possess unique characteristics, which allows us to tackle the challenge of negative transfer through penalizing the latent function that facilitates information sharing in each bivariate model. Predictions are then made through combining predictions from the bivariate models within a Bayesian framework. The proposed method has excellent scalability when the number of outputs is large and minimizes the negative transfer of knowledge between uncorrelated outputs. Statistical guarantees for the proposed method are studied and its advantageous features are demonstrated through numerical studies.
Collapse
|
12
|
Wu B, Wei B, Liu J, Wu K, Wang M. Faceted Text Segmentation via Multitask Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:3846-3857. [PMID: 32894723 DOI: 10.1109/tnnls.2020.3015996] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Text segmentation is a fundamental step in natural language processing (NLP) and information retrieval (IR) tasks. Most existing approaches do not explicitly take into account the facet information of documents for segmentation. Text segmentation and facet annotation are often addressed as separate problems, but they operate in a common input space. This article proposes FTS, which is a novel model for faceted text segmentation via multitask learning (MTL). FTS models faceted text segmentation as an MTL problem with text segmentation and facet annotation. This model employs the bidirectional long short-term memory (Bi-LSTM) network to learn the feature representation of sentences within a document. The feature representation is shared and adjusted with common parameters by MTL, which can help an optimization model to learn a better-shared and robust feature representation from text segmentation to facet annotation. Moreover, the text segmentation is modeled as a sequence tagging task using LSTM with a conditional random fields (CRFs) classification layer. Extensive experiments are conducted on five data sets from five domains: data structure, data mining, computer network, solid mechanics, and crystallography. The results indicate that the FTS model outperforms several highly cited and state-of-the-art approaches related to text segmentation and facet annotation.
Collapse
|
13
|
Chen Y, Luo Z, Kong L. ℓ2,0-norm based selection and estimation for multivariate generalized linear models. J MULTIVARIATE ANAL 2021. [DOI: 10.1016/j.jmva.2021.104782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
14
|
A Two-Step Classification Method Based on Collaborative Representation for Positive and Unlabeled Learning. Neural Process Lett 2021. [DOI: 10.1007/s11063-021-10590-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
15
|
Abeywardhana D, Dangalle C, Nugaliyadde A, Mallawarachchi Y. Deep learning approach to classify Tiger beetles of Sri Lanka. ECOL INFORM 2021. [DOI: 10.1016/j.ecoinf.2021.101286] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
16
|
Medical image fusion using segment graph filter and sparse representation. Comput Biol Med 2021; 131:104239. [PMID: 33550015 DOI: 10.1016/j.compbiomed.2021.104239] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2020] [Revised: 01/19/2021] [Accepted: 01/19/2021] [Indexed: 11/24/2022]
Abstract
This study proposes a novel medical image fusion approach based on the segment graph filter (SGF) and sparse representation (SR). Specifically, using the SGF, source images are decomposed into base and detail images, based on which the edge information is integrated into the fused image as much as possible. The base images are then fused applying a fusion rule based on the normalized Shannon entropy, whereas the detail images are fused using an SR-based fusion method. Finally, the resultant fused image is computed by combining the fused base and detail images. For quantitative performance evaluations, five metrics are adopted: the feature-based metric, structure-based metric, normalized mutual information, nonlinear correlation information entropy, and phase congruency metric. Experimental results indicate that the fusion performance of the proposed method is comparable to those of state-of-the-art methods with respect to both subjective visual performance and objective quantification.
Collapse
|
17
|
Liu J, Wu Z, Xiao L, Yan H. Learning Multiple Parameters for Kernel Collaborative Representation Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:5068-5078. [PMID: 31976913 DOI: 10.1109/tnnls.2019.2962878] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this article, the problem of automatically learning multiple parameters for kernel collaborative representation classification (KCRC) is considered. We investigate the KCRC and measure its generalization error via leave-one-out cross-validation (LOO-CV). By taking advantage of the specific properties of KCRC, a closed-form expression is derived for the outputs of LOO-CV. Then, a simple classification rule that provides probabilistic outputs is adopted, and thereby, an effective loss function that is an explicit function with respect to the parameters is proposed as the generalization error. The gradients of the loss function are calculated, and the parameters are learned by minimizing the loss function using a gradient-based optimization algorithm. Furthermore, the proposed approach makes it possible to solve the multiple kernel/feature learning problems of KCRC effectively. Experiment results on six data sets taken from different scenes demonstrate the effectiveness of the proposed approach.
Collapse
|
18
|
CLRS: Continual Learning Benchmark for Remote Sensing Image Scene Classification. SENSORS 2020; 20:s20041226. [PMID: 32102294 PMCID: PMC7070946 DOI: 10.3390/s20041226] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/11/2020] [Revised: 02/20/2020] [Accepted: 02/20/2020] [Indexed: 11/28/2022]
Abstract
Remote sensing image scene classification has a high application value in the agricultural, military, as well as other fields. A large amount of remote sensing data is obtained every day. After learning the new batch data, scene classification algorithms based on deep learning face the problem of catastrophic forgetting, that is, they cannot maintain the performance of the old batch data. Therefore, it has become more and more important to ensure that the scene classification model has the ability of continual learning, that is, to learn new batch data without forgetting the performance of the old batch data. However, the existing remote sensing image scene classification datasets all use static benchmarks and lack the standard to divide the datasets into a number of sequential learning training batches, which largely limits the development of continual learning in remote sensing image scene classification. First, this study gives the criteria for training batches that have been partitioned into three continual learning scenarios, and proposes a large-scale remote sensing image scene classification database called the Continual Learning Benchmark for Remote Sensing (CLRS). The goal of CLRS is to help develop state-of-the-art continual learning algorithms in the field of remote sensing image scene classification. In addition, in this paper, a new method of constructing a large-scale remote sensing image classification database based on the target detection pretrained model is proposed, which can effectively reduce manual annotations. Finally, several mainstream continual learning methods are tested and analyzed under three continual learning scenarios, and the results can be used as a baseline for future work.
Collapse
|
19
|
Shi Y, Suk HI, Gao Y, Lee SW, Shen D. Leveraging Coupled Interaction for Multimodal Alzheimer's Disease Diagnosis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:186-200. [PMID: 30908241 DOI: 10.1109/tnnls.2019.2900077] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
As the population becomes older worldwide, accurate computer-aided diagnosis for Alzheimer's disease (AD) in the early stage has been regarded as a crucial step for neurodegeneration care in recent years. Since it extracts the low-level features from the neuroimaging data, previous methods regarded this computer-aided diagnosis as a classification problem that ignored latent featurewise relation. However, it is known that multiple brain regions in the human brain are anatomically and functionally interlinked according to the current neuroscience perspective. Thus, it is reasonable to assume that the extracted features from different brain regions are related to each other to some extent. Also, the complementary information between different neuroimaging modalities could benefit multimodal fusion. To this end, we consider leveraging the coupled interactions in the feature level and modality level for diagnosis in this paper. First, we propose capturing the feature-level coupled interaction using a coupled feature representation. Then, to model the modality-level coupled interaction, we present two novel methods: 1) the coupled boosting (CB) that models the correlation of pairwise coupled-diversity on both inconsistently and incorrectly classified samples between different modalities and 2) the coupled metric ensemble (CME) that learns an informative feature projection from different modalities by integrating the intrarelation and interrelation of training samples. We systematically evaluated our methods with the AD neuroimaging initiative data set. By comparison with the baseline learning-based methods and the state-of-the-art methods that are specially developed for AD/MCI (mild cognitive impairment) diagnosis, our methods achieved the best performance with accuracy of 95.0% and 80.7% (CB), 94.9% and 79.9% (CME) for AD/NC (normal control), and MCI/NC identification, respectively.
Collapse
|
20
|
Liu D, Liang C, Zhang Z, Qi L, Lovell BC. Exploring Inter-Instance Relationships within the Query Set for Robust Image Set Matching. SENSORS 2019; 19:s19225051. [PMID: 31752415 PMCID: PMC6891765 DOI: 10.3390/s19225051] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Revised: 11/09/2019] [Accepted: 11/11/2019] [Indexed: 11/16/2022]
Abstract
Image set matching (ISM) has attracted increasing attention in the field of computer vision and pattern recognition. Some studies attempt to model query and gallery sets under a joint or collaborative representation framework, achieving impressive performance. However, existing models consider only the competition and collaboration among gallery sets, neglecting the inter-instance relationships within the query set which are also regarded as one important clue for ISM. In this paper, inter-instance relationships within the query set are explored for robust image set matching. Specifically, we propose to represent the query set instances jointly via a combined dictionary learned from the gallery sets. To explore the commonality and variations within the query set simultaneously to benefit the matching, both low rank and class-level sparsity constraints are imposed on the representation coefficients. Then, to deal with nonlinear data in real scenarios, the‘kernelized version is also proposed. Moreover, to tackle the gross corruptions mixed in the query set, the proposed model is extended for robust ISM. The optimization problems are solved efficiently by employing singular value thresholding and block soft thresholding operators in an alternating direction manner. Experiments on five public datasets demonstrate the effectiveness of the proposed method, comparing favorably with state-of-the-art methods.
Collapse
Affiliation(s)
- Deyin Liu
- School of Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (D.L.); (L.Q.)
- School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane 4072, Australia;
| | - Chengwu Liang
- School of Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (D.L.); (L.Q.)
- School of Electrical and Control Engineering, Henan University of Urban Construction, Pingdingshan 467036, China
- Correspondence:
| | - Zhiming Zhang
- School of Control Science and Engineering, Shandong University, Jinan 250100, China;
| | - Lin Qi
- School of Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (D.L.); (L.Q.)
| | - Brian C. Lovell
- School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane 4072, Australia;
| |
Collapse
|
21
|
Abstract
HYPOTHESIS The artificial intelligence and image processing technology can develop automatic diagnostic algorithm for pediatric otitis media (OM) with accuracy comparable to that from well-trained otologists. BACKGROUND OM is a public health issue that occurs commonly in pediatric population. Caring for OM may incur significant indirect cost that stems mainly from loss of school or working days seeking for medical consultation. It makes great sense for the homecare of OM. In this study, we aim to develop an automatic diagnostic algorithm for pediatric OM. METHODS A total of 1,230 otoscopic images were collected. Among them, 214 images diagnosed of acute otitis media (AOM) and otitis media with effusion (OME) are used as the database for image classification in this study. For the OM image classification system, the image database is randomly partitioned into the test and train subsets. Of each image in the train and test sets, the desired eardrum image region is first segmented, then multiple image features such as color, and shape are extracted. The multitask joint sparse representation-based classification to combine different features of the OM image is used for classification. RESULTS The multitask joint sparse representation algorithm was applied for the classification of the AOM and OME images. The approach is able to differentiate the OME from AOM images and achieves the classification accuracy as high as 91.41%. CONCLUSION Our results demonstrated that this automatic diagnosis algorithm has acceptable accuracy to diagnose pediatric OM. The cost-effective algorithm can assist parents for early detection and continuous monitoring at home to decrease consequence of the disease.
Collapse
|
22
|
Yang Y, Wu QMJ. Features Combined From Hundreds of Midlayers: Hierarchical Networks With Subnetwork Nodes. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:3313-3325. [PMID: 30703046 DOI: 10.1109/tnnls.2018.2890787] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In this paper, we believe that the mixed selectivity of neuron in the top layer encodes distributed information produced from other neurons to offer a significant computational advantage over recognition accuracy. Thus, this paper proposes a hierarchical network framework that the learning behaviors of features combined from hundreds of midlayers. First, a subnetwork neuron, which itself could be constructed by other nodes, is functional as a subspace features extractor. The top layer of a hierarchical network needs subspace features produced by the subnetwork neurons to get rid of factors that are not relevant, but at the same time, to recast the subspace features into a mapping space so that the hierarchical network can be processed to generate more reliable cognition. Second, this paper shows that with noniterative learning strategy, the proposed method has a wider and shallower structure, providing a significant role in generalization performance improvements. Hence, compared with other state-of-the-art methods, multiple channel features with the proposed method could provide a comparable or even better performance, which dramatically boosts the learning speed. Our experimental results show that our platform can provide a much better generalization performance than 55 other state-of-the-art methods.
Collapse
|
23
|
Multiple-relations-constrained image classification with limited training samples via Pareto optimization. Neural Comput Appl 2019. [DOI: 10.1007/s00521-018-3491-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
24
|
|
25
|
|
26
|
Early identification of ischemic stroke in noncontrast computed tomography. Biomed Signal Process Control 2019. [DOI: 10.1016/j.bspc.2019.03.008] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
27
|
Yoo HJ, Park HJ, Lee B. Myoelectric Signal Classification of Targeted Muscles Using Dictionary Learning. SENSORS 2019; 19:s19102370. [PMID: 31126025 PMCID: PMC6567142 DOI: 10.3390/s19102370] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 05/14/2019] [Accepted: 05/21/2019] [Indexed: 11/16/2022]
Abstract
Surface electromyography (sEMG) signals comprise electrophysiological information related to muscle activity. As this signal is easy to record, it is utilized to control several myoelectric prostheses devices. Several studies have been conducted to process sEMG signals more efficiently. However, research on optimal algorithms and electrode placements for the processing of sEMG signals is still inconclusive. In addition, very few studies have focused on minimizing the number of electrodes. In this study, we investigated the most effective method for myoelectric signal classification with a small number of electrodes. A total of 23 subjects participated in the study, and the sEMG data of 14 different hand movements of the subjects were acquired from targeted muscles and untargeted muscles. Furthermore, the study compared the classification accuracy of the sEMG data using discriminative feature-oriented dictionary learning (DFDL) and other conventional classifiers. DFDL demonstrated the highest classification accuracy among the classifiers, and its higher quality performance became more apparent as the number of channels decreased. The targeted method was superior to the untargeted method, particularly when classifying sEMG signals with DFDL. Therefore, it was concluded that the combination of the targeted method and the DFDL algorithm could classify myoelectric signals more effectively with a minimal number of channels.
Collapse
Affiliation(s)
- Hyun-Joon Yoo
- Department of Biomedical Science and Engineering (BMSE), Institute of Integrated Technology (IIT), Gwangju Institute of Science and Technology (GIST), Gwangju 61005, Korea.
| | - Hyeong-Jun Park
- Department of Biomedical Science and Engineering (BMSE), Institute of Integrated Technology (IIT), Gwangju Institute of Science and Technology (GIST), Gwangju 61005, Korea.
| | - Boreom Lee
- Department of Biomedical Science and Engineering (BMSE), Institute of Integrated Technology (IIT), Gwangju Institute of Science and Technology (GIST), Gwangju 61005, Korea.
| |
Collapse
|
28
|
Li X, Zhao L, Ji W, Wu Y, Wu F, Yang MH, Tao D, Reid I. Multi-Task Structure-Aware Context Modeling for Robust Keypoint-Based Object Tracking. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019; 41:915-927. [PMID: 29993768 DOI: 10.1109/tpami.2018.2818132] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In the fields of computer vision and graphics, keypoint-based object tracking is a fundamental and challenging problem, which is typically formulated in a spatio-temporal context modeling framework. However, many existing keypoint trackers are incapable of effectively modeling and balancing the following three aspects in a simultaneous manner: temporal model coherence across frames, spatial model consistency within frames, and discriminative feature construction. To address this problem, we propose a robust keypoint tracker based on spatio-temporal multi-task structured output optimization driven by discriminative metric learning. Consequently, temporal model coherence is characterized by multi-task structured keypoint model learning over several adjacent frames; spatial model consistency is modeled by solving a geometric verification based structured learning problem; discriminative feature construction is enabled by metric learning to ensure the intra-class compactness and inter-class separability. To achieve the goal of effective object tracking, we jointly optimize the above three modules in a spatio-temporal multi-task learning scheme. Furthermore, we incorporate this joint learning scheme into both single-object and multi-object tracking scenarios, resulting in robust tracking results. Experiments over several challenging datasets have justified the effectiveness of our single-object and multi-object trackers against the state-of-the-art.
Collapse
|
29
|
Shabbir S, Majeed N, Dawood H, Dawood H, Xiu B. Integrating the Local Patches of Weber Orientation with Sparse Distribution Method for Object Recognition. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2019. [DOI: 10.1007/s13369-018-3612-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
30
|
|
31
|
A novel reverse sparse model utilizing the spatio-temporal relationship of target templates for object tracking. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.10.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
32
|
Rebollo-Neira L, Whitehouse D. Sparse representation of 3D images for piecewise dimensionality reduction with high quality reconstruction. ARRAY 2019. [DOI: 10.1016/j.array.2019.100001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
|
33
|
Zhao Z, Feng G, Zhang L, Zhu J, Shen Q. Novel orthogonal based collaborative dictionary learning for efficient face recognition. Knowl Based Syst 2019. [DOI: 10.1016/j.knosys.2018.09.014] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
34
|
|
35
|
Tuo Q, Zhao H, Hu Q. Hierarchical feature selection with subtree based graph regularization. Knowl Based Syst 2019. [DOI: 10.1016/j.knosys.2018.10.023] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
36
|
Zhang Z, Lin H, Zhao X, Ji R, Gao Y. Inductive Multi-Hypergraph Learning and Its Application on View-Based 3D Object Classification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:5957-5968. [PMID: 30072328 DOI: 10.1109/tip.2018.2862625] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The wide 3D applications have led to increasing amount of 3D object data, and thus effective 3D object classification technique has become an urgent requirement. One important and challenging task for 3D object classification is how to formulate the 3D data correlation and exploit it. Most of the previous works focus on learning optimal pairwise distance metric for object comparison, which may lose the global correlation among 3D objects. Recently, a transductive hypergraph learning has been investigated for classification, which can jointly explore the correlation among multiple objects, including both the labeled and unlabeled data. Although these methods have shown better performance, they are still limited due to 1) a considerable amount of testing data may not be available in practice and 2) the high computational cost to test new coming data. To handle this problem, considering the multi-modal representations of 3D objects in practice, we propose an inductive multi-hypergraph learning algorithm, which targets on learning an optimal projection for the multi-modal training data. In this method, all the training data are formulated in multi-hypergraph based on the features, and the inductive learning is conducted to learn the projection matrices and the optimal multi-hypergraph combination weights simultaneously. Different from the transductive learning on hypergraph, the high cost training process is off-line, and the testing process is very efficient for the inductive learning on hypergraph. We have conducted experiments on two 3D benchmarks, i.e., the NTU and the ModelNet40 data sets, and compared the proposed algorithm with the state-of-the-art methods and traditional transductive multi-hypergraph learning methods. Experimental results have demonstrated that the proposed method can achieve effective and efficient classification performance. We also note that the proposed method is a general framework and has the potential to be applied in other applications in practice.
Collapse
|
37
|
Yao Z, Dong Y, Wu G, Zhang Q, Yang D, Yu JH, Wang WP. Preoperative diagnosis and prediction of hepatocellular carcinoma: Radiomics analysis based on multi-modal ultrasound images. BMC Cancer 2018; 18:1089. [PMID: 30419849 PMCID: PMC6233500 DOI: 10.1186/s12885-018-5003-4] [Citation(s) in RCA: 93] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Accepted: 10/28/2018] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND This study aims to establish a radiomics analysis system for the diagnosis and clinical behaviour prediction of hepatocellular carcinoma (HCC) based on multi-parametric ultrasound imaging. METHODS A total of 177 patients with focal liver lesions (FLLs) were included in the study. Every patient underwent multi-modal ultrasound examination, including B-mode ultrasound (BMUS), shear wave elastography (SWE), and shear wave viscosity (SWV) imaging. The radiomics analysis system was built on sparse representation theory (SRT) and support vector machine (SVM) for asymmetric data. Through the sparse regulation from the SRT, the proposed radiomics system can effectively avoid over-fitting issues that occur in regular radiomics analysis. The purpose of the proposed system includes differential diagnosis between benign and malignant FLLs, pathologic diagnosis of HCC, and clinical prognostic prediction. Three biomarkers, including programmed cell death protein 1 (PD-1), antigen Ki-67 (Ki-67) and microvascular invasion (MVI), were included and analysed. We calculated the accuracy (ACC), sensitivity (SENS), specificity (SPEC) and area under the receiver operating characteristic curve (AUC) to evaluate the performance of the radiomics models. RESULTS A total of 2560 features were extracted from the multi-modal ultrasound images for each patient. Five radiomics models were built, and leave-one-out cross-validation (LOOCV) was used to evaluate the models. In LOOCV, the AUC was 0.94 for benign and malignant classification (95% confidence interval [CI]: 0.88 to 0.98), 0.97 for malignant subtyping (95% CI: 0.93 to 0.99), 0.97 for PD-1 prediction (95% CI: 0.89 to 0.98), 0.94 for Ki-67 prediction (95% CI: 0.87 to 0.97), and 0.98 for MVI prediction (95% CI: 0.93 to 0.99). The performance of each model improved when the viscosity modality was included. CONCLUSIONS Radiomics analysis based on multi-modal ultrasound images could aid in comprehensive liver tumor evaluations, including diagnosis, differential diagnosis, and clinical prognosis.
Collapse
Affiliation(s)
- Zhao Yao
- Department of Electronic Engineering, Fudan University, No. 220, Handan Road, Yangpu District, Shanghai, 200433, China
| | - Yi Dong
- Department of Ultrasound, Zhongshan Hospital, Fudan University, 180 Fenglin Road, Shanghai, 200032, China
| | - Guoqing Wu
- Department of Electronic Engineering, Fudan University, No. 220, Handan Road, Yangpu District, Shanghai, 200433, China
| | - Qi Zhang
- Department of Ultrasound, Zhongshan Hospital, Fudan University, 180 Fenglin Road, Shanghai, 200032, China
| | - Daohui Yang
- Department of Ultrasound, Zhongshan Hospital, Fudan University, 180 Fenglin Road, Shanghai, 200032, China
| | - Jin-Hua Yu
- Department of Electronic Engineering, Fudan University, No. 220, Handan Road, Yangpu District, Shanghai, 200433, China.
| | - Wen-Ping Wang
- Department of Ultrasound, Zhongshan Hospital, Fudan University, 180 Fenglin Road, Shanghai, 200032, China.
| |
Collapse
|
38
|
|
39
|
He L, Li H, Zhang Q, Sun Z. Dynamic Feature Matching for Partial Face Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 28:791-802. [PMID: 30235130 DOI: 10.1109/tip.2018.2870946] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Partial face recognition (PFR) in an unconstrained environment is a very important task, especially in situations where partial face images are likely to be captured due to occlusions, out-of-view, and large viewing angle, e.g., video surveillance and mobile devices. However, little attention has been paid to PFR so far and thus, the problem of recognizing an arbitrary patch of a face image remains largely unsolved. This study proposes a novel partial face recognition approach, called Dynamic Feature Matching (DFM), which combines Fully Convolutional Networks (FCNs) and Sparse Representation Classification (SRC) to address partial face recognition problem regardless of various face sizes. DFM does not require prior position information of partial faces against a holistic face. By sharing computation, the feature maps are calculated from the entire input image once, which yields a significant speedup. Experimental results demonstrate the effectiveness and advantages of DFM in comparison with state-of-the-art PFR methods on several partial face databases, including CAISA-NIR-Distance, CASIA-NIR-Mobile, and LFW databases. The performance of DFM is also impressive in partial person re-identification on Partial RE-ID and iLIDS databases. The source code of DFM can be found at https://github.com/lingxiao-he/dfm new.
Collapse
|
40
|
Li J, Zhang B, Zhang D. Shared Autoencoder Gaussian Process Latent Variable Model for Visual Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:4272-4286. [PMID: 29990089 DOI: 10.1109/tnnls.2017.2761401] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Multiview learning reveals the latent correlation among different modalities and utilizes the complementary information to achieve a better performance in many applications. In this paper, we propose a novel multiview learning model based on the Gaussian process latent variable model (GPLVM) to learn a set of nonlinear and nonparametric mapping functions and obtain a shared latent variable in the manifold space. Different from the previous work on the GPLVM, the proposed shared autoencoder Gaussian process (SAGP) latent variable model assumes that there is an additional mapping from the observed data to the shared manifold space. Due to the introduction of the autoencoder framework, both nonlinear projections from and to the observation are considered simultaneously. Additionally, instead of fully connecting used in the conventional autoencoder, the SAGP achieves the mappings utilizing the GP, which remarkably reduces the number of estimated parameters and avoids the phenomenon of overfitting. To make the proposed method adaptive for classification, a discriminative regularization is embedded into the proposed method. In the optimization process, an efficient algorithm based on the alternating direction method and gradient decent techniques is designed to solve the encoder and decoder parts alternatively. Experimental results on three real-world data sets substantiate the effectiveness and superiority of the proposed approach as compared with the state of the art.
Collapse
|
41
|
Du B, Wang S, Xu C, Wang N, Zhang L, Tao D. Multi-Task Learning for Blind Source Separation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:4219-4231. [PMID: 29870343 DOI: 10.1109/tip.2018.2836324] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Blind source separation (BSS) aims to discover the underlying source signals from a set of linear mixture signals without any prior information of the mixing system, which is a fundamental problem in signal and image processing field. Most of the state-of-the-art algorithms have independently handled the decompositions of mixture signals. In this paper, we propose a new algorithm named multi-task sparse model to solve the BSS problem. Source signals are characterized via sparse techniques. Meanwhile, we regard the decomposition of each mixture signal as a task and employ the idea of multi-task learning to discover connections between tasks for the accuracy improvement of the source signal separation. Theoretical analyses on the optimization convergence and sample complexity of the proposed algorithm are provided. Experimental results based on extensive synthetic and real-world data demonstrate the necessity of exploiting connections between mixture signals and the effectiveness of the proposed algorithm.
Collapse
|
42
|
Emerging topics and challenges of learning from noisy data in nonstandard classification: a survey beyond binary class noise. Knowl Inf Syst 2018. [DOI: 10.1007/s10115-018-1244-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
43
|
|
44
|
Kontar R, Zhou S, Sankavaram C, Du X, Zhang Y. Nonparametric Modeling and Prognosis of Condition Monitoring Signals Using Multivariate Gaussian Convolution Processes. Technometrics 2018. [DOI: 10.1080/00401706.2017.1383310] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Raed Kontar
- Department of Industrial and Systems Engineering, University of Wisconsin-Madison, Madison, WI
| | - Shiyu Zhou
- Department of Industrial and Systems Engineering, University of Wisconsin-Madison, Madison, WI
| | | | - Xinyu Du
- General Motors Research & Development, Warren, MI
| | - Yilu Zhang
- General Motors Research & Development, Warren, MI
| |
Collapse
|
45
|
Su C, Yang F, Zhang S, Tian Q, Davis LS, Gao W. Multi-Task Learning with Low Rank Attribute Embedding for Multi-Camera Person Re-Identification. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2018; 40:1167-1181. [PMID: 28287958 DOI: 10.1109/tpami.2017.2679002] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
We propose Multi-Task Learning with Low Rank Attribute Embedding (MTL-LORAE) to address the problem of person re-identification on multi-cameras. Re-identifications on different cameras are considered as related tasks, which allows the shared information among different tasks to be explored to improve the re-identification accuracy. The MTL-LORAE framework integrates low-level features with mid-level attributes as the descriptions for persons. To improve the accuracy of such description, we introduce the low-rank attribute embedding, which maps original binary attributes into a continuous space utilizing the correlative relationship between each pair of attributes. In this way, inaccurate attributes are rectified and missing attributes are recovered. The resulting objective function is constructed with an attribute embedding error and a quadratic loss concerning class labels. It is solved by an alternating optimization strategy. The proposed MTL-LORAE is tested on four datasets and is validated to outperform the existing methods with significant margins.
Collapse
|
46
|
Fang X, Xu Y, Li X, Lai Z, Wong WK, Fang B. Regularized Label Relaxation Linear Regression. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:1006-1018. [PMID: 28166507 DOI: 10.1109/tnnls.2017.2648880] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Linear regression (LR) and some of its variants have been widely used for classification problems. Most of these methods assume that during the learning phase, the training samples can be exactly transformed into a strict binary label matrix, which has too little freedom to fit the labels adequately. To address this problem, in this paper, we propose a novel regularized label relaxation LR method, which has the following notable characteristics. First, the proposed method relaxes the strict binary label matrix into a slack variable matrix by introducing a nonnegative label relaxation matrix into LR, which provides more freedom to fit the labels and simultaneously enlarges the margins between different classes as much as possible. Second, the proposed method constructs the class compactness graph based on manifold learning and uses it as the regularization item to avoid the problem of overfitting. The class compactness graph is used to ensure that the samples sharing the same labels can be kept close after they are transformed. Two different algorithms, which are, respectively, based on -norm and -norm loss functions are devised. These two algorithms have compact closed-form solutions in each iteration so that they are easily implemented. Extensive experiments show that these two algorithms outperform the state-of-the-art algorithms in terms of the classification accuracy and running time.
Collapse
|
47
|
Wu G, Chen Y, Wang Y, Yu J, Lv X, Ju X, Shi Z, Chen L, Chen Z. Sparse Representation-Based Radiomics for the Diagnosis of Brain Tumors. IEEE TRANSACTIONS ON MEDICAL IMAGING 2018; 37:893-905. [PMID: 29610069 DOI: 10.1109/tmi.2017.2776967] [Citation(s) in RCA: 60] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Brain tumors are the most common malignant neurologic tumors with the highest mortality and disability rate. Because of the delicate structure of the brain, the clinical use of several commonly used biopsy diagnosis is limited for brain tumors. Radiomics is an emerging technique for noninvasive diagnosis based on quantitative medical image analyses. However, current radiomics techniques are not standardized regarding feature extraction, feature selection, and decision making. In this paper, we propose a sparse representation-based radiomics (SRR) system for the diagnosis of brain tumors. First, we developed a dictionary learning- and sparse representation-based feature extraction method that exploits the statistical characteristics of the lesion area, leading to fine and more effective feature extraction compared with the traditional explicitly calculation-based methods. Then, we set up an iterative sparse representation method to solve the redundancy problem of the extracted features. Finally, we proposed a novel multi-feature collaborative sparse representation classification framework that introduces a new coefficient of regularization term to combine features from multi-modal images at the sparse representation coefficient level. Two clinical problems were used to validate the performance and usefulness of the proposed SRR system. One was the differential diagnosis between primary central nervous system lymphoma (PCNSL) and glioblastoma (GBM), and the other was isocitrate dehydrogenase 1 estimation for gliomas. The SRR system had superior PCNSL and GBM differentiation performance compared with some advanced imaging techniques and yielded 11% better performance for estimating IDH1 compared with the traditional radiomics methods.
Collapse
|
48
|
Spatial-aware hyperspectral image classification via multifeature kernel dictionary learning. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS 2018. [DOI: 10.1007/s41060-018-0115-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
49
|
Lu J, Shang C, Yue C, Morillo R, Ware S, Kamath J, Bamis A, Russell A, Wang B, Bi J. Joint Modeling of Heterogeneous Sensing Data for Depression Assessment via Multi-task Learning. ACTA ACUST UNITED AC 2018. [DOI: 10.1145/3191753] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Depression is a common mood disorder that causes severe medical problems and interferes negatively with daily life. Identifying human behavior patterns that are predictive or indicative of depressive disorder is important. Clinical diagnosis of depression relies on costly clinician assessment using survey instruments which may not objectively reflect the fluctuation of daily behavior. Self-administered surveys, such as the Quick Inventory of Depressive Symptomatology (QIDS) commonly used to monitor depression, may show disparities from clinical decision. Smartphones provide easy access to many behavioral parameters, and Fitbit wrist bands are becoming another important tool to assess variables such as heart rates and sleep efficiency that are complementary to smartphone sensors. However, data used to identify depression indicators have been limited to a single platform either iPhone, or Android, or Fitbit alone due to the variation in their methods of data collection. The present work represents a large-scale effort to collect and integrate data from mobile phones, wearable devices, and self reports in depression analysis by designing a new machine learning approach. This approach constructs sparse mappings from sensing variables collected by various tools to two separate targets: self-reported QIDS scores and clinical assessment of depression severity. We propose a so-called heterogeneous multi-task feature learning method that jointly builds inference models for related tasks but of different types including classification and regression tasks. The proposed method was evaluated using data collected from 103 college students and could predict the QIDS score with an R2 reaching 0.44 and depression severity with an F1-score as high as 0.77. By imposing appropriate regularizers, our approach identified strong depression indicators such as time staying at home and total time asleep.
Collapse
Affiliation(s)
- Jin Lu
- University of Connecticut, Department of Computer Science and Engineering, Storrs, CT, USA
| | - Chao Shang
- University of Connecticut, Department of Computer Science and Engineering, Storrs, CT, USA
| | - Chaoqun Yue
- University of Connecticut, Department of Computer Science and Engineering, Storrs, CT, USA
| | - Reynaldo Morillo
- University of Connecticut, Department of Computer Science and Engineering, Storrs, CT, USA
| | - Shweta Ware
- University of Connecticut, Department of Computer Science and Engineering, Storrs, CT, USA
| | - Jayesh Kamath
- University of Connecticut Health Center, Department of Psychiatry, Farmington, CT, USA
| | | | - Alexander Russell
- University of Connecticut, Department of Computer Science and Engineering, Storrs, CT, USA
| | - Bing Wang
- University of Connecticut, Department of Computer Science and Engineering, Storrs, CT, USA
| | - Jinbo Bi
- University of Connecticut, Department of Computer Science and Engineering, Storrs, CT, USA
| |
Collapse
|
50
|
SAR Image Recognition with Monogenic Scale Selection-Based Weighted Multi-task Joint Sparse Representation. REMOTE SENSING 2018. [DOI: 10.3390/rs10040504] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|