1
|
Wang K, Lu J, Liu A, Zhang G. TS-DM: A Time Segmentation-Based Data Stream Learning Method for Concept Drift Adaptation. IEEE TRANSACTIONS ON CYBERNETICS 2024; 54:6000-6011. [PMID: 39133590 DOI: 10.1109/tcyb.2024.3429459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/10/2024]
Abstract
Concept drift arises from the uncertainty of data distribution over time and is common in data stream. While numerous methods have been developed to assist machine learning models in adapting to such changeable data, the problem of improperly keeping or discarding data samples remains. This may results in the loss of valuable knowledge that could be utilized in subsequent time points, ultimately affecting the model's accuracy. To address this issue, a novel method called time segmentation-based data stream learning method (TS-DM) is developed to help segment and learn the streaming data for concept drift adaptation. First, a chunk-based segmentation strategy is given to segment normal and drift chunks. Building upon this, a chunk-based evolving segmentation (CES) strategy is proposed to mine and segment the data chunk when both old and new concepts coexist. Furthermore, a warning level data segmentation process (CES-W) and a high-low-drift tradeoff handling process are developed to enhance the generalization and robustness. To evaluate the performance and efficiency of our proposed method, we conduct experiments on both synthetic and real-world datasets. By comparing the results with several state-of-the-art data stream learning methods, the experimental findings demonstrate the efficiency of the proposed method.
Collapse
|
2
|
Wang K, Lu J, Liu A, Zhang G, Xiong L. Evolving Gradient Boost: A Pruning Scheme Based on Loss Improvement Ratio for Learning Under Concept Drift. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:2110-2123. [PMID: 34613927 DOI: 10.1109/tcyb.2021.3109796] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In nonstationary environments, data distributions can change over time. This phenomenon is known as concept drift, and the related models need to adapt if they are to remain accurate. With gradient boosting (GB) ensemble models, selecting which weak learners to keep/prune to maintain model accuracy under concept drift is nontrivial research. Unlike existing models such as AdaBoost, which can directly compare weak learners' performance by their accuracy (a metric between [0, 1]), in GB, weak learners' performance is measured with different scales. To address the performance measurement scaling issue, we propose a novel criterion to evaluate weak learners in GB models, called the loss improvement ratio (LIR). Based on LIR, we develop two pruning strategies: 1) naive pruning (NP), which simply deletes all learners with increasing loss and 2) statistical pruning (SP), which removes learners if their loss increase meets a significance threshold. We also devise a scheme to dynamically switch between NP and SP to achieve the best performance. We implement the scheme as a concept drift learning algorithm, called evolving gradient boost (LIR-eGB). On average, LIR-eGB delivered the best performance against state-of-the-art methods on both stationary and nonstationary data.
Collapse
|
3
|
Yang Y, Hu Y, Zhang X, Wang S. Two-Stage Selective Ensemble of CNN via Deep Tree Training for Medical Image Classification. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:9194-9207. [PMID: 33705343 DOI: 10.1109/tcyb.2021.3061147] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Medical image classification is an important task in computer-aided diagnosis systems. Its performance is critically determined by the descriptiveness and discriminative power of features extracted from images. With rapid development of deep learning, deep convolutional neural networks (CNNs) have been widely used to learn the optimal high-level features from the raw pixels of images for a given classification task. However, due to the limited amount of labeled medical images with certain quality distortions, such techniques crucially suffer from the training difficulties, including overfitting, local optimums, and vanishing gradients. To solve these problems, in this article, we propose a two-stage selective ensemble of CNN branches via a novel training strategy called deep tree training (DTT). In our approach, DTT is adopted to jointly train a series of networks constructed from the hidden layers of CNN in a hierarchical manner, leading to the advantage that vanishing gradients can be mitigated by supplementing gradients for hidden layers of CNN, and intrinsically obtain the base classifiers on the middle-level features with minimum computation burden for an ensemble solution. Moreover, the CNN branches as base learners are combined into the optimal classifier via the proposed two-stage selective ensemble approach based on both accuracy and diversity criteria. Extensive experiments on CIFAR-10 benchmark and two specific medical image datasets illustrate that our approach achieves better performance in terms of accuracy, sensitivity, specificity, and F1 score measurement.
Collapse
|
4
|
Shi M, Tang Y, Zhu X, Zhuang Y, Lin M, Liu J. Feature-Attention Graph Convolutional Networks for Noise Resilient Learning. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:7719-7731. [PMID: 35104237 DOI: 10.1109/tcyb.2022.3143798] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Noise and inconsistency commonly exist in real-world information networks, due to the inherent error-prone nature of human or user privacy concerns. To date, tremendous efforts have been made to advance feature learning from networks, including the most recent graph convolutional networks (GCNs) or attention GCN, by integrating node content and topology structures. However, all existing methods consider networks as error-free sources and treat feature content in each node as independent and equally important to model node relations. Noisy node content, combined with sparse features, provides essential challenges for existing methods to be used in real-world noisy networks. In this article, we propose feature-based attention GCN (FA-GCN), a feature-attention graph convolution learning framework, to handle networks with noisy and sparse node content. To tackle noise and sparse content in each node, FA-GCN first employs a long short-term memory (LSTM) network to learn dense representation for each node feature. To model interactions between neighboring nodes, a feature-attention mechanism is introduced to allow neighboring nodes to learn and vary feature importance, with respect to their connections. By using a spectral-based graph convolution aggregation process, each node is allowed to concentrate more on the most determining neighborhood features aligned with the corresponding learning task. Experiments and validations, w.r.t. different noise levels, demonstrate that FA-GCN achieves better performance than the state-of-the-art methods in both noise-free and noisy network environments.
Collapse
|
5
|
|
6
|
Mao S, Lin W, Jiao L, Gou S, Chen JW. End-to-End Ensemble Learning by Exploiting the Correlation Between Individuals and Weights. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:2835-2846. [PMID: 31425063 DOI: 10.1109/tcyb.2019.2931071] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Ensemble learning performs better than a single classifier in most tasks due to the diversity among multiple classifiers. However, the enhancement of the diversity is at the expense of reducing the accuracies of individual classifiers in general and, thus, how to balance the diversity and accuracies is crucial for improving the ensemble performance. In this paper, we propose a new ensemble method which exploits the correlation between individual classifiers and their corresponding weights by constructing a joint optimization model to achieve the tradeoff between the diversity and the accuracy. Specifically, the proposed framework can be modeled as a shallow network and efficiently trained by the end-to-end manner. In the proposed ensemble method, not only can a high total classification performance be achieved by the weighted classifiers but also the individual classifier can be updated based on the error of the optimized weighted classifiers ensemble. Furthermore, the sparsity constraint is imposed on the weight to enforce that partial individual classifiers are selected for final classification. Finally, the experimental results on the UCI datasets demonstrate that the proposed method effectively improves the performance of classification compared with relevant existing ensemble methods.
Collapse
|
7
|
Huang L, Wang CD, Chao HY, Yu PS. MVStream: Multiview Data Stream Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:3482-3496. [PMID: 31675346 DOI: 10.1109/tnnls.2019.2944851] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This article studies a new problem of data stream clustering, namely, multiview data stream (MVStream) clustering. Although many data stream clustering algorithms have been developed, they are restricted to the single-view streaming data, and clustering MVStreams still remains largely unsolved. In addition to the many issues encountered by the conventional single-view data stream clustering, such as capturing cluster evolution and discovering clusters of arbitrary shapes under the limited computational resources, the main challenge of MVStream clustering lies in integrating information from multiple views in a streaming manner and abstracting summary statistics from the integrated features simultaneously. In this article, we propose a novel MVStream clustering algorithm for the first time. The main idea is to design a multiview support vector domain description (MVSVDD) model, by which the information from multiple insufficient views can be integrated, and the outputting support vectors (SVs) are utilized to abstract the summary statistics of the historical multiview data objects. Based on the MVSVDD model, a new multiview cluster labeling method is designed, whereby clusters of arbitrary shapes can be discovered for each view. By tracking the cluster labels of SVs in each view, the cluster evolution associated with concept drift can be captured. Since the SVs occupy only a small portion of data objects, the proposed MVStream algorithm is quite efficient with the limited computational resources. Extensive experiments are conducted to demonstrate the effectiveness and efficiency of the proposed method.
Collapse
|
8
|
Zheng Y, Fan J, Zhang J, Gao X. Discriminative Fast Hierarchical Learning for Multiclass Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:2779-2790. [PMID: 31751253 DOI: 10.1109/tnnls.2019.2948881] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this article, a discriminative fast hierarchical learning algorithm is developed for supporting multiclass image classification, where a visual tree is seamlessly integrated with multitask learning to achieve fast training of the tree classifier hierarchically (i.e., a set of structural node classifiers over the visual tree). By partitioning a large number of categories hierarchically in a coarse-to-fine fashion, a visual tree is first constructed and further used to handle data imbalance and identify the interrelated learning tasks automatically (e.g., the tasks for learning the node classifiers for the sibling child nodes under the same parent node are strongly interrelated), and a multitask SVM classifier is trained for each nonleaf node to achieve more effective separation of its sibling child nodes at the next level of the visual tree. Both the internode visual similarities and the interlevel visual correlations are utilized to train more discriminative multitask SVM classifiers and control the interlevel error propagation effectively, and a stochastic gradient descent (SGD) algorithm is developed for learning such multitask SVM classifiers with higher efficiency. Our experimental results have demonstrated that our fast hierarchical learning algorithm can achieve very competitive results on both the classification accuracy rates and the computational efficiency.
Collapse
|
9
|
Incremental learning imbalanced data streams with concept drift: The dynamic updated ensemble algorithm. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.105694] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
10
|
Joint Local Block Grouping with Noise-Adjusted Principal Component Analysis for Hyperspectral Remote-Sensing Imagery Sparse Unmixing. REMOTE SENSING 2019. [DOI: 10.3390/rs11101223] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Spatial regularized sparse unmixing has been proved as an effective spectral unmixing technique, combining spatial information and standard spectral signatures known in advance into the traditional spectral unmixing model in the form of sparse regression. In a spatial regularized sparse unmixing model, spatial consideration acts as an important role and develops from local neighborhood pixels to global structures. However, incorporating spatial relationships will increase the computational complexity, and it is inevitable that some negative influences obtained by inaccurate estimated abundances’ spatial correlations will reduce the accuracy of the algorithms. To obtain a more reliable and efficient spatial regularized sparse unmixing results, a joint local block grouping with noise-adjusted principal component analysis for hyperspectral remote-sensing imagery sparse unmixing is proposed in this paper. In this work, local block grouping is first utilized to gather and classify abundant spatial information in local blocks, and noise-adjusted principal component analysis is used to compress these series of classified local blocks and select the most significant ones. Then the representative spatial correlations are drawn and replace the traditional spatial regularization in the spatial regularized sparse unmixing method. Compared with total variation-based and non-local means-based sparse unmixing algorithms, the proposed approach can yield comparable experimental results with three simulated hyperspectral data cubes and two real hyperspectral remote-sensing images.
Collapse
|
11
|
Yang Y, Jiang J. Adaptive Bi-Weighting Toward Automatic Initialization and Model Selection for HMM-Based Hybrid Meta-Clustering Ensembles. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:1657-1668. [PMID: 29994293 DOI: 10.1109/tcyb.2018.2809562] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Temporal data clustering can provide underpinning techniques for the discovery of intrinsic structures, which proved important in condensing or summarizing information demanded in various fields of information sciences, ranging from time series analysis to sequential data understanding. In this paper, we propose a novel hidden Markov model (HMM)-based hybrid meta-clustering ensemble with bi-weighting scheme to solve the problems of initialization and model selection associated with temporal data clustering. To improve the performance of the ensemble techniques, the proposed bi-weighting scheme adaptively examines the partition process and hence optimizes the fusion of consensus functions. Specifically, three consensus functions are used to combine the input partitions, generated by HMM-based K -models under different initializations, into a robust consensus partition. An optimal consensus partition is then selected from the three candidates by a normalized mutual information-based objective function. Finally, the optimal consensus partition is further refined by the HMM-based agglomerative clustering algorithm in association with dendrogram-based similarity partitioning algorithm, leading to the advantage that the number of clusters can be automatically and adaptively determined. Extensive experiments on synthetic data, time series, and real-world motion trajectory datasets illustrate that our proposed approach outperforms all the selected benchmarks and hence providing promising potentials for developing improved clustering tools for information analysis and management.
Collapse
|
12
|
Wang S, Minku LL, Yao X. A Systematic Study of Online Class Imbalance Learning With Concept Drift. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:4802-4821. [PMID: 29993955 DOI: 10.1109/tnnls.2017.2771290] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
As an emerging research topic, online class imbalance learning often combines the challenges of both class imbalance and concept drift. It deals with data streams having very skewed class distributions, where concept drift may occur. It has recently received increased research attention; however, very little work addresses the combined problem where both class imbalance and concept drift coexist. As the first systematic study of handling concept drift in class-imbalanced data streams, this paper first provides a comprehensive review of current research progress in this field, including current research focuses and open challenges. Then, an in-depth experimental study is performed, with the goal of understanding how to best overcome concept drift in online learning with class imbalance.
Collapse
|
13
|
Wu J, Pan S, Zhu X, Zhang C, Yu PS. Multiple Structure-View Learning for Graph Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:3236-3251. [PMID: 28945603 DOI: 10.1109/tnnls.2017.2703832] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Many applications involve objects containing structure and rich content information, each describing different feature aspects of the object. Graph learning and classification is a common tool for handling such objects. To date, existing graph classification has been limited to the single-graph setting with each object being represented as one graph from a single structure-view. This inherently limits its use to the classification of complicated objects containing complex structures and uncertain labels. In this paper, we advance graph classification to handle multigraph learning for complicated objects from multiple structure views, where each object is represented as a bag containing several graphs and the label is only available for each graph bag but not individual graphs inside the bag. To learn such graph classification models, we propose a multistructure-view bag constrained learning (MSVBL) algorithm, which aims to explore substructure features across multiple structure views for learning. By enabling joint regularization across multiple structure views and enforcing labeling constraints at the bag and graph levels, MSVBL is able to discover the most effective substructure features across all structure views. Experiments and comparisons on real-world data sets validate and demonstrate the superior performance of MSVBL in representing complicated objects as multigraph for classification, e.g., MSVBL outperforms the state-of-the-art multiview graph classification and multiview multi-instance learning approaches.
Collapse
|
14
|
Chi L, Li B, Zhu X, Pan S, Chen L. Hashing for Adaptive Real-Time Graph Stream Classification With Concept Drifts. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:1591-1604. [PMID: 28858820 DOI: 10.1109/tcyb.2017.2708979] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Many applications involve processing networked streaming data in a timely manner. Graph stream classification aims to learn a classification model from a stream of graphs with only one-pass of data, requiring real-time processing in training and prediction. This is a nontrivial task, as many existing methods require multipass of the graph stream to extract subgraph structures as features for graph classification which does not simultaneously satisfy "one-pass" and "real-time" requirements. In this paper, we propose an adaptive real-time graph stream classification method to address this challenge. We partition the unbounded graph stream data into consecutive graph chunks, each consisting of a fixed number of graphs and delivering a corresponding chunk-level classifier. We employ a random hashing function to compress the original node set of graphs in each chunk for fast feature detection when training chunk-level classifiers. Furthermore, a differential hashing strategy is applied to map unlimited increasing features (i.e., cliques) into a fixed-size feature space which is then used as a feature vector for stochastic learning. Finally, the chunk-level classifiers are weighted in an ensemble learning model for graph classification. The proposed method substantially speeds up the graph feature extraction and avoids unbounded graph feature growth. Moreover, it effectively offsets concept drifts in graph stream classification. Experiments on real-world and synthetic graph streams demonstrate that our method significantly outperforms existing methods in both classification accuracy and learning efficiency.
Collapse
|
15
|
Zhang X, Zhuang Y, Wang W, Pedrycz W. Transfer Boosting With Synthetic Instances for Class Imbalanced Object Recognition. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:357-370. [PMID: 28026795 DOI: 10.1109/tcyb.2016.2636370] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
A challenging problem in object recognition is to train a robust classifier with small and imbalanced data set. In such cases, the learned classifier tends to overfit the training data and has low prediction accuracy on the minority class. In this paper, we address the problem of class imbalanced object recognition by combining synthetic minorities over-sampling technique (SMOTE) and instance-based transfer boosting to rebalance the skewed class distribution. We present ways of generating synthetic instances under the learning framework of transfer Adaboost. A novel weighted SMOTE technique (WSMOTE) is proposed to generate weighted synthetic instances with weighted source and target instances at each boosting round. Based on WSMOTE, we propose a novel class imbalanced transfer boosting algorithm called WSMOTE-TrAdaboost and experimentally demonstrate its effectiveness on four datasets (Office, Caltech256, SUN2012, and VOC2012) for object recognition application. Bag-of-words model with SURF features and histogram of oriented gradient features are separately used to represent an image. We experimentally demonstrated the effectiveness and robustness of our approaches by comparing it with several baseline algorithms in boosting family for class imbalanced learning.
Collapse
|
16
|
Kang Q, Chen X, Li S, Zhou M. A Noise-Filtered Under-Sampling Scheme for Imbalanced Classification. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:4263-4274. [PMID: 28113413 DOI: 10.1109/tcyb.2016.2606104] [Citation(s) in RCA: 59] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Under-sampling is a popular data preprocessing method in dealing with class imbalance problems, with the purposes of balancing datasets to achieve a high classification rate and avoiding the bias toward majority class examples. It always uses full minority data in a training dataset. However, some noisy minority examples may reduce the performance of classifiers. In this paper, a new under-sampling scheme is proposed by incorporating a noise filter before executing resampling. In order to verify the efficiency, this scheme is implemented based on four popular under-sampling methods, i.e., Undersampling + Adaboost, RUSBoost, UnderBagging, and EasyEnsemble through benchmarks and significance analysis. Furthermore, this paper also summarizes the relationship between algorithm performance and imbalanced ratio. Experimental results indicate that the proposed scheme can improve the original undersampling-based methods with significance in terms of three popular metrics for imbalanced classification, i.e., the area under the curve, -measure, and -mean.
Collapse
|
17
|
Yu Z, Wang Z, You J, Zhang J, Liu J, Wong HS, Han G. A New Kind of Nonparametric Test for Statistical Comparison of Multiple Classifiers Over Multiple Datasets. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:4418-4431. [PMID: 28113414 DOI: 10.1109/tcyb.2016.2611020] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Nonparametric statistical analysis, such as the Friedman test (FT), is gaining more and more attention due to its useful applications in a lot of experimental studies. However, traditional FT for the comparison of multiple learning algorithms on different datasets adopts the naive ranking approach. The ranking is based on the average accuracy values obtained by the set of learning algorithms on the datasets, which neither considers the differences of the results obtained by the learning algorithms on each dataset nor takes into account the performance of the learning algorithms in each run. In this paper, we will first propose three kinds of ranking approaches, which are the weighted ranking approach, the global ranking approach (GRA), and the weighted GRA. Then, a theoretical analysis is performed to explore the properties of the proposed ranking approaches. Next, a set of the modified FTs based on the proposed ranking approaches are designed for the comparison of the learning algorithms. Finally, the modified FTs are evaluated through six classifier ensemble approaches on 34 real-world datasets. The experiments show the effectiveness of the modified FTs.
Collapse
|
18
|
Li P, Yu J, Wang M, Zhang L, Cai D, Li X. Constrained Low-Rank Learning Using Least Squares-Based Regularization. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:4250-4262. [PMID: 27849552 DOI: 10.1109/tcyb.2016.2623638] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Low-rank learning has attracted much attention recently due to its efficacy in a rich variety of real-world tasks, e.g., subspace segmentation and image categorization. Most low-rank methods are incapable of capturing low-dimensional subspace for supervised learning tasks, e.g., classification and regression. This paper aims to learn both the discriminant low-rank representation (LRR) and the robust projecting subspace in a supervised manner. To achieve this goal, we cast the problem into a constrained rank minimization framework by adopting the least squares regularization. Naturally, the data label structure tends to resemble that of the corresponding low-dimensional representation, which is derived from the robust subspace projection of clean data by low-rank learning. Moreover, the low-dimensional representation of original data can be paired with some informative structure by imposing an appropriate constraint, e.g., Laplacian regularizer. Therefore, we propose a novel constrained LRR method. The objective function is formulated as a constrained nuclear norm minimization problem, which can be solved by the inexact augmented Lagrange multiplier algorithm. Extensive experiments on image classification, human pose estimation, and robust face recovery have confirmed the superiority of our method.
Collapse
|
19
|
Rolling Guidance Based Scale-Aware Spatial Sparse Unmixing for Hyperspectral Remote Sensing Imagery. REMOTE SENSING 2017. [DOI: 10.3390/rs9121218] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
20
|
Yu Z, Zhu X, Wong HS, You J, Zhang J, Han G. Distribution-Based Cluster Structure Selection. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:3554-3567. [PMID: 27254876 DOI: 10.1109/tcyb.2016.2569529] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The objective of cluster structure ensemble is to find a unified cluster structure from multiple cluster structures obtained from different datasets. Unfortunately, not all the cluster structures contribute to the unified cluster structure. This paper investigates the problem of how to select the suitable cluster structures in the ensemble which will be summarized to a more representative cluster structure. Specifically, the cluster structure is first represented by a mixture of Gaussian distributions, the parameters of which are estimated using the expectation-maximization algorithm. Then, several distribution-based distance functions are designed to evaluate the similarity between two cluster structures. Based on the similarity comparison results, we propose a new approach, which is referred to as the distribution-based cluster structure ensemble (DCSE) framework, to find the most representative unified cluster structure. We then design a new technique, the distribution-based cluster structure selection strategy (DCSSS), to select a subset of cluster structures. Finally, we propose using a distribution-based normalized hypergraph cut algorithm to generate the final result. In our experiments, a nonparametric test is adopted to evaluate the difference between DCSE and its competitors. We adopt 20 real-world datasets obtained from the University of California, Irvine and knowledge extraction based on evolutionary learning repositories, and a number of cancer gene expression profiles to evaluate the performance of the proposed methods. The experimental results show that: 1) DCSE works well on the real-world datasets and 2) DCSE based on DCSSS can further improve the performance of the algorithm.
Collapse
|
21
|
Wu J, Pan S, Zhu X, Zhang C, Wu X. Positive and Unlabeled Multi-Graph Learning. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:818-829. [PMID: 28113878 DOI: 10.1109/tcyb.2016.2527239] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
In this paper, we advance graph classification to handle multi-graph learning for complicated objects, where each object is represented as a bag of graphs and the label is only available to each bag but not individual graphs. In addition, when training classifiers, users are only given a handful of positive bags and many unlabeled bags, and the learning objective is to train models to classify previously unseen graph bags with maximum accuracy. To achieve the goal, we propose a positive and unlabeled multi-graph learning (puMGL) framework to first select informative subgraphs to convert graphs into a feature space. To utilize unlabeled bags for learning, puMGL assigns a confidence weight to each bag and dynamically adjusts its weight value to select "reliable negative bags." A number of representative graphs, selected from positive bags and identified reliable negative graph bags, form a "margin graph pool" which serves as the base for deriving subgraph patterns, training graph classifiers, and further updating the bag weight values. A closed-loop iterative process helps discover optimal subgraphs from positive and unlabeled graph bags for learning. Experimental comparisons demonstrate the performance of puMGL for classifying real-world complicated objects.
Collapse
|
22
|
Pan S, Wu J, Zhu X, Long G, Zhang C. Task Sensitive Feature Exploration and Learning for Multitask Graph Classification. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:744-758. [PMID: 26978839 DOI: 10.1109/tcyb.2016.2526058] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Multitask learning (MTL) is commonly used for jointly optimizing multiple learning tasks. To date, all existing MTL methods have been designed for tasks with feature-vector represented instances, but cannot be applied to structure data, such as graphs. More importantly, when carrying out MTL, existing methods mainly focus on exploring overall commonality or disparity between tasks for learning, but cannot explicitly capture task relationships in the feature space, so they are unable to answer important questions, such as what exactly is shared between tasks and what is the uniqueness of one task differing from others? In this paper, we formulate a new multitask graph learning problem, and propose a task sensitive feature exploration and learning algorithm for multitask graph classification. Because graphs do not have features available, we advocate a task sensitive feature exploration and learning paradigm to jointly discover discriminative subgraph features across different tasks. In addition, a feature learning process is carried out to categorize each subgraph feature into one of three categories: (1) common feature; (2) task auxiliary feature; and (3) task specific feature, indicating whether the feature is shared by all tasks, by a subset of tasks, or by only one specific task, respectively. The feature learning and the multiple task learning are iteratively optimized to form a multitask graph classification model with a global optimization goal. Experiments on real-world functional brain analysis and chemical compound categorization demonstrate the algorithm's performance. Results confirm that our method can be used to explicitly capture task correlations and uniqueness in the feature space, and explicitly answer what are shared between tasks and what is the uniqueness of a specific task.
Collapse
|