1
|
Zhu Y, Yu W, Li X. A Multi-objective transfer learning framework for time series forecasting with Concept Echo State Networks. Neural Netw 2025; 186:107272. [PMID: 39999532 DOI: 10.1016/j.neunet.2025.107272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2024] [Revised: 01/22/2025] [Accepted: 02/11/2025] [Indexed: 02/27/2025]
Abstract
This paper introduces a novel transfer learning framework for time series forecasting that uses Concept Echo State Network (CESN) and a multi-objective optimization strategy. Our approach addresses the challenges of feature extraction and knowledge transfer in heterogeneous data environments. By optimizing CESN for each data source, we extract targeted features that capture the unique characteristics of individual datasets. Additionally, our multi-network architecture enables effective knowledge sharing among different ESNs, leading to improved forecasting performance. To further enhance efficiency, CESN reduces the need for extensive hyperparameter tuning by focusing on optimizing only the concept matrix and output weights. Our proposed framework offers a promising solution for forecasting problems where data is diverse, limited, or missing.
Collapse
Affiliation(s)
- Yingqin Zhu
- CINVESTAV-IPN Departamento de Control Automático, Av. IPN 2508, Mexico city, 07360, Mexico
| | - Wen Yu
- CINVESTAV-IPN Departamento de Control Automático, Av. IPN 2508, Mexico city, 07360, Mexico.
| | - Xiaoou Li
- CINVESTAV-IPN Departamento de Computación, Av. IPN 2508, Mexico city, 07360, Mexico
| |
Collapse
|
2
|
Askarizadeh M, Morsali A, Nguyen KK. Resource-Constrained Multisource Instance-Based Transfer Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1029-1043. [PMID: 37930915 DOI: 10.1109/tnnls.2023.3327248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2023]
Abstract
In today's machine learning (ML), the need for vast amounts of training data has become a significant challenge. Transfer learning (TL) offers a promising solution by leveraging knowledge across different domains/tasks, effectively addressing data scarcity. However, TL encounters computational and communication challenges in resource-constrained scenarios, and negative transfer (NT) can arise from specific data distributions. This article presents a novel focus on maximizing the accuracy of instance-based TL in multisource resource-constrained environments while mitigating NT, a key concern in TL. Previous studies have overlooked the impact of resource consumption in addressing the NT problem. To address these challenges, we introduce an optimization model named multisource resource-constrained optimized TL (MSOPTL), which employs a convex combination of empirical sources and target errors while considering feasibility and resource constraints. Moreover, we enhance one of the generalization error upper bounds in domain adaptation setting by demonstrating the potential to substitute the divergence with the Kullback-Leibler (KL) divergence. We utilize this enhanced error upper bound as one of the feasibility constraints of MSOPTL. Our suggested model can be applied as a versatile framework for various ML methods. Our approach is extensively validated in a neural network (NN)-based classification problem, demonstrating the efficiency of MSOPTL in achieving the desired trade-offs between TL's benefits and associated costs. This advancement holds tremendous potential for enhancing edge artificial intelligence (AI) applications in resource-constrained environments.
Collapse
|
3
|
Pei C, Wu F, Yang M, Pan L, Ding W, Dong J, Huang L, Zhuang X. Multi-Source Domain Adaptation for Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1640-1651. [PMID: 38133966 DOI: 10.1109/tmi.2023.3346285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2023]
Abstract
Unsupervised domain adaptation(UDA) aims to mitigate the performance drop of models tested on the target domain, due to the domain shift from the target to sources. Most UDA segmentation methods focus on the scenario of solely single source domain. However, in practical situations data with gold standard could be available from multiple sources (domains), and the multi-source training data could provide more information for knowledge transfer. How to utilize them to achieve better domain adaptation yet remains to be further explored. This work investigates multi-source UDA and proposes a new framework for medical image segmentation. Firstly, we employ a multi-level adversarial learning scheme to adapt features at different levels between each of the source domains and the target, to improve the segmentation performance. Then, we propose a multi-model consistency loss to transfer the learned multi-source knowledge to the target domain simultaneously. Finally, we validated the proposed framework on two applications, i.e., multi-modality cardiac segmentation and cross-modality liver segmentation. The results showed our method delivered promising performance and compared favorably to state-of-the-art approaches.
Collapse
|
4
|
Tao J, Dan Y, Zhou D. Local domain generalization with low-rank constraint for EEG-based emotion recognition. Front Neurosci 2023; 17:1213099. [PMID: 38027525 PMCID: PMC10662311 DOI: 10.3389/fnins.2023.1213099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 10/04/2023] [Indexed: 12/01/2023] Open
Abstract
As an important branch in the field of affective computing, emotion recognition based on electroencephalography (EEG) faces a long-standing challenge due to individual diversities. To conquer this challenge, domain adaptation (DA) or domain generalization (i.e., DA without target domain in the training stage) techniques have been introduced into EEG-based emotion recognition to eliminate the distribution discrepancy between different subjects. The preceding DA or domain generalization (DG) methods mainly focus on aligning the global distribution shift between source and target domains, yet without considering the correlations between the subdomains within the source domain and the target domain of interest. Since the ignorance of the fine-grained distribution information in the source may still bind the DG expectation on EEG datasets with multimodal structures, multiple patches (or subdomains) should be reconstructed from the source domain, on which multi-classifiers could be learned collaboratively. It is expected that accurately aligning relevant subdomains by excavating multiple distribution patterns within the source domain could further boost the learning performance of DG/DA. Therefore, we propose in this work a novel DG method for EEG-based emotion recognition, i.e., Local Domain Generalization with low-rank constraint (LDG). Specifically, the source domain is firstly partitioned into multiple local domains, each of which contains only one positive sample and its positive neighbors and k2 negative neighbors. Multiple subject-invariant classifiers on different subdomains are then co-learned in a unified framework by minimizing local regression loss with low-rank regularization for considering the shared knowledge among local domains. In the inference stage, the learned local classifiers are discriminatively selected according to their importance of adaptation. Extensive experiments are conducted on two benchmark databases (DEAP and SEED) under two cross-validation evaluation protocols, i.e., cross-subject within-dataset and cross-dataset within-session. The experimental results under the 5-fold cross-validation demonstrate the superiority of the proposed method compared with several state-of-the-art methods.
Collapse
Affiliation(s)
- Jianwen Tao
- Institute of Artificial Intelligence Application, Ningbo Polytechnic, Zhejiang, China
| | - Yufang Dan
- Institute of Artificial Intelligence Application, Ningbo Polytechnic, Zhejiang, China
| | - Di Zhou
- Industrial Technological Institute of Intelligent Manufacturing, Sichuan University of Arts and Science, Dazhou, China
| |
Collapse
|
5
|
Yi C, Chen H, Xu Y, Chen H, Liu Y, Tan H, Yan Y, Yu H. Multicomponent Adversarial Domain Adaptation: A General Framework. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:6824-6838. [PMID: 37224350 DOI: 10.1109/tnnls.2023.3270359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Domain adaptation (DA) aims to transfer knowledge from one source domain to another different but related target domain. The mainstream approach embeds adversarial learning into deep neural networks (DNNs) to either learn domain-invariant features to reduce the domain discrepancy or generate data to fill in the domain gap. However, these adversarial DA (ADA) approaches mainly consider the domain-level data distributions, while ignoring the differences among components contained in different domains. Therefore, components that are not related to the target domain are not filtered out. This can cause a negative transfer. In addition, it is difficult to make full use of the relevant components between the source and target domains to enhance DA. To address these limitations, we propose a general two-stage framework, named multicomponent ADA (MCADA). This framework trains the target model by first learning a domain-level model and then fine-tuning that model at the component-level. In particular, MCADA constructs a bipartite graph to find the most relevant component in the source domain for each component in the target domain. Since the nonrelevant components are filtered out for each target component, fine-tuning the domain-level model can enhance positive transfer. Extensive experiments on several real-world datasets demonstrate that MCADA has significant advantages over state-of-the-art methods.
Collapse
|
6
|
Yao Y, Li X, Zhang Y, Ye Y. Multisource Heterogeneous Domain Adaptation With Conditional Weighting Adversarial Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:2079-2092. [PMID: 34487497 DOI: 10.1109/tnnls.2021.3105868] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Heterogeneous domain adaptation (HDA) tackles the learning of cross-domain samples with both different probability distributions and feature representations. Most of the existing HDA studies focus on the single-source scenario. In reality, however, it is not uncommon to obtain samples from multiple heterogeneous domains. In this article, we study the multisource HDA problem and propose a conditional weighting adversarial network (CWAN) to address it. The proposed CWAN adversarially learns a feature transformer, a label classifier, and a domain discriminator. To quantify the importance of different source domains, CWAN introduces a sophisticated conditional weighting scheme to calculate the weights of the source domains according to the conditional distribution divergence between the source and target domains. Different from existing weighting schemes, the proposed conditional weighting scheme not only weights the source domains but also implicitly aligns the conditional distributions during the optimization process. Experimental results clearly demonstrate that the proposed CWAN performs much better than several state-of-the-art methods on four real-world datasets.
Collapse
|
7
|
Kumar G, Narducci F, Bakshi S. Knowledge Transfer and Crowdsourcing in Cyber-Physical-Social Systems. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.10.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
8
|
Zhao H, Wang H, Fu Y, Wu F, Li X. Memory-Efficient Class-Incremental Learning for Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5966-5977. [PMID: 33939615 DOI: 10.1109/tnnls.2021.3072041] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
With the memory-resource-limited constraints, class-incremental learning (CIL) usually suffers from the "catastrophic forgetting" problem when updating the joint classification model on the arrival of newly added classes. To cope with the forgetting problem, many CIL methods transfer the knowledge of old classes by preserving some exemplar samples into the size-constrained memory buffer. To utilize the memory buffer more efficiently, we propose to keep more auxiliary low-fidelity exemplar samples, rather than the original real-high-fidelity exemplar samples. Such a memory-efficient exemplar preserving scheme makes the old-class knowledge transfer more effective. However, the low-fidelity exemplar samples are often distributed in a different domain away from that of the original exemplar samples, that is, a domain shift. To alleviate this problem, we propose a duplet learning scheme that seeks to construct domain-compatible feature extractors and classifiers, which greatly narrows down the above domain gap. As a result, these low-fidelity auxiliary exemplar samples have the ability to moderately replace the original exemplar samples with a lower memory cost. In addition, we present a robust classifier adaptation scheme, which further refines the biased classifier (learned with the samples containing distillation label knowledge about old classes) with the help of the samples of pure true class labels. Experimental results demonstrate the effectiveness of this work against the state-of-the-art approaches. We will release the code, baselines, and training statistics for all models to facilitate future research.
Collapse
|
9
|
Chai Z, Zhao C, Huang B. Multisource-Refined Transfer Network for Industrial Fault Diagnosis Under Domain and Category Inconsistencies. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:9784-9796. [PMID: 34033554 DOI: 10.1109/tcyb.2021.3067786] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Unsupervised cross-domain fault diagnosis has been actively researched in recent years. It learns transferable features that reduce distribution inconsistency between source and target domains without target supervision. Most of the existing cross-domain fault diagnosis approaches are developed based on the consistency assumption of the source and target fault category sets. This assumption, however, is generally challenged in practice, as different working conditions can have different fault category sets. To solve the fault diagnosis problem under both domain and category inconsistencies, a multisource-refined transfer network is proposed in this article. First, a multisource-domain-refined adversarial adaptation strategy is designed to reduce the refined categorywise distribution inconsistency within each source-target domain pair. It avoids the negative transfer trap caused by conventional global-domainwise-forced alignments. Then, a multiple classifier complementation module is developed by complementing and transferring the source classifiers to the target domain to leverage different diagnostic knowledge existing in various sources. Different classifiers are complemented by the similarity scores produced by the adaptation module, and the complemented smooth predictions are used to guide the refined adaptation. Thus, the refined adversarial adaptation and the classifier complementation can benefit from each other in the training stage, yielding target-faults-discriminative and domain-refined-indistinguishable feature representations. Extensive experiments on two cases demonstrate the superiority of the proposed method when domain and category inconsistencies coexist.
Collapse
|
10
|
Shared Dictionary Learning Via Coupled Adaptations for Cross-Domain Classification. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10967-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
11
|
Tao J, Dan Y, Zhou D, He S. Robust Latent Multi-Source Adaptation for Encephalogram-Based Emotion Recognition. Front Neurosci 2022; 16:850906. [PMID: 35573289 PMCID: PMC9091911 DOI: 10.3389/fnins.2022.850906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Accepted: 02/11/2022] [Indexed: 11/18/2022] Open
Abstract
In practical encephalogram (EEG)-based machine learning, different subjects can be represented by many different EEG patterns, which would, in some extent, degrade the performance of extant subject-independent classifiers obtained from cross-subjects datasets. To this end, in this paper, we present a robust Latent Multi-source Adaptation (LMA) framework for cross-subject/dataset emotion recognition with EEG signals by uncovering multiple domain-invariant latent subspaces. Specifically, by jointly aligning the statistical and semantic distribution discrepancies between each source and target pair, multiple domain-invariant classifiers can be trained collaboratively in a unified framework. This framework can fully utilize the correlated knowledge among multiple sources with a novel low-rank regularization term. Comprehensive experiments on DEAP and SEED datasets demonstrate the superior or comparable performance of LMA with the state of the art in the EEG-based emotion recognition.
Collapse
Affiliation(s)
- Jianwen Tao
- Institute of Artificial Intelligence Application, Ningbo Polytechnic, Ningbo, China
| | - Yufang Dan
- Institute of Artificial Intelligence Application, Ningbo Polytechnic, Ningbo, China
| | - Di Zhou
- Industrial Technological Institute of Intelligent Manufacturing, Sichuan University of Arts and Science, Dazhou, China
| | - Songsong He
- Institute of Artificial Intelligence Application, Ningbo Polytechnic, Ningbo, China
| |
Collapse
|
12
|
Zhan S, Sun W, Du C, Zhong W. Diversity-promoting multi-view graph learning for semi-supervised classification. INT J MACH LEARN CYB 2021. [DOI: 10.1007/s13042-021-01370-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
13
|
SSDAN: Multi-Source Semi-Supervised Domain Adaptation Network for Remote Sensing Scene Classification. REMOTE SENSING 2021. [DOI: 10.3390/rs13193861] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
We present a new method for multi-source semi-supervised domain adaptation in remote sensing scene classification. The method consists of a pre-trained convolutional neural network (CNN) model, namely EfficientNet-B3, for the extraction of highly discriminative features, followed by a classification module that learns feature prototypes for each class. Then, the classification module computes a cosine distance between feature vectors of target data samples and the feature prototypes. Finally, the proposed method ends with a Softmax activation function that converts the distances into class probabilities. The feature prototypes are also divided by a temperature parameter to normalize and control the classification module. The whole model is trained on both the unlabeled and labeled target samples. It is trained to predict the correct classes utilizing the standard cross-entropy loss computed over the labeled source and target samples. At the same time, the model is trained to learn domain invariant features using another loss function based on entropy computed over the unlabeled target samples. Unlike the standard cross-entropy loss, the new entropy loss function is computed on the model’s predicted probabilities and does not need the true labels. This entropy loss, called minimax loss, needs to be maximized with respect to the classification module to learn features that are domain-invariant (hence removing the data shift), and at the same time, it should be minimized with respect to the CNN feature extractor to learn discriminative features that are clustered around the class prototypes (in other words reducing intra-class variance). To accomplish these maximization and minimization processes at the same time, we use an adversarial training approach, where we alternate between the two processes. The model combines the standard cross-entropy loss and the new minimax entropy loss and optimizes them jointly. The proposed method is tested on four RS scene datasets, namely UC Merced, AID, RESISC45, and PatternNet, using two-source and three-source domain adaptation scenarios. The experimental results demonstrate the strong capability of the proposed method to achieve impressive performance despite using only a few (six in our case) labeled target samples per class. Its performance is already better than several state-of-the-art methods, including RevGrad, ADDA, Siamese-GAN, and MSCN.
Collapse
|
14
|
Zhao S, Li B, Xu P, Yue X, Ding G, Keutzer K. MADAN: Multi-source Adversarial Domain Aggregation Network for Domain Adaptation. Int J Comput Vis 2021. [DOI: 10.1007/s11263-021-01479-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
15
|
Yang K, Lu J, Wan W, Zhang G. Multi-source transfer regression via source-target pairwise segment. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2020.09.074] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
16
|
Liu ZG, Huang LQ, Zhou K, Denoeux T. Combination of Transferable Classification With Multisource Domain Adaptation Based on Evidential Reasoning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:2015-2029. [PMID: 32497012 DOI: 10.1109/tnnls.2020.2995862] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
In applications of domain adaptation, there may exist multiple source domains, which can provide more or less complementary knowledge for pattern classification in the target domain. In order to improve the classification accuracy, a decision-level combination method is proposed for the multisource domain adaptation based on evidential reasoning. The classification results obtained from different source domains usually have different reliabilities/weights, which are calculated according to domain consistency. Therefore, the multiple classification results are discounted by the corresponding weights under belief functions framework, and then, Dempster's rule is employed to combine these discounted results. In order to reduce errors, a neighborhood-based cautious decision-making rule is developed to make the class decision depending on the combination result. The object is assigned to a singleton class if its neighborhoods can be (almost) correctly classified. Otherwise, it is cautiously committed to the disjunction of several possible classes. By doing this, we can well characterize the partial imprecision of classification and reduce the error risk as well. A unified utility value is defined here to reflect the benefit of such classification. This cautious decision-making rule can achieve the maximum unified utility value because partial imprecision is considered better than an error. Several real data sets are used to test the performance of the proposed method, and the experimental results show that our new method can efficiently improve the classification accuracy with respect to other related combination methods.
Collapse
|
17
|
Gao P, Wu W, Li J. Multi-source fast transfer learning algorithm based on support vector machine. APPL INTELL 2021; 51:8451-8465. [PMID: 34764591 PMCID: PMC8023540 DOI: 10.1007/s10489-021-02194-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/04/2021] [Indexed: 11/30/2022]
Abstract
Knowledge in the source domain can be used in transfer learning to help train and classification tasks within the target domain with fewer available data sets. Therefore, given the situation where the target domain contains only a small number of available unlabeled data sets and multi-source domains contain a large number of labeled data sets, a new Multi-source Fast Transfer Learning algorithm based on support vector machine(MultiFTLSVM) is proposed in this paper. Given the idea of multi-source transfer learning, more source domain knowledge is taken to train the target domain learning task to improve classification effect. At the same time, the representative data set of the source domain is taken to speed up the algorithm training process to improve the efficiency of the algorithm. Experimental results on several real data sets show the effectiveness of MultiFTLSVM, and it also has certain advantages compared with the benchmark algorithm.
Collapse
Affiliation(s)
- Peng Gao
- College of Computer Science and Technology, Harbin Engineering University, Harbin, China
- Technology Development Cente, Heilongjiang Broadcasting Station, Harbin, China
| | - Weifei Wu
- College of Computer Science and Technology, Harbin Engineering University, Harbin, China
| | - Jingmei Li
- College of Computer Science and Technology, Harbin Engineering University, Harbin, China
| |
Collapse
|
18
|
Zhao W, Xu C, Guan Z, Liu Y. Multiview Concept Learning Via Deep Matrix Factorization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:814-825. [PMID: 32275617 DOI: 10.1109/tnnls.2020.2979532] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Multiview representation learning (MVRL) leverages information from multiple views to obtain a common representation summarizing the consistency and complementarity in multiview data. Most previous matrix factorization-based MVRL methods are shallow models that neglect the complex hierarchical information. The recently proposed deep multiview factorization models cannot explicitly capture consistency and complementarity in multiview data. We present the deep multiview concept learning (DMCL) method, which hierarchically factorizes the multiview data, and tries to explicitly model consistent and complementary information and capture semantic structures at the highest abstraction level. We explore two variants of the DMCL framework, DMCL-L and DMCL-N, with respectively linear/nonlinear transformations between adjacent layers. We propose two block coordinate descent-based optimization methods for DMCL-L and DMCL-N. We verify the effectiveness of DMCL on three real-world data sets for both clustering and classification tasks.
Collapse
|
19
|
Wang D, Lu C, Wu J, Liu H, Zhang W, Zhuang F, Zhang H. Softly Associative Transfer Learning for Cross-Domain Classification. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:4709-4721. [PMID: 30703057 DOI: 10.1109/tcyb.2019.2891577] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The main challenge of cross-domain text classification is to train a classifier in a source domain while applying it to a different target domain. Many transfer learning-based algorithms, for example, dual transfer learning, triplex transfer learning, etc., have been proposed for cross-domain classification, by detecting a shared low-dimensional feature representation for both source and target domains. These methods, however, often assume that the word clusters matrix or the clusters association matrix as knowledge transferring bridges are exactly the same across different domains, which is actually unrealistic in real-world applications and, therefore, could degrade classification performance. In light of this, in this paper, we propose a softly associative transfer learning algorithm for cross-domain text classification. Specifically, we integrate two non-negative matrix tri-factorizations into a joint optimization framework, with approximate constraints on both word clusters matrices and clusters association matrices so as to allow proper diversity in knowledge transfer, and with another approximate constraint on class labels in source domains in order to handle noisy labels. An iterative algorithm is then proposed to solve the above problem, with its convergence verified theoretically and empirically. Extensive experimental results on various text datasets demonstrate the effectiveness of our algorithm, even with the presence of abundant state-of-the-art competitors.
Collapse
|
20
|
Chen H, Li Y, Su D. Discriminative Cross-Modal Transfer Learning and Densely Cross-Level Feedback Fusion for RGB-D Salient Object Detection. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:4808-4820. [PMID: 31484153 DOI: 10.1109/tcyb.2019.2934986] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This article addresses two key issues in RGB-D salient object detection based on the convolutional neural network (CNN). 1) How to bridge the gap between the "data-hungry" nature of CNNs and the insufficient labeled training data in the depth modality? 2) How to take full advantages of the complementary information among two modalities. To solve the first problem, we model the depth-induced saliency detection as a CNN-based cross-modal transfer learning problem. Instead of directly adopting the RGB CNN as initialization, we additionally train a modality classification network (MCNet) to encourage discriminative modality-specific representations in minimizing the modality classification loss. To solve the second problem, we propose a densely cross-level feedback topology, in which the cross-modal complements are combined in each level and then densely fed back to all shallower layers for sufficient cross-level interactions. Compared to traditional two-stream frameworks, the proposed one can better explore, select, and fuse cross-modal cross-level complements. Experiments show the significant and consistent improvements of the proposed CNN framework over other state-of-the-art methods.
Collapse
|
21
|
Yi C, Xu Y, Yu H, Yan Y, Liu Y. Multi-component transfer metric learning for handling unrelated source domain samples. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.106132] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
22
|
A Novel Digital Modulation Recognition Algorithm Based on Deep Convolutional Neural Network. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10031166] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The modulation recognition of digital signals under non-cooperative conditions is one of the important research contents here. With the rapid development of artificial intelligence technology, deep learning theory is also increasingly being applied to the field of modulation recognition. In this paper, a novel digital signal modulation recognition algorithm is proposed, which has combined the InceptionResNetV2 network with transfer adaptation, called InceptionResnetV2-TA. Firstly, the received signal is preprocessed and generated the constellation diagram. Then, the constellation diagram is used as the input of the InceptionResNetV2 network to identify different kinds of signals. Transfer adaptation is used for feature extraction and SVM classifier is used to identify the modulation mode of digital signal. The constellation diagram of three typical signals, including Binary Phase Shift Keying(BPSK), Quadrature Phase Shift Keying(QPSK) and 8 Phase Shift Keying(8PSK), was made for the experiments. When the signal-to-noise ratio(SNR) is 4dB, the recognition rates of BPSK, QPSK and 8PSK are respectively 1.0, 0.9966 and 0.9633 obtained by InceptionResnetV2-TA, and at the same time, the recognition rate can be 3% higher than other algorithms. Compared with the traditional modulation recognition algorithms, the experimental results show that the proposed algorithm in this paper has a higher accuracy rate for digital signal modulation recognition at low SNR.
Collapse
|
23
|
Jiang S, Mao H, Ding Z, Fu Y. Deep Decision Tree Transfer Boosting. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:383-395. [PMID: 30932853 DOI: 10.1109/tnnls.2019.2901273] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Instance transfer approaches consider source and target data together during the training process, and borrow examples from the source domain to augment the training data, when there is limited or no label in the target domain. Among them, boosting-based transfer learning methods (e.g., TrAdaBoost) are most widely used. When dealing with more complex data, we may consider the more complex hypotheses (e.g., a decision tree with deeper layers). However, with the fixed and high complexity of the hypotheses, TrAdaBoost and its variants may face the overfitting problems. Even worse, in the transfer learning scenario, a decision tree with deep layers may overfit different distribution data in the source domain. In this paper, we propose a new instance transfer learning method, i.e., Deep Decision Tree Transfer Boosting (DTrBoost), whose weights are learned and assigned to base learners by minimizing the data-dependent learning bounds across both source and target domains in terms of the Rademacher complexities. This guarantees that we can learn decision trees with deep layers without overfitting. The theorem proof and experimental results indicate the effectiveness of our proposed method.
Collapse
|
24
|
Ding Z, Shao M, Fu Y. Generative Zero-Shot Learning via Low-Rank Embedded Semantic Dictionary. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019; 41:2861-2874. [PMID: 30176581 DOI: 10.1109/tpami.2018.2867870] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Zero-shot learning for visual recognition, which approaches identifying unseen categories through a shared visual-semantic function learned on the seen categories and is expected to well adapt to unseen categories, has received considerable research attention most recently. However, the semantic gap between discriminant visual features and their underlying semantics is still the biggest obstacle, because there usually exists domain disparity across the seen and unseen classes. To deal with this challenge, we design two-stage generative adversarial networks to enhance the generalizability of semantic dictionary through low-rank embedding for zero-shot learning. In detail, we formulate a novel framework to simultaneously seek a two-stage generative model and a semantic dictionary to connect visual features with their semantics under a low-rank embedding. Our first-stage generative model is able to augment more semantic features for the unseen classes, which are then used to generate more discriminant visual features in the second stage, to expand the seen visual feature space. Therefore, we will be able to seek a better semantic dictionary to constitute the latent basis for the unseen classes based on the augmented semantic and visual data. Finally, our approach could capture a variety of visual characteristics from seen classes that are "ready-to-use" for new classes. Extensive experiments on four zero-shot benchmarks demonstrate that our proposed algorithm outperforms the state-of-the-art zero-shot algorithms.
Collapse
|
25
|
Li J, Wu W, Xue D, Gao P. Multi-Source Deep Transfer Neural Network Algorithm. SENSORS 2019; 19:s19183992. [PMID: 31527437 PMCID: PMC6767847 DOI: 10.3390/s19183992] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 09/08/2019] [Accepted: 09/12/2019] [Indexed: 12/11/2022]
Abstract
Transfer learning can enhance classification performance of a target domain with insufficient training data by utilizing knowledge relating to the target domain from source domain. Nowadays, it is common to see two or more source domains available for knowledge transfer, which can improve performance of learning tasks in the target domain. However, the classification performance of the target domain decreases due to mismatching of probability distribution. Recent studies have shown that deep learning can build deep structures by extracting more effective features to resist the mismatching. In this paper, we propose a new multi-source deep transfer neural network algorithm, MultiDTNN, based on convolutional neural network and multi-source transfer learning. In MultiDTNN, joint probability distribution adaptation (JPDA) is used for reducing the mismatching between source and target domains to enhance features transferability of the source domain in deep neural networks. Then, the convolutional neural network is trained by utilizing the datasets of each source and target domain to obtain a set of classifiers. Finally, the designed selection strategy selects classifier with the smallest classification error on the target domain from the set to assemble the MultiDTNN framework. The effectiveness of the proposed MultiDTNN is verified by comparing it with other state-of-the-art deep transfer learning on three datasets.
Collapse
Affiliation(s)
- Jingmei Li
- College of Computer Science and Technology, Harbin Engineering University, No.145 Nantong Street, Harbin 150001, China.
| | - Weifei Wu
- College of Computer Science and Technology, Harbin Engineering University, No.145 Nantong Street, Harbin 150001, China.
| | - Di Xue
- College of Computer Science and Technology, Harbin Engineering University, No.145 Nantong Street, Harbin 150001, China.
| | - Peng Gao
- College of Computer Science and Technology, Harbin Engineering University, No.145 Nantong Street, Harbin 150001, China.
| |
Collapse
|
26
|
Ding Z, Fu Y. Deep Transfer Low-Rank Coding for Cross-Domain Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:1768-1779. [PMID: 30371396 DOI: 10.1109/tnnls.2018.2874567] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Transfer learning has attracted great attention to facilitate the sparsely labeled or unlabeled target learning by leveraging previously well-established source domain through knowledge transfer. Recent activities on transfer learning attempt to build deep architectures to better fight off cross-domain divergences by extracting more effective features. However, its generalizability would decrease greatly due to the domain mismatch enlarges, particularly at the top layers. In this paper, we develop a novel deep transfer low-rank coding based on deep convolutional neural networks, where we investigate multilayer low-rank coding at the top task-specific layers. Specifically, multilayer common dictionaries shared across two domains are obtained to bridge the domain gap such that more enriched domain-invariant knowledge can be captured through a layerwise fashion. With rank minimization on the new codings, our model manages to preserve the global structures across source and target, and thus, similar samples of two domains tend to gather together for effective knowledge transfer. Furthermore, domain/classwise adaption terms are integrated to guide the effective coding optimization in a semisupervised manner, so the marginal and conditional disparities of two domains will be alleviated. Experimental results on three visual domain adaptation benchmarks verify the effectiveness of our proposed approach on boosting the recognition performance for the target domain, by comparing it with other state-of-the-art deep transfer learning.
Collapse
|
27
|
Li J, Lu K, Huang Z, Zhu L, Shen HT. Heterogeneous Domain Adaptation Through Progressive Alignment. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:1381-1391. [PMID: 30281489 DOI: 10.1109/tnnls.2018.2868854] [Citation(s) in RCA: 72] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In real-world transfer learning tasks, especially in cross-modal applications, the source domain and the target domain often have different features and distributions, which are well known as the heterogeneous domain adaptation (HDA) problem. Yet, existing HDA methods focus on either alleviating the feature discrepancy or mitigating the distribution divergence due to the challenges of HDA. In fact, optimizing one of them can reinforce the other. In this paper, we propose a novel HDA method that can optimize both feature discrepancy and distribution divergence in a unified objective function. Specifically, we present progressive alignment, which first learns a new transferable feature space by dictionary-sharing coding, and then aligns the distribution gaps on the new space. Different from previous HDA methods that are limited to specific scenarios, our approach can handle diverse features with arbitrary dimensions. Extensive experiments on various transfer learning tasks, such as image classification, text categorization, and text-to-image recognition, verify the superiority of our method against several state-of-the-art approaches.
Collapse
|
28
|
Cao W, Wu S, Yu Z, Wong HS. Exploring Correlations Among Tasks, Clusters, and Features for Multitask Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:355-368. [PMID: 29994135 DOI: 10.1109/tnnls.2018.2839114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Multitask clustering methods are proposed to improve performances of related tasks concurrently, because they explore the relationship among tasks via exploiting the coefficient matrix or the shared feature matrix. However, divergent effects of features in learning this relationship are seldom considered. To further improve performances, we propose a new multitask clustering approach through exploring correlations among tasks, clusters, and features based on effects of features on clusters. First, a Feature-Cluster (FeaCluster) matrix is introduced to capture the similarity and the distinct task-feature information simultaneously for each task. With the FeaCluster matrix, two affinities are calculated to constitute the interdependencies among tasks: the former is the graphical affinity based on feature-task and task-cluster correlations, while the latter is the reconstructive affinity. Here, the feature-task correlation considers effects of features on tasks, and the task-cluster correlation considers the overall effects of features on clusters. The reconstructive affinity is obtained by minimizing the reconstruction error when representing the FeaCluster matrix for a given task with a linear combination of others. The interdependencies among tasks allow transferring asymmetric shared information, exploring significant features and preserving key information when mapping data into the subspace. The experimental results on multiple data sets reveal that the proposed approach outperforms the state-of-the-art clustering methods in terms of accuracy and normal mutual information.
Collapse
|
29
|
Ding Z, Nasrabadi NM, Fu Y. Semi-supervised Deep Domain Adaptation via Coupled Neural Networks. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:5214-5224. [PMID: 29994676 DOI: 10.1109/tip.2018.2851067] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Domain adaptation is a promising technique when addressing limited or no labeled target data by borrowing well-labeled knowledge from the auxiliary source data. Recently, researchers have exploited multi-layer structures for discriminative feature learning to reduce the domain discrepancy. However, there are limited research efforts on simultaneously building a deep structure and a discriminative classifier over both labeled source and unlabeled target. In this paper, we propose a semi-supervised deep domain adaptation framework, in which the multi-layer feature extractor and a multi-class classifier are jointly learned to benefit from each other. Specifically, we develop a novel semi-supervised class-wise adaptation manner to fight off the conditional distribution mismatch between two domains by assigning a probabilistic label to each target sample, i.e., multiple class labels with different probabilities. Furthermore, a multi-class classifier is simultaneously trained on labeled source and unlabeled target samples in a semi-supervised fashion. In this way, the deep structure can formally alleviate the domain divergence and enhance the feature transferability. Experimental evaluations on several standard cross-domain benchmarks verify the superiority of our proposed approach.
Collapse
|
30
|
Ding Z, Fu Y. Robust Multiview Data Analysis Through Collective Low-Rank Subspace. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:1986-1997. [PMID: 28436903 DOI: 10.1109/tnnls.2017.2690970] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Multiview data are of great abundance in real-world applications, since various viewpoints and multiple sensors desire to represent the data in a better way. Conventional multiview learning methods aimed to learn multiple view-specific transformations meanwhile assumed the view knowledge of training, and test data were available in advance. However, they would fail when we do not have any prior knowledge for the probe data's view information, since the correct view-specific projections cannot be utilized to extract effective feature representations. In this paper, we develop a collective low-rank subspace (CLRS) algorithm to deal with this problem in multiview data analysis. CLRS attempts to reduce the semantic gap across multiple views through seeking a view-free low-rank projection shared by multiple view-specific transformations. Moreover, we exploit low-rank reconstruction to build a bridge between the view-specific features and those view-free ones transformed with the CLRS. Furthermore, a supervised cross-view regularizer is developed to couple the within-class data across different views to make the learned collective subspace more discriminative. Our CLRS makes our algorithm more flexible when addressing the challenging issue without any prior knowledge of the probe data's view information. To that end, two different settings of experiments on several multiview benchmarks are designed to evaluate the proposed approach. Experimental results have verified the effective performance of our proposed method by comparing with the state-of-the-art algorithms.
Collapse
|
31
|
Kong Y, Shao M, Li K, Fu Y. Probabilistic Low-Rank Multitask Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:670-680. [PMID: 28060715 DOI: 10.1109/tnnls.2016.2641160] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
In this paper, we consider the problem of learning multiple related tasks simultaneously with the goal of improving the generalization performance of individual tasks. The key challenge is to effectively exploit the shared information across multiple tasks as well as preserve the discriminative information for each individual task. To address this, we propose a novel probabilistic model for multitask learning (MTL) that can automatically balance between low-rank and sparsity constraints. The former assumes a low-rank structure of the underlying predictive hypothesis space to explicitly capture the relationship of different tasks and the latter learns the incoherent sparse patterns private to each task. We derive and perform inference via variational Bayesian methods. Experimental results on both regression and classification tasks on real-world applications demonstrate the effectiveness of the proposed method in dealing with the MTL problems.
Collapse
|