1
|
Lu Z, Liu J, Xu M. Identity Model Transformation for boosting performance and efficiency in object detection network. Neural Netw 2025; 184:107098. [PMID: 39778291 DOI: 10.1016/j.neunet.2024.107098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Revised: 10/27/2024] [Accepted: 12/23/2024] [Indexed: 01/11/2025]
Abstract
Modifying the structure of an existing network is a common method to further improve the performance of the network. However, modifying some layers in network often results in pre-trained weight mismatch, and fine-tune process is time-consuming and resource-inefficient. To address this issue, we propose a novel technique called Identity Model Transformation (IMT), which keep the output before and after transformation in an equal form by rigorous algebraic transformations. This approach ensures the preservation of the original model's performance when modifying layers. Additionally, IMT significantly reduces the total training time required to achieve optimal results while further enhancing network performance. IMT has established a bridge for rapid transformation between model architectures, enabling a model to quickly perform analytic continuation and derive a family of tree-like models with better performance. This model family possesses a greater potential for optimization improvements compared to a single model. Extensive experiments across various object detection tasks validated the effectiveness and efficiency of our proposed IMT solution, which saved 94.76% time in fine-tuning the basic model YOLOv4-Rot on DOTA 1.5 dataset, and by using the IMT method, we saw stable performance improvements of 9.89%, 6.94%, 2.36%, and 4.86% on the four datasets: AI-TOD, DOTA1.5, coco2017, and MRSAText, respectively.
Collapse
Affiliation(s)
- Zhongyuan Lu
- The State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan, 430079, China.
| | - Jin Liu
- The State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan, 430079, China.
| | - Miaozhong Xu
- The State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan, 430079, China.
| |
Collapse
|
2
|
Zhao Y, Li S, Zhang R, Liu CH, Cao W, Wang X, Tian S. Semantic Correlation Transfer for Heterogeneous Domain Adaptation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:4233-4245. [PMID: 36006880 DOI: 10.1109/tnnls.2022.3199619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Heterogeneous domain adaptation (HDA) is expected to achieve effective knowledge transfer from a label-rich source domain to a heterogeneous target domain with scarce labeled data. Most prior HDA methods strive to align the cross-domain feature distributions by learning domain invariant representations without considering the intrinsic semantic correlations among categories, which inevitably results in the suboptimal adaptation performance across domains. Therefore, to address this issue, we propose a novel semantic correlation transfer (SCT) method for HDA, which not only matches the marginal and conditional distributions between domains to mitigate the large domain discrepancy, but also transfers the category correlation knowledge underlying the source domain to target by maximizing the pairwise class similarity across source and target. Technically, the domainwise and classwise centroids (prototypes) are first computed and aligned according to the feature embeddings. Then, based on the derived classwise prototypes, we leverage the cosine similarity of each two classes in both domains to transfer the supervised source semantic correlation knowledge among different categories to target effectively. As a result, the feature transferability and category discriminability can be simultaneously improved during the adaptation process. Comprehensive experiments and ablation studies on standard HDA tasks, such as text-to-image, image-to-image, and text-to-text, have demonstrated the superiority of our proposed SCT against several state-of-the-art HDA methods.
Collapse
|
3
|
Tian J, Saddik AE, Xu X, Li D, Cao Z, Shen HT. Intrinsic Consistency Preservation With Adaptively Reliable Samples for Source-Free Domain Adaptation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:4738-4749. [PMID: 38379234 DOI: 10.1109/tnnls.2024.3362948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/22/2024]
Abstract
Unsupervised domain adaptation (UDA) aims to alleviate the domain shift by transferring knowledge learned from a labeled source dataset to an unlabeled target domain. Although UDA has seen promising progress recently, it requires access to data from both domains, making it problematic in source data-absent scenarios. In this article, we investigate a practical task source-free domain adaptation (SFDA) that alleviates the limitations of the widely studied UDA in simultaneously acquiring source and target data. In addition, we further study the imbalanced SFDA (ISFDA) problem, which addresses the intra-domain class imbalance and inter-domain label shift in SFDA. We observe two key issues in SFDA that: 1) target data form clusters in the representation space regardless of whether the target data points are aligned with the source classifier and 2) target samples with higher classification confidence are more reliable and have less variation in their classification confidence during adaptation. Motivated by these observations, we propose a unified method, named intrinsic consistency preservation with adaptively reliable samples (ICPR), to jointly cope with SFDA and ISFDA. Specifically, ICPR first encourages the intrinsic consistency in the predictions of neighbors for unlabeled samples with weak augmentation (standard flip-and-shift), regardless of their reliability. ICPR then generates strongly augmented views specifically for adaptively selected reliable samples and is trained to fix the intrinsic consistency between weakly and strongly augmented views of the same image concerning predictions of neighbors and their own. Additionally, we propose to use a prototype-like classifier to avoid the classification confusion caused by severe intra-domain class imbalance and inter-domain label shift. We demonstrate the effectiveness and general applicability of ICPR on six benchmarks of both SFDA and ISFDA tasks. The reproducible code of our proposed ICPR method is available at https://github.com/CFM-MSG/Code_ICPR.
Collapse
|
4
|
Liu R, Wen S, Xing Y. An integrated approach for advanced vehicle classification. PLoS One 2025; 20:e0318530. [PMID: 39965022 PMCID: PMC11835343 DOI: 10.1371/journal.pone.0318530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2024] [Accepted: 01/16/2025] [Indexed: 02/20/2025] Open
Abstract
This study is dedicated to addressing the trade-off between receptive field size and computational efficiency in low-level vision. Conventional neural networks (CNNs) usually expand the receptive field by adding layers or inflation filtering, which often leads to high computational costs. Although expansion filtering was introduced to reduce the computational burden, the resulting receptive field is only a sparse sampling of the tessellated pattern in the input image due to the grid effect. To better trade-off between the size of the receptive field and the computational efficiency, a new multilevel discrete wavelet CNN model (DWAN) is proposed in this paper. The DWAN introduces a four-level discrete wavelet transform in the convolutional neural network architecture and combines it with Convolutional Block Attention Module (CBAM) to efficiently capture multiscale feature information. By reducing the size of the feature maps in the shrinkage subnetwork, DWAN achieves a wider sensory field coverage while maintaining a smaller computational cost, thus improving the performance and efficiency of visual tasks. In addition, this paper validates the DWAN model in an image classification task targeting fine categories of automobiles. Significant performance gains are observed by training and testing the DWAN architecture that includes CBAM. The DWAN model can identify and accurately classify subtle features and differences in automotive images, resulting in better classification results for the automotive fine-grained category. This validation result further demonstrates the effectiveness and robustness of the DWAN model in vision tasks and lays a solid foundation for its generalization to practical applications.
Collapse
Affiliation(s)
- Rui Liu
- College of Computer Science and Cyber Security, Chengdu University of Technology, Chengdu, Sichuan, China
| | - Shiyuan Wen
- College of Computer Science and Cyber Security, Chengdu University of Technology, Chengdu, Sichuan, China
| | - Yufei Xing
- College of Computer Science and Cyber Security, Chengdu University of Technology, Chengdu, Sichuan, China
| |
Collapse
|
5
|
Lee J, Kang E, Heo DW, Suk HI. Site-Invariant Meta-Modulation Learning for Multisite Autism Spectrum Disorders Diagnosis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:18062-18075. [PMID: 37708014 DOI: 10.1109/tnnls.2023.3311195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/16/2023]
Abstract
Large amounts of fMRI data are essential to building generalized predictive models for brain disease diagnosis. In order to conduct extensive data analysis, it is often necessary to gather data from multiple organizations. However, the site variation inherent in multisite resting-state functional magnetic resonance imaging (rs-fMRI) leads to unfavorable heterogeneity in data distribution, negatively impacting the identification of biomarkers and the diagnostic decision. Several existing methods have alleviated this shift of domain distribution (i.e., multisite problem). Statistical tuning schemes directly regress out site disparity factors from the data prior to model training. Such methods have a limitation in processing data each time through variance estimation according to the added site. In the model adjustment approaches, domain adaptation (DA) methods adjust the features or models of the source domain according to the target domain during model training. Thus, it is inevitable that it needs updating model parameters according to the samples of a target site, causing great limitations in practical applicability. Meanwhile, the approach of domain generalization (DG) aims to create a universal model that can be quickly adapted to multiple domains. In this study, we propose a novel framework for disease diagnosis that alleviates the multisite problem by adaptively calibrating site-specific features into site-invariant features. Specifically, it applies directly to samples from unseen sites without the need for fine-tuning. With a learning-to-learn strategy that learns how to calibrate the features under the various domain shift environments, our novel modulation mechanism extracts site-invariant features. In our experiments over the Autism Brain Imaging Data Exchange (ABIDE I and II) dataset, we validated the generalization ability of the proposed network by improving diagnostic accuracy in both seen and unseen multisite samples.
Collapse
|
6
|
Ma S, Yuan Z, Wu Q, Huang Y, Hu X, Leung CH, Wang D, Huang Z. Deep Into the Domain Shift: Transfer Learning Through Dependence Regularization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:14409-14423. [PMID: 37279130 DOI: 10.1109/tnnls.2023.3279099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Classical domain adaptation methods acquire transferability by regularizing the overall distributional discrepancies between features in the source domain (labeled) and features in the target domain (unlabeled). They often do not differentiate whether the domain differences come from the marginals or the dependence structures. In many business and financial applications, the labeling function usually has different sensitivities to the changes in the marginals versus changes in the dependence structures. Measuring the overall distributional differences will not be discriminative enough in acquiring transferability. Without the needed structural resolution, the learned transfer is less optimal. This article proposes a new domain adaptation approach in which one can measure the differences in the internal dependence structure separately from those in the marginals. By optimizing the relative weights among them, the new regularization strategy greatly relaxes the rigidness of the existing approaches. It allows a learning machine to pay special attention to places where the differences matter the most. Experiments on three real-world datasets show that the improvements are quite notable and robust compared to various benchmark domain adaptation models.
Collapse
|
7
|
Xie P, Zhao X, He X. Improve the performance of CT-based pneumonia classification via source data reweighting. Sci Rep 2023; 13:9401. [PMID: 37296239 PMCID: PMC10251339 DOI: 10.1038/s41598-023-35938-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Accepted: 05/26/2023] [Indexed: 06/12/2023] Open
Abstract
Pneumonia is a life-threatening disease. Computer tomography (CT) imaging is broadly used for diagnosing pneumonia. To assist radiologists in accurately and efficiently detecting pneumonia from CT scans, many deep learning methods have been developed. These methods require large amounts of annotated CT scans, which are difficult to obtain due to privacy concerns and high annotation costs. To address this problem, we develop a three-level optimization based method which leverages CT data from a source domain to mitigate the lack of labeled CT scans in a target domain. Our method automatically identifies and downweights low-quality source CT data examples which are noisy or have large domain discrepancy with target data, by minimizing the validation loss of a target model trained on reweighted source data. On a target dataset with 2218 CT scans and a source dataset with 349 CT images, our method achieves an F1 score of 91.8% in detecting pneumonia and an F1 score of 92.4% in detecting other types of pneumonia, which are significantly better than those achieved by state-of-the-art baseline methods.
Collapse
Affiliation(s)
- Pengtao Xie
- Department of Electrical and Computer Engineering, University of California San Diego, San Diego, USA.
| | - Xingchen Zhao
- Department of Electrical and Computer Engineering, Northeastern University, Boston, USA
| | - Xuehai He
- Department of Computer Science and Engineering, University of California Santa Cruz, Santa Cruz, USA
| |
Collapse
|
8
|
She Q, Shi X, Fang F, Ma Y, Zhang Y. Cross-subject EEG emotion recognition using multi-source domain manifold feature selection. Comput Biol Med 2023; 159:106860. [PMID: 37080005 DOI: 10.1016/j.compbiomed.2023.106860] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 03/01/2023] [Accepted: 03/30/2023] [Indexed: 04/22/2023]
Abstract
Recent researches on emotion recognition suggests that domain adaptation, a form of transfer learning, has the capability to solve the cross-subject problem in Affective brain-computer interface (aBCI) field. However, traditional domain adaptation methods perform single to single domain transfer or simply merge different source domains into a larger domain to realize the transfer of knowledge, resulting in negative transfer. In this study, a multi-source transfer learning framework was proposed to promote the performance of multi-source electroencephalogram (EEG) emotion recognition. The method first used the data distribution similarity ranking (DDSA) method to select the appropriate source domain for each target domain off-line, and reduced data drift between domains through manifold feature mapping on Grassmann manifold. Meanwhile, the minimum redundancy maximum correlation algorithm (mRMR) was employed to select more representative manifold features and minimized the conditional distribution and marginal distribution of the manifold features, and then learned the domain-invariant classifier by summarizing structural risk minimization (SRM). Finally, the weighted fusion criterion was applied to further improve recognition performance. We compared our method with several state-of-the-art domain adaptation techniques using the SEED and DEAP dataset. Results showed that, compared with the conventional MEDA algorithm, the recognition accuracy of our proposed algorithm on SEED and DEAP dataset were improved by 6.74% and 5.34%, respectively. Besides, compared with TCA, JDA, and other state-of-the-art algorithms, the performance of our proposed method was also improved with the best average accuracy of 86.59% on SEED and 64.40% on DEAP. Our results demonstrated that the proposed multi-source transfer learning framework is more effective and feasible than other state-of-the-art methods in recognizing different emotions by solving the cross-subject problem.
Collapse
Affiliation(s)
- Qingshan She
- School of Automation, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, China.
| | - Xinsheng Shi
- School of Automation, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, China
| | - Feng Fang
- Department of Biomedical Engineering, University of Houston, Houston, TX, 77204, USA
| | - Yuliang Ma
- School of Automation, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, China
| | - Yingchun Zhang
- Department of Biomedical Engineering, University of Houston, Houston, TX, 77204, USA.
| |
Collapse
|
9
|
Yao Y, Li X, Zhang Y, Ye Y. Multisource Heterogeneous Domain Adaptation With Conditional Weighting Adversarial Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:2079-2092. [PMID: 34487497 DOI: 10.1109/tnnls.2021.3105868] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Heterogeneous domain adaptation (HDA) tackles the learning of cross-domain samples with both different probability distributions and feature representations. Most of the existing HDA studies focus on the single-source scenario. In reality, however, it is not uncommon to obtain samples from multiple heterogeneous domains. In this article, we study the multisource HDA problem and propose a conditional weighting adversarial network (CWAN) to address it. The proposed CWAN adversarially learns a feature transformer, a label classifier, and a domain discriminator. To quantify the importance of different source domains, CWAN introduces a sophisticated conditional weighting scheme to calculate the weights of the source domains according to the conditional distribution divergence between the source and target domains. Different from existing weighting schemes, the proposed conditional weighting scheme not only weights the source domains but also implicitly aligns the conditional distributions during the optimization process. Experimental results clearly demonstrate that the proposed CWAN performs much better than several state-of-the-art methods on four real-world datasets.
Collapse
|
10
|
Chen Y, Zhang H, Wang Y, Peng W, Zhang W, Wu QMJ, Yang Y. D-BIN: A Generalized Disentangling Batch Instance Normalization for Domain Adaptation. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:2151-2163. [PMID: 34546939 DOI: 10.1109/tcyb.2021.3110128] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Pattern recognition is significantly challenging in real-world scenarios by the variability of visual statistics. Therefore, most existing algorithms relying on the independent identically distributed assumption of training and test data suffer from the poor generalization capability of inference on unseen testing datasets. Although numerous studies, including domain discriminator or domain-invariant feature learning, are proposed to alleviate this problem, the data-driven property and lack of interpretation of their principle throw researchers and developers off. Consequently, this dilemma incurs us to rethink the essence of networks' generalization. An observation that visual patterns cannot be discriminative after style transfer inspires us to take careful consideration of the importance of style features and content features. Does the style information related to the domain bias? How to effectively disentangle content and style features across domains? In this article, we first investigate the effect of feature normalization on domain adaptation. Based on it, we propose a novel normalization module to adaptively leverage the propagated information through each channel and batch of features called disentangling batch instance normalization (D-BIN). In this module, we explicitly explore domain-specific and domaininvariant feature disentanglement. We maneuver contrastive learning to encourage images with the same semantics from different domains to have similar content representations while having dissimilar style representations. Furthermore, we construct both self-form and dual-form regularizers for preserving the mutual information (MI) between feature representations of the normalization layer in order to compensate for the loss of discriminative information and effectively match the distributions across domains. D-BIN and the constrained term can be simply plugged into state-of-the-art (SOTA) networks to improve their performance. In the end, experiments, including domain adaptation and generalization, conducted on different datasets have proven their effectiveness.
Collapse
|
11
|
Hayes N, Merkurjev E, Wei GW. Integrating transformer and autoencoder techniques with spectral graph algorithms for the prediction of scarcely labeled molecular data. Comput Biol Med 2023; 153:106479. [PMID: 36610214 PMCID: PMC9868114 DOI: 10.1016/j.compbiomed.2022.106479] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2022] [Revised: 10/25/2022] [Accepted: 12/21/2022] [Indexed: 12/24/2022]
Abstract
In molecular and biological sciences, experiments are expensive, time-consuming, and often subject to ethical constraints. Consequently, one often faces the challenging task of predicting desirable properties from small data sets or scarcely-labeled data sets. Although transfer learning can be advantageous, it requires the existence of a related large data set. This work introduces three graph-based models incorporating Merriman-Bence-Osher (MBO) techniques to tackle this challenge. Specifically, graph-based modifications of the MBO scheme are integrated with state-of-the-art techniques, including a home-made transformer and an autoencoder, in order to deal with scarcely-labeled data sets. In addition, a consensus technique is detailed. The proposed models are validated using five benchmark data sets. We also provide a thorough comparison to other competing methods, such as support vector machines, random forests, and gradient boosting decision trees, which are known for their good performance on small data sets. The performances of various methods are analyzed using residue-similarity (R-S) scores and R-S indices. Extensive computational experiments and theoretical analysis show that the new models perform very well even when as little as 1% of the data set is used as labeled data.
Collapse
Affiliation(s)
- Nicole Hayes
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Ekaterina Merkurjev
- Department of Mathematics, Michigan State University, MI 48824, USA; Department of Computational Mathematics, Science and Engineering, Michigan State University, MI 48824, USA.
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA; Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA; Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| |
Collapse
|
12
|
Wang W, Li H, Ding Z, Nie F, Chen J, Dong X, Wang Z. Rethinking Maximum Mean Discrepancy for Visual Domain Adaptation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:264-277. [PMID: 34242174 DOI: 10.1109/tnnls.2021.3093468] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Existing domain adaptation approaches often try to reduce distribution difference between source and target domains and respect domain-specific discriminative structures by some distribution [e.g., maximum mean discrepancy (MMD)] and discriminative distances (e.g., intra-class and inter-class distances). However, they usually consider these losses together and trade off their relative importance by estimating parameters empirically. It is still under insufficient exploration so far to deeply study their relationships to each other so that we cannot manipulate them correctly and the model's performance degrades. To this end, this article theoretically proves two essential facts: 1) minimizing MMD equals to jointly minimizing their data variance with some implicit weights but, respectively, maximizing the source and target intra-class distances so that feature discriminability degrades and 2) the relationship between intra-class and inter-class distances is as one falls and another rises. Based on this, we propose a novel discriminative MMD with two parallel strategies to correctly restrain the degradation of feature discriminability or the expansion of intra-class distance; specifically: 1) we directly impose a tradeoff parameter on the intra-class distance that is implicit in the MMD according to 1) and 2) we reformulate the inter-class distance with special weights that are analogical to those implicit ones in the MMD and maximizing it can also lead to the intra-class distance falling according to 2). Notably, we do not consider the two strategies in one model due to 2). The experiments on several benchmark datasets not only prove the validity of our revealed theoretical results but also demonstrate that the proposed approach could perform better than some compared state-of-art methods substantially. Our preliminary MATLAB code will be available at https://github.com/WWLoveTransfer/.
Collapse
|
13
|
Huang L, Fan J, Zhao W, You Y. A new multi-source Transfer Learning method based on Two-stage Weighted Fusion. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.110233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
14
|
Tang C, He Z, Li Y, Lv J. Zero-Shot Learning via Structure-Aligned Generative Adversarial Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:6749-6762. [PMID: 34106859 DOI: 10.1109/tnnls.2021.3083367] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In this article, we propose a structure-aligned generative adversarial network framework to improve zero-shot learning (ZSL) by mitigating the semantic gap, domain shift, and hubness problem. The proposed framework contains two parts, i.e., a generative adversarial network with a softmax classifier part, and a structure-aligned part. In the first part, the generative adversarial network aims at generating pseudovisual features through the guiding generator and discriminator play the minimax two-player game together. At the same time, the softmax classifier is committed to increasing the interclass distance and reducing intraclass distance. Then, the harmful effect of domain shift and hubness problems can be mitigated. In another part, we introduce a structure-aligned module where the structural consistency between visual space and semantic space is learned. By aligning the structure between visual space and semantic space, the semantic gap between them can be bridged. The performance of classification is improved when the structure-aligned visual-semantic embedding space is transferred to the unseen classes. Our framework reformulates the ZSL as a standard fully supervised classification task using the pseudovisual features of unseen classes. Extensive experiments conducted on five benchmark data sets demonstrate that the proposed framework significantly outperforms state-of-the-art methods in both conventional and generalized settings.
Collapse
|
15
|
Li J, Du Z, Zhu L, Ding Z, Lu K, Shen HT. Divergence-Agnostic Unsupervised Domain Adaptation by Adversarial Attacks. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:8196-8211. [PMID: 34478362 DOI: 10.1109/tpami.2021.3109287] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Conventional machine learning algorithms suffer the problem that the model trained on existing data fails to generalize well to the data sampled from other distributions. To tackle this issue, unsupervised domain adaptation (UDA) transfers the knowledge learned from a well-labeled source domain to a different but related target domain where labeled data is unavailable. The majority of existing UDA methods assume that data from the source domain and the target domain are available and complete during training. Thus, the divergence between the two domains can be formulated and minimized. In this paper, we consider a more practical yet challenging UDA setting where either the source domain data or the target domain data are unknown. Conventional UDA methods would fail this setting since the domain divergence is agnostic due to the absence of the source data or the target data. Technically, we investigate UDA from a novel view-adversarial attack-and tackle the divergence-agnostic adaptive learning problem in a unified framework. Specifically, we first report the motivation of our approach by investigating the inherent relationship between UDA and adversarial attacks. Then we elaborately design adversarial examples to attack the training model and harness these adversarial examples. We argue that the generalization ability of the model would be significantly improved if it can defend against our attack, so as to improve the performance on the target domain. Theoretically, we analyze the generalization bound for our method based on domain adaptation theories. Extensive experimental results on multiple UDA benchmarks under conventional, source-absent and target-absent UDA settings verify that our method is able to achieve a favorable performance compared with previous ones. Notably, this work extends the scope of both domain adaptation and adversarial attack, and expected to inspire more ideas in the community.
Collapse
|
16
|
Transferable feature filtration network for multi-source domain adaptation. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.110113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
17
|
Shojaie M, Cabrerizo M, DeKosky ST, Vaillancourt DE, Loewenstein D, Duara R, Adjouadi M. A transfer learning approach based on gradient boosting machine for diagnosis of Alzheimer’s disease. Front Aging Neurosci 2022; 14:966883. [PMID: 36275004 PMCID: PMC9581117 DOI: 10.3389/fnagi.2022.966883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2022] [Accepted: 08/22/2022] [Indexed: 12/01/2022] Open
Abstract
Early detection of Alzheimer’s disease (AD) during the Mild Cognitive Impairment (MCI) stage could enable effective intervention to slow down disease progression. Computer-aided diagnosis of AD relies on a sufficient amount of biomarker data. When this requirement is not fulfilled, transfer learning can be used to transfer knowledge from a source domain with more amount of labeled data than available in the desired target domain. In this study, an instance-based transfer learning framework is presented based on the gradient boosting machine (GBM). In GBM, a sequence of base learners is built, and each learner focuses on the errors (residuals) of the previous learner. In our transfer learning version of GBM (TrGB), a weighting mechanism based on the residuals of the base learners is defined for the source instances. Consequently, instances with different distribution than the target data will have a lower impact on the target learner. The proposed weighting scheme aims to transfer as much information as possible from the source domain while avoiding negative transfer. The target data in this study was obtained from the Mount Sinai dataset which is collected and processed in a collaborative 5-year project at the Mount Sinai Medical Center. The Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset was used as the source domain. The experimental results showed that the proposed TrGB algorithm could improve the classification accuracy by 1.5 and 4.5% for CN vs. MCI and multiclass classification, respectively, as compared to the conventional methods. Also, using the TrGB model and transferred knowledge from the CN vs. AD classification of the source domain, the average score of early MCI vs. late MCI classification improved by 5%.
Collapse
Affiliation(s)
- Mehdi Shojaie
- Department of Electrical and Computer Engineering, Center for Advanced Technology and Education, Florida International University, Miami, FL, United States
- *Correspondence: Mehdi Shojaie,
| | - Mercedes Cabrerizo
- Department of Electrical and Computer Engineering, Center for Advanced Technology and Education, Florida International University, Miami, FL, United States
| | - Steven T. DeKosky
- Fixel Institute for Neurological Disorders, University of Florida, Gainesville, FL, United States
- 1Florida ADRC (Alzheimer’s Disease Research Center), University of Florida, Gainesville, FL, United States
| | - David E. Vaillancourt
- Fixel Institute for Neurological Disorders, University of Florida, Gainesville, FL, United States
- 1Florida ADRC (Alzheimer’s Disease Research Center), University of Florida, Gainesville, FL, United States
- Department of Applied Physiology and Kinesiology, University of Florida, Gainesville, FL, United States
| | - David Loewenstein
- 1Florida ADRC (Alzheimer’s Disease Research Center), University of Florida, Gainesville, FL, United States
- Center for Cognitive Neuroscience and Aging, Miller School of Medicine, University of Miami, Miami, FL, United States
| | - Ranjan Duara
- 1Florida ADRC (Alzheimer’s Disease Research Center), University of Florida, Gainesville, FL, United States
- Wien Center for Alzheimer’s Disease & Memory Disorders, Mount Sinai Medical Center, Miami, FL, United States
| | - Malek Adjouadi
- Department of Electrical and Computer Engineering, Center for Advanced Technology and Education, Florida International University, Miami, FL, United States
- 1Florida ADRC (Alzheimer’s Disease Research Center), University of Florida, Gainesville, FL, United States
| |
Collapse
|
18
|
Zhou L, Ye M, Zhang D, Zhu C, Ji L. Prototype-Based Multisource Domain Adaptation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5308-5320. [PMID: 33852394 DOI: 10.1109/tnnls.2021.3070085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Unsupervised domain adaptation aims to transfer knowledge from labeled source domain to unlabeled target domain. Recently, multisource domain adaptation (MDA) has begun to attract attention. Its performance should go beyond simply mixing all source domains together for knowledge transfer. In this article, we propose a novel prototype-based method for MDA. Specifically, for solving the problem that the target domain has no label, we use the prototype to transfer the semantic category information from source domains to target domain. First, a feature extraction network is applied to both source and target domains to obtain the extracted features from which the domain-invariant features and domain-specific features will be disentangled. Then, based on these two kinds of features, the named inherent class prototypes and domain prototypes are estimated, respectively. Then a prototype mapping to the extracted feature space is learned in the feature reconstruction process. Thus, the class prototypes for all source and target domains can be constructed in the extracted feature space based on the previous domain prototypes and inherent class prototypes. By forcing the extracted features are close to the corresponding class prototypes for all domains, the feature extraction network is progressively adjusted. In the end, the inherent class prototypes are used as a classifier in the target domain. Our contribution is that through the inherent class prototypes and domain prototypes, the semantic category information from source domains is transformed into the target domain by constructing the corresponding class prototypes. In our method, all source and target domains are aligned twice at the feature level for better domain-invariant features and more closer features to the class prototypes, respectively. Several experiments on public data sets also prove the effectiveness of our method.
Collapse
|
19
|
Majumdar SS, Jain S, Tourni IC, Mustafin A, Lteif D, Sclaroff S, Saenko K, Bargal SA. Ani-GIFs: A benchmark dataset for domain generalization of action recognition from GIFs. FRONTIERS IN COMPUTER SCIENCE 2022. [DOI: 10.3389/fcomp.2022.876846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Deep learning models perform remarkably well for the same task under the assumption that data is always coming from the same distribution. However, this is generally violated in practice, mainly due to the differences in data acquisition techniques and the lack of information about the underlying source of new data. Domain generalization targets the ability to generalize to test data of an unseen domain; while this problem is well-studied for images, such studies are significantly lacking in spatiotemporal visual content—videos and GIFs. This is due to (1) the challenging nature of misalignment of temporal features and the varying appearance/motion of actors and actions in different domains, and (2) spatiotemporal datasets being laborious to collect and annotate for multiple domains. We collect and present the first synthetic video dataset of Animated GIFs for domain generalization, Ani-GIFs, that is used to study the domain gap of videos vs. GIFs, and animated vs. real GIFs, for the task of action recognition. We provide a training and testing setting for Ani-GIFs, and extend two domain generalization baseline approaches, based on data augmentation and explainability, to the spatiotemporal domain to catalyze research in this direction.
Collapse
|
20
|
Azimifar M, Nejatian S, Parvin H, Bagherifard K, Rezaei V. A structure-protecting kernelized semi-supervised space adjustment for classification. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-200224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
We introduce a semi-supervised space adjustment framework in this paper. In the introduced framework, the dataset contains two subsets: (a) training data subset (space-one data (SOD)) and (b) testing data subset (space-two data (STD)). Our semi-supervised space adjustment framework learns under three assumptions: (I) it is assumed that all data points in the SOD are labeled, and only a minority of the data points in the STD are labeled (we call the labeled space-two data as LSTD), (II) the size of LSTD is very small comparing to the size of SOD, and (III) it is also assumed that the data of SOD and the data of STD have different distributions. We denote the unlabeled space-two data by ULSTD, which is equal to STD - LSTD. The aim is to map the training data, i.e., the data from the training labeled data subset and those from LSTD (note that all labeled data are considered to be training data, i.e., SOD ∪ LSTD) into a shared space (ShS). The mapped SOD, ULSTD, and LSTD into ShS are named MSOD, MULSTD, and MLSTD, respectively. The proposed method does the mentioned mapping in such a way that structures of the data points in SOD and MSOD, in STD and MSTD, in ULSTD and MULSTD, and in LSTD and MLSTD are the same. In the proposed method, the mapping is proposed to be done by a principal component analysis transformation on kernelized data. In the proposed method, it is tried to find a mapping that (a) can maintain the neighbors of data points after the mapping and (b) can take advantage of the class labels that are known in STD during transformation. After that, we represent and formulate the problem of finding the optimal mapping into a non-linear objective function. To solve it, we transform it into a semidefinite programming (SDP) problem. We solve the optimization problem with an SDP solver. The examinations indicate the superiority of the learners trained in the data mapped by the proposed approach to the learners trained in the data mapped by the state of the art methods.
Collapse
Affiliation(s)
- Maryam Azimifar
- Department of Computer Science, Yasooj Branch, Islamic Azad University, Yasooj, IR
| | - Samad Nejatian
- Department of Electrical Engineering, Yasooj Branch, Islamic Azad University, Yasooj, IR
| | - Hamid Parvin
- Department of Computer Science, Nourabad Mamasani Branch, Islamic Azad University, Mamasani, IR
| | | | - Vahideh Rezaei
- Department of Mathematics, Yasooj Branch, Islamic Azad University, Yasooj, IR
| |
Collapse
|
21
|
Zhang L, Gao X. Transfer Adaptation Learning: A Decade Survey. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:23-44. [PMID: 35727786 DOI: 10.1109/tnnls.2022.3183326] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
The world we see is ever-changing and it always changes with people, things, and the environment. Domain is referred to as the state of the world at a certain moment. A research problem is characterized as transfer adaptation learning (TAL) when it needs knowledge correspondence between different moments/domains. TAL aims to build models that can perform tasks of target domain by learning knowledge from a semantic-related but distribution different source domain. It is an energetic research field of increasing influence and importance, which is presenting a blowout publication trend. This article surveys the advances of TAL methodologies in the past decade, and the technical challenges and essential problems of TAL have been observed and discussed with deep insights and new perspectives. Broader solutions of TAL being created by researchers are identified, i.e., instance reweighting adaptation, feature adaptation, classifier adaptation, deep network adaptation, and adversarial adaptation, which are beyond the early semisupervised and unsupervised split. The survey helps researchers rapidly but comprehensively understand and identify the research foundation, research status, theoretical limitations, future challenges, and understudied issues (universality, interpretability, and credibility) to be broken in the field toward generalizable representation in open-world scenarios.
Collapse
|
22
|
Xu X, Lin K, Gao L, Lu H, Shen HT, Li X. Learning Cross-Modal Common Representations by Private-Shared Subspaces Separation. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:3261-3275. [PMID: 32780706 DOI: 10.1109/tcyb.2020.3009004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Due to the inconsistent distributions and representations of different modalities (e.g., images and texts), it is very challenging to correlate such heterogeneous data. A standard solution is to construct one common subspace, where the common representations of different modalities are generated to bridge the heterogeneity gap. Existing methods based on common representation learning mostly adopt a less effective two-stage paradigm: first, generating separate representations for each modality by exploiting the modality-specific properties as the complementary information, and then capturing the cross-modal correlation in the separate representations for common representation learning. Moreover, these methods usually neglect that there may exist interference in the modality-specific properties, that is, the unrelated objects and background regions in images or the noisy words and incorrect sentences in the text. In this article, we hypothesize that explicitly modeling the interference within each modality can improve the quality of common representation learning. To this end, we propose a novel model private-shared subspaces separation (P3S) to explicitly learn different representations that are partitioned into two kinds of subspaces: 1) the common representations that capture the cross-modal correlation in a shared subspace and 2) the private representations that model the interference within each modality in two private subspaces. By employing the orthogonality constraints between the shared subspace and the private subspaces during the one-stage joint learning procedure, our model is able to learn more effective common representations for different modalities in the shared subspace by fully excluding the interference within each modality. Extensive experiments conducted on cross-modal retrieval verify the advantages of our P3S method compared with 15 state-of-the-art methods on four widely used cross-modal datasets.
Collapse
|
23
|
Tian J, Han D, Li M, Shi P. A multi-source information transfer learning method with subdomain adaptation for cross-domain fault diagnosis. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108466] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
24
|
Tao J, Dan Y, Zhou D, He S. Robust Latent Multi-Source Adaptation for Encephalogram-Based Emotion Recognition. Front Neurosci 2022; 16:850906. [PMID: 35573289 PMCID: PMC9091911 DOI: 10.3389/fnins.2022.850906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Accepted: 02/11/2022] [Indexed: 11/18/2022] Open
Abstract
In practical encephalogram (EEG)-based machine learning, different subjects can be represented by many different EEG patterns, which would, in some extent, degrade the performance of extant subject-independent classifiers obtained from cross-subjects datasets. To this end, in this paper, we present a robust Latent Multi-source Adaptation (LMA) framework for cross-subject/dataset emotion recognition with EEG signals by uncovering multiple domain-invariant latent subspaces. Specifically, by jointly aligning the statistical and semantic distribution discrepancies between each source and target pair, multiple domain-invariant classifiers can be trained collaboratively in a unified framework. This framework can fully utilize the correlated knowledge among multiple sources with a novel low-rank regularization term. Comprehensive experiments on DEAP and SEED datasets demonstrate the superior or comparable performance of LMA with the state of the art in the EEG-based emotion recognition.
Collapse
Affiliation(s)
- Jianwen Tao
- Institute of Artificial Intelligence Application, Ningbo Polytechnic, Ningbo, China
| | - Yufang Dan
- Institute of Artificial Intelligence Application, Ningbo Polytechnic, Ningbo, China
| | - Di Zhou
- Industrial Technological Institute of Intelligent Manufacturing, Sichuan University of Arts and Science, Dazhou, China
| | - Songsong He
- Institute of Artificial Intelligence Application, Ningbo Polytechnic, Ningbo, China
| |
Collapse
|
25
|
Ren CX, Liu YH, Zhang XW, Huang KK. Multi-Source Unsupervised Domain Adaptation via Pseudo Target Domain. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:2122-2135. [PMID: 35196236 DOI: 10.1109/tip.2022.3152052] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Multi-source domain adaptation (MDA) aims to transfer knowledge from multiple source domains to an unlabeled target domain. MDA is a challenging task due to the severe domain shift, which not only exists between target and source but also exists among diverse sources. Prior studies on MDA either estimate a mixed distribution of source domains or combine multiple single-source models, but few of them delve into the relevant information among diverse source domains. For this reason, we propose a novel MDA approach, termed Pseudo Target for MDA (PTMDA). Specifically, PTMDA maps each group of source and target domains into a group-specific subspace using adversarial learning with a metric constraint, and constructs a series of pseudo target domains correspondingly. Then we align the remainder source domains with the pseudo target domain in the subspace efficiently, which allows to exploit additional structured source information through the training on pseudo target domain and improves the performance on the real target domain. Besides, to improve the transferability of deep neural networks (DNNs), we replace the traditional batch normalization layer with an effective matching normalization layer, which enforces alignments in latent layers of DNNs and thus gains further promotion. We give theoretical analysis showing that PTMDA as a whole can reduce the target error bound and leads to a better approximation of the target risk in MDA settings. Extensive experiments demonstrate PTMDA's effectiveness on MDA tasks, as it outperforms state-of-the-art methods in most experimental settings.
Collapse
|
26
|
Zhao S, Yue X, Zhang S, Li B, Zhao H, Wu B, Krishna R, Gonzalez JE, Sangiovanni-Vincentelli AL, Seshia SA, Keutzer K. A Review of Single-Source Deep Unsupervised Visual Domain Adaptation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:473-493. [PMID: 33095718 DOI: 10.1109/tnnls.2020.3028503] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Large-scale labeled training datasets have enabled deep neural networks to excel across a wide range of benchmark vision tasks. However, in many applications, it is prohibitively expensive and time-consuming to obtain large quantities of labeled data. To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled or unlabeled target domain. Unfortunately, direct transfer across domains often performs poorly due to the presence of domain shift or dataset bias. Domain adaptation (DA) is a machine learning paradigm that aims to learn a model from a source domain that can perform well on a different (but related) target domain. In this article, we review the latest single-source deep unsupervised DA methods focused on visual tasks and discuss new perspectives for future research. We begin with the definitions of different DA strategies and the descriptions of existing benchmark datasets. We then summarize and compare different categories of single-source unsupervised DA methods, including discrepancy-based methods, adversarial discriminative methods, adversarial generative methods, and self-supervision-based methods. Finally, we discuss future research directions with challenges and possible solutions.
Collapse
|
27
|
Wang F, Li W, Xu D. Cross-Dataset Point Cloud Recognition Using Deep-Shallow Domain Adaptation Network. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:7364-7377. [PMID: 34255628 DOI: 10.1109/tip.2021.3092818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In this work, we propose a new two-view domain adaptation network named Deep-Shallow Domain Adaptation Network (DSDAN) for 3D point cloud recognition. Different from the traditional 2D image recognition task, the valuable texture information is often absent in point cloud data, making point cloud recognition a challenging task, especially in the cross-dataset scenario where the training and testing data exhibit a considerable distribution mismatch. In our DSDAN method, we tackle the challenging cross-dataset 3D point cloud recognition task from two aspects. On one hand, we propose a two-view learning framework, such that we can effectively leverage multiple feature representations to improve the recognition performance. To this end, we propose a simple and efficient Bag-of-Points feature method, as a complementary view to the deep representation. Moreover, we also propose a cross view consistency loss to boost the two-view learning framework. On the other hand, we further propose a two-level adaptation strategy to effectively address the domain distribution mismatch issue. Specifically, we apply a feature-level distribution alignment module for each view, and also propose an instance-level adaptation approach to select highly confident pseudo-labeled target samples for adapting the model to the target domain, based on which a co-training scheme is used to integrate the learning and adaptation process on the two views. Extensive experiments on the benchmark dataset show that our newly proposed DSDAN method outperforms the existing state-of-the-art methods for the cross-dataset point cloud recognition task.
Collapse
|
28
|
Zhang W, Xu D, Ouyang W, Li W. Self-Paced Collaborative and Adversarial Network for Unsupervised Domain Adaptation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:2047-2061. [PMID: 31880543 DOI: 10.1109/tpami.2019.2962476] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This paper proposes a new unsupervised domain adaptation approach called Collaborative and Adversarial Network (CAN), which uses the domain-collaborative and domain-adversarial learning strategies for training the neural network. The domain-collaborative learning strategy aims to learn domain specific feature representation to preserve the discriminability for the target domain, while the domain adversarial learning strategy aims to learn domain invariant feature representation to reduce the domain distribution mismatch between the source and target domains. We show that these two learning strategies can be uniformly formulated as domain classifier learning with positive or negative weights on the losses. We then design a collaborative and adversarial training scheme, which automatically learns domain specific representations from lower blocks in CNNs through collaborative learning and domain invariant representations from higher blocks through adversarial learning. Moreover, to further enhance the discriminability in the target domain, we propose Self-Paced CAN (SPCAN), which progressively selects pseudo-labeled target samples for re-training the classifiers. We employ a self-paced learning strategy such that we can select pseudo-labeled target samples in an easy-to-hard fashion. Additionally, we build upon the popular two-stream approach to extend our domain adaptation approach for more challenging video action recognition task, which additionally considers the cooperation between the RGB stream and the optical flow stream. We propose the Two-stream SPCAN (TS-SPCAN) method to select and reweight the pseudo labeled target samples of one stream (RGB/Flow) based on the information from the other stream (Flow/RGB) in a cooperative way. As a result, our TS-SPCAN model is able to exchange the information between the two streams. Comprehensive experiments on different benchmark datasets, Office-31, ImageCLEF-DA and VISDA-2017 for the object recognition task, and UCF101-10 and HMDB51-10 for the video action recognition task, show our newly proposed approaches achieve the state-of-the-art performance, which clearly demonstrates the effectiveness of our proposed approaches for unsupervised domain adaptation.
Collapse
|
29
|
Zhao S, Li B, Xu P, Yue X, Ding G, Keutzer K. MADAN: Multi-source Adversarial Domain Aggregation Network for Domain Adaptation. Int J Comput Vis 2021. [DOI: 10.1007/s11263-021-01479-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
30
|
|
31
|
Tao J, Dan Y. Multi-Source Co-adaptation for EEG-Based Emotion Recognition by Mining Correlation Information. Front Neurosci 2021; 15:677106. [PMID: 34054422 PMCID: PMC8155359 DOI: 10.3389/fnins.2021.677106] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Accepted: 03/22/2021] [Indexed: 11/17/2022] Open
Abstract
Since each individual subject may present completely different encephalogram (EEG) patterns with respect to other subjects, existing subject-independent emotion classifiers trained on data sampled from cross-subjects or cross-dataset generally fail to achieve sound accuracy. In this scenario, the domain adaptation technique could be employed to address this problem, which has recently got extensive attention due to its effectiveness on cross-distribution learning. Focusing on cross-subject or cross-dataset automated emotion recognition with EEG features, we propose in this article a robust multi-source co-adaptation framework by mining diverse correlation information (MACI) among domains and features with l 2,1-norm as well as correlation metric regularization. Specifically, by minimizing the statistical and semantic distribution differences between source and target domains, multiple subject-invariant classifiers can be learned together in a joint framework, which can make MACI use relevant knowledge from multiple sources by exploiting the developed correlation metric function. Comprehensive experimental evidence on DEAP and SEED datasets verifies the better performance of MACI in EEG-based emotion recognition.
Collapse
|
32
|
Liu ZG, Huang LQ, Zhou K, Denoeux T. Combination of Transferable Classification With Multisource Domain Adaptation Based on Evidential Reasoning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:2015-2029. [PMID: 32497012 DOI: 10.1109/tnnls.2020.2995862] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
In applications of domain adaptation, there may exist multiple source domains, which can provide more or less complementary knowledge for pattern classification in the target domain. In order to improve the classification accuracy, a decision-level combination method is proposed for the multisource domain adaptation based on evidential reasoning. The classification results obtained from different source domains usually have different reliabilities/weights, which are calculated according to domain consistency. Therefore, the multiple classification results are discounted by the corresponding weights under belief functions framework, and then, Dempster's rule is employed to combine these discounted results. In order to reduce errors, a neighborhood-based cautious decision-making rule is developed to make the class decision depending on the combination result. The object is assigned to a singleton class if its neighborhoods can be (almost) correctly classified. Otherwise, it is cautiously committed to the disjunction of several possible classes. By doing this, we can well characterize the partial imprecision of classification and reduce the error risk as well. A unified utility value is defined here to reflect the benefit of such classification. This cautious decision-making rule can achieve the maximum unified utility value because partial imprecision is considered better than an error. Several real data sets are used to test the performance of the proposed method, and the experimental results show that our new method can efficiently improve the classification accuracy with respect to other related combination methods.
Collapse
|
33
|
Wu X, Chen J, Yu F, Yao M, Luo J. Joint Learning of Multiple Latent Domains and Deep Representations for Domain Adaptation. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:2676-2687. [PMID: 31251207 DOI: 10.1109/tcyb.2019.2921559] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In domain adaptation, the automatic discovery of multiple latent source domains has succeeded by capturing the intrinsic structure underlying the source data. Different from previous works that mainly rely on shallow models for domain discovery, we propose a novel unified framework based on deep neural networks to jointly address latent domain prediction from source data and deep representation learning from both source and target data. Within this framework, an iterative algorithm is proposed to alternate between 1) utilizing a new probabilistic hierarchical clustering method to separate the source domain into latent clusters and 2) training deep neural networks by using the domain membership as the supervision to learn deep representations. The key idea behind this joint learning framework is that good representations can help to improve the prediction accuracy of latent domains and, in turn, domain prediction results can provide useful supervisory information for feature learning. During the training of the deep model, a domain prediction loss, a domain confusion loss, and a task-specific classification loss are effectively integrated to enable the learned feature to distinguish between different latent source domains, transfer between source and target domains, and become semantically meaningful among different classes. Trained in an end-to-end fashion, our framework outperforms the state-of-the-art methods for latent domain discovery, as validated by extensive experiments on both object classification and human action-recognition tasks.
Collapse
|
34
|
Gao P, Wu W, Li J. Multi-source fast transfer learning algorithm based on support vector machine. APPL INTELL 2021; 51:8451-8465. [PMID: 34764591 PMCID: PMC8023540 DOI: 10.1007/s10489-021-02194-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/04/2021] [Indexed: 11/30/2022]
Abstract
Knowledge in the source domain can be used in transfer learning to help train and classification tasks within the target domain with fewer available data sets. Therefore, given the situation where the target domain contains only a small number of available unlabeled data sets and multi-source domains contain a large number of labeled data sets, a new Multi-source Fast Transfer Learning algorithm based on support vector machine(MultiFTLSVM) is proposed in this paper. Given the idea of multi-source transfer learning, more source domain knowledge is taken to train the target domain learning task to improve classification effect. At the same time, the representative data set of the source domain is taken to speed up the algorithm training process to improve the efficiency of the algorithm. Experimental results on several real data sets show the effectiveness of MultiFTLSVM, and it also has certain advantages compared with the benchmark algorithm.
Collapse
Affiliation(s)
- Peng Gao
- College of Computer Science and Technology, Harbin Engineering University, Harbin, China
- Technology Development Cente, Heilongjiang Broadcasting Station, Harbin, China
| | - Weifei Wu
- College of Computer Science and Technology, Harbin Engineering University, Harbin, China
| | - Jingmei Li
- College of Computer Science and Technology, Harbin Engineering University, Harbin, China
| |
Collapse
|
35
|
Zhang W, Xu D, Zhang J, Ouyang W. Progressive Modality Cooperation for Multi-Modality Domain Adaptation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:3293-3306. [PMID: 33481713 DOI: 10.1109/tip.2021.3052083] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
In this work, we propose a new generic multi-modality domain adaptation framework called Progressive Modality Cooperation (PMC) to transfer the knowledge learned from the source domain to the target domain by exploiting multiple modality clues (e.g., RGB and depth) under the multi-modality domain adaptation (MMDA) and the more general multi-modality domain adaptation using privileged information (MMDA-PI) settings. Under the MMDA setting, the samples in both domains have all the modalities. Through effective collaboration among multiple modalities, the two newly proposed modules in our PMC can select the reliable pseudo-labeled target samples, which captures the modality-specific information and modality-integrated information, respectively. Under the MMDA-PI setting, some modalities are missing in the target domain. Hence, to better exploit the multi-modality data in the source domain, we further propose the PMC with privileged information (PMC-PI) method by proposing a new multi-modality data generation (MMG) network. MMG generates the missing modalities in the target domain based on the source domain data by considering both domain distribution mismatch and semantics preservation, which are respectively achieved by using adversarial learning and conditioning on weighted pseudo semantic class labels. Extensive experiments on three image datasets and eight video datasets for various multi-modality cross-domain visual recognition tasks under both MMDA and MMDA-PI settings clearly demonstrate the effectiveness of our proposed PMC framework.
Collapse
|
36
|
Xia K, Ni T, Yin H, Chen B. Cross-Domain Classification Model With Knowledge Utilization Maximization for Recognition of Epileptic EEG Signals. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:53-61. [PMID: 32078557 DOI: 10.1109/tcbb.2020.2973978] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Conventional classification models for epileptic EEG signal recognition need sufficient labeled samples as training dataset. In addition, when training and testing EEG signal samples are collected from different distributions, for example, due to differences in patient groups or acquisition devices, such methods generally cannot perform well. In this paper, a cross-domain classification model with knowledge utilization maximization called CDC-KUM is presented, which takes advantage of the data global structure provided by the labeled samples in the related domain and unlabeled samples in the current domain. Through mapping the data into kernel space, the pairwise constraint regularization term is combined together the predictive differences of the labeled data in the source domain. Meanwhile, the soft clustering regularization term using quadratic weights and Gini-Simpson diversity is applied to exploit the distribution information of unlabeled data in the target domain. Experimental results show that CDC-KUM model outperformed several traditional non-transfer and transfer classification methods for recognition of epileptic EEG signals.
Collapse
|
37
|
Bernhardt M, Vishnevskiy V, Rau R, Goksel O. Training Variational Networks With Multidomain Simulations: Speed-of-Sound Image Reconstruction. IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL 2020; 67:2584-2594. [PMID: 32746211 DOI: 10.1109/tuffc.2020.3010186] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Speed-of-sound (SoS) has been shown as a potential biomarker for breast cancer imaging, successfully differentiating malignant tumors from benign ones. SoS images can be reconstructed from time-of-flight measurements from ultrasound images acquired using conventional handheld ultrasound transducers. Variational networks (VNs) have recently been shown to be a potential learning-based approach for optimizing inverse problems in image reconstruction. Despite earlier promising results, these methods, however, do not generalize well from simulated to acquired data, due to the domain shift. In this work, we present for the first time a VN solution for a pulse-echo SoS image reconstruction problem using diverging waves with conventional transducers and single-sided tissue access. This is made possible by incorporating simulations with varying complexity into training. We use loop unrolling of gradient descent with momentum, with an exponentially weighted loss of outputs at each unrolled iteration in order to regularize the training. We learn norms as activation functions regularized to have smooth forms for robustness to input distribution variations. We evaluate reconstruction quality on the ray-based and full-wave simulations as well as on the tissue-mimicking phantom data, in comparison with a classical iterative [limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS)] optimization of this image reconstruction problem. We show that the proposed regularization techniques combined with multisource domain training yield substantial improvements in the domain adaptation capabilities of VN, reducing the median root mean squared error (RMSE) by 54% on a wave-based simulation data set compared to the baseline VN. We also show that on data acquired from a tissue-mimicking breast phantom, the proposed VN provides improved reconstruction in 12 ms.
Collapse
|
38
|
Wang Q, Artières T, Takerkart S. Inter-subject pattern analysis for multivariate group analysis of functional neuroimaging. A unifying formalization. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2020; 197:105730. [PMID: 32987228 DOI: 10.1016/j.cmpb.2020.105730] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Accepted: 08/27/2020] [Indexed: 06/11/2023]
Abstract
BACKGROUND AND OBJECTIVE In medical imaging, population studies have to overcome the differences that exist between individuals to identify invariant image features that can be used for diagnosis purposes. In functional neuroimaging, an appealing solution to identify neural coding principles that hold at the population level is inter-subject pattern analysis, i.e. to learn a predictive model on data from multiple subjects and evaluate its generalization performance on new subjects. Although it has gained popularity in recent years, its widespread adoption is still hampered by the blatant lack of a formal definition in the literature. In this paper, we precisely introduce the first principled formalization of inter-subject pattern analysis targeted at multivariate group analysis of functional neuroimaging. METHODS We propose to frame inter-subject pattern analysis as a multi-source transductive transfer question, thus grounding it within several well defined machine learning settings and broadening the spectrum of usable algorithms. We describe two sets of inter-subject brain decoding experiments that use several open datasets: a magneto-encephalography study with 16 subjects and a functional magnetic resonance imaging paradigm with 100 subjects. We assess the relevance of our framework by performing model comparisons, where one brain decoding model exploits our formalization while others do not. RESULTS The first set of experiments demonstrates the superiority of a brain decoder that uses subject-by-subject standardization compared to state of the art models that use other standardization schemes, making the case for the interest of the transductive and the multi-source components of our formalization The second set of experiments quantitatively shows that, even after such transformation, it is more difficult for a brain decoder to generalize to new participants rather than to new data from participants available in the training phase, thus highlighting the transfer gap that needs to be overcome. CONCLUSION This paper describes the first formalization of inter-subject pattern analysis as a multi-source transductive transfer learning problem. We demonstrate the added value of this formalization using proof-of-concept experiments on several complementary functional neuroimaging datasets. This work should contribute to popularize inter-subject pattern analysis for functional neuroimaging population studies and pave the road for future methodological innovations.
Collapse
Affiliation(s)
- Qi Wang
- Institut de Neurosciences de la Timone UMR 7289 Aix-Marseille Université, CNRS Faculté de Médecine, 27 boulevard Jean Moulin, Marseille 13005, France; Laboratoire d'Informatique et Systèmes UMR 7020 Aix-Marseille Université, CNRS, Ecole Centrale de Marseille Faculté des Sciences, 163 avenue de Luminy, Case 901, Marseille 13009, France
| | - Thierry Artières
- Laboratoire d'Informatique et Systèmes UMR 7020 Aix-Marseille Université, CNRS, Ecole Centrale de Marseille Faculté des Sciences, 163 avenue de Luminy, Case 901, Marseille 13009, France
| | - Sylvain Takerkart
- Institut de Neurosciences de la Timone UMR 7289 Aix-Marseille Université, CNRS Faculté de Médecine, 27 boulevard Jean Moulin, Marseille 13005, France.
| |
Collapse
|
39
|
Wang J, Zhang L, Wang Q, Chen L, Shi J, Chen X, Li Z, Shen D. Multi-Class ASD Classification Based on Functional Connectivity and Functional Correlation Tensor via Multi-Source Domain Adaptation and Multi-View Sparse Representation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:3137-3147. [PMID: 32305905 DOI: 10.1109/tmi.2020.2987817] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The resting-state functional magnetic resonance imaging (rs-fMRI) reflects functional activity of brain regions by blood-oxygen-level dependent (BOLD) signals. Up to now, many computer-aided diagnosis methods based on rs-fMRI have been developed for Autism Spectrum Disorder (ASD). These methods are mostly the binary classification approaches to determine whether a subject is an ASD patient or not. However, the disease often consists of several sub-categories, which are complex and thus still confusing to many automatic classification methods. Besides, existing methods usually focus on the functional connectivity (FC) features in grey matter regions, which only account for a small portion of the rs-fMRI data. Recently, the possibility to reveal the connectivity information in the white matter regions of rs-fMRI has drawn high attention. To this end, we propose to use the patch-based functional correlation tensor (PBFCT) features extracted from rs-fMRI in white matter, in addition to the traditional FC features from gray matter, to develop a novel multi-class ASD diagnosis method in this work. Our method has two stages. Specifically, in the first stage of multi-source domain adaptation (MSDA), the source subjects belonging to multiple clinical centers (thus called as source domains) are all transformed into the same target feature space. Thus each subject in the target domain can be linearly reconstructed by the transformed subjects. In the second stage of multi-view sparse representation (MVSR), a multi-view classifier for multi-class ASD diagnosis is developed by jointly using both views of the FC and PBFCT features. The experimental results using the ABIDE dataset verify the effectiveness of our method, which is capable of accurately classifying each subject into a respective ASD sub-category.
Collapse
|
40
|
Wu H, Yan Y, Ng MK, Wu Q. Domain-attention Conditional Wasserstein Distance for Multi-source Domain Adaptation. ACM T INTEL SYST TEC 2020. [DOI: 10.1145/3391229] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Multi-source domain adaptation has received considerable attention due to its effectiveness of leveraging the knowledge from multiple related sources with different distributions to enhance the learning performance. One of the fundamental challenges in multi-source domain adaptation is how to determine the amount of knowledge transferred from each source domain to the target domain. To address this issue, we propose a new algorithm, called Domain-attention Conditional Wasserstein Distance (DCWD), to learn transferred weights for evaluating the relatedness across the source and target domains. In DCWD, we design a new conditional Wasserstein distance objective function by taking the label information into consideration to measure the distance between a given source domain and the target domain. We also develop an attention scheme to compute the transferred weights of different source domains based on their conditional Wasserstein distances to the target domain. After that, the transferred weights can be used to reweight the source data to determine their importance in knowledge transfer. We conduct comprehensive experiments on several real-world data sets, and the results demonstrate the effectiveness and efficiency of the proposed method.
Collapse
Affiliation(s)
- Hanrui Wu
- South China University of Technology, Guangzhou, China
| | - Yuguang Yan
- The University of Hong Kong, Hong Kong, China
| | | | - Qingyao Wu
- South China University of Technology, Guangzhou, China
| |
Collapse
|
41
|
Li J, Qiu S, Shen YY, Liu CL, He H. Multisource Transfer Learning for Cross-Subject EEG Emotion Recognition. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:3281-3293. [PMID: 30932860 DOI: 10.1109/tcyb.2019.2904052] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Electroencephalogram (EEG) has been widely used in emotion recognition due to its high temporal resolution and reliability. Since the individual differences of EEG are large, the emotion recognition models could not be shared across persons, and we need to collect new labeled data to train personal models for new users. In some applications, we hope to acquire models for new persons as fast as possible, and reduce the demand for the labeled data amount. To achieve this goal, we propose a multisource transfer learning method, where existing persons are sources, and the new person is the target. The target data are divided into calibration sessions for training and subsequent sessions for test. The first stage of the method is source selection aimed at locating appropriate sources. The second is style transfer mapping, which reduces the EEG differences between the target and each source. We use few labeled data in the calibration sessions to conduct source selection and style transfer. Finally, we integrate the source models to recognize emotions in the subsequent sessions. The experimental results show that the three-category classification accuracy on benchmark SEED improves by 12.72% comparing with the nontransfer method. Our method facilitates the fast deployment of emotion recognition models by reducing the reliance on the labeled data amount, which has practical significance especially in fast-deployment scenarios.
Collapse
|
42
|
Wang Z, Du B, Guo Y. Domain Adaptation With Neural Embedding Matching. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:2387-2397. [PMID: 31536022 DOI: 10.1109/tnnls.2019.2935608] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Domain adaptation aims to exploit the supervision knowledge in a source domain for learning prediction models in a target domain. In this article, we propose a novel representation learning-based domain adaptation method, i.e., neural embedding matching (NEM) method, to transfer information from the source domain to the target domain where labeled data is scarce. The proposed approach induces an intermediate common representation space for both domains with a neural network model while matching the embedding of data from the two domains in this common representation space. The embedding matching is based on the fundamental assumptions that a cross-domain pair of instances will be close to each other in the embedding space if they belong to the same class category, and the local geometry property of the data can be maintained in the embedding space. The assumptions are encoded via objectives of metric learning and graph embedding techniques to regularize and learn the semisupervised neural embedding model. We also provide a generalization bound analysis for the proposed domain adaptation method. Meanwhile, a progressive learning strategy is proposed and it improves the generalization ability of the neural network gradually. Experiments are conducted on a number of benchmark data sets and the results demonstrate that the proposed method outperforms several state-of-the-art domain adaptation methods and the progressive learning strategy is promising.
Collapse
|
43
|
Fu W, Parvin H, Mahmoudi MR, Tuan BA, Pho KH. A linear space adjustment by mapping data into an intermediate space and keeping low level data structures. J EXP THEOR ARTIF IN 2020. [DOI: 10.1080/0952813x.2020.1764634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Weiqing Fu
- School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai City, China
| | - Hamid Parvin
- Department of Computer Science, Nourabad Mamasani Branch, Islamic Azad University, Mamasani, Iran
- Young Researchers and Elite Club, Nourabad Mamasani Branch, Islamic Azad University, Mamasani, Iran
| | - Mohammad Reza Mahmoudi
- Institute of Research and Development, Duy Tan University, Da Nang, Vietnam
- Department of Statistics, Faculty of Science, Fasa University, Fasa, Fars, Iran
| | - Bui Anh Tuan
- Department of Mathematics Education, Teachers College, Can Tho University, Can Tho City, Vietnam
| | - Kim-Hung Pho
- Fractional Calculus, Optimization and Algebra Research Group, Faculty of Mathematics and Statistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam
| |
Collapse
|
44
|
Yang L, Zhong P. Robust adaptation regularization based on within-class scatter for domain adaptation. Neural Netw 2020; 124:60-74. [PMID: 31982674 DOI: 10.1016/j.neunet.2020.01.009] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 12/08/2019] [Accepted: 01/09/2020] [Indexed: 11/17/2022]
Abstract
In many practical applications, the assumption that the distributions of the data employed for training and test are identical is rarely valid, which would result in a rapid decline in performance. To address this problem, the domain adaptation strategy has been developed in recent years. In this paper, we propose a novel unsupervised domain adaptation method, referred to as Robust Adaptation Regularization based on Within-Class Scatter (WCS-RAR), to simultaneously optimize the regularized loss, the within-class scatter, the joint distribution between domains, and the manifold consistency. On the one hand, to make the model robust against outliers, we adopt an l2,1-norm based loss function in virtue of its row sparsity, instead of the widely-used l2-norm based squared loss or hinge loss function to determine the residual. On the other hand, to well preserve the structure knowledge of the source data within the same class and strengthen the discriminant ability of the classifier, we incorporate the minimum within-class scatter into the process of domain adaptation. Lastly, to efficiently solve the resulting optimization problem, we extend the form of the Representer Theorem through the kernel trick, and thus derive an elegant solution for the proposed model. The extensive comparison experiments with the state-of-the-art methods on multiple benchmark data sets demonstrate the superiority of the proposed method.
Collapse
Affiliation(s)
- Liran Yang
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 100083, China
| | - Ping Zhong
- College of Science, China Agricultural University, Beijing, 100083, China.
| |
Collapse
|
45
|
Zhang P, Huang J, Zhou Z, Chen Z, Shang J, Niu C, Yang Z. Joint category-level and discriminative feature learning networks for unsupervised domain adaptation. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2019. [DOI: 10.3233/jifs-191136] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Pengyu Zhang
- School of Electronic and Information Engineering, South China University of Technology, GuangZhou, China
| | - Junchu Huang
- School of Electronic and Information Engineering, South China University of Technology, GuangZhou, China
| | - Zhiheng Zhou
- School of Electronic and Information Engineering, South China University of Technology, GuangZhou, China
| | - Zengqun Chen
- School of Electronic and Information Engineering, South China University of Technology, GuangZhou, China
| | - Junyuan Shang
- School of Electronic and Information Engineering, South China University of Technology, GuangZhou, China
| | - Chang Niu
- School of Electronic and Information Engineering, South China University of Technology, GuangZhou, China
| | - Zhiwei Yang
- China Information and Communication Research Institute, GuangZhou, China
| |
Collapse
|
46
|
Han N, Wu J, Fang X, Xie S, Zhan S, Xie K, Li X. Latent Elastic-Net Transfer Learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:2820-2833. [PMID: 31751275 DOI: 10.1109/tip.2019.2952739] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Subspace learning based transfer learning methods commonly find a common subspace where the discrepancy of the source and target domains is reduced. The final classification is also performed in such subspace. However, the minimum discrepancy does not guarantee the best classification performance and thus the common subspace may be not the best discriminative. In this paper, we propose a latent elastic-net transfer learning (LET) method by simultaneously learning a latent subspace and a discriminative subspace. Specifically, the data from different domains can be well interlaced in the latent subspace by minimizing Maximum Mean Discrepancy (MMD). Since the latent subspace decouples inputs and outputs and, thus a more compact data representation is obtained for discriminative subspace learning. Based on the latent subspace, we further propose a low-rank constraint based matrix elastic-net regression to learn another subspace in which the intrinsic intra-class structure correlations of data from different domains is well captured. In doing so, a better discriminative alignment is guaranteed and thus LET finally learns another discriminative subspace for classification. Experiments on visual domains adaptation tasks show the superiority of the proposed LET method.
Collapse
|
47
|
Ji D, Jiang Y, Qian P, Wang S. A Novel Doubly Reweighting Multisource Transfer Learning Framework. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE 2019. [DOI: 10.1109/tetci.2018.2868326] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
48
|
Li J, Wu W, Xue D, Gao P. Multi-Source Deep Transfer Neural Network Algorithm. SENSORS 2019; 19:s19183992. [PMID: 31527437 PMCID: PMC6767847 DOI: 10.3390/s19183992] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 09/08/2019] [Accepted: 09/12/2019] [Indexed: 12/11/2022]
Abstract
Transfer learning can enhance classification performance of a target domain with insufficient training data by utilizing knowledge relating to the target domain from source domain. Nowadays, it is common to see two or more source domains available for knowledge transfer, which can improve performance of learning tasks in the target domain. However, the classification performance of the target domain decreases due to mismatching of probability distribution. Recent studies have shown that deep learning can build deep structures by extracting more effective features to resist the mismatching. In this paper, we propose a new multi-source deep transfer neural network algorithm, MultiDTNN, based on convolutional neural network and multi-source transfer learning. In MultiDTNN, joint probability distribution adaptation (JPDA) is used for reducing the mismatching between source and target domains to enhance features transferability of the source domain in deep neural networks. Then, the convolutional neural network is trained by utilizing the datasets of each source and target domain to obtain a set of classifiers. Finally, the designed selection strategy selects classifier with the smallest classification error on the target domain from the set to assemble the MultiDTNN framework. The effectiveness of the proposed MultiDTNN is verified by comparing it with other state-of-the-art deep transfer learning on three datasets.
Collapse
Affiliation(s)
- Jingmei Li
- College of Computer Science and Technology, Harbin Engineering University, No.145 Nantong Street, Harbin 150001, China.
| | - Weifei Wu
- College of Computer Science and Technology, Harbin Engineering University, No.145 Nantong Street, Harbin 150001, China.
| | - Di Xue
- College of Computer Science and Technology, Harbin Engineering University, No.145 Nantong Street, Harbin 150001, China.
| | - Peng Gao
- College of Computer Science and Technology, Harbin Engineering University, No.145 Nantong Street, Harbin 150001, China.
| |
Collapse
|
49
|
Pirbonyeh A, Rezaie V, Parvin H, Nejatian S, Mehrabi M. A linear unsupervised transfer learning by preservation of cluster-and-neighborhood data organization. Pattern Anal Appl 2019; 22:1149-1160. [DOI: 10.1007/s10044-018-0753-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2017] [Accepted: 09/25/2018] [Indexed: 10/28/2022]
|
50
|
|