1
|
Tan L, Peng Z, Song Y, Liu X, Jiang H, Liu S, Wu W, Xiang Z. Unsupervised Domain Adaptation Method Based on Relative Entropy Regularization and Measure Propagation. ENTROPY (BASEL, SWITZERLAND) 2025; 27:426. [PMID: 40282661 PMCID: PMC12025361 DOI: 10.3390/e27040426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2025] [Revised: 04/10/2025] [Accepted: 04/13/2025] [Indexed: 04/29/2025]
Abstract
This paper presents a novel unsupervised domain adaptation (UDA) framework that integrates information-theoretic principles to mitigate distributional discrepancies between source and target domains. The proposed method incorporates two key components: (1) relative entropy regularization, which leverages Kullback-Leibler (KL) divergence to align the predicted label distribution of the target domain with a reference distribution derived from the source domain, thereby reducing prediction uncertainty; and (2) measure propagation, a technique that transfers probability mass from the source domain to generate pseudo-measures-estimated probabilistic representations-for the unlabeled target domain. This dual mechanism enhances both global feature alignment and semantic consistency across domains. Extensive experiments on benchmark datasets (OfficeHome and DomainNet) demonstrate that the proposed approach consistently outperforms State-of-the-Art methods, particularly in scenarios with significant domain shifts. These results confirm the robustness, scalability, and theoretical grounding of our framework, offering a new perspective on the fusion of information theory and domain adaptation.
Collapse
Affiliation(s)
- Lianghao Tan
- Department of Computer Science, Arizona State University, Tempe, AZ 85281, USA; (Z.P.); (X.L.)
| | - Zhuo Peng
- Department of Computer Science, Arizona State University, Tempe, AZ 85281, USA; (Z.P.); (X.L.)
| | - Yongjia Song
- Department of Language Science, University of California, Irvine, CA 92697, USA;
| | - Xiaoyi Liu
- Department of Computer Science, Arizona State University, Tempe, AZ 85281, USA; (Z.P.); (X.L.)
| | - Huangqi Jiang
- Department of Computer Science, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Shubing Liu
- Department of Computer Science, North Carolina at Chapel Hill, Orange, GA 27599, USA;
| | - Weixi Wu
- Department of Computer Science, New York University, Brooklyn, NY 10003, USA;
| | - Zhiyuan Xiang
- Department of Computer Science, University of California, San Diego, CA 92093, USA;
| |
Collapse
|
2
|
Chen J, Liu L, Deng W, Liu Z, Liu Y, Wei Y, Liu Y. Refining Pseudo Labeling via Multi-Granularity Confidence Alignment for Unsupervised Cross Domain Object Detection. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; PP:279-294. [PMID: 40030753 DOI: 10.1109/tip.2024.3522807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Most state-of-the-art object detection methods suffer from poor generalization due to the domain shift between training and testing datasets. To resolve this challenge, unsupervised cross domain object detection is proposed to learn an object detector for an unlabeled target domain by transferring knowledge from an annotated source domain. Promising results have been achieved via Mean Teacher, however, pseudo labeling which is the bottleneck of mutual learning remains to be further explored. In this study, we find that confidence misalignment of the predictions, including category-level overconfidence, instance-level task confidence inconsistency, and image-level confidence misfocusing, leading to the injection of noisy pseudo labels in the training process, will bring suboptimal performance. Considering the above issue, we present a novel general framework termed Multi-Granularity Confidence Alignment Mean Teacher (MGCAMT) for unsupervised cross domain object detection, which alleviates confidence misalignment across category-, instance-, and image-levels simultaneously to refine pseudo labeling for better teacher-student learning. Specifically, to align confidence with accuracy at category level, we propose Classification Confidence Alignment (CCA) to model category uncertainty based on Evidential Deep Learning (EDL) and filter out the category incorrect labels via an uncertainty-aware selection strategy. Furthermore, we design Task Confidence Alignment (TCA) to mitigate the instance-level misalignment between classification and localization by enabling each classification feature to adaptively identify the optimal feature for regression. Finally, we develop imagery Focusing Confidence Alignment (FCA) adopting another way of pseudo label learning, i.e., we use the original outputs from the Mean Teacher network for supervised learning without label assignment to achieve a balanced perception of the image's spatial layout. When these three procedures are integrated into a single framework, they mutually benefit to improve the final performance from a cooperative learning perspective. Extensive experiments across multiple scenarios demonstrate that our method outperforms large foundational models, and surpasses other state-of-the-art approaches by a large margin.
Collapse
|
3
|
Li X, Lan C, Wei G, Chen Z. Semantic-Aware Message Broadcasting for Efficient Unsupervised Domain Adaptation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:5340-5353. [PMID: 39115993 DOI: 10.1109/tip.2024.3437212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/10/2024]
Abstract
Vision transformer has demonstrated great potential in abundant vision tasks. However, it also inevitably suffers from poor generalization capability when the distribution shift occurs in testing (i.e., out-of-distribution data). To mitigate this issue, we propose a novel method, Semantic-aware Message Broadcasting (SAMB), which enables more informative and flexible feature alignment for unsupervised domain adaptation (UDA). Particularly, we study the attention module in the vision transformer and notice that the alignment space using one global class token lacks enough flexibility, where it interacts information with all image tokens in the same manner but ignores the rich semantics of different regions. In this paper, we aim to improve the richness of the alignment features by enabling semantic-aware adaptive message broadcasting. Particularly, we introduce a group of learned group tokens as nodes to aggregate the global information from all image tokens, but encourage different group tokens to adaptively focus on the message broadcasting to different semantic regions. In this way, our message broadcasting encourages the group tokens to learn more informative and diverse information for effective domain alignment. Moreover, we systematically study the effects of adversarial-based feature alignment (ADA) and pseudo-label based self-training (PST) on UDA. We find that one simple two-stage training strategy with the cooperation of ADA and PST can further improve the adaptation capability of the vision transformer. Extensive experiments on DomainNet, OfficeHome, and VisDA-2017 demonstrate the effectiveness of our methods for UDA.
Collapse
|
4
|
Pei J, Men A, Liu Y, Zhuang X, Chen Q. Evidential Multi-Source-Free Unsupervised Domain Adaptation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:5288-5305. [PMID: 38315607 DOI: 10.1109/tpami.2024.3361978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2024]
Abstract
Multi-Source-Free Unsupervised Domain Adaptation (MSFUDA) requires aggregating knowledge from multiple source models and adapting it to the target domain. Two challenges remain: 1) suboptimal coarse-grained (domain-level) aggregation of multiple source models, and 2) risky semantics propagation based on local structures. In this article, we propose an evidential learning method for MSFUDA, where we formulate two uncertainties, i.e. Evidential Prediction Uncertainty (EPU) and Evidential Adjacency-Consistent Uncertainty (EAU), respectively for addressing the two challenges. The former, EPU, captures the uncertainty of a sample fitted to a source model, which can suggest the preferences of target samples for different source models. Based on this, we develop an EPU-Based Multi-Source Aggregation module to achieve fine-grained, instance-level source knowledge aggregation. The latter, EAU, provides a robust measure of consistency among adjacent samples in the target domain. Utilizing this, we develop an EAU-Guided Local Structure Mining module to ensure the trustworthy propagation of semantics. The two modules are integrated into the Evidential Aggregation and Adaptation Framework (EAAF), and we demonstrated that this framework achieves state-of-the-art performances on three MSFUDA benchmarks.
Collapse
|
5
|
Ma H, Lin X, Yu Y. I2F: A Unified Image-to-Feature Approach for Domain Adaptive Semantic Segmentation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:1695-1710. [PMID: 37015559 DOI: 10.1109/tpami.2022.3229207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Unsupervised domain adaptation (UDA) for semantic segmentation is a promising task freeing people from heavy annotation work. However, domain discrepancies in low-level image statistics and high-level contexts compromise the segmentation performance over the target domain. A key idea to tackle this problem is to perform both image-level and feature-level adaptation jointly. Unfortunately, there is a lack of such unified approaches for UDA tasks in the existing literature. This paper proposes a novel UDA pipeline for semantic segmentation that unifies image-level and feature-level adaptation. Concretely, for image-level domain shifts, we propose a global photometric alignment module and a global texture alignment module that align images in the source and target domains in terms of image-level properties. For feature-level domain shifts, we perform global manifold alignment by projecting pixel features from both domains onto the feature manifold of the source domain; and we further regularize category centers in the source domain through a category-oriented triplet loss, and perform target domain consistency regularization over augmented target domain images. Experimental results demonstrate that our pipeline significantly outperforms previous methods. In the commonly tested GTA5 →Cityscapes task, our proposed method using Deeplab V3+ as the backbone surpasses previous SOTA by 8%, achieving 58.2% in mIoU.
Collapse
|
6
|
Pei J, Jiang Z, Men A, Chen L, Liu Y, Chen Q. Uncertainty-Induced Transferability Representation for Source-Free Unsupervised Domain Adaptation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:2033-2048. [PMID: 37030696 DOI: 10.1109/tip.2023.3258753] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Source-free unsupervised domain adaptation (SFUDA) aims to learn a target domain model using unlabeled target data and the knowledge of a well-trained source domain model. Most previous SFUDA works focus on inferring semantics of target data based on the source knowledge. Without measuring the transferability of the source knowledge, these methods insufficiently exploit the source knowledge, and fail to identify the reliability of the inferred target semantics. However, existing transferability measurements require either source data or target labels, which are infeasible in SFUDA. To this end, firstly, we propose a novel Uncertainty-induced Transferability Representation (UTR), which leverages uncertainty as the tool to analyse the channel-wise transferability of the source encoder in the absence of the source data and target labels. The domain-level UTR unravels how transferable the encoder channels are to the target domain and the instance-level UTR characterizes the reliability of the inferred target semantics. Secondly, based on the UTR, we propose a novel Calibrated Adaption Framework (CAF) for SFUDA, including i) the source knowledge calibration module that guides the target model to learn the transferable source knowledge and discard the non-transferable one, and ii) the target semantics calibration module that calibrates the unreliable semantics. With the help of the calibrated source knowledge and the target semantics, the model adapts to the target domain safely and ultimately better. We verified the effectiveness of our method using experimental results and demonstrated that the proposed method achieves state-of-the-art performances on the three SFUDA benchmarks. Code is available at https://github.com/SPIresearch/UTR.
Collapse
|
7
|
He QQ, Siu SWI, Si YW. Attentive recurrent adversarial domain adaptation with Top-k pseudo-labeling for time series classification. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04176-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
8
|
Discriminative transfer feature learning based on robust-centers. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.05.042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
9
|
Zhang J, Li C, Yin Y, Zhang J, Grzegorzek M. Applications of artificial neural networks in microorganism image analysis: a comprehensive review from conventional multilayer perceptron to popular convolutional neural network and potential visual transformer. Artif Intell Rev 2022; 56:1013-1070. [PMID: 35528112 PMCID: PMC9066147 DOI: 10.1007/s10462-022-10192-7] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Microorganisms are widely distributed in the human daily living environment. They play an essential role in environmental pollution control, disease prevention and treatment, and food and drug production. The analysis of microorganisms is essential for making full use of different microorganisms. The conventional analysis methods are laborious and time-consuming. Therefore, the automatic image analysis based on artificial neural networks is introduced to optimize it. However, the automatic microorganism image analysis faces many challenges, such as the requirement of a robust algorithm caused by various application occasions, insignificant features and easy under-segmentation caused by the image characteristic, and various analysis tasks. Therefore, we conduct this review to comprehensively discuss the characteristics of microorganism image analysis based on artificial neural networks. In this review, the background and motivation are introduced first. Then, the development of artificial neural networks and representative networks are presented. After that, the papers related to microorganism image analysis based on classical and deep neural networks are reviewed from the perspectives of different tasks. In the end, the methodology analysis and potential direction are discussed.
Collapse
Affiliation(s)
- Jinghua Zhang
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
- Institute for Medical Informatics, University of Luebeck, Luebeck, Germany
| | - Chen Li
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Yimin Yin
- School of Mathematics and Statistics, Hunan First Normal University, Changsha, China
| | - Jiawei Zhang
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Marcin Grzegorzek
- Institute for Medical Informatics, University of Luebeck, Luebeck, Germany
| |
Collapse
|