1
|
Cui C, Liu Z, Gong S, Zhu L, Zhang C, Liu H. When Adversarial Training Meets Prompt Tuning: Adversarial Dual Prompt Tuning for Unsupervised Domain Adaptation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:1427-1440. [PMID: 40031795 DOI: 10.1109/tip.2025.3541868] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Unsupervised domain adaptation (UDA) aims to adapt models learned from a well-annotated source domain to a target domain, where only unlabeled samples are available. To this end, adversarial training is widely used in conventional UDA methods to reduce the discrepancy between source and target domains. Recently, prompt tuning has emerged as an efficient way to adapt large pre-trained vision-language models like CLIP to a variety of downstream tasks. In this paper, we present a novel method named Adversarial DuAl Prompt Tuning (ADAPT) for UDA, which employs text prompts and visual prompts to guide CLIP simultaneously. Rather than simply performing a joint optimization of text prompts and visual prompts, we integrate text prompt tuning and visual prompt tuning into a collaborative framework where they engage in an adversarial game: text prompt tuning focuses on distinguishing between source and target images, whereas visual prompt tuning seeks to align source and target domains. Unlike most existing adversarial training-based UDA approaches, ADAPT does not require explicit domain discriminators for domain alignment. Instead, the objective is effectively achieved at both global and category levels through modeling the joint probability distribution of images on domains and categories. Extensive experiments on four benchmark datasets demonstrate the effectiveness of our ADAPT method for UDA. We have released our code at https://github.com/Liuziyi1999/ADAPT.
Collapse
|
2
|
Jiang X, Yang Y, Su T, Xiao K, Lu L, Wang W, Guo C, Shao L, Wang M, Jiang D. Unsupervised domain adaptation based on feature and edge alignment for femur X-ray image segmentation. Comput Med Imaging Graph 2024; 116:102407. [PMID: 38880065 DOI: 10.1016/j.compmedimag.2024.102407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 05/24/2024] [Accepted: 05/24/2024] [Indexed: 06/18/2024]
Abstract
The gold standard for diagnosing osteoporosis is bone mineral density (BMD) measurement by dual-energy X-ray absorptiometry (DXA). However, various factors during the imaging process cause domain shifts in DXA images, which lead to incorrect bone segmentation. Research shows that poor bone segmentation is one of the prime reasons of inaccurate BMD measurement, severely affecting the diagnosis and treatment plans for osteoporosis. In this paper, we propose a Multi-feature Joint Discriminative Domain Adaptation (MDDA) framework to improve segmentation performance and the generalization of the network in domain-shifted images. The proposed method learns domain-invariant features between the source and target domains from the perspectives of multi-scale features and edges, and is evaluated on real data from multi-center datasets. Compared to other state-of-the-art methods, the feature prior from the source domain and edge prior enable the proposed MDDA to achieve the optimal domain adaptation performance and generalization. It also demonstrates superior performance in domain adaptation tasks on small amount datasets, even using only 5 or 10 images. In this study, MDDA provides an accurate bone segmentation tool for BMD measurement based on DXA imaging.
Collapse
Affiliation(s)
- Xiaoming Jiang
- Chongqing Engineering Research Center of Medical Electronics and Information Technology, Chongqing University of Post and Telecommunications, Chongqing, China
| | - Yongxin Yang
- Chongqing Engineering Research Center of Medical Electronics and Information Technology, Chongqing University of Post and Telecommunications, Chongqing, China
| | - Tong Su
- Department of Sports Medicine, Peking University Third Hospital, Institute of Sports Medicine of Peking University, Beijing Key Laboratory of Sports Injuries, Engineering Research Center of Sports Trauma Treatment Technology and Devices, Ministry of Education, No. 49 North Garden Road, Beijing, China
| | - Kai Xiao
- Department of Foot and Ankle Surgery, Wuhan Fourth Hospital, Wuhan, Hubei, China
| | - LiDan Lu
- Chongqing Engineering Research Center of Medical Electronics and Information Technology, Chongqing University of Post and Telecommunications, Chongqing, China
| | - Wei Wang
- Chongqing Engineering Research Center of Medical Electronics and Information Technology, Chongqing University of Post and Telecommunications, Chongqing, China
| | - Changsong Guo
- National Health Commission Capacity Building and Continuing Education Center, Beijing, China
| | - Lizhi Shao
- Chinese Academy of Sciences Key Laboratory of Molecular Imaging, Institute of Automation, Beijing 100190, China.
| | - Mingjing Wang
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou 325000, China.
| | - Dong Jiang
- Department of Sports Medicine, Peking University Third Hospital, Institute of Sports Medicine of Peking University, Beijing Key Laboratory of Sports Injuries, Engineering Research Center of Sports Trauma Treatment Technology and Devices, Ministry of Education, No. 49 North Garden Road, Beijing, China.
| |
Collapse
|
3
|
Liu W, Ni Z, Chen Q, Ni L. Attention-Guided Partial Domain Adaptation for Automated Pneumonia Diagnosis From Chest X-Ray Images. IEEE J Biomed Health Inform 2023; 27:5848-5859. [PMID: 37695960 DOI: 10.1109/jbhi.2023.3313886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/13/2023]
Abstract
Deep neural networks (DNN) supported by multicenter large-scale Chest X-Ray (CXR) datasets can efficiently perform tasks such as disease identification, lesion segmentation, and report generation. However, the non-ignorable inter-domain heterogeneity caused by different equipment, ethnic groups, and scanning protocols may lead to dramatic degradation in model performance. Unsupervised domain adaptation (UDA) methods help alleviate the cross-domain discrepancy for subsequent analysis. Nevertheless, they may be prone to: 1) spatial negative transfer: misaligning non-transferable regions which have inadequate knowledge, and 2) semantic negative transfer: failing to extend to scenarios where the label spaces of the source and target domain are partially shared. In this work, we propose a classification-based framework named attention-guided partial domain adaptation (AGPDA) network for overcoming these two negative transfer challenges. AGPDA is composed of two key modules: 1) a region attention discrimination block (RADB) to generate fine-grained attention value via lightweight region-wise multi-adversarial networks. 2) a residual feature recalibration block (RFRB) trained with class-weighted maximum mean discrepancy (MMD) loss for down-weighing the irrelevant source samples. Extensive experiments on two publicly available CXR datasets containing a total of 8598 pneumonia (viral, bacterial, and COVID-19) cases, 7163 non-pneumonia or healthy cases, demonstrate the superior performance of our AGPDA. Especially on three partial transfer tasks, AGPDA significantly increases the accuracy, sensitivity, and F1 score by 4.35%, 4.05%, and 1.78% compared to recently strong baselines.
Collapse
|
4
|
Gholami B, El-Khamy M, Song KB. Latent Feature Disentanglement for Visual Domain Generalization. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:5751-5763. [PMID: 37831569 DOI: 10.1109/tip.2023.3321511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/15/2023]
Abstract
Despite remarkable success in a variety of computer vision applications, it is well-known that deep learning can fail catastrophically when presented with out-of-distribution data, where there are usually style differences between the training and test images. Toward addressing this challenge, we consider the domain generalization problem, wherein predictors are trained using data drawn from a family of related training (source) domains and then evaluated on a distinct and unseen test domain. Naively training a model on the aggregate set of data (pooled from all source domains) has been shown to perform suboptimally, since the information learned by that model might be domain-specific and generalizes imperfectly to test domains. Data augmentation has been shown to be an effective approach to overcome this problem. However, its application has been limited to enforcing invariance to simple transformations like rotation, brightness change, etc. Such perturbations do not necessarily cover plausible real-world variations that preserve the semantics of the input (such as a change in the image style). In this paper, taking the advantage of multiple source domains, we propose a novel approach to express and formalize robustness to these kind of real-world image perturbations. The three key ideas underlying our formulation are (1) leveraging disentangled representations of the images to define different factors of variations, (2) generating perturbed images by changing such factors composing the representations of the images, (3) enforcing the learner (classifier) to be invariant to such changes in the images. We use image-to-image translation models to demonstrate the efficacy of this approach. Based on this, we propose a domain-invariant regularization (DIR) loss function that enforces invariant prediction of targets (class labels) across domains which yields improved generalization performance. We demonstrate the effectiveness of our approach on several widely used datasets for the domain generalization problem, on all of which our results are competitive with the state-of-the-art.
Collapse
|
5
|
Wang X, She B, Shi Z, Sun S, Qin F. Partial adversarial domain adaptation by dual-domain alignment for fault diagnosis of rotating machines. ISA TRANSACTIONS 2023; 136:455-467. [PMID: 36513542 DOI: 10.1016/j.isatra.2022.11.021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 11/24/2022] [Accepted: 11/24/2022] [Indexed: 05/16/2023]
Abstract
Domain adaptation (DA) techniques have succeeded in solving domain shift problem for fault diagnosis (FD), where the research assumption is that the target domain (TD) and source domain (SD) share identical label spaces. However, when the SD label spaces subsume the TD, heterogeneity occurs, which is a partial domain adaptation (PDA) problem. In this paper, we propose a dual-domain alignment approach for partial adversarial DA (DDA-PADA) for FD, including (1) traditional domain-adversarial neural network (DANN) modules (feature extractors, feature classifiers and a domain discriminator); (2) a SD alignment (SDA) module designed based on the feature alignment of SD extracted in two stages; and (3) a cross-domain alignment (CDA) module designed based on the feature alignment of SD and TD extracted in the second stage. Specifically, SDA and CDA are implemented by a unilateral feature alignment approach, which maintains the feature consistency of the SD and attempts to mitigate cross-domain variation by correcting the feature distribution of TD, achieving feature alignment from a dual-domain perspective. Thus, DDA-PADA can effectively align the SD and TD without affecting the feature distribution of SD. Experimental results obtained on two rotating mechanical datasets show that DDA-PADA exhibits satisfactory performance in handling PDA problems. The various analysis results validate the advantages of DDA-PADA.
Collapse
Affiliation(s)
- Xuan Wang
- Department of Weaponry Engineering, Naval University of Engineering, Wuhan 430000, China
| | - Bo She
- Department of Weaponry Engineering, Naval University of Engineering, Wuhan 430000, China.
| | - Zhangsong Shi
- Department of Weaponry Engineering, Naval University of Engineering, Wuhan 430000, China
| | - Shiyan Sun
- Department of Weaponry Engineering, Naval University of Engineering, Wuhan 430000, China
| | - Fenqi Qin
- 713 Research Institute of China Shipbuilding, Zhenzhou 450000, China
| |
Collapse
|
6
|
Feng Y, Wang Z, Xu X, Wang Y, Fu H, Li S, Zhen L, Lei X, Cui Y, Sim Zheng Ting J, Ting Y, Zhou JT, Liu Y, Siow Mong Goh R, Heng Tan C. Contrastive domain adaptation with consistency match for automated pneumonia diagnosis. Med Image Anal 2023; 83:102664. [PMID: 36332357 DOI: 10.1016/j.media.2022.102664] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 09/02/2022] [Accepted: 10/13/2022] [Indexed: 11/05/2022]
Abstract
Pneumonia can be difficult to diagnose since its symptoms are too variable, and the radiographic signs are often very similar to those seen in other illnesses such as a cold or influenza. Deep neural networks have shown promising performance in automated pneumonia diagnosis using chest X-ray radiography, allowing mass screening and early intervention to reduce the severe cases and death toll. However, they usually require many well-labelled chest X-ray images for training to achieve high diagnostic accuracy. To reduce the need for training data and annotation resources, we propose a novel method called Contrastive Domain Adaptation with Consistency Match (CDACM). It transfers the knowledge from different but relevant datasets to the unlabelled small-size target dataset and improves the semantic quality of the learnt representations. Specifically, we design a conditional domain adversarial network to exploit discriminative information conveyed in the predictions to mitigate the domain gap between the source and target datasets. Furthermore, due to the small scale of the target dataset, we construct a feature cloud for each target sample and leverage contrastive learning to extract more discriminative features. Lastly, we propose adaptive feature cloud expansion to push the decision boundary to a low-density area. Unlike most existing transfer learning methods that aim only to mitigate the domain gap, our method instead simultaneously considers the domain gap and the data deficiency problem of the target dataset. The conditional domain adaptation and the feature cloud generation of our method are learning jointly to extract discriminative features in an end-to-end manner. Besides, the adaptive feature cloud expansion improves the model's generalisation ability in the target domain. Extensive experiments on pneumonia and COVID-19 diagnosis tasks demonstrate that our method outperforms several state-of-the-art unsupervised domain adaptation approaches, which verifies the effectiveness of CDACM for automated pneumonia diagnosis using chest X-ray imaging.
Collapse
Affiliation(s)
- Yangqin Feng
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Zizhou Wang
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Xinxing Xu
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore.
| | - Yan Wang
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Huazhu Fu
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Shaohua Li
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Liangli Zhen
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Xiaofeng Lei
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Yingnan Cui
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Jordan Sim Zheng Ting
- Department of Diagnostic Radiology, Tan Tock Seng Hospital (TTSH), Singapore 308433, Singapore
| | - Yonghan Ting
- Department of Diagnostic Radiology, Tan Tock Seng Hospital (TTSH), Singapore 308433, Singapore
| | - Joey Tianyi Zhou
- Centre for Frontier AI Research, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Yong Liu
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Rick Siow Mong Goh
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Cher Heng Tan
- Department of Diagnostic Radiology, Tan Tock Seng Hospital (TTSH), Singapore 308433, Singapore; Lee Kong Chian School of Medicine, Singapore 308232, Singapore
| |
Collapse
|
7
|
Li Y, Zhang Y, Yang C. Unsupervised domain adaptation with Joint Adversarial Variational AutoEncoder. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
8
|
Wang M, Deng W. Adaptive Face Recognition Using Adversarial Information Network. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:4909-4921. [PMID: 35839179 DOI: 10.1109/tip.2022.3189830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In many real-world applications, face recognition models often degenerate when training data (referred to as source domain) are different from testing data (referred to as target domain). To alleviate this mismatch caused by some factors like pose and skin tone, the utilization of pseudo-labels generated by clustering algorithms is an effective way in unsupervised domain adaptation. However, they always miss some hard positive samples. Supervision on pseudo-labeled samples attracts them towards their prototypes and would cause an intra-domain gap between pseudo-labeled samples and the remaining unlabeled samples within target domain, which results in the lack of discrimination in face recognition. In this paper, considering the particularity of face recognition, we propose a novel adversarial information network (AIN) to address it. First, a novel adversarial mutual information (MI) loss is proposed to alternately minimize MI with respect to the target classifier and maximize MI with respect to the feature extractor. By this min-max manner, the positions of target prototypes are adaptively modified which makes unlabeled images clustered more easily such that intra-domain gap can be mitigated. Second, to assist adversarial MI loss, we utilize a graph convolution network to predict linkage likelihoods between target data and generate pseudo-labels. It leverages valuable information in the context of nodes and can achieve more reliable results. The proposed method is evaluated under two scenarios, i.e., domain adaptation across poses and image conditions, and domain adaptation across faces with different skin tones. Extensive experiments show that AIN successfully improves cross-domain generalization and offers a new state-of-the-art on RFW dataset.
Collapse
|
9
|
Unpaired multi-modal tumor segmentation with structure adaptation. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03610-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
10
|
Wang M, Deng W, Liu CL. Unsupervised Structure-Texture Separation Network for Oracle Character Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:3137-3150. [PMID: 35420984 DOI: 10.1109/tip.2022.3165989] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Oracle bone script is the earliest-known Chinese writing system of the Shang dynasty and is precious to archeology and philology. However, real-world scanned oracle data are rare and few experts are available for annotation which make the automatic recognition of scanned oracle characters become a challenging task. Therefore, we aim to explore unsupervised domain adaptation to transfer knowledge from handprinted oracle data, which are easy to acquire, to scanned domain. We propose a structure-texture separation network (STSN), which is an end-to-end learning framework for joint disentanglement, transformation, adaptation and recognition. First, STSN disentangles features into structure (glyph) and texture (noise) components by generative models, and then aligns handprinted and scanned data in structure feature space such that the negative influence caused by serious noises can be avoided when adapting. Second, transformation is achieved via swapping the learned textures across domains and a classifier for final classification is trained to predict the labels of the transformed scanned characters. This not only guarantees the absolute separation, but also enhances the discriminative ability of the learned features. Extensive experiments on Oracle-241 dataset show that STSN outperforms other adaptation methods and successfully improves recognition performance on scanned data even when they are contaminated by long burial and careless excavation.
Collapse
|
11
|
Song P, Jadan HV, Howe CL, Foust AJ, Dragotti PL. Light-Field Microscopy for Optical Imaging of Neuronal Activity: When Model-Based Methods Meet Data-Driven Approaches. IEEE SIGNAL PROCESSING MAGAZINE 2022; 39:58-72. [PMID: 35261535 PMCID: PMC7612478 DOI: 10.1109/msp.2021.3123557] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Understanding how networks of neurons process information is one of the key challenges in modern neuroscience. A necessary step to achieve this goal is to be able to observe the dynamics of large populations of neurons over a large area of the brain. Light-field microscopy (LFM), a type of scanless microscope, is a particularly attractive candidate for high-speed three-dimensional (3D) imaging. It captures volumetric information in a single snapshot, allowing volumetric imaging at video frame-rates. Specific features of imaging neuronal activity using LFM call for the development of novel machine learning approaches that fully exploit priors embedded in physics and optics models. Signal processing theory and wave-optics theory could play a key role in filling this gap, and contribute to novel computational methods with enhanced interpretability and generalization by integrating model-driven and data-driven approaches. This paper is devoted to a comprehensive survey to state-of-the-art of computational methods for LFM, with a focus on model-based and data-driven approaches.
Collapse
Affiliation(s)
- Pingfan Song
- Department of Engineering, University of Cambridge, Cambridge, CB2 1PZ, UK
| | - Herman Verinaz Jadan
- Department of Electronic and Electrical Engineering, Imperial College London, London, SW7 2AZ, UK
| | - Carmel L. Howe
- Department of Bioengineering, and Center for Neurotechnology, Imperial College London, London, SW7 2AZ, UK
| | - Amanda J. Foust
- Department of Bioengineering, and Center for Neurotechnology, Imperial College London, London, SW7 2AZ, UK
| | - Pier Luigi Dragotti
- Department of Electronic and Electrical Engineering, Imperial College London, London, SW7 2AZ, UK
| |
Collapse
|
12
|
Zhou K, Yang Y, Qiao Y, Xiang T. Domain Adaptive Ensemble Learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:8008-8018. [PMID: 34534081 DOI: 10.1109/tip.2021.3112012] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The problem of generalizing deep neural networks from multiple source domains to a target one is studied under two settings: When unlabeled target data is available, it is a multi-source unsupervised domain adaptation (UDA) problem, otherwise a domain generalization (DG) problem. We propose a unified framework termed domain adaptive ensemble learning (DAEL) to address both problems. A DAEL model is composed of a CNN feature extractor shared across domains and multiple classifier heads each trained to specialize in a particular source domain. Each such classifier is an expert to its own domain but a non-expert to others. DAEL aims to learn these experts collaboratively so that when forming an ensemble, they can leverage complementary information from each other to be more effective for an unseen target domain. To this end, each source domain is used in turn as a pseudo-target-domain with its own expert providing supervisory signal to the ensemble of non-experts learned from the other sources. To deal with unlabeled target data under the UDA setting where real expert does not exist, DAEL uses pseudo labels to supervise the ensemble learning. Extensive experiments on three multi-source UDA datasets and two DG datasets show that DAEL improves the state of the art on both problems, often by significant margins.
Collapse
|
13
|
Deng W, Liao Q, Zhao L, Guo D, Kuang G, Hu D, Liu L. Joint Clustering and Discriminative Feature Alignment for Unsupervised Domain Adaptation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:7842-7855. [PMID: 34506283 DOI: 10.1109/tip.2021.3109530] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Unsupervised Domain Adaptation (UDA) aims to learn a classifier for the unlabeled target domain by leveraging knowledge from a labeled source domain with a different but related distribution. Many existing approaches typically learn a domain-invariant representation space by directly matching the marginal distributions of the two domains. However, they ignore exploring the underlying discriminative features of the target data and align the cross-domain discriminative features, which may lead to suboptimal performance. To tackle these two issues simultaneously, this paper presents a Joint Clustering and Discriminative Feature Alignment (JCDFA) approach for UDA, which is capable of naturally unifying the mining of discriminative features and the alignment of class-discriminative features into one single framework. Specifically, in order to mine the intrinsic discriminative information of the unlabeled target data, JCDFA jointly learns a shared encoding representation for two tasks: supervised classification of labeled source data, and discriminative clustering of unlabeled target data, where the classification of the source domain can guide the clustering learning of the target domain to locate the object category. We then conduct the cross-domain discriminative feature alignment by separately optimizing two new metrics: 1) an extended supervised contrastive learning, i.e., semi-supervised contrastive learning 2) an extended Maximum Mean Discrepancy (MMD), i.e., conditional MMD, explicitly minimizing the intra-class dispersion and maximizing the inter-class compactness. When these two procedures, i.e., discriminative features mining and alignment are integrated into one framework, they tend to benefit from each other to enhance the final performance from a cooperative learning perspective. Experiments are conducted on four real-world benchmarks (e.g., Office-31, ImageCLEF-DA, Office-Home and VisDA-C). All the results demonstrate that our JCDFA can obtain remarkable margins over state-of-the-art domain adaptation methods. Comprehensive ablation studies also verify the importance of each key component of our proposed algorithm and the effectiveness of combining two learning strategies into a framework.
Collapse
|
14
|
Feng Z, Xu C, Tao D. Open-Set Hypothesis Transfer With Semantic Consistency. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:6473-6484. [PMID: 34224354 DOI: 10.1109/tip.2021.3093393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Unsupervised open-set domain adaptation (UODA) is a realistic problem where unlabeled target data contain unknown classes. Prior methods rely on the coexistence of both source and target domain data to perform domain alignment, which greatly limits their applications when source domain data are restricted due to privacy concerns. In this paper we address the challenging hypothesis transfer setting for UODA, where data from source domain are no longer available during adaptation on target domain. Specifically, we propose to use pseudo-labels and a novel consistency regularization on target data, where using conventional formulations fails in this open-set setting. Firstly, our method discovers confident predictions on target domain and performs classification with pseudo-labels. Then we enforce the model to output consistent and definite predictions on semantically similar transformed inputs, discovering all latent class semantics. As a result, unlabeled data can be classified into discriminative classes coincided with either source classes or unknown classes. We theoretically prove that under perfect semantic transformation, the proposed objective that enforces consistency can recover the information of true labels in prediction. Experimental results show that our model outperforms state-of-the-art methods on UODA benchmarks.
Collapse
|
15
|
Zhou H, Azzam M, Zhong J, Liu C, Wu S, Wong HS. Knowledge Exchange Between Domain-Adversarial and Private Networks Improves Open Set Image Classification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:5807-5818. [PMID: 34138710 DOI: 10.1109/tip.2021.3088642] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Both target-specific and domain-invariant features can facilitate Open Set Domain Adaptation (OSDA). To exploit these features, we propose a Knowledge Exchange (KnowEx) model which jointly trains two complementary constituent networks: (1) a Domain-Adversarial Network (DAdvNet) learning the domain-invariant representation, through which the supervision in source domain can be exploited to infer the class information of unlabeled target data; (2) a Private Network (PrivNet) exclusive for target domain, which is beneficial for discriminating between instances from known and unknown classes. The two constituent networks exchange training experience in the learning process. Toward this end, we exploit an adversarial perturbation process against DAdvNet to regularize PrivNet. This enhances the complementarity between the two networks. At the same time, we incorporate an adaptation layer into DAdvNet to address the unreliability of the PrivNet's experience. Therefore, DAdvNet and PrivNet are able to mutually reinforce each other during training. We have conducted thorough experiments on multiple standard benchmarks to verify the effectiveness and superiority of KnowEx in OSDA.
Collapse
|
16
|
Liu KC, Chan M, Kuo HC, Hsieh CY, Huang HY, Chan CT, Tsao Y. Domain-Adaptive Fall Detection Using Deep Adversarial Training. IEEE Trans Neural Syst Rehabil Eng 2021; 29:1243-1251. [PMID: 34133280 DOI: 10.1109/tnsre.2021.3089685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Fall detection (FD) systems are important assistive technologies for healthcare that can detect emergency fall events and alert caregivers. However, it is not easy to obtain large-scale annotated fall events with various specifications of sensors or sensor positions during the implementation of accurate FD systems. Moreover, the knowledge obtained through machine learning has been restricted to tasks in the same domain. The mismatch between different domains might hinder the performance of FD systems. Cross-domain knowledge transfer is very beneficial for machine-learning based FD systems to train a reliable FD model with well-labeled data in new environments. In this study, we propose domain-adaptive fall detection (DAFD) using deep adversarial training (DAT) to tackle cross-domain problems, such as cross-position and cross-configuration. The proposed DAFD can transfer knowledge from the source domain to the target domain by minimizing the domain discrepancy to avoid mismatch problems. The experimental results show that the average F1-score improvement when using DAFD ranges from 1.5% to 7% in the cross-position scenario, and from 3.5% to 12% in the cross-configuration scenario, compared to using the conventional FD model without domain adaptation training. The results demonstrate that the proposed DAFD successfully helps to deal with cross-domain problems and to achieve better detection performance.
Collapse
|
17
|
Chen J, Wu X, Duan L, Chen L. Sequential Instance Refinement for Cross-Domain Object Detection in Images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:3970-3984. [PMID: 33769933 DOI: 10.1109/tip.2021.3066904] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Cross-domain object detection in images has attracted increasing attention in the past few years, which aims at adapting the detection model learned from existing labeled images (source domain) to newly collected unlabeled ones (target domain). Existing methods usually deal with the cross-domain object detection problem through direct feature alignment between the source and target domains at the image level, the instance level (i.e., region proposals) or both. However, we have observed that directly aligning features of all object instances from the two domains often results in the problem of negative transfer, due to the existence of (1) outlier target instances that contain confusing objects not belonging to any category of the source domain and thus are hard to be captured by detectors and (2) low-relevance source instances that are considerably statistically different from target instances although their contained objects are from the same category. With this in mind, we propose a reinforcement learning based method, coined as sequential instance refinement, where two agents are learned to progressively refine both source and target instances by taking sequential actions to remove both outlier target instances and low-relevance source instances step by step. Extensive experiments on several benchmark datasets demonstrate the superior performance of our method over existing state-of-the-art baselines for cross-domain object detection.
Collapse
|
18
|
Development and validation of a Brazilian sign language database for human gesture recognition. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-05802-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
19
|
Jiao Y, Yao H, Xu C. SAN: Selective Alignment Network for Cross-Domain Pedestrian Detection. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:2155-2167. [PMID: 33471752 DOI: 10.1109/tip.2021.3049948] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Cross-domain pedestrian detection, which has been attracting much attention, assumes that the training and test images are drawn from different data distributions. Existing methods focus on aligning the descriptions of whole candidate instances between source and target domains. Since there exists a giant visual difference among the candidate instances, aligning whole candidate instances between two domains cannot overcome the inter-instance difference. Compared with aligning the whole candidate instances, we consider that aligning each type of instances separately is a more reasonable manner. Therefore, we propose a novel Selective Alignment Network for cross-domain pedestrian detection, which consists of three components: a Base Detector, an Image-Level Adaptation Network, and an Instance-Level Adaptation Network. The Image-Level Adaptation Network and Instance-Level Adaptation Network can be regarded as the global-level and local-level alignments, respectively. Similar to the Faster R-CNN, the Base Detector, which is composed of a Feature module, an RPN module and a Detection module, is used to infer a robust pedestrian detector with the annotated source data. Once obtaining the image description extracted by the Feature module, the Image-Level Adaptation Network is proposed to align the image description with an adversarial domain classifier. Given the candidate proposals generated by the RPN module, the Instance-Level Adaptation Network firstly clusters the source candidate proposals into several groups according to their visual features, and thus generates the pseudo label for each candidate proposal. After generating the pseudo labels, we align the source and target domains by maximizing and minimizing the discrepancy between the prediction of two classifiers iteratively. Extensive evaluations on several benchmarks demonstrate the effectiveness of the proposed approach for cross-domain pedestrian detection.
Collapse
|
20
|
Li W, Huan W, Hou B, Tian Y, Zhang Z, Song A. Can Emotion be Transferred? – A Review on Transfer Learning for EEG-Based Emotion Recognition. IEEE Trans Cogn Dev Syst 2021. [DOI: 10.1109/tcds.2021.3098842] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
21
|
Tian L, Tang Y, Hu L, Ren Z, Zhang W. Domain Adaptation by Class Centroid Matching and Local Manifold Self-Learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:9703-9718. [PMID: 33079662 DOI: 10.1109/tip.2020.3031220] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Domain adaptation has been a fundamental technology for transferring knowledge from a source domain to a target domain. The key issue of domain adaptation is how to reduce the distribution discrepancy between two domains in a proper way such that they can be treated indifferently for learning. In this paper, we propose a novel domain adaptation approach, which can thoroughly explore the data distribution structure of target domain. Specifically, we regard the samples within the same cluster in target domain as a whole rather than individuals and assigns pseudo-labels to the target cluster by class centroid matching. Besides, to exploit the manifold structure information of target data more thoroughly, we further introduce a local manifold self-learning strategy into our proposal to adaptively capture the inherent local connectivity of target samples. An efficient iterative optimization algorithm is designed to solve the objective function of our proposal with theoretical convergence guarantee. In addition to unsupervised domain adaptation, we further extend our method to the semi-supervised scenario including both homogeneous and heterogeneous settings in a direct but elegant way. Extensive experiments on seven benchmark datasets validate the significant superiority of our proposal in both unsupervised and semi-supervised manners.
Collapse
|
22
|
Chen S, Harandi M, Jin X, Yang X. Domain Adaptation by Joint Distribution Invariant Projections. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; PP:8264-8277. [PMID: 32755860 DOI: 10.1109/tip.2020.3013167] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Domain adaptation addresses the learning problem where the training data are sampled from a source joint distribution (source domain), while the test data are sampled from a different target joint distribution (target domain). Because of this joint distribution mismatch, a discriminative classifier naively trained on the source domain often generalizes poorly to the target domain. In this paper, we therefore present a Joint Distribution Invariant Projections (JDIP) approach to solve this problem. The proposed approach exploits linear projections to directly match the source and target joint distributions under the L2-distance. Since the traditional kernel density estimators for distribution estimation tend to be less reliable as the dimensionality increases, we propose a least square method to estimate the L2-distance without the need to estimate the two joint distributions, leading to a quadratic problem with analytic solution. Furthermore, we introduce a kernel version of JDIP to account for inherent nonlinearity in the data. We show that the proposed learning problems can be naturally cast as optimization problems defined on the product of Riemannian manifolds. To be comprehensive, we also establish an error bound, theoretically explaining how our method works and contributes to reducing the target domain generalization error. Extensive empirical evidence demonstrates the benefits of our approach over state-of-the-art domain adaptation methods on several visual data sets.
Collapse
|
23
|
C 2DAN: An Improved Deep Adaptation Network with Domain Confusion and Classifier Adaptation. SENSORS 2020; 20:s20123606. [PMID: 32604859 PMCID: PMC7349586 DOI: 10.3390/s20123606] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 06/20/2020] [Accepted: 06/23/2020] [Indexed: 11/17/2022]
Abstract
Deep neural networks have been successfully applied in domain adaptation which uses the labeled data of source domain to supplement useful information for target domain. Deep Adaptation Network (DAN) is one of these efficient frameworks, it utilizes Multi-Kernel Maximum Mean Discrepancy (MK-MMD) to align the feature distribution in a reproducing kernel Hilbert space. However, DAN does not perform very well in feature level transfer, and the assumption that source and target domain share classifiers is too strict in different adaptation scenarios. In this paper, we further improve the adaptability of DAN by incorporating Domain Confusion (DC) and Classifier Adaptation (CA). To achieve this, we propose a novel domain adaptation method named C2DAN. Our approach first enables Domain Confusion (DC) by using a domain discriminator for adversarial training. For Classifier Adaptation (CA), a residual block is added to the source domain classifier in order to learn the difference between source classifier and target classifier. Beyond validating our framework on the standard domain adaptation dataset office-31, we also introduce and evaluate on the Comprehensive Cars (CompCars) dataset, and the experiment results demonstrate the effectiveness of the proposed framework C2DAN.
Collapse
|