1
|
Xu T, Dan J. EHM: Exploring dynamic alignment and hierarchical clustering in unsupervised domain adaptation via high-order moment-guided contrastive learning. Neural Netw 2025; 185:107188. [PMID: 39884175 DOI: 10.1016/j.neunet.2025.107188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2024] [Revised: 12/22/2024] [Accepted: 01/15/2025] [Indexed: 02/01/2025]
Abstract
Unsupervised domain adaptation (UDA) aims to annotate unlabeled target domain samples using transferable knowledge learned from the labeled source domain. Optimal transport (OT) is a widely adopted probability metric in transfer learning for quantifying domain discrepancy. However, many existing OT-based UDA methods usually employ an entropic regularization term to solve the OT optimization problem, inevitably resulting in a biased estimation of domain discrepancy. Furthermore, to achieve precise alignment of class distributions, numerous UDA methods commonly employ deep features for guiding contrastive learning, overlooking the loss of discriminative information. Additionally, prior studies frequently use conditional entropy regularization term to cluster unlabeled target samples, which may guide the model toward optimizing in the wrong direction. To address the aforementioned issues, this paper proposes a new UDA framework called EHM, which employs a Dynamic Domain Alignment (DDA) strategy, a Reliable High-order Contrastive Alignment (RHCA) strategy, and a Trustworthy Hierarchical Clustering (THC) strategy. Specially, DDA leverages a dynamically adjusted Sinkhorn divergence to measure domain discrepancy, effectively eliminating the biased estimation issue. Our RHCA skillfully conducts contrastive learning in a high-order moment space, significantly enhancing the representation power of transferable features and reducing the domain discrepancy at the class-level. Moreover, THC integrates multi-view information to guide unlabeled samples towards achieving robust clustering. Extensive experiments on various benchmarks demonstrate the effectiveness of our EHM.
Collapse
Affiliation(s)
- Tengyue Xu
- School of Management, Zhejiang University, Hangzhou, 310058, China.
| | - Jun Dan
- College of Information Science & Electronic Engineering, Zhejiang University, Hangzhou, 310027, China.
| |
Collapse
|
2
|
Zhang C, Zheng H, You X, Zheng Y, Gu Y. PASS: Test-Time Prompting to Adapt Styles and Semantic Shapes in Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:1853-1865. [PMID: 40030683 DOI: 10.1109/tmi.2024.3521463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Test-time adaptation (TTA) has emerged as a promising paradigm to handle the domain shifts at test time for medical images from different institutions without using extra training data. However, existing TTA solutions for segmentation tasks suffer from 1) dependency on modifying the source training stage and access to source priors or 2) lack of emphasis on shape-related semantic knowledge that is crucial for segmentation tasks. Recent research on visual prompt learning achieves source-relaxed adaptation by extended parameter space but still neglects the full utilization of semantic features, thus motivating our work on knowledge-enriched deep prompt learning. Beyond the general concern of image style shifts, we reveal that shape variability is another crucial factor causing the performance drop. To address this issue, we propose a TTA framework called PASS (Prompting to Adapt Styles and Semantic shapes), which jointly learns two types of prompts: the input-space prompt to reformulate the style of the test image to fit into the pretrained model and the semantic-aware prompts to bridge high-level shape discrepancy across domains. Instead of naively imposing a fixed prompt, we introduce an input decorator to generate the self-regulating visual prompt conditioned on the input data. To retrieve the knowledge representations and customize target-specific shape prompts for each test sample, we propose a cross-attention prompt modulator, which performs interaction between target representations and an enriched shape prompt bank. Extensive experiments demonstrate the superior performance of PASS over state-of-the-art methods on multiple medical image segmentation datasets. The code is available at https://github.com/EndoluminalSurgicalVision-IMR/PASS.
Collapse
|
3
|
Zhang Y, Chen S, Jiang W, Zhang Y, Lu J, Kwok JT. Domain-guided conditional diffusion model for unsupervised domain adaptation. Neural Netw 2025; 184:107031. [PMID: 39778293 DOI: 10.1016/j.neunet.2024.107031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2024] [Revised: 11/17/2024] [Accepted: 12/06/2024] [Indexed: 01/11/2025]
Abstract
Limited transferability hinders the performance of a well-trained deep learning model when applied to new application scenarios. Recently, Unsupervised Domain Adaptation (UDA) has achieved significant progress in addressing this issue via learning domain-invariant features. However, the performance of existing UDA methods is constrained by the possibly large domain shift and limited target domain data. To alleviate these issues, we propose a Domain-guided Conditional Diffusion Model (DCDM), which generates high-fidelity target domain samples, making the transfer from source domain to target domain easier. DCDM introduces class information to control labels of the generated samples, and a domain classifier to guide the generated samples towards the target domain. Extensive experiments on various benchmarks demonstrate that DCDM brings a large performance improvement to UDA.
Collapse
Affiliation(s)
- Yulong Zhang
- State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou, 310027, Zhejiang, China.
| | - Shuhao Chen
- Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, Guangdong, China.
| | - Weisen Jiang
- Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, Guangdong, China; Department of Computer Science and Engineering, Hong Kong University of Science and Technology, 999077, Hong Kong, China.
| | - Yu Zhang
- Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, Guangdong, China.
| | - Jiangang Lu
- State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou, 310027, Zhejiang, China.
| | - James T Kwok
- Department of Computer Science and Engineering, Hong Kong University of Science and Technology, 999077, Hong Kong, China.
| |
Collapse
|
4
|
Zhang Y, Guo J, Yue H, Zheng S, Liu C. Illumination-Guided progressive unsupervised domain adaptation for low-light instance segmentation. Neural Netw 2025; 183:106958. [PMID: 39637826 DOI: 10.1016/j.neunet.2024.106958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Revised: 09/12/2024] [Accepted: 11/22/2024] [Indexed: 12/07/2024]
Abstract
Due to limited photons, low-light environments pose significant challenges for computer vision tasks. Unsupervised domain adaptation offers a potential solution, but struggles with domain misalignment caused by inadequate utilization of features at different stages. To address this, we propose an Illumination-Guided Progressive Unsupervised Domain Adaptation method, called IPULIS, for low-light instance segmentation by progressively exploring the alignment of features at image-, instance-, and pixel-levels between normal- and low-light conditions under illumination guidance. This is achieved through: (1) an Illumination-Guided Domain Discriminator (IGD) for image-level feature alignment using retinex-derived illumination maps, (2) a Foreground Focus Module (FFM) incorporating global information with local center features to facilitate instance-level feature alignment, and (3) a Contour-aware Domain Discriminator (CAD) for pixel-level feature alignment by matching contour vertex features from a contour-based model. By progressively deploying these modules, IPULIS achieves precise feature alignment, leading to high-quality instance segmentation. Experimental results demonstrate that our IPULIS achieves state-of-the-art performance on real-world low-light dataset LIS.
Collapse
Affiliation(s)
- Yi Zhang
- School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China.
| | - Jichang Guo
- School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China; Shanghai Artificial Intelligence Laboratory, Shanghai, China.
| | - Huihui Yue
- School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore.
| | - Sida Zheng
- School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China.
| | - Chonghao Liu
- School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China.
| |
Collapse
|
5
|
Loh J, Dudchenko L, Viga J, Gemmeke T. Towards Hardware Supported Domain Generalization in DNN-Based Edge Computing Devices for Health Monitoring. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2025; 19:5-15. [PMID: 38913533 DOI: 10.1109/tbcas.2024.3418085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
Deep neural network (DNN) models have shown remarkable success in many real-world scenarios, such as object detection and classification. Unfortunately, these models are not yet widely adopted in health monitoring due to exceptionally high requirements for model robustness and deployment in highly resource-constrained devices. In particular, the acquisition of biosignals, such as electrocardiogram (ECG), is subject to large variations between training and deployment, necessitating domain generalization (DG) for robust classification quality across sensors and patients. The continuous monitoring of ECG also requires the execution of DNN models in convenient wearable devices, which is achieved by specialized ECG accelerators with small form factor and ultra-low power consumption. However, combining DG capabilities with ECG accelerators remains a challenge. This article provides a comprehensive overview of ECG accelerators and DG methods and discusses the implication of the combination of both domains, such that multi-domain ECG monitoring is enabled with emerging algorithm-hardware co-optimized systems. Within this context, an approach based on correction layers is proposed to deploy DG capabilities on the edge. Here, the DNN fine-tuning for unknown domains is limited to a single layer, while the remaining DNN model remains unmodified. Thus, computational complexity (CC) for DG is reduced with minimal memory overhead compared to conventional fine-tuning of the whole DNN model. The DNN model-dependent CC is reduced by more than 2.5 compared to DNN fine-tuning at an average increase of F1 score by more than 20 % on the generalized target domain. In summary, this article provides a novel perspective on robust DNN classification on the edge for health monitoring applications.
Collapse
|
6
|
Sun Y, Shi G, Dong W, Li X, Dong L, Xie X. Local Uncertainty Energy Transfer for Active Domain Adaptation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; PP:816-827. [PMID: 40031161 DOI: 10.1109/tip.2025.3530788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Active Domain Adaptation (ADA) improves knowledge transfer efficiency from the labeled source domain to the unlabeled target domain by selecting a few target sample labels. However, most existing active sampling methods ignore the local uncertainty of neighbors in the target domain,making it easier to pick out anomalous samples that are detrimental to the model. To address this problem, we present a new approach to active domain adaptation called Local Uncertainty Energy Transfer (LUET), which integrates active learning of local uncertainty confusion and energy transfer alignment constraints into a unified framework. First, in the active learning module, the uncertainty difficult and representative samples from the target domain are selected through local uncertainty energy selection and entropy-weighted class confusion selection. And the active learning strategy based on local uncertainty energy will avoid selecting anomalous samples in the target domain. Second, for the discrimination issue caused by domain shift, we use a global and local energy-transfer alignment constraint module to eliminate the domain gap and improve accuracy. Finally, we used negative log-likelihood loss for supervised learning of source domains and query samples. With the introduction of sample-based energy metrics, the active learning strategy is more closely with the domain alignment. Experiments on multiple domain-adaptive datasets have demonstrated that our LUET can achieve outstanding results and outperform existing state-of-the-art approaches.
Collapse
|
7
|
Zhang Z, Liu Z, Ning L, Martin A, Xiong J. Representation of Imprecision in Deep Neural Networks for Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1199-1212. [PMID: 37948150 DOI: 10.1109/tnnls.2023.3329712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2023]
Abstract
Quantification and reduction of uncertainty in deep-learning techniques have received much attention but ignored how to characterize the imprecision caused by such uncertainty. In some tasks, we prefer to obtain an imprecise result rather than being willing or unable to bear the cost of an error. For this purpose, we investigate the representation of imprecision in deep-learning (RIDL) techniques based on the theory of belief functions (TBF). First, the labels of some training images are reconstructed using the learning mechanism of neural networks to characterize the imprecision in the training set. In the process, a label assignment rule is proposed to reassign one or more labels to each training image. Once an image is assigned with multiple labels, it indicates that the image may be in an overlapping region of different categories from the feature perspective or the original label is wrong. Second, those images with multiple labels are rechecked. As a result, the imprecision (multiple labels) caused by the original labeling errors will be corrected, while the imprecision caused by insufficient knowledge is retained. Images with multiple labels are called imprecise ones, and they are considered to belong to meta-categories, the union of some specific categories. Third, the deep network model is retrained based on the reconstructed training set, and the test images are then classified. Finally, some test images that specific categories cannot distinguish will be assigned to meta-categories to characterize the imprecision in the results. Experiments based on some remarkable networks have shown that RIDL can improve accuracy (AC) and reasonably represent imprecision both in the training and testing sets.
Collapse
|
8
|
Wang Y, Zheng W, Li Q, Chen S. Dual-Correction-Adaptation Network for Noisy Knowledge Transfer. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1081-1091. [PMID: 37856271 DOI: 10.1109/tnnls.2023.3322390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2023]
Abstract
Unsupervised domain adaptation (UDA) promotes target learning via a single -directional transfer from label-rich source domain to unlabeled target, while its reverse adaption from target to source has not been jointly considered yet. In real teaching practice, a teacher helps students learn and also gets promotion from students, and such a virtuous cycle inspires us to explore dual -directional transfer between domains. In fact, target pseudo-labels predicted by source commonly involve noise due to model bias; moreover, source domain usually contains innate noise, which inevitably aggravates target noise, leading to noise amplification. Transfer from target to source exploits target knowledge to rectify the adaptation, consequently enables better source transfer, and exploits a virtuous transfer circle. To this end, we propose a dual-correction-adaptation network (DualCAN), in which adaptation and correction cycle between domains, such that learning in both domains can be boosted gradually. To the best of our knowledge, this is the first naive attempt of dual-directional adaptation. Empirical results validate DualCAN with remarkable performance gains, particularly for extreme noisy tasks (e.g., approximately +10% on of Office-31 with 40% label corruption).
Collapse
|
9
|
Ge C, Huang R, Xie M, Lai Z, Song S, Li S, Huang G. Domain Adaptation via Prompt Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1160-1170. [PMID: 37943650 DOI: 10.1109/tnnls.2023.3327962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2023]
Abstract
Unsupervised domain adaptation (UDA) aims to adapt models learned from a well-annotated source domain to a target domain, where only unlabeled samples are given. Current UDA approaches learn domain-invariant features by aligning source and target feature spaces through statistical discrepancy minimization or adversarial training. However, these constraints could lead to the distortion of semantic feature structures and loss of class discriminability. In this article, we introduce a novel prompt learning paradigm for UDA, named domain adaptation via prompt learning (DAPrompt). In contrast to prior works, our approach learns the underlying label distribution for target domain rather than aligning domains. The main idea is to embed domain information into prompts, a form of representation generated from natural language, which is then used to perform classification. This domain information is shared only by images from the same domain, thereby dynamically adapting the classifier according to each domain. By adopting this paradigm, we show that our model not only outperforms previous methods on several cross-domain benchmarks but also is very efficient to train and easy to implement.
Collapse
|
10
|
Li L, Lu T, Sun Y, Gao Y, Yan C, Hu Z, Huang Q. Progressive Decision Boundary Shifting for Unsupervised Domain Adaptation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:274-285. [PMID: 39120988 DOI: 10.1109/tnnls.2024.3431283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/11/2024]
Abstract
Unsupervised domain adaptation (UDA) is attracting more attention from researchers for boosting the task-specific generalization on target domain. It focuses on addressing the domain shift between the labeled source domain and the unlabeled target domain. Recent biclassifier-based UDA models perform category-level alignment to reduce domain shift, and meanwhile, self-training is used for improving the discriminability of target instances. However, the error accumulation problem of instances with high semantic uncertainty may cause discriminability degradation and category-level misalignment. To solve this issue, we design the progressive decision boundary shifting algorithm, where stable category information of target instances is explored for learning a discriminability structure on target domain. Specifically, we first model the semantic uncertainty of instances by progressively shifting decision boundaries of category. Then, we introduce the uncertainty decoupling in a contrastive manner, where the discriminative information is learned from the source domain for instance with low semantic uncertainty. Furthermore, we minimize the predictive entropy of instances with high semantic uncertainty to reduce their prediction confidence. Extensive experiments on three popular datasets show that our model outperforms the current state-of-the-art (SOTA) UDA methods.
Collapse
|
11
|
Lee J, Kang E, Heo DW, Suk HI. Site-Invariant Meta-Modulation Learning for Multisite Autism Spectrum Disorders Diagnosis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:18062-18075. [PMID: 37708014 DOI: 10.1109/tnnls.2023.3311195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/16/2023]
Abstract
Large amounts of fMRI data are essential to building generalized predictive models for brain disease diagnosis. In order to conduct extensive data analysis, it is often necessary to gather data from multiple organizations. However, the site variation inherent in multisite resting-state functional magnetic resonance imaging (rs-fMRI) leads to unfavorable heterogeneity in data distribution, negatively impacting the identification of biomarkers and the diagnostic decision. Several existing methods have alleviated this shift of domain distribution (i.e., multisite problem). Statistical tuning schemes directly regress out site disparity factors from the data prior to model training. Such methods have a limitation in processing data each time through variance estimation according to the added site. In the model adjustment approaches, domain adaptation (DA) methods adjust the features or models of the source domain according to the target domain during model training. Thus, it is inevitable that it needs updating model parameters according to the samples of a target site, causing great limitations in practical applicability. Meanwhile, the approach of domain generalization (DG) aims to create a universal model that can be quickly adapted to multiple domains. In this study, we propose a novel framework for disease diagnosis that alleviates the multisite problem by adaptively calibrating site-specific features into site-invariant features. Specifically, it applies directly to samples from unseen sites without the need for fine-tuning. With a learning-to-learn strategy that learns how to calibrate the features under the various domain shift environments, our novel modulation mechanism extracts site-invariant features. In our experiments over the Autism Brain Imaging Data Exchange (ABIDE I and II) dataset, we validated the generalization ability of the proposed network by improving diagnostic accuracy in both seen and unseen multisite samples.
Collapse
|
12
|
Li Z, Zhang R, Tong L, Zeng Y, Gao Y, Yang K, Yan B. A cross-attention swin transformer network for EEG-based subject-independent cognitive load assessment. Cogn Neurodyn 2024; 18:3805-3819. [PMID: 39712125 PMCID: PMC11655798 DOI: 10.1007/s11571-024-10160-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 06/21/2024] [Accepted: 08/03/2024] [Indexed: 12/24/2024] Open
Abstract
EEG signals play a crucial role in assessing cognitive load, which is a key element in ensuring the secure operation of human-computer interaction systems. However, the variability of EEG signals across different subjects poses a challenge in applying the pre-trained cognitive load assessment model to new subjects. Moreover, previous domain adaptation research has primarily focused on developing complex network architectures to learn more domain-invariant features, overlooking the noise introduced by pseudo-labels and the challenges posed by domain migration problems. Therefore, this study proposes a novel cross-attention swin-transformer network for cross-subject cognitive load assessment, which achieves inter-domain feature alignment through parameter sharing in cross attention mechanism without using pseudo-labels, and utilizes maximum mean discrepancy (MMD) to measure the difference between the feature distributions of the source and target domains, further promoting feature alignment between domains. This method aims to leverage the advantages of cross-attention mechanism and MMD to better mitigate individual differences among subjects in cross-subject cognitive workload assessment. To validate the classification performance of the proposed network, two datasets of image recognition task and N-back task were employed for testing. Results show that, the proposed model outperformed advanced methods with cross-subject classification results of 88.13% and 81.27% on the on local and public datasets. The ablation experiment results reveal that using either the cross-attention mechanism or the MMD strategy alone improves cross-subject classification performance by 2.11% and 2.95% on the local dataset, respectively. Furthermore, the results of the EEG features distribution differences between all subjects before and after network training showed a significant reduction in feature distribution differences between subjects, further confirming the network's effectiveness in minimizing inter-subject differences. Supplementary Information The online version contains supplementary material available at 10.1007/s11571-024-10160-7.
Collapse
Affiliation(s)
- Zhongrui Li
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force Information Engineering University, Zhengzhou, China
| | - Rongkai Zhang
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force Information Engineering University, Zhengzhou, China
| | - Li Tong
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force Information Engineering University, Zhengzhou, China
| | - Ying Zeng
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force Information Engineering University, Zhengzhou, China
| | - Yuanlong Gao
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force Information Engineering University, Zhengzhou, China
| | - Kai Yang
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force Information Engineering University, Zhengzhou, China
| | - Bin Yan
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force Information Engineering University, Zhengzhou, China
| |
Collapse
|
13
|
Liu K, Zhang J. Development of a Cost-Efficient and Glaucoma-Specialized OD/OC Segmentation Model for Varying Clinical Scenarios. SENSORS (BASEL, SWITZERLAND) 2024; 24:7255. [PMID: 39599032 PMCID: PMC11597940 DOI: 10.3390/s24227255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Revised: 10/31/2024] [Accepted: 11/11/2024] [Indexed: 11/29/2024]
Abstract
Most existing optic disc (OD) and cup (OC) segmentation models are biased to the dominant size and easy class (normal class), resulting in suboptimal performances on glaucoma-confirmed samples. Thus, these models are not optimal choices for assisting in tracking glaucoma progression and prognosis. Moreover, fully supervised models employing annotated glaucoma samples can achieve superior performances, although restricted by the high cost of collecting and annotating the glaucoma samples. Therefore, in this paper, we are dedicated to developing a glaucoma-specialized model by exploiting low-cost annotated normal fundus images, simultaneously adapting various common scenarios in clinical practice. We employ a contrastive learning and domain adaptation-based model by exploiting shared knowledge from normal samples. To capture glaucoma-related features, we utilize a Gram matrix to encode style information and the domain adaptation strategy to encode domain information, followed by narrowing the style and domain gaps between normal and glaucoma samples by contrastive and adversarial learning, respectively. To validate the efficacy of our proposed model, we conducted experiments utilizing two public datasets to mimic various common scenarios. The results demonstrate the superior performance of our proposed model across multi-scenarios, showcasing its proficiency in both the segmentation- and glaucoma-related metrics. In summary, our study illustrates a concerted effort to target confirmed glaucoma samples, mitigating the inherent bias issue in most existing models. Moreover, we propose an annotation-efficient strategy that exploits low-cost, normal-labeled fundus samples, mitigating the economic- and labor-related burdens by employing a fully supervised strategy. Simultaneously, our approach demonstrates its adaptability across various scenarios, highlighting its potential utility in both assisting in the monitoring of glaucoma progression and assessing glaucoma prognosis.
Collapse
Affiliation(s)
- Kai Liu
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China;
- Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang University, Beijing 100083, China
- Department of Computer Science, City University of Hong Kong, Hong Kong 98121, China
| | - Jicong Zhang
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China;
- Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang University, Beijing 100083, China
- Hefei Innovation Research Institute, Beihang University, Hefei 230012, China
| |
Collapse
|
14
|
He P, Jiao L, Shang R, Liu X, Liu F, Yang S, Zhang X, Wang S. A Patch Diversity Transformer for Domain Generalized Semantic Segmentation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:14138-14150. [PMID: 37279122 DOI: 10.1109/tnnls.2023.3274760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Domain generalization (DG) is one of the critical issues for deep learning in unknown domains. How to effectively represent domain-invariant context (DIC) is a difficult problem that DG needs to solve. Transformers have shown the potential to learn generalized features, since the powerful ability to learn global context. In this article, a novel method named patch diversity Transformer (PDTrans) is proposed to improve the DG for scene segmentation by learning global multidomain semantic relations. Specifically, patch photometric perturbation (PPP) is proposed to improve the representation of multidomain in the global context information, which helps the Transformer learn the relationship between multiple domains. Besides, patch statistics perturbation (PSP) is proposed to model the feature statistics of patches under different domain shifts, which enables the model to encode domain-invariant semantic features and improve generalization. PPP and PSP can help to diversify the source domain at the patch level and feature level. PDTrans learns context across diverse patches and takes advantage of self-attention to improve DG. Extensive experiments demonstrate the tremendous performance advantages of the PDTrans over state-of-the-art DG methods.
Collapse
|
15
|
Ma S, Yuan Z, Wu Q, Huang Y, Hu X, Leung CH, Wang D, Huang Z. Deep Into the Domain Shift: Transfer Learning Through Dependence Regularization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:14409-14423. [PMID: 37279130 DOI: 10.1109/tnnls.2023.3279099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Classical domain adaptation methods acquire transferability by regularizing the overall distributional discrepancies between features in the source domain (labeled) and features in the target domain (unlabeled). They often do not differentiate whether the domain differences come from the marginals or the dependence structures. In many business and financial applications, the labeling function usually has different sensitivities to the changes in the marginals versus changes in the dependence structures. Measuring the overall distributional differences will not be discriminative enough in acquiring transferability. Without the needed structural resolution, the learned transfer is less optimal. This article proposes a new domain adaptation approach in which one can measure the differences in the internal dependence structure separately from those in the marginals. By optimizing the relative weights among them, the new regularization strategy greatly relaxes the rigidness of the existing approaches. It allows a learning machine to pay special attention to places where the differences matter the most. Experiments on three real-world datasets show that the improvements are quite notable and robust compared to various benchmark domain adaptation models.
Collapse
|
16
|
Chen Z, Pan Y, Ye Y, Wang Z, Xia Y. TriLA: Triple-Level Alignment Based Unsupervised Domain Adaptation for Joint Segmentation of Optic Disc and Optic Cup. IEEE J Biomed Health Inform 2024; 28:5497-5508. [PMID: 38805331 DOI: 10.1109/jbhi.2024.3406447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2024]
Abstract
Cross-domain joint segmentation of optic disc and optic cup on fundus images is essential, yet challenging, for effective glaucoma screening. Although many unsupervised domain adaptation (UDA) methods have been proposed, these methods can hardly achieve complete domain alignment, leading to suboptimal performance. In this paper, we propose a triple-level alignment (TriLA) model to address this issue by aligning the source and target domains at the input level, feature level, and output level simultaneously. At the input level, a learnable Fourier domain adaptation (LFDA) module is developed to learn the cut-off frequency adaptively for frequency-domain translation. At the feature level, we disentangle the style and content features and align them in the corresponding feature spaces using consistency constraints. At the output level, we design a segmentation consistency constraint to emphasize the segmentation consistency across domains. The proposed model is trained on the RIGA+ dataset and widely evaluated on six different UDA scenarios. Our comprehensive results not only demonstrate that the proposed TriLA substantially outperforms other state-of-the-art UDA methods in joint segmentation of optic disc and optic cup, but also suggest the effectiveness of the triple-level alignment strategy.
Collapse
|
17
|
Liu J, Jiao G. Cross-domain additive learning of new knowledge rather than replacement. Biomed Eng Lett 2024; 14:1137-1146. [PMID: 39220031 PMCID: PMC11362399 DOI: 10.1007/s13534-024-00399-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Revised: 01/10/2024] [Accepted: 05/27/2024] [Indexed: 09/04/2024] Open
Abstract
In medical clinical scenarios for reasons such as patient privacy, information protection and data migration, when domain adaptation is needed for real scenarios, the source-domain data is often inaccessible and only the pre-trained source model on the source-domain is available. Existing solutions for this type of problem tend to forget the rich task experience previously learned on the source domain after adapting, which means that the model simply overfits the target-domain data when adapting and does not learn robust features that facilitate real task decisions. We address this problem by exploring the particular application of source-free domain adaptation in medical image segmentation and propose a two-stage additive source-free adaptation framework. We generalize the domain-invariant features by constraining the core pathological structure and semantic consistency between different perspectives. And we reduce the segmentation generated by locating and filtering elements that may have errors through Monte-Carlo uncertainty estimation. We conduct comparison experiments with some other methods on a cross-device polyp segmentation and a cross-modal brain tumor segmentation dataset, the results in both the target and source domains verify that the proposed method can effectively solve the domain offset problem and the model retains its dominance on the source domain after learning new knowledge of the target domain.This work provides valuable exploration for achieving additive learning on the target and source domains in the absence of source data and offers new ideas and methods for adaptation research in the field of medical image segmentation.
Collapse
Affiliation(s)
- Jiahao Liu
- College of Computer Science, Hengyang Normal University, Hengyang, 421008 China
| | - Ge Jiao
- College of Computer Science, Hengyang Normal University, Hengyang, 421008 China
| |
Collapse
|
18
|
Jing T, Xia H, Hamm J, Ding Z. Marginalized Augmented Few-Shot Domain Adaptation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:12459-12469. [PMID: 37037243 DOI: 10.1109/tnnls.2023.3263176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Domain adaptation (DA) has recently drawn a lot of attention, as it facilitates unlabeled target learning by borrowing knowledge from an external source domain. Most existing DA solutions seek to align feature representations between the labeled source and unlabeled target data. However, the scarcity of target data easily results in negative transfer, as it misleads the cross DA to the dominance of the source. To address the challenging few-shot domain adaptation (FSDA) problem, in this article, we propose a novel marginalized augmented FSDA (MAF) approach to address the cross-domain distribution disparity and insufficiency of target data simultaneously. On the one hand, cross-domain continuity augmentation (CCA) synthesizes abundant intermediate patterns across domains leading to a continuous domain-invariant latent space. On the other hand, sufficient source-supervised semantic augmentation (SSA) is explored to progressively diversify the conditional distribution within and across domains. Moreover, the proposed augmentation strategies are implemented efficiently via an expected transferable cross-entropy (CE) loss over the augmented distribution instead of explicit data synthesis, and minimizing the upper bound of the expected loss introduces negligible extra computing cost. Experimentally, our method outperforms the state of the art in various FSDA benchmarks, which demonstrates the effectiveness and contribution of our work. Our source code is provided at https://github.com/scottjingtt/MAF.git.
Collapse
|
19
|
Liu X, Dai C, Liu J, Yuan Y. Effects of Exercise on the Inter-Session Accuracy of sEMG-Based Hand Gesture Recognition. Bioengineering (Basel) 2024; 11:811. [PMID: 39199769 PMCID: PMC11351745 DOI: 10.3390/bioengineering11080811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2024] [Revised: 08/05/2024] [Accepted: 08/08/2024] [Indexed: 09/01/2024] Open
Abstract
Surface electromyography (sEMG) is commonly used as an interface in human-machine interaction systems due to their high signal-to-noise ratio and easy acquisition. It can intuitively reflect motion intentions of users, thus is widely applied in gesture recognition systems. However, wearable sEMG-based gesture recognition systems are susceptible to changes in environmental noise, electrode placement, and physiological characteristics. This could result in significant performance degradation of the model in inter-session scenarios, bringing a poor experience to users. Currently, for noise from environmental changes and electrode shifting from wearing variety, numerous studies have proposed various data-augmentation methods and highly generalized networks to improve inter-session gesture recognition accuracy. However, few studies have considered the impact of individual physiological states. In this study, we assumed that user exercise could cause changes in muscle conditions, leading to variations in sEMG features and subsequently affecting the recognition accuracy of model. To verify our hypothesis, we collected sEMG data from 12 participants performing the same gesture tasks before and after exercise, and then used Linear Discriminant Analysis (LDA) for gesture classification. For the non-exercise group, the inter-session accuracy declined only by 2.86%, whereas that of the exercise group decreased by 13.53%. This finding proves that exercise is indeed a critical factor contributing to the decline in inter-session model performance.
Collapse
Affiliation(s)
- Xiangyu Liu
- College of Publishing, University of Shanghai for Science and Technology, Shanghai 200093, China;
| | - Chenyun Dai
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200241, China;
| | - Jionghui Liu
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
| | - Yangyang Yuan
- School of Information Science and Technology, Fudan University, Shanghai 200433, China
| |
Collapse
|
20
|
Kong Z, Zhang W, Liu F, Luo W, Liu H, Shen L, Ramachandra R. Taming Self-Supervised Learning for Presentation Attack Detection: De-Folding and De-Mixing. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10639-10650. [PMID: 37027593 DOI: 10.1109/tnnls.2023.3243229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Biometric systems are vulnerable to presentation attacks (PAs) performed using various PA instruments (PAIs). Even though there are numerous PA detection (PAD) techniques based on both deep learning and hand-crafted features, the generalization of PAD for unknown PAI is still a challenging problem. In this work, we empirically prove that the initialization of the PAD model is a crucial factor for generalization, which is rarely discussed in the community. Based on such observation, we proposed a self-supervised learning-based method, denoted as DF-DM. Specifically, DF-DM is based on a global-local view coupled with de-folding and de-mixing to derive the task-specific representation for PAD. During de-folding, the proposed technique will learn region-specific features to represent samples in a local pattern by explicitly minimizing the generative loss. While de-mixing drives detectors to obtain the instance-specific features with global information for more comprehensive representation by minimizing the interpolation-based consistency. Extensive experimental results show that the proposed method can achieve significant improvements in terms of both face and fingerprint PAD in more complicated and hybrid datasets when compared with the state-of-the-art methods. When training in CASIA-FASD and Idiap Replay-Attack, the proposed method can achieve an 18.60% equal error rate (EER) in OULU-NPU and MSU-MFSD, exceeding the baseline performance by 9.54%. The source code of the proposed technique is available at https://github.com/kongzhecn/dfdm.
Collapse
|
21
|
Liu J, Zhao J, Xiao J, Zhao G, Xu P, Yang Y, Gong S. Unsupervised domain adaptation multi-level adversarial learning-based crossing-domain retinal vessel segmentation. Comput Biol Med 2024; 178:108759. [PMID: 38917530 DOI: 10.1016/j.compbiomed.2024.108759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 06/12/2024] [Accepted: 06/13/2024] [Indexed: 06/27/2024]
Abstract
BACKGROUND The retinal vasculature, a crucial component of the human body, mirrors various illnesses such as cardiovascular disease, glaucoma, and retinopathy. Accurate segmentation of retinal vessels in funduscopic images is essential for diagnosing and understanding these conditions. However, existing segmentation models often struggle with images from different sources, making accurate segmentation in crossing-source fundus images challenging. METHODS To address the crossing-source segmentation issues, this paper proposes a novel Multi-level Adversarial Learning and Pseudo-label Denoising-based Self-training Framework (MLAL&PDSF). Expanding on our previously proposed Multiscale Context Gating with Breakpoint and Spatial Dual Attention Network (MCG&BSA-Net), MLAL&PDSF introduces a multi-level adversarial network that operates at both the feature and image layers to align distributions between the target and source domains. Additionally, it employs a distance comparison technique to refine pseudo-labels generated during the self-training process. By comparing the distance between the pseudo-labels and the network predictions, the framework identifies and corrects inaccuracies, thus enhancing the accuracy of the fine vessel segmentation. RESULTS We have conducted extensive validation and comparative experiments on the CHASEDB1, STARE, and HRF datasets to evaluate the efficacy of the MLAL&PDSF. The evaluation metrics included the area under the operating characteristic curve (AUC), sensitivity (SE), specificity (SP), accuracy (ACC), and balanced F-score (F1). The performance results from unsupervised domain adaptive segmentation are remarkable: for DRIVE to CHASEDB1, results are AUC: 0.9806, SE: 0.7400, SP: 0.9737, ACC: 0.9874, and F1: 0.8851; for DRIVE to STARE, results are AUC: 0.9827, SE: 0.7944, SP: 0.9651, ACC: 0.9826, and F1: 0.8326. CONCLUSION These results demonstrate the effectiveness and robustness of MLAL&PDSF in achieving accurate segmentation results from crossing-domain retinal vessel datasets. The framework lays a solid foundation for further advancements in cross-domain segmentation and enhances the diagnosis and understanding of related diseases.
Collapse
Affiliation(s)
- Jinping Liu
- College of Information Science and Engineering, Hunan Normal University, Changsha, Hunan, 410081, China.
| | - Junqi Zhao
- College of Information Science and Engineering, Hunan Normal University, Changsha, Hunan, 410081, China.
| | - Jingri Xiao
- College of Information Science and Engineering, Hunan Normal University, Changsha, Hunan, 410081, China.
| | - Gangjin Zhao
- College of Information Science and Engineering, Hunan Normal University, Changsha, Hunan, 410081, China.
| | - Pengfei Xu
- College of Information Science and Engineering, Hunan Normal University, Changsha, Hunan, 410081, China.
| | - Yimei Yang
- School of Mathematics and Statistics, Hunan Normal University, Changsha, Hunan, 410081, China; College of Computer and Artificial Intelligence (Software College), Huaihua University, Huaihua, Hunan, 418000, China.
| | - Subo Gong
- Department of Geriatrics, The Second Xiangya Hospital of Central South University, Changsha, 410011, China.
| |
Collapse
|
22
|
Oza P, Sindagi VA, Vs V, Patel VM. Unsupervised Domain Adaptation of Object Detectors: A Survey. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:4018-4040. [PMID: 37030853 DOI: 10.1109/tpami.2022.3217046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Recent advances in deep learning have led to the development of accurate and efficient models for various computer vision applications such as classification, segmentation, and detection. However, learning highly accurate models relies on the availability of large-scale annotated datasets. Due to this, model performance drops drastically when evaluated on label-scarce datasets having visually distinct images, termed as domain adaptation problem. There are a plethora of works to adapt classification and segmentation models to label-scarce target dataset through unsupervised domain adaptation. Considering that detection is a fundamental task in computer vision, many recent works have focused on developing novel domain adaptive detection techniques. Here, we describe in detail the domain adaptation problem for detection and present an extensive survey of the various methods. Furthermore, we highlight strategies proposed and the associated shortcomings. Subsequently, we identify multiple aspects of the problem that are most promising for future research. We believe that this survey shall be valuable to the pattern recognition experts working in the fields of computer vision, biometrics, medical imaging, and autonomous navigation by introducing them to the problem, and familiarizing them with the current status of the progress while providing promising directions for future research.
Collapse
|
23
|
Lu N, Xiao H, Ma Z, Yan T, Han M. Domain Adaptation With Self-Supervised Learning and Feature Clustering for Intelligent Fault Diagnosis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7657-7670. [PMID: 36378787 DOI: 10.1109/tnnls.2022.3219896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Domain adaptation indeed promotes the progress of intelligent fault diagnosis in industrial scenarios. The abundant labeled samples are not necessary. The identical distribution between the training and testing datasets is not any more the prerequisite for intelligent fault diagnosis working. However, two issues arise subsequently: Feature learning in domain adaptation framework tends to be biased to the source domain, and unreliable pseudolabeling seriously impacts on the conditional domain adaptation. In this article, a new domain adaptation approach with self-supervised learning and feature clustering (DASSL-FC) is proposed, trying to alleviate the issues by unbiased feature learning and pseudolabels updating strategy. Taking different transformation methods as pretext, the transformed data and its pretext train a neural network in an SSL way. As to pseudolabeling, clusters are taken as the auxiliary information to correct the network predicted labels in terms of the "strong cluster" rule. Then, the updated pseudolabels and their confidence are enforced to further estimate the conditional distribution discrepancy and its confidence weight. To verify the effectiveness of the proposed method, the experiments are implemented on intraplatform and interplatforms for simulating the practical scenarios.
Collapse
|
24
|
Xu Y, Cao H, Mao K, Chen Z, Xie L, Yang J. Aligning Correlation Information for Domain Adaptation in Action Recognition. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:6767-6778. [PMID: 36256722 DOI: 10.1109/tnnls.2022.3212909] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Domain adaptation (DA) approaches address domain shift and enable networks to be applied to different scenarios. Although various image DA approaches have been proposed in recent years, there is limited research toward video DA. This is partly due to the complexity in adapting the different modalities of features in videos, which includes the correlation features extracted as long-range dependencies of pixels across spatiotemporal dimensions. The correlation features are highly associated with action classes and proven their effectiveness in accurate video feature extraction through the supervised action recognition task. Yet correlation features of the same action would differ across domains due to domain shift. Therefore, we propose a novel adversarial correlation adaptation network (ACAN) to align action videos by aligning pixel correlations. ACAN aims to minimize the distribution of correlation information, termed as pixel correlation discrepancy (PCD). Additionally, video DA research is also limited by the lack of cross-domain video datasets with larger domain shifts. We, therefore, introduce a novel HMDB-ARID dataset with a larger domain shift caused by a larger statistical difference between domains. This dataset is built in an effort to leverage current datasets for dark video classification. Empirical results demonstrate the state-of-the-art performance of our proposed ACAN for both existing and the new video DA datasets.
Collapse
|
25
|
Yang C, Liu Q, Liu Y, Cheung YM. Transfer Dynamic Latent Variable Modeling for Quality Prediction of Multimode Processes. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:6061-6074. [PMID: 37079407 DOI: 10.1109/tnnls.2023.3265762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Quality prediction is beneficial to intelligent inspection, advanced process control, operation optimization, and product quality improvements of complex industrial processes. Most of the existing work obeys the assumption that training samples and testing samples follow similar data distributions. The assumption is, however, not true for practical multimode processes with dynamics. In practice, traditional approaches mostly establish a prediction model using the samples from the principal operating mode (POM) with abundant samples. The model is inapplicable to other modes with a few samples. In view of this, this article will propose a novel dynamic latent variable (DLV)-based transfer learning approach, called transfer DLV regression (TDLVR), for quality prediction of multimode processes with dynamics. The proposed TDLVR can not only derive the dynamics between process variables and quality variables in the POM but also extract the co-dynamic variations among process variables between the POM and the new mode. This can effectively overcome data marginal distribution discrepancy and enrich the information of the new mode. To make full use of the available labeled samples from the new mode, an error compensation mechanism is incorporated into the established TDLVR, termed compensated TDLVR (CTDLVR), to adapt to the conditional distribution discrepancy. Empirical studies show the efficacy of the proposed TDLVR and CTDLVR methods in several case studies, including numerical simulation examples and two real-industrial process examples.
Collapse
|
26
|
Luo SH, Zhao XJ, Cao MF, Xu J, Wang WL, Lu XY, Huang QT, Yue XX, Liu GK, Yang L, Ren B, Tian ZQ. Signal2signal: Pushing the Spatiotemporal Resolution to the Limit by Single Chemical Hyperspectral Imaging. Anal Chem 2024; 96:6550-6557. [PMID: 38642045 DOI: 10.1021/acs.analchem.3c04609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/22/2024]
Abstract
There is growing interest in developing a high-performance self-supervised denoising algorithm for real-time chemical hyperspectral imaging. With a good understanding of the working function of the zero-shot Noise2Noise-based denoising algorithm, we developed a self-supervised Signal2Signal (S2S) algorithm for real-time denoising with a single chemical hyperspectral image. Owing to the accurate distinction and capture of the weak signal from the random fluctuating noise, S2S displays excellent denoising performance, even for the hyperspectral image with a spectral signal-to-noise ratio (SNR) as low as 1.12. Under this condition, both the image clarity and the spatial resolution could be significantly improved and present an almost identical pattern with a spectral SNR of 7.87. The feasibility of real-time denoising during imaging was well demonstrated, and S2S was applied to monitor the photoinduced exfoliation of transition metal dichalcogenide, which is hard to accomplish by confocal Raman spectroscopy. In general, the real-time denoising capability of S2S offers an easy way toward in situ/in vivo/operando research with much improved spatial and temporal resolution. S2S is open-source at https://github.com/3331822w/Signal2signal and will be accessible online at https://ramancloud.xmu.edu.cn/tutorial.
Collapse
Affiliation(s)
- Si-Heng Luo
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Xiao-Jiao Zhao
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Mao-Feng Cao
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Jing Xu
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Wei-Li Wang
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Xin-Yu Lu
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Qiu-Ting Huang
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Xia-Xia Yue
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Guo-Kun Liu
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Liu Yang
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Bin Ren
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Zhong-Qun Tian
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| |
Collapse
|
27
|
Guo Y, Liu J, Wu Y, Jiang X, Wang Y, Meng L, Liu X, Shu F, Dai C, Chen W. sEMG-Based Inter-Session Hand Gesture Recognition via Domain Adaptation with Locality Preserving and Maximum Margin. Int J Neural Syst 2024; 34:2450010. [PMID: 38369904 DOI: 10.1142/s0129065724500102] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Surface electromyography (sEMG)-based gesture recognition can achieve high intra-session performance. However, the inter-session performance of gesture recognition decreases sharply due to the shift in data distribution. Therefore, developing a robust model to minimize the data distribution difference is crucial to improving the user experience. In this work, based on the inter-session gesture recognition task, we propose a novel algorithm called locality preserving and maximum margin criterion (LPMM). The LPMM algorithm integrates three main modules, including domain alignment, pseudo-label selection, and iteration result selection. Domain alignment is designed to preserve the neighborhood structure of the feature and minimize the overlap of different classes. The pseudo-label selection and iteration result selection can avoid the decrease in accuracy caused by mislabeled samples. The proposed algorithm was evaluated on two of the most widely used EMG databases. It achieves a mean accuracy of 98.46% and 71.64%, respectively, which is superior to state-of-the-art domain adaptation methods.
Collapse
Affiliation(s)
- Yao Guo
- School of Information Science and Technology, Fudan University, Shanghai, P. R. China
| | - Jiayan Liu
- School of Information Science and Technology, Fudan University, Shanghai, P. R. China
| | - Yonglin Wu
- School of Information Science and Technology, Fudan University, Shanghai, P. R. China
| | - Xinyu Jiang
- School of Information Science and Technology, Fudan University, Shanghai, P. R. China
| | - Yalin Wang
- School of Information Science and Technology, Fudan University, Shanghai, P. R. China
| | - Long Meng
- School of Information Science and Technology, Fudan University, Shanghai, P. R. China
| | - Xiangyu Liu
- College of Communication and Art Design, University of Shanghai for Science and Technology, Shanghai, P. R. China
| | - Feng Shu
- Academy for Engineering and Technology, Fudan University, Shanghai, P. R. China
| | - Chenyun Dai
- School of Information Science and Technology, Fudan University, Shanghai, P. R. China
| | - Wei Chen
- School of Information Science and Technology, Fudan University, Shanghai, P. R. China
| |
Collapse
|
28
|
Zhu H, Wu Y, Yang G, Song R, Yu J, Zhang J. Electronic Nose Drift Suppression Based on Smooth Conditional Domain Adversarial Networks. SENSORS (BASEL, SWITZERLAND) 2024; 24:1319. [PMID: 38400477 PMCID: PMC10892276 DOI: 10.3390/s24041319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 01/31/2024] [Accepted: 02/12/2024] [Indexed: 02/25/2024]
Abstract
Anti-drift is a new and serious challenge in the field related to gas sensors. Gas sensor drift causes the probability distribution of the measured data to be inconsistent with the probability distribution of the calibrated data, which leads to the failure of the original classification algorithm. In order to make the probability distributions of the drifted data and the regular data consistent, we introduce the Conditional Adversarial Domain Adaptation Network (CDAN)+ Sharpness Aware Minimization (SAM) optimizer-a state-of-the-art deep transfer learning method.The core approach involves the construction of feature extractors and domain discriminators designed to extract shared features from both drift and clean data. These extracted features are subsequently input into a classifier, thereby amplifying the overall model's generalization capabilities. The method boasts three key advantages: (1) Implementation of semi-supervised learning, thereby negating the necessity for labels on drift data. (2) Unlike conventional deep transfer learning methods such as the Domain-adversarial Neural Network (DANN) and Wasserstein Domain-adversarial Neural Network (WDANN), it accommodates inter-class correlations. (3) It exhibits enhanced ease of training and convergence compared to traditional deep transfer learning networks. Through rigorous experimentation on two publicly available datasets, we substantiate the efficiency and effectiveness of our proposed anti-drift methodology when juxtaposed with state-of-the-art techniques.
Collapse
Affiliation(s)
| | | | | | | | | | - Jianwei Zhang
- School of Control Science and Engineering, Dalian University of Technology, Dalian 116000, China; (H.Z.); (Y.W.); (G.Y.); (R.S.); (J.Y.)
| |
Collapse
|
29
|
Sun Y, Dong W, Li X, Dong L, Shi G, Xie X. TransVQA: Transferable Vector Quantization Alignment for Unsupervised Domain Adaptation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:856-866. [PMID: 38231815 DOI: 10.1109/tip.2024.3352392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Unsupervised Domain adaptation (UDA) aims to transfer knowledge from the labeled source domain to the unlabeled target domain. Most existing domain adaptation methods are based on convolutional neural networks (CNNs) to learn cross-domain invariant features. Inspired by the success of transformer architectures and their superiority to CNNs, we propose to combine the transformer with UDA to improve their generalization properties. In this paper, we present a novel model named Trans ferable V ector Q uantization A lignment for Unsupervised Domain Adaptation (TransVQA), which integrates the Transferable transformer-based feature extractor (Trans), vector quantization domain alignment (VQA), and mutual information weighted maximization confusion matrix (MIMC) of intra-class discrimination into a unified domain adaptation framework. First, TransVQA uses the transformer to extract more accurate features in different domains for classification. Second, TransVQA, based on the vector quantization alignment module, uses a two-step alignment method to align the extracted cross-domain features and solve the domain shift problem. The two-step alignment includes global alignment via vector quantization and intra-class local alignment via pseudo-labels. Third, for intra-class feature discrimination problem caused by the fuzzy alignment of different domains, we use the MIMC module to constrain the target domain output and increase the accuracy of pseudo-labels. The experiments on several datasets of domain adaptation show that TransVQA can achieve excellent performance and outperform existing state-of-the-art methods.
Collapse
|
30
|
Wu S, Shu L, Song Z, Xu X. SFDA: Domain Adaptation With Source Subject Fusion Based on Multi-Source and Single-Target Fall Risk Assessment. IEEE Trans Neural Syst Rehabil Eng 2023; 31:4907-4920. [PMID: 38032785 DOI: 10.1109/tnsre.2023.3337861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]
Abstract
In cross-subject fall risk classification based on plantar pressure, a challenge is that data from different subjects have significant individual information. Thus, the models with insufficient generalization ability can't perform well on new subjects, which limits their application in daily life. To solve this problem, domain adaptation methods are applied to reduce the gap between source and target domain. However, these methods focus on the distribution of the source and the target domain, but ignore the potential correlation among multiple source subjects, which deteriorates domain adaptation performance. In this paper, we proposed a novel method named domain adaptation with subject fusion (SFDA) for fall risk assessment, greatly improving the cross-subject assessment ability. Specifically, SFDA synchronously carries out source target adaptation and multiple source subject fusion by domain adversarial module to reduce source-target gap and distribution distance within source subjects of same class. Consequently, target samples can learn more task-specific features from source subjects to improve the generalization ability. Experiment results show that SFDA achieved mean accuracy of 79.17 % and 73.66 % based on two backbones in a cross-subject classification manner, outperforming the state-of-the-art methods on continuous plantar pressure dataset. This study proves the effectiveness of SFDA and provides a novel tool for implementing cross-subject and few-gait fall risk assessment.
Collapse
|
31
|
Zheng Z, Hu Y, Bin Y, Xu X, Yang Y, Shen HT. Composition-Aware Image Steganography Through Adversarial Self-Generated Supervision. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:9451-9465. [PMID: 35679383 DOI: 10.1109/tnnls.2022.3175627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Steganography is an important and prevailing information hiding tool to perform secret message transmission in an open environment. Existing steganography methods can mainly fall into two categories: predefined rule-based and data-driven methods. The former is susceptible to the statistical attack, while the latter adopts the deep convolution neural networks to promote security. However, deep learning-based methods suffer from perceptible artificial artifacts or deep steganalysis. In this article, we introduce a novel composition-aware image steganography (CAIS) to guarantee both visual security and resistance to deep steganalysis through the self-generated supervision. The key innovation is an adversarial composition estimation module, which has integrated the rule-based composition method and generative adversarial network to help synthesize steganographic images with more naturalness. We first perform a rule-based image blending method to obtain infinite synthetically data-label pairs. Then, we utilize an adversarial composition estimation branch to recognize the message feature pattern from the composite image based on these self-generated data-label pairs. Through the adversarial training, we force the steganography function to synthesize steganographic images, which can fool the composition estimation network. Thus, the proposed CAIS can achieve better information hiding and higher security to resist deep steganalysis. Furthermore, an effective global-and-part checking is designed to alleviate visual artifacts caused by hiding secret information. We conduct a comprehensive analysis of CAIS from various aspects (e.g., security and robustness) to verify the superior performance of the proposed method. Comprehensive experimental results on three large-scale widely used datasets have demonstrated the superior performance of our CAIS compared with several state-of-the-art approaches.
Collapse
|
32
|
Chen S, Hong Z, Harandi M, Yang X. Domain Neural Adaptation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:8630-8641. [PMID: 35259116 DOI: 10.1109/tnnls.2022.3151683] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Domain adaptation is concerned with the problem of generalizing a classification model to a target domain with little or no labeled data, by leveraging the abundant labeled data from a related source domain. The source and target domains possess different joint probability distributions, making it challenging for model generalization. In this article, we introduce domain neural adaptation (DNA): an approach that exploits nonlinear deep neural network to 1) match the source and target joint distributions in the network activation space and 2) learn the classifier in an end-to-end manner. Specifically, we employ the relative chi-square divergence to compare the two joint distributions, and show that the divergence can be estimated via seeking the maximal value of a quadratic functional over the reproducing kernel hilbert space. The analytic solution to this maximization problem enables us to explicitly express the divergence estimate as a function of the neural network mapping. We optimize the network parameters to minimize the estimated joint distribution divergence and the classification loss, yielding a classification model that generalizes well to the target domain. Empirical results on several visual datasets demonstrate that our solution is statistically better than its competitors.
Collapse
|
33
|
Li Y, Huang J, Lu S, Zhang Z, Lu G. Cross-Domain Facial Expression Recognition via Contrastive Warm up and Complexity-Aware Self-Training. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:5438-5450. [PMID: 37773906 DOI: 10.1109/tip.2023.3318955] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/01/2023]
Abstract
Unsupervised cross-domain Facial Expression Recognition (FER) aims to transfer the knowledge from a labeled source domain to an unlabeled target domain. Existing methods strive to reduce the discrepancy between source and target domain, but cannot effectively explore the abundant semantic information of the target domain due to the absence of target labels. To this end, we propose a novel framework via Contrastive Warm up and Complexity-aware Self-Training (namely CWCST), which facilitates source knowledge transfer and target semantic learning jointly. Specifically, we formulate a contrastive warm up strategy via features, momentum features, and learnable category centers to concurrently learn discriminative representations and narrow the domain gap, which benefits domain adaptation by generating more accurate target pseudo labels. Moreover, to deal with the inevitable noise in pseudo labels, we develop complexity-aware self-training with a label selection module based on prediction entropy, which iteratively generates pseudo labels and adaptively chooses the reliable ones for training, ultimately yielding effective target semantics exploration. Furthermore, by jointly using the two mentioned components, our framework enables to effectively utilize the source knowledge and target semantic information by source-target co- training. In addition, our framework can be easily incorporated into other baselines with consistent performance improvements. Extensive experimental results on seven databases show the superior performance of the proposed method against various baselines.
Collapse
|
34
|
Zhang Y, Zhang Y, Guo W, Cai X, Yuan X. Learning Disentangled Representation for Multimodal Cross-Domain Sentiment Analysis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:7956-7966. [PMID: 35188893 DOI: 10.1109/tnnls.2022.3147546] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Multimodal cross-domain sentiment analysis aims at transferring domain-invariant sentiment information across datasets to address the insufficiency of labeled data. Existing adaptation methods achieve well performance by remitting the discrepancies in characteristics of multiple modalities. However, the expressive styles of different datasets also contain domain-specific information, which hinders the adaptation performance. In this article, we propose a disentangled sentiment representation adversarial network (DiSRAN) to reduce the domain shift of expressive styles for multimodal cross-domain sentiment analysis. Specifically, we first align the multiple modalities and obtain the joint representation through a cross-modality attention layer. Then, we disentangle sentiment information from the multimodal joint representation that contains domain-specific expressive style by adversarial training. The obtained sentiment representation is domain-invariant, which can better facilitate the sentiment information transfer between different domains. Experimental results on two multimodal cross-domain sentiment analysis tasks demonstrate that the proposed method performs favorably against state-of-the-art approaches.
Collapse
|
35
|
Yang J, Zhou Y, Zhao Y, Lu W, Gao X. MetaMP: Metalearning-Based Multipatch Image Aesthetics Assessment. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:5716-5728. [PMID: 35580097 DOI: 10.1109/tcyb.2022.3169017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Image aesthetics assessment (IAA) is a subjective and complex task. The aesthetics of different themes vary greatly in content and aesthetic results, whether they are in the same aesthetic community or not. In aesthetic evaluation tasks, the pretrained network with direct fine-tune may not be able to quickly adapt to tasks on various themes. This article introduces a metalearning-based multipatch (MetaMP) IAA method to adapt to various thematic tasks quickly. The network is trained based on metalearning to obtain content-oriented aesthetic expression. In addition, we design a complete-information patch selection scheme and a multipatch (MP) network to make the fine details fit the overall impression. Experimental results demonstrate the superiority of the proposed method in comparison with the state-of-the-art models based on aesthetic visual analysis (AVA) benchmark datasets. In addition, the evaluation of the dataset shows the effectiveness of our metalearning training model, which not only improves MetaMP assessment accuracy but also provides valuable guidance for network initialization of IAA.
Collapse
|
36
|
Cheng Y, Wei F, Bao J, Chen D, Zhang W. ADPL: Adaptive Dual Path Learning for Domain Adaptation of Semantic Segmentation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:9339-9356. [PMID: 37027611 DOI: 10.1109/tpami.2023.3248294] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
To alleviate the need for large-scale pixel-wise annotations, domain adaptation for semantic segmentation trains segmentation models on synthetic data (source) with computer-generated annotations, which can be then generalized to segment realistic images (target). Recently, self-supervised learning (SSL) with a combination of image-to-image translation shows great effectiveness in adaptive segmentation. The most common practice is to perform SSL along with image translation to well align a single domain (source or target). However, in this single-domain paradigm, unavoidable visual inconsistency raised by image translation may affect subsequent learning. In addition, pseudo labels generated by a single segmentation model aligned in either the source or target domain may be not accurate enough for SSL. In this paper, based on the observation that domain adaptation frameworks performed in the source and target domain are almost complementary, we propose a novel adaptive dual path learning (ADPL) framework to alleviate visual inconsistency and promote pseudo-labeling by introducing two interactive single-domain adaptation paths aligned in source and target domain respectively. To fully explore the potential of this dual-path design, novel technologies such as dual path image translation (DPIT), dual path adaptive segmentation (DPAS), dual path pseudo label generation (DPPLG) and Adaptive ClassMix are proposed. The inference of ADPL is extremely simple, only one segmentation model in the target domain is employed. Our ADPL outperforms the state-of-the-art methods by large margins on GTA5 →Cityscapes, SYNTHIA → Cityscapes and GTA5 →BDD100K scenarios. Code and models are available at https://github.com/royee182/DPL.
Collapse
|
37
|
Yan K, Guo X, Ji Z, Zhou X. Deep Transfer Learning for Cross-Species Plant Disease Diagnosis Adapting Mixed Subdomains. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2555-2564. [PMID: 34914593 DOI: 10.1109/tcbb.2021.3135882] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
A deep transfer learning framework adapting mixed subdomains is proposed for cross-species plant disease diagnosis. Most existing deep transfer learning studies focus on knowledge transfer between highly correlated domains. These methods may fail to deal with domains that are poorly correlated. In this study, mixed domain images were generated from source and target image groups for improving the correlation between the mixed domain (training dataset) and the target domain (testing dataset). A subdomain alignment mechanism is employed to transfer knowledge from the mixed domain to the target domain. The proposed framework captures the fine-grained information more effectively. Extensive experiments were conducted and prove that the proposed method produces a more effective result compared with existing deep transfer learning technologies for poorly related subdomains.
Collapse
|
38
|
Ji Y, Gao Y, Bao R, Li Q, Liu D, Sun Y, Ye Y. Prediction of COVID-19 Patients' Emergency Room Revisit using Multi-Source Transfer Learning. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS 2023; 2023:138-144. [PMID: 38486663 PMCID: PMC10939709 DOI: 10.1109/ichi57859.2023.00028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/17/2024]
Abstract
The coronavirus disease 2019 (COVID-19) has led to a global pandemic of significant severity. In addition to its high level of contagiousness, COVID-19 can have a heterogeneous clinical course, ranging from asymptomatic carriers to severe and potentially life-threatening health complications. Many patients have to revisit the emergency room (ER) within a short time after discharge, which significantly increases the workload for medical staff. Early identification of such patients is crucial for helping physicians focus on treating life-threatening cases. In this study, we obtained Electronic Health Records (EHRs) of 3,210 encounters from 13 affiliated ERs within the University of Pittsburgh Medical Center between March 2020 and January 2021. We leveraged a Natural Language Processing technique, ScispaCy, to extract clinical concepts and used the 1001 most frequent concepts to develop 7-day revisit models for COVID-19 patients in ERs. The research data we collected were obtained from 13 ERs, which may have distributional differences that could affect the model development. To address this issue, we employed a classic deep transfer learning method called the Domain Adversarial Neural Network (DANN) and evaluated different modeling strategies, including the Multi-DANN algorithm (which considers the source differences), the Single-DANN algorithm (which doesn't consider the source differences), and three baseline methods: using only source data, using only target data, and using a mixture of source and target data. Results showed that the Multi-DANN models outperformed the Single-DANN models and baseline models in predicting revisits of COVID-19 patients to the ER within 7 days after discharge (median AUROC = 0.8 vs. 0.5). Notably, the Multi-DANN strategy effectively addressed the heterogeneity among multiple source domains and improved the adaptation of source data to the target domain. Moreover, the high performance of Multi-DANN models indicates that EHRs are informative for developing a prediction model to identify COVID-19 patients who are very likely to revisit an ER within 7 days after discharge.
Collapse
Affiliation(s)
- Yuelyu Ji
- Department of Information Science, School of Computing and Information, University of Pittsburgh, Pittsburgh,USA
| | - Yuhe Gao
- Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Pittsburgh, USA
| | - Runxue Bao
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, Pittsburgh, USA
| | - Qi Li
- School of Business, State University of New York at New Paltz, New Paltz, USA
| | - Disheng Liu
- Department of Information Science, School of Computing and Information, University of Pittsburgh Pittsburgh, USA
| | - Yiming Sun
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh Pittsburgh, USA
| | - Ye Ye
- Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Pittsburgh, USA
| |
Collapse
|
39
|
Wang S, Wang B, Zhang Z, Heidari AA, Chen H. Class-Aware Sample Reweighting Optimal Transport for Multi-source Domain Adaptation. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.12.048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
40
|
Yang H, Chen C, Jiang M, Liu Q, Cao J, Heng PA, Dou Q. DLTTA: Dynamic Learning Rate for Test-Time Adaptation on Cross-Domain Medical Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:3575-3586. [PMID: 35839185 DOI: 10.1109/tmi.2022.3191535] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Test-time adaptation (TTA) has increasingly been an important topic to efficiently tackle the cross-domain distribution shift at test time for medical images from different institutions. Previous TTA methods have a common limitation of using a fixed learning rate for all the test samples. Such a practice would be sub-optimal for TTA, because test data may arrive sequentially therefore the scale of distribution shift would change frequently. To address this problem, we propose a novel dynamic learning rate adjustment method for test-time adaptation, called DLTTA, which dynamically modulates the amount of weights update for each test image to account for the differences in their distribution shift. Specifically, our DLTTA is equipped with a memory bank based estimation scheme to effectively measure the discrepancy of a given test sample. Based on this estimated discrepancy, a dynamic learning rate adjustment strategy is then developed to achieve a suitable degree of adaptation for each test sample. The effectiveness and general applicability of our DLTTA is extensively demonstrated on three tasks including retinal optical coherence tomography (OCT) segmentation, histopathological image classification, and prostate 3D MRI segmentation. Our method achieves effective and fast test-time adaptation with consistent performance improvement over current state-of-the-art test-time adaptation methods. Code is available at https://github.com/med-air/DLTTA.
Collapse
|
41
|
Zhao S, Yao X, Yang J, Jia G, Ding G, Chua TS, Schuller BW, Keutzer K. Affective Image Content Analysis: Two Decades Review and New Perspectives. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:6729-6751. [PMID: 34214034 DOI: 10.1109/tpami.2021.3094362] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Images can convey rich semantics and induce various emotions in viewers. Recently, with the rapid advancement of emotional intelligence and the explosive growth of visual data, extensive research efforts have been dedicated to affective image content analysis (AICA). In this survey, we will comprehensively review the development of AICA in the recent two decades, especially focusing on the state-of-the-art methods with respect to three main challenges - the affective gap, perception subjectivity, and label noise and absence. We begin with an introduction to the key emotion representation models that have been widely employed in AICA and description of available datasets for performing evaluation with quantitative comparison of label noise and dataset bias. We then summarize and compare the representative approaches on (1) emotion feature extraction, including both handcrafted and deep features, (2) learning methods on dominant emotion recognition, personalized emotion prediction, emotion distribution learning, and learning from noisy data or few labels, and (3) AICA based applications. Finally, we discuss some challenges and promising research directions in the future, such as image content and context understanding, group emotion clustering, and viewer-image interaction.
Collapse
|
42
|
Liu ZG, Ning LB, Zhang ZW. A New Progressive Multisource Domain Adaptation Network With Weighted Decision Fusion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:1062-1072. [PMID: 35675250 DOI: 10.1109/tnnls.2022.3179805] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Multisource unsupervised domain adaptation (MUDA) is an important and challenging topic for target classification with the assistance of labeled data in source domains. When we have several labeled source domains, it is difficult to map all source domains and target domain into a common feature space for classifying the targets well. In this article, a new progressive multisource domain adaptation network (PMSDAN) is proposed to further improve the classification performance. PMSDAN mainly consists of two steps for distribution alignment. First, the multiple source domains are integrated as one auxiliary domain to match the distribution with the target domain. By doing this, we can generally reduce the distribution discrepancy between each source and target domains, as well as the discrepancy between different source domains. It can efficiently explore useful knowledge from the integrated source domain. Second, to mine assistance knowledge from each source domain as much as possible, the distribution of the target domain is separately aligned with that of each source domain. A weighted fusion method is employed to combine the multiple classification results for making the final decision. In the optimization of domain adaption, weighted hybrid maximum mean discrepancy (WHMMD) is proposed, and it considers both the interclass and intraclass discrepancies. The effectiveness of the proposed PMSDAN is demonstrated in the experiments comparing with some state-of-the-art methods.
Collapse
|
43
|
Spatial Alignment for Unsupervised Domain Adaptive Single-Stage Object Detection. SENSORS 2022; 22:s22093253. [PMID: 35590943 PMCID: PMC9102984 DOI: 10.3390/s22093253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 04/15/2022] [Accepted: 04/21/2022] [Indexed: 12/02/2022]
Abstract
Domain adaptation methods are proposed to improve the performance of object detection in new domains without additional annotation costs. Recently, domain adaptation methods based on adversarial learning to align source and target domain image distributions are effective. However, for object detection tasks, image-level alignment enforces the alignment of non-transferable background regions, which affects the performance of important target regions. Therefore, how to balance the alignment of background and target remains a challenge. In addition, the current research with good effect is based on two-stage detectors, and there are relatively few studies on single-stage detectors. To address these issues, in this paper, we propose a selective domain adaptation framework for the spatial alignment of a single-stage detector. The framework can identify the background and target and pay different attention to them. On the premise that the single-stage detector does not generate region suggestions, it can achieve domain feature alignment and reduce the influence of the background, enabling transfer between different domains. We validate the effectiveness of our method for weather discrepancy, camera angles, synthetic to real-world, and real images to artistic images. Extensive experiments on four representative adaptation tasks show that the method effectively improves the performance of single-stage object detectors in different domains while maintaining good scalability.
Collapse
|
44
|
Transferring model structure in Bayesian transfer learning for Gaussian process regression. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108875] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
45
|
Automatic Fish Age Determination across Different Otolith Image Labs Using Domain Adaptation. FISHES 2022. [DOI: 10.3390/fishes7020071] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
The age determination of fish is fundamental to marine resource management. This task is commonly done by analysis of otoliths performed manually by human experts. Otolith images from Greenland halibut acquired by the Institute of Marine Research (Norway) were recently used to train a convolutional neural network (CNN) for automatically predicting fish age, opening the way for requiring less human effort and availability of expertise by means of deep learning (DL). In this study, we demonstrate that applying a CNN model trained on images from one lab (in Norway) does not lead to a suitable performance when predicting fish ages from otolith images from another lab (in Iceland) for the same species. This is due to a problem known as dataset shift, where the source data, i.e., the dataset the model was trained on have different characteristics from the dataset at test stage, here denoted as target data. We further demonstrate that we can handle this problem by using domain adaptation, such that an existing model trained in the source domain is adapted to perform well in the target domain, without requiring extra annotation effort. We investigate four different approaches: (i) simple adaptation via image standardization, (ii) adversarial generative adaptation, (iii) adversarial discriminative adaptation and (iv) self-supervised adaptation. The results show that the performance varies substantially between the methods, with adversarial discriminative and self-supervised adaptations being the best approaches. Without using a domain adaptation approach, the root mean squared error (RMSE) and coefficient of variation (CV) on the Icelandic dataset are as high as 5.12 years and 28.6%, respectively, whereas by using the self-supervised domain adaptation, the RMSE and CV are reduced to 1.94 years and 11.1%. We conclude that careful consideration must be given before DL-based predictors are applied to perform large scale inference. Despite that, domain adaptation is a promising solution for handling problems of dataset shift across image labs.
Collapse
|
46
|
Data-Driven Geothermal Reservoir Modeling: Estimating Permeability Distributions by Machine Learning. GEOSCIENCES 2022. [DOI: 10.3390/geosciences12030130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Numerical modeling for geothermal reservoir engineering is a crucial process to evaluate the performance of the reservoir and to develop strategies for the future development. The governing equations in the geothermal reservoir models consist of several constitutive parameters, and each parameter is given to a large number of simulation grids. Thus, the combinations of parameters we need to estimate are almost limitless. Although several inverse analysis algorithms have been developed, determining the constitutive parameters in the reservoir model is still a matter of trial-and-error estimation in actual practice, and is largely based on the experience of the analyst. There are several parameters which control the hydrothermal processes in the geothermal reservoir modeling. In this study, as an initial challenge, we focus on permeability, which is one of the most important parameters for the modeling. We propose a machine-learning-based method to estimate permeability distributions using measurable data. A large number of learning data were prepared by a geothermal reservoir simulator capable of calculating pressure and temperature distributions in the natural state with different permeability distributions. Several machine learning algorithms (i.e., linear regression, ridge regression, Lasso regression, support vector regression (SVR), multilayer perceptron (MLP), random forest, gradient boosting, and the k-nearest neighbor algorithm) were applied to learn the relationship between the permeability and the pressure and temperature distributions. By comparing the feature importance and the scores of estimations, random forest using pressure differences as feature variables provided the best estimation (the training score of 0.979 and the test score of 0.789). Since it was learned independently of the grids and locations, this model is expected to be generalized. It was also found that estimation is possible to some extent, even for different heat source conditions. This study is a successful demonstration of the first step in achieving the goal of new data-driven geothermal reservoir engineering, which will be developed and enhanced with the knowledge of information science.
Collapse
|
47
|
Baffour AA, Qin Z, Geng J, Ding Y, Deng F, Qin Z. Generic network for domain adaptation based on self-supervised learning and deep clustering. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.12.099] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
48
|
Hou J, Ding X, Deng JD, Cranefield S. Deep adversarial transition learning using cross-grafted generative stacks. Neural Netw 2022; 149:172-183. [DOI: 10.1016/j.neunet.2022.02.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 01/09/2022] [Accepted: 02/09/2022] [Indexed: 10/19/2022]
|
49
|
Alo UR, Nkwo FO, Nweke HF, Achi II, Okemiri HA. Non-Pharmaceutical Interventions against COVID-19 Pandemic: Review of Contact Tracing and Social Distancing Technologies, Protocols, Apps, Security and Open Research Directions. SENSORS (BASEL, SWITZERLAND) 2021; 22:280. [PMID: 35009822 PMCID: PMC8749862 DOI: 10.3390/s22010280] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/02/2021] [Revised: 12/13/2021] [Accepted: 12/14/2021] [Indexed: 12/17/2022]
Abstract
The COVID-19 Pandemic has punched a devastating blow on the majority of the world's population. Millions of people have been infected while hundreds of thousands have died of the disease throwing many families into mourning and other psychological torments. It has also crippled the economy of many countries of the world leading to job losses, high inflation, and dwindling Gross Domestic Product (GDP). The duo of social distancing and contact tracing are the major technological-based non-pharmaceutical public health intervention strategies adopted for combating the dreaded disease. These technologies have been deployed by different countries around the world to achieve effective and efficient means of maintaining appropriate distance and tracking the transmission pattern of the diseases or identifying those at high risk of infecting others. This paper aims to synthesize the research efforts on contact tracing and social distancing to minimize the spread of COVID-19. The paper critically and comprehensively reviews contact tracing technologies, protocols, and mobile applications (apps) that were recently developed and deployed against the coronavirus disease. Furthermore, the paper discusses social distancing technologies, appropriate methods to maintain distances, regulations, isolation/quarantine, and interaction strategies. In addition, the paper highlights different security/privacy vulnerabilities identified in contact tracing and social distancing technologies and solutions against these vulnerabilities. We also x-rayed the strengths and weaknesses of the various technologies concerning their application in contact tracing and social distancing. Finally, the paper proposed insightful recommendations and open research directions in contact tracing and social distancing that could assist researchers, developers, and governments in implementing new technological methods to combat the menace of COVID-19.
Collapse
Affiliation(s)
- Uzoma Rita Alo
- Department of Computer Science and Informatics, Alex Ekwueme Federal University, Ndufu-Alike, Ikwo P.M.B 1010, Abakaliki 480211, Ebonyi State, Nigeria; (F.O.N.); (I.I.A.); (H.A.O.)
| | - Friday Onwe Nkwo
- Department of Computer Science and Informatics, Alex Ekwueme Federal University, Ndufu-Alike, Ikwo P.M.B 1010, Abakaliki 480211, Ebonyi State, Nigeria; (F.O.N.); (I.I.A.); (H.A.O.)
| | - Henry Friday Nweke
- Centre for Research in Machine Learning, Artificial Intelligence and Network Systems, Computer Science Department, Ebonyi State University, P.M.B 053, Abakaliki 480211, Ebonyi State, Nigeria;
| | - Ifeanyi Isaiah Achi
- Department of Computer Science and Informatics, Alex Ekwueme Federal University, Ndufu-Alike, Ikwo P.M.B 1010, Abakaliki 480211, Ebonyi State, Nigeria; (F.O.N.); (I.I.A.); (H.A.O.)
| | - Henry Anayo Okemiri
- Department of Computer Science and Informatics, Alex Ekwueme Federal University, Ndufu-Alike, Ikwo P.M.B 1010, Abakaliki 480211, Ebonyi State, Nigeria; (F.O.N.); (I.I.A.); (H.A.O.)
| |
Collapse
|
50
|
Wang S, Dong D, Li H, Feng C, Wang Y, Tian J. Cross-Phase Adversarial Domain Adaptation for Deep Disease-free Survival Prediction with Gastric Cancer CT Images. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:3501-3504. [PMID: 34891994 DOI: 10.1109/embc46164.2021.9631004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Predicting gastric cancer disease-free survival (DFS) and identifying patients probably with high risk are imperative for more appropriate clinical treatment plans. Compared with CT-based radiomics researches adopting linear Cox proportional hazards models, deep neural networks can perform nonlinear transformations and investigate complex associations of image features with prognosis. Exploring shared information between post-contrast CT (with better visual enhancement) and pre-contrast CT (with few side effects and contraindications) is another challenge. In this work, a cross-phase adversarial domain adaptation (CPADA) framework is proposed to adapt a deep DFS prediction network (DDFS-Net) from arterial phase to pre-contrast phase. The DDFS-Net is designed for feature learning and trained by optimizing the average negative log function of Cox partial likelihood. The CPADA maps the feature space of pre-contrast phase (target) to arterial phase (source) in an adversarial manner by measuring Wasserstein distance. The proposed methods are evaluated on a dataset of 249 gastric cancer patients by concordance index, receiver operating characteristic curves, and Kaplan-Meier survival curves. The results demonstrate that our DDFS-Net outperforms linear survival analysis methods, and the CPADA works better than supervised learning and direct transfer schemes.Clinical Relevance-This work enables preoperative DFS prediction and risk stratification in gastric cancer. It is feasible and effective to infer a patient's risk of failure given a pre-contrast CT image by DDFS-Net adapted by CPADA.
Collapse
|