1
|
Lai H, Luo Y, Li B, Lu J, Yuan J. Bilateral Proxy Federated Domain Generalization for Privacy-Preserving Medical Image Diagnosis. IEEE J Biomed Health Inform 2025; 29:2784-2797. [PMID: 39259622 DOI: 10.1109/jbhi.2024.3456440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/13/2024]
Abstract
Contemporary domain generalization methods have demonstrated effectiveness in aiding the generalized diagnosis of medical images with multi-source data by joint optimization. However, the centralized training paradigm employed by these approaches becomes infeasible when data are non-shared across domains due to the high privacy of medical data. Despite attempts by existing federated domain generalization methods to address this issue, the simultaneous attainment of strict privacy protection and a satisfactory level of generalization ability on out-of-distribution data remains a persistent challenge. In this paper, to tackle this challenging problem, we propose a novel approach called the Bilateral Proxy Framework (BPF). The BPF leverages the client-side proxies to facilitate the strict privacy-preserving communications with the server and ensure smoother and more stable convergences of local models through mutual distillation. Meanwhile, the server-side proxy adopts a distance-based strategy and a parameter moving average scheme, which enhances the stability and robustness of the global model, particularly by averting abrupt parameter changes that could result in fluctuations or overfitting. Through these advancements, our framework strives to enhance the generalization capability of the global model, enabling more accurate and reliable medical image diagnosis in federated settings. The effectiveness of our method is demonstrated with superior performance over state-of-the-arts on both simulated and real-world distribution medical image diagnosis tasks.
Collapse
|
2
|
Cheng Z, Liu M, Yan C, Wang S. Dynamic domain generalization for medical image segmentation. Neural Netw 2025; 184:107073. [PMID: 39733701 DOI: 10.1016/j.neunet.2024.107073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 12/13/2024] [Accepted: 12/18/2024] [Indexed: 12/31/2024]
Abstract
Domain Generalization-based Medical Image Segmentation (DGMIS) aims to enhance the robustness of segmentation models on unseen target domains by learning from fully annotated data across multiple source domains. Despite the progress made by traditional DGMIS methods, they still face several challenges. First, most DGMIS approaches rely on static models to perform inference on unseen target domains, lacking the ability to dynamically adapt to samples from different target domains. Second, current DGMIS methods often use Fourier transforms to simulate target domain styles from a global perspective, but relying solely on global transformations for data augmentation fails to fully capture the complexity and local details of the target domains. To address these issues, we propose a Dynamic Domain Generalization (DDG) method for medical image segmentation, which improves the generalization capability of models on unseen target domains by dynamically adjusting model parameters and effectively simulating target domain styles. Specifically, we design a Dynamic Position Transfer (DPT) module that decouples model parameters into static and dynamic components while incorporating positional encoding information to enable efficient feature representation and dynamic adaptation to target domain characteristics. Additionally, we introduce a Global-Local Fourier Random Transformation (GLFRT) module, which jointly considers both global and local style information of the samples. By using a random style selection strategy, this module enhances sample diversity while controlling training costs. Experimental results demonstrate that our method outperforms state-of-the-art approaches on several public medical image datasets, achieving average Dice score improvements of 0.58%, 0.76%, and 0.76% on the Fundus dataset (1060 retinal images), Prostate dataset (1744 T2-weighted MRI scans), and SCGM dataset (551 MRI image slices), respectively. The code is available online (https://github.com/ZMC-IIIM/DDG-Med).
Collapse
Affiliation(s)
- Zhiming Cheng
- School of Communication Engineering, Hangzhou Dianzi University, Hangzhou, 310018, China.
| | - Mingxia Liu
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States.
| | - Chenggang Yan
- School of Communication Engineering, Hangzhou Dianzi University, Hangzhou, 310018, China.
| | - Shuai Wang
- School of Cyberspace, Hangzhou Dianzi University, Hangzhou, 310018, China; Suzhou Research Institute of Shandong University, Suzhou, 215123, China.
| |
Collapse
|
3
|
Zhang Z, Li Y, Shin BS. Enhancing generalization of medical image segmentation via game theory-based domain selection. J Biomed Inform 2025; 164:104802. [PMID: 40049504 DOI: 10.1016/j.jbi.2025.104802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2024] [Revised: 01/08/2025] [Accepted: 02/03/2025] [Indexed: 03/17/2025]
Abstract
Medical image segmentation models often fail to generalize well to new datasets due to substantial variability in imaging conditions, anatomical differences, and patient demographics. Conventional domain generalization (DG) methods focus on learning domain-agnostic features but often overlook the importance of maintaining performance balance across different domains, leading to suboptimal results. To address these issues, we propose a novel approach using game theory to model the training process as a zero-sum game, aiming for a Nash equilibrium to enhance adaptability and robustness against domain shifts. Specifically, our adaptive domain selection method, guided by the Beta distribution and optimized via reinforcement learning, dynamically adjusts to the variability across different domains, thus improving model generalization. We conducted extensive experiments on benchmark datasets for polyp segmentation, optic cup/optic disc (OC/OD) segmentation, and prostate segmentation. Our method achieved an average Dice score improvement of 1.75% compared with other methods, demonstrating the effectiveness of our approach in enhancing the generalization performance of medical image segmentation models.
Collapse
Affiliation(s)
- Zuyu Zhang
- Key Laboratory of Big Data Intelligent Computing, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Yan Li
- Department of Electrical and Computer Engineering, Inha University, Incheon, 22212, Republic of Korea.
| | - Byeong-Seok Shin
- Department of Electrical and Computer Engineering, Inha University, Incheon, 22212, Republic of Korea.
| |
Collapse
|
4
|
Li W, Zhang Y, Zhou H, Yang W, Xie Z, He Y. CLMS: Bridging domain gaps in medical imaging segmentation with source-free continual learning for robust knowledge transfer and adaptation. Med Image Anal 2025; 100:103404. [PMID: 39616943 DOI: 10.1016/j.media.2024.103404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Revised: 10/01/2024] [Accepted: 11/19/2024] [Indexed: 12/16/2024]
Abstract
Deep learning shows promise for medical image segmentation but suffers performance declines when applied to diverse healthcare sites due to data discrepancies among the different sites. Translating deep learning models to new clinical environments is challenging, especially when the original source data used for training is unavailable due to privacy restrictions. Source-free domain adaptation (SFDA) aims to adapt models to new unlabeled target domains without requiring access to the original source data. However, existing SFDA methods face challenges such as error propagation, misalignment of visual and structural features, and inability to preserve source knowledge. This paper introduces Continual Learning Multi-Scale domain adaptation (CLMS), an end-to-end SFDA framework integrating multi-scale reconstruction, continual learning, and style alignment to bridge domain gaps across medical sites using only unlabeled target data or publicly available data. Compared to the current state-of-the-art methods, CLMS consistently and significantly achieved top performance for different tasks, including prostate MRI segmentation (improved Dice of 10.87 %), colonoscopy polyp segmentation (improved Dice of 17.73 %), and plus disease classification from retinal images (improved AUC of 11.19 %). Crucially, CLMS preserved source knowledge for all the tasks, avoiding catastrophic forgetting. CLMS demonstrates a promising solution for translating deep learning models to new clinical imaging domains towards safe, reliable deployment across diverse healthcare settings.
Collapse
Affiliation(s)
- Weilu Li
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yun Zhang
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Hao Zhou
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Wenhan Yang
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China.
| | - Yao He
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
5
|
Tang Y, Lyu T, Jin H, Du Q, Wang J, Li Y, Li M, Chen Y, Zheng J. Domain adaptive noise reduction with iterative knowledge transfer and style generalization learning. Med Image Anal 2024; 98:103327. [PMID: 39191093 DOI: 10.1016/j.media.2024.103327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 08/20/2024] [Accepted: 08/21/2024] [Indexed: 08/29/2024]
Abstract
Low-dose computed tomography (LDCT) denoising tasks face significant challenges in practical imaging scenarios. Supervised methods encounter difficulties in real-world scenarios as there are no paired data for training. Moreover, when applied to datasets with varying noise patterns, these methods may experience decreased performance owing to the domain gap. Conversely, unsupervised methods do not require paired data and can be directly trained on real-world data. However, they often exhibit inferior performance compared to supervised methods. To address this issue, it is necessary to leverage the strengths of these supervised and unsupervised methods. In this paper, we propose a novel domain adaptive noise reduction framework (DANRF), which integrates both knowledge transfer and style generalization learning to effectively tackle the domain gap problem. Specifically, an iterative knowledge transfer method with knowledge distillation is selected to train the target model using unlabeled target data and a pre-trained source model trained with paired simulation data. Meanwhile, we introduce the mean teacher mechanism to update the source model, enabling it to adapt to the target domain. Furthermore, an iterative style generalization learning process is also designed to enrich the style diversity of the training dataset. We evaluate the performance of our approach through experiments conducted on multi-source datasets. The results demonstrate the feasibility and effectiveness of our proposed DANRF model in multi-source LDCT image processing tasks. Given its hybrid nature, which combines the advantages of supervised and unsupervised learning, and its ability to bridge domain gaps, our approach is well-suited for improving practical low-dose CT imaging in clinical settings. Code for our proposed approach is publicly available at https://github.com/tyfeiii/DANRF.
Collapse
Affiliation(s)
- Yufei Tang
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Tianling Lyu
- Research Center of Augmented Intelligence, Zhejiang Lab, Hangzhou, 310000, China
| | - Haoyang Jin
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Qiang Du
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Jiping Wang
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Yunxiang Li
- Nanovision Technology Co., Ltd., Beiqing Road, Haidian District, Beijing, 100094, China
| | - Ming Li
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China.
| | - Yang Chen
- Laboratory of Image Science and Technology, the School of Computer Science and Engineering, Southeast University, Nanjing, 210096, China
| | - Jian Zheng
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China; Shandong Laboratory of Advanced Biomaterials and Medical Devices in Weihai, Weihai, 264200, China.
| |
Collapse
|
6
|
Fu W, Hu H, Li X, Guo R, Chen T, Qian X. A Generalizable Causal-Invariance-Driven Segmentation Model for Peripancreatic Vessels. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3794-3806. [PMID: 38739508 DOI: 10.1109/tmi.2024.3400528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Segmenting peripancreatic vessels in CT, including the superior mesenteric artery (SMA), the coeliac artery (CA), and the partial portal venous system (PPVS), is crucial for preoperative resectability analysis in pancreatic cancer. However, the clinical applicability of vessel segmentation methods is impeded by the low generalizability on multi-center data, mainly attributed to the wide variations in image appearance, namely the spurious correlation factor. Therefore, we propose a causal-invariance-driven generalizable segmentation model for peripancreatic vessels. It incorporates interventions at both image and feature levels to guide the model to capture causal information by enforcing consistency across datasets, thus enhancing the generalization performance. Specifically, firstly, a contrast-driven image intervention strategy is proposed to construct image-level interventions by generating images with various contrast-related appearances and seeking invariant causal features. Secondly, the feature intervention strategy is designed, where various patterns of feature bias across different centers are simulated to pursue invariant prediction. The proposed model achieved high DSC scores (79.69%, 82.62%, and 83.10%) for the three vessels on a cross-validation set containing 134 cases. Its generalizability was further confirmed on three independent test sets of 233 cases. Overall, the proposed method provides an accurate and generalizable segmentation model for peripancreatic vessels and offers a promising paradigm for increasing the generalizability of segmentation models from a causality perspective. Our source codes will be released at https://github.com/ SJTUBME-QianLab/PC_VesselSeg.
Collapse
|
7
|
Liao S, Peng T, Chen H, Lin T, Zhu W, Shi F, Chen X, Xiang D. Dual-Spatial Domain Generalization for Fundus Lesion Segmentation in Unseen Manufacturer's OCT Images. IEEE Trans Biomed Eng 2024; 71:2789-2799. [PMID: 38662563 DOI: 10.1109/tbme.2024.3393453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/30/2024]
Abstract
OBJECTIVE Optical Coherence Tomography (OCT) images can provide non-invasive visualization of fundus lesions; however, scanners from different OCT manufacturers largely vary from each other, which often leads to model deterioration to unseen OCT scanners due to domain shift. METHODS To produce the T-styles of the potential target domain, an Orthogonal Style Space Reparameterization (OSSR) method is proposed to apply orthogonal constraints in the latent orthogonal style space to the sampled marginal styles. To leverage the high-level features of multi-source domains and potential T-styles in the graph semantic space, a Graph Adversarial Network (GAN) is constructed to align the generated samples with the source domain samples. To align features with the same label based on the semantic feature in the graph semantic space, Graph Semantic Alignment (GSA) is performed to focus on the shape and the morphological differences between the lesions and their surrounding regions. RESULTS Comprehensive experiments have been performed on two OCT image datasets. Compared to state-of-the-art methods, the proposed method can achieve better segmentation. CONCLUSION The proposed fundus lesion segmentation method can be trained with labeled OCT images from multiple manufacturers' scanners and be tested on an unseen manufacturer's scanner with better domain generalization. SIGNIFICANCE The proposed method can be used in routine clinical occasions when an unseen manufacturer's OCT image is available for a patient.
Collapse
|
8
|
Huang L, Zhang N, Yi Y, Zhou W, Zhou B, Dai J, Wang J. SAMCF: Adaptive global style alignment and multi-color spaces fusion for joint optic cup and disc segmentation. Comput Biol Med 2024; 178:108639. [PMID: 38878394 DOI: 10.1016/j.compbiomed.2024.108639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 04/21/2024] [Accepted: 05/18/2024] [Indexed: 07/24/2024]
Abstract
The optic cup (OC) and optic disc (OD) are two critical structures in retinal fundus images, and their relative positions and sizes are essential for effectively diagnosing eye diseases. With the success of deep learning in computer vision, deep learning-based segmentation models have been widely used for joint optic cup and disc segmentation. However, there are three prominent issues that impact the segmentation performance. First, significant differences among datasets collecting from various institutions, protocols, and devices lead to performance degradation of models. Second, we find that images with only RGB information struggle to counteract the interference caused by brightness variations, affecting color representation capability. Finally, existing methods typically ignored the edge perception, facing the challenges in obtaining clear and smooth edge segmentation results. To address these drawbacks, we propose a novel framework based on Style Alignment and Multi-Color Fusion (SAMCF) for joint OC and OD segmentation. Initially, we introduce a domain generalization method to generate uniformly styled images without damaged image content for mitigating domain shift issues. Next, based on multiple color spaces, we propose a feature extraction and fusion network aiming to handle brightness variation interference and improve color representation capability. Lastly, an edge aware loss is designed to generate fine edge segmentation results. Our experiments conducted on three public datasets, DGS, RIM, and REFUGE, demonstrate that our proposed SAMCF achieves superior performance to existing state-of-the-art methods. Moreover, SAMCF exhibits remarkable generalization ability across multiple retinal fundus image datasets, showcasing its outstanding generality.
Collapse
Affiliation(s)
- Longjun Huang
- School of Software, Nanchang Key Laboratory for Blindness and Visual Impairment Prevention Technology and Equipment, Jiangxi Normal University, Nanchang, 330022, China
| | - Ningyi Zhang
- School of Software, Nanchang Key Laboratory for Blindness and Visual Impairment Prevention Technology and Equipment, Jiangxi Normal University, Nanchang, 330022, China
| | - Yugen Yi
- School of Software, Nanchang Key Laboratory for Blindness and Visual Impairment Prevention Technology and Equipment, Jiangxi Normal University, Nanchang, 330022, China.
| | - Wei Zhou
- College of Computer Science, Shenyang Aerospace University, Shenyang, 110136, China
| | - Bin Zhou
- School of Software, Nanchang Key Laboratory for Blindness and Visual Impairment Prevention Technology and Equipment, Jiangxi Normal University, Nanchang, 330022, China
| | - Jiangyan Dai
- School of Computer Engineering, Weifang University, 261061, China.
| | - Jianzhong Wang
- College of Information Science and Technology, Northeast Normal University, Changchun, 130117, China
| |
Collapse
|
9
|
Zhou C, Wang J, Xiang S, Liu F, Huang H, Qian D. A Simple Normalization Technique Using Window Statistics to Improve the Out-of-Distribution Generalization on Medical Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2086-2097. [PMID: 38224511 DOI: 10.1109/tmi.2024.3353800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/17/2024]
Abstract
Since data scarcity and data heterogeneity are prevailing for medical images, well-trained Convolutional Neural Networks (CNNs) using previous normalization methods may perform poorly when deployed to a new site. However, a reliable model for real-world clinical applications should generalize well both on in-distribution (IND) and out-of-distribution (OOD) data (e.g., the new site data). In this study, we present a novel normalization technique called window normalization (WIN) to improve the model generalization on heterogeneous medical images, which offers a simple yet effective alternative to existing normalization methods. Specifically, WIN perturbs the normalizing statistics with the local statistics computed within a window. This feature-level augmentation technique regularizes the models well and improves their OOD generalization significantly. Leveraging its advantage, we propose a novel self-distillation method called WIN-WIN. WIN-WIN can be easily implemented with two forward passes and a consistency constraint, serving as a simple extension to existing methods. Extensive experimental results on various tasks (6 tasks) and datasets (24 datasets) demonstrate the generality and effectiveness of our methods.
Collapse
|
10
|
Wei H, Shi P, Miao J, Zhang M, Bai G, Qiu J, Liu F, Yuan W. CauDR: A causality-inspired domain generalization framework for fundus-based diabetic retinopathy grading. Comput Biol Med 2024; 175:108459. [PMID: 38701588 DOI: 10.1016/j.compbiomed.2024.108459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 03/31/2024] [Accepted: 04/07/2024] [Indexed: 05/05/2024]
Abstract
Diabetic retinopathy (DR) is the most common diabetic complication, which usually leads to retinal damage, vision loss, and even blindness. A computer-aided DR grading system has a significant impact on helping ophthalmologists with rapid screening and diagnosis. Recent advances in fundus photography have precipitated the development of novel retinal imaging cameras and their subsequent implementation in clinical practice. However, most deep learning-based algorithms for DR grading demonstrate limited generalization across domains. This inferior performance stems from variance in imaging protocols and devices inducing domain shifts. We posit that declining model performance between domains arises from learning spurious correlations in the data. Incorporating do-operations from causality analysis into model architectures may mitigate this issue and improve generalizability. Specifically, a novel universal structural causal model (SCM) was proposed to analyze spurious correlations in fundus imaging. Building on this, a causality-inspired diabetic retinopathy grading framework named CauDR was developed to eliminate spurious correlations and achieve more generalizable DR diagnostics. Furthermore, existing datasets were reorganized into 4DR benchmark for DG scenario. Results demonstrate the effectiveness and the state-of-the-art (SOTA) performance of CauDR. Diabetic retinopathy (DR) is the most common diabetic complication, which usually leads to retinal damage, vision loss, and even blindness. A computer-aided DR grading system has a significant impact on helping ophthalmologists with rapid screening and diagnosis. Recent advances in fundus photography have precipitated the development of novel retinal imaging cameras and their subsequent implementation in clinical practice. However, most deep learning-based algorithms for DR grading demonstrate limited generalization across domains. This inferior performance stems from variance in imaging protocols and devices inducing domain shifts. We posit that declining model performance between domains arises from learning spurious correlations in the data. Incorporating do-operations from causality analysis into model architectures may mitigate this issue and improve generalizability. Specifically, a novel universal structural causal model (SCM) was proposed to analyze spurious correlations in fundus imaging. Building on this, a causality-inspired diabetic retinopathy grading framework named CauDR was developed to eliminate spurious correlations and achieve more generalizable DR diagnostics. Furthermore, existing datasets were reorganized into 4DR benchmark for DG scenario. Results demonstrate the effectiveness and the state-of-the-art (SOTA) performance of CauDR.
Collapse
Affiliation(s)
- Hao Wei
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong Special Administrative Region of China.
| | - Peilun Shi
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong Special Administrative Region of China
| | - Juzheng Miao
- Department of Computer Science Engineering, The Chinese University of Hong Kong, Hong Kong Special Administrative Region of China
| | - Mingqin Zhang
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong Special Administrative Region of China
| | - Guitao Bai
- Department of Ophthalmology, Zigong First People's Hospital, ZiGong, China
| | - Jianing Qiu
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong Special Administrative Region of China
| | - Furui Liu
- Zhejiang Lab, Hangzhou, Zhejiang, China
| | - Wu Yuan
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong Special Administrative Region of China.
| |
Collapse
|
11
|
Zhi Y, Bie H, Wang J, Ren L. Masked autoencoders with generalizable self-distillation for skin lesion segmentation. Med Biol Eng Comput 2024:10.1007/s11517-024-03086-z. [PMID: 38653880 DOI: 10.1007/s11517-024-03086-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 03/29/2024] [Indexed: 04/25/2024]
Abstract
In the field of skin lesion image segmentation, accurate identification and partitioning of diseased regions is of vital importance for in-depth analysis of skin cancer. Self-supervised learning, i.e., MAE, has emerged as a potent force in the medical imaging domain, which autonomously learns and extracts latent features from unlabeled data, thereby yielding pre-trained models that greatly assist downstream tasks. To encourage pre-trained models to more comprehensively learn the global structural and local detail information inherent in dermoscopy images, we introduce a Teacher-Student architecture, named TEDMAE, by incorporating a self-distillation mechanism, it learns holistic image feature information to improve the generalizable global knowledge learning of the student MAE model. To make the image features learned by the model suitable for unknown test images, two optimization strategies are, Exterior Conversion Augmentation (EC) utilizes random convolutional kernels and linear interpolation to effectively transform the input image into one with the same shape but altered intensities and textures, while Dynamic Feature Generation (DF) employs a nonlinear attention mechanism for feature merging, enhancing the expressive power of the features, are proposed to enhance the generalizability of global features learned by the teacher model, thereby improving the overall generalization capability of the pre-trained models. Experimental results from the three public skin disease datasets, ISIC2019, ISIC2017, and PH2 indicate that our proposed TEDMAE method outperforms several similar approaches. Specifically, TEDMAE demonstrated optimal segmentation and generalization performance on the ISIC2017 and PH2 datasets, with Dice scores reaching 82.1% and 91.2%, respectively. The best Jaccard values were 72.6% and 84.5%, while the optimal HD95% values were 13.0% and 8.9%, respectively.
Collapse
Affiliation(s)
- Yichen Zhi
- Department of Intelligent Media Computing Center, School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, 100876, People's Republic of China
| | - Hongxia Bie
- Department of Intelligent Media Computing Center, School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, 100876, People's Republic of China.
| | - Jiali Wang
- Department of Intelligent Media Computing Center, School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, 100876, People's Republic of China
| | - Lihan Ren
- Department of Intelligent Media Computing Center, School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, 100876, People's Republic of China
| |
Collapse
|
12
|
Gao Y, Ma C, Guo L, Liu G, Zhang X, Ji X. Adversarial learning-based domain adaptation algorithm for intracranial artery stenosis detection on multi-source datasets. Comput Biol Med 2024; 170:108001. [PMID: 38280254 DOI: 10.1016/j.compbiomed.2024.108001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 12/26/2023] [Accepted: 01/13/2024] [Indexed: 01/29/2024]
Abstract
Intracranial arterial stenosis (ICAS) is characterized by the pathological narrowing or occlusion of the inner lumen of intracranial blood vessels. However, the retina can indirectly react to cerebrovascular disease. Therefore, retinal fundus images (RFI) serve as valuable noninvasive and easily accessible screening tools for early detection and diagnosis of ICAS. This paper introduces an adversarial learning-based domain adaptation algorithm (ALDA) specifically designed for ICAS detection in multi-source datasets. The primary objective is to achieve accurate detection and enhanced generalization of ICAS based on RFI. Given the limitations of traditional algorithms in meeting the accuracy and generalization requirements, ALDA overcomes these challenges by leveraging RFI datasets from multiple sources and employing the concept of adversarial learning to facilitate feature representation sharing and distinguishability learning. In order to evaluate the performance of the ALDA algorithm, we conducted experimental validation on multi-source datasets. We compared its results with those obtained from other deep learning algorithms in the ICAS detection task. Furthermore, we validated the potential of ALDA for detecting diabetic retinopathy. The experimental results clearly demonstrate the significant improvements achieved by the ALDA algorithm. By leveraging information from diverse datasets, ALDA learns feature representations that exhibit enhanced generalizability and robustness. This makes it a reliable auxiliary diagnostic tool for clinicians, thereby facilitating the prevention and treatment of cerebrovascular diseases.
Collapse
Affiliation(s)
- Yuan Gao
- Department of Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, 100191, Beijing, China; Department of Ophthalmology, Xuanwu Hospital, Capital Medical University, 100053, Beijing, China.
| | - Chenbin Ma
- Department of Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, 100191, Beijing, China; Shen Yuan Honors College, Beihang University, 100191, Beijing, China.
| | - Lishuang Guo
- Department of Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, 100191, Beijing, China.
| | - Guiyou Liu
- Department of Ophthalmology, Beijing Tiantan Hospital, Capital Medical University, 100050, Beijing, China.
| | - Xuxiang Zhang
- Beijing Institute for Brain Disorders, Capital Medical University, 100069, Beijing, China.
| | - Xunming Ji
- Department of Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, 100191, Beijing, China.
| |
Collapse
|
13
|
Bo ZH, Guo Y, Lyu J, Liang H, He J, Deng S, Xu F, Lou X, Dai Q. Relay learning: a physically secure framework for clinical multi-site deep learning. NPJ Digit Med 2023; 6:204. [PMID: 37925578 PMCID: PMC10625523 DOI: 10.1038/s41746-023-00934-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 09/25/2023] [Indexed: 11/06/2023] Open
Abstract
Big data serves as the cornerstone for constructing real-world deep learning systems across various domains. In medicine and healthcare, a single clinical site lacks sufficient data, thus necessitating the involvement of multiple sites. Unfortunately, concerns regarding data security and privacy hinder the sharing and reuse of data across sites. Existing approaches to multi-site clinical learning heavily depend on the security of the network firewall and system implementation. To address this issue, we propose Relay Learning, a secure deep-learning framework that physically isolates clinical data from external intruders while still leveraging the benefits of multi-site big data. We demonstrate the efficacy of Relay Learning in three medical tasks of different diseases and anatomical structures, including structure segmentation of retina fundus, mediastinum tumors diagnosis, and brain midline localization. We evaluate Relay Learning by comparing its performance to alternative solutions through multi-site validation and external validation. Incorporating a total of 41,038 medical images from 21 medical hosts, including 7 external hosts, with non-uniform distributions, we observe significant performance improvements with Relay Learning across all three tasks. Specifically, it achieves an average performance increase of 44.4%, 24.2%, and 36.7% for retinal fundus segmentation, mediastinum tumor diagnosis, and brain midline localization, respectively. Remarkably, Relay Learning even outperforms central learning on external test sets. In the meanwhile, Relay Learning keeps data sovereignty locally without cross-site network connections. We anticipate that Relay Learning will revolutionize clinical multi-site collaboration and reshape the landscape of healthcare in the future.
Collapse
Affiliation(s)
- Zi-Hao Bo
- School of Software, Tsinghua University, Beijing, China
- BNRist, Tsinghua University, Beijing, China
| | - Yuchen Guo
- BNRist, Tsinghua University, Beijing, China.
| | - Jinhao Lyu
- Department of Radiology, Chinese PLA General Hospital / Chinese PLA Medical School, Beijing, China
| | - Hengrui Liang
- Department of Thoracic Oncology and Surgery, China State Key Laboratory of Respiratory Disease & National Clinical Research Center for Respiratory Disease, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Jianxing He
- Department of Thoracic Oncology and Surgery, China State Key Laboratory of Respiratory Disease & National Clinical Research Center for Respiratory Disease, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Shijie Deng
- Department of Radiology, The 921st Hospital of Chinese PLA, Changsha, China
| | - Feng Xu
- School of Software, Tsinghua University, Beijing, China.
- BNRist, Tsinghua University, Beijing, China.
| | - Xin Lou
- Department of Radiology, Chinese PLA General Hospital / Chinese PLA Medical School, Beijing, China.
| | - Qionghai Dai
- BNRist, Tsinghua University, Beijing, China.
- Department of Automation, Tsinghua University, Beijing, China.
| |
Collapse
|
14
|
Zhang Z, Li Y, Shin BS. Learning generalizable visual representation via adaptive spectral random convolution for medical image segmentation. Comput Biol Med 2023; 167:107580. [PMID: 39491380 DOI: 10.1016/j.compbiomed.2023.107580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 09/27/2023] [Accepted: 10/15/2023] [Indexed: 11/05/2024]
Abstract
Medical image segmentation models often fail to generalize well when applied to new datasets, hindering their usage in clinical practice. Existing random-convolution-based domain generalization approaches, which involve randomizing the convolutional kernel weights in the initial layers of CNN models, have shown promise in improving model generalizability. Nevertheless, the indiscriminate introduction of high-frequency noise during early feature extraction may pollute the critical fine details and degrade the model's performance on new datasets. To mitigate this problem, we propose an adaptive spectral random convolution (ASRConv) module designed to selectively randomize low-frequency features while avoiding the introduction of high-frequency artifacts. Unlike prior arts, ASRConv dynamically generates convolution kernel weights, enabling more effective control over feature frequencies than randomized kernels. Specifically, ASRConv achieves this selective randomization through a novel weight generation module conditioned on random noise inputs. The adversarial domain augmentation strategy guides the weight generation module in adaptively suppressing high-frequency noise during training, allowing ASRConv to improve feature diversity and reduce overfitting to specific domains. Extensive experimental results show that our proposed ASRConv method consistently outperforms the state-of-the-art methods, with average DSC improvements of 3.07% and 1.18% on fundus and polyp datasets, respectively. We also qualitatively demonstrate the robustness of our model against domain distribution shifts. All these results validate the effectiveness of the proposed ASRConv in learning domain-invariant representations for robust medical image segmentation.
Collapse
Affiliation(s)
- Zuyu Zhang
- Department of Electrical and Computer Engineering, Inha University, Incheon, 22212, South Korea
| | - Yan Li
- Department of Electrical and Computer Engineering, Inha University, Incheon, 22212, South Korea
| | - Byeong-Seok Shin
- Department of Electrical and Computer Engineering, Inha University, Incheon, 22212, South Korea.
| |
Collapse
|
15
|
Kang Y, Zhao X, Zhang Y, Li H, Wang G, Cui L, Xing Y, Feng J, Yang L. Improving domain generalization performance for medical image segmentation via random feature augmentation. Methods 2023; 218:149-157. [PMID: 37572767 DOI: 10.1016/j.ymeth.2023.08.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 07/19/2023] [Accepted: 08/07/2023] [Indexed: 08/14/2023] Open
Abstract
Deep convolutional neural networks (DCNNs) have shown remarkable performance in medical image segmentation tasks. However, medical images frequently exhibit distribution discrepancies due to variations in scanner vendors, operators, and image quality, which pose significant challenges to the robustness of trained models when applied to unseen clinical data. To address this issue, domain generalization methods have been developed to enhance the generalization ability of DCNNs. Feature space-based data augmentation methods have been proven effective in improving domain generalization, but they often rely on prior knowledge or assumptions, which can limit the diversity of source domain data. In this study, we propose a novel random feature augmentation (RFA) method to diversify source domain data at the feature level without prior knowledge. Specifically, our RFA method perturbs domain-specific information while preserving domain-invariant information, thereby adequately diversifying the source domain data. Furthermore, we propose a dual-branches invariant synergistic learning strategy to capture domain-invariant information from the augmented features of RFA, enabling DCNNs to learn a more generalized representation. We evaluate our proposed method on two challenging medical image segmentation tasks, optic cup/disc segmentation on fundus images and prostate segmentation on MRI images. Extensive experimental results demonstrate the superior performance of our method over state-of-the-art domain generalization methods.
Collapse
Affiliation(s)
- Yuxin Kang
- School of Information Science and Technology, Northwest University, Xi'an, 710127, China
| | - Xuan Zhao
- School of Information Science and Technology, Northwest University, Xi'an, 710127, China
| | - Yu Zhang
- Department of Medical Oncology, Harbin Medical University Cancer Hospital, Harbin, 150081, China
| | - Hansheng Li
- School of Information Science and Technology, Northwest University, Xi'an, 710127, China
| | - Guan Wang
- Department of Neurosurgery, Xi'an People's Hospital, Xi'an, 710004, China
| | - Lei Cui
- School of Information Science and Technology, Northwest University, Xi'an, 710127, China.
| | - Yaqiong Xing
- School of Information Science and Technology, Northwest University, Xi'an, 710127, China.
| | - Jun Feng
- School of Information Science and Technology, Northwest University, Xi'an, 710127, China
| | - Lin Yang
- School of Information Science and Technology, Northwest University, Xi'an, 710127, China
| |
Collapse
|
16
|
Gu R, Wang G, Lu J, Zhang J, Lei W, Chen Y, Liao W, Zhang S, Li K, Metaxas DN, Zhang S. CDDSA: Contrastive domain disentanglement and style augmentation for generalizable medical image segmentation. Med Image Anal 2023; 89:102904. [PMID: 37506556 DOI: 10.1016/j.media.2023.102904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 06/06/2023] [Accepted: 07/12/2023] [Indexed: 07/30/2023]
Abstract
Generalization to previously unseen images with potential domain shifts is essential for clinically applicable medical image segmentation. Disentangling domain-specific and domain-invariant features is key for Domain Generalization (DG). However, existing DG methods struggle to achieve effective disentanglement. To address this problem, we propose an efficient framework called Contrastive Domain Disentanglement and Style Augmentation (CDDSA) for generalizable medical image segmentation. First, a disentangle network decomposes the image into domain-invariant anatomical representation and domain-specific style code, where the former is sent for further segmentation that is not affected by domain shift, and the disentanglement is regularized by a decoder that combines the anatomical representation and style code to reconstruct the original image. Second, to achieve better disentanglement, a contrastive loss is proposed to encourage the style codes from the same domain and different domains to be compact and divergent, respectively. Finally, to further improve generalizability, we propose a style augmentation strategy to synthesize images with various unseen styles in real time while maintaining anatomical information. Comprehensive experiments on a public multi-site fundus image dataset and an in-house multi-site Nasopharyngeal Carcinoma Magnetic Resonance Image (NPC-MRI) dataset show that the proposed CDDSA achieved remarkable generalizability across different domains, and it outperformed several state-of-the-art methods in generalizable segmentation. Code is available at https://github.com/HiLab-git/DAG4MIA.
Collapse
Affiliation(s)
- Ran Gu
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Guotai Wang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; Shanghai AI Lab, Shanghai, China.
| | - Jiangshan Lu
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Jingyang Zhang
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China; School of Biomedical Engineering, ShanghaiTech University, Shanghai, China
| | - Wenhui Lei
- School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China; Shanghai AI Lab, Shanghai, China
| | - Yinan Chen
- SenseTime Research, Shanghai, China; West China Hospital-SenseTime Joint Lab, West China Biomedical Big Data Center, Sichuan University, Chengdu, China
| | - Wenjun Liao
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, University of Electronic Science and Technology of China, Chengdu, China
| | - Shichuan Zhang
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, University of Electronic Science and Technology of China, Chengdu, China
| | - Kang Li
- West China Hospital-SenseTime Joint Lab, West China Biomedical Big Data Center, Sichuan University, Chengdu, China
| | - Dimitris N Metaxas
- Department of Computer Science, Rutgers University, Piscataway NJ 08854, USA
| | - Shaoting Zhang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; SenseTime Research, Shanghai, China; Shanghai AI Lab, Shanghai, China.
| |
Collapse
|
17
|
Wei X, Li H, Zhu T, Li W, Li Y, Sui R. Deep Learning with Automatic Data Augmentation for Segmenting Schisis Cavities in the Optical Coherence Tomography Images of X-Linked Juvenile Retinoschisis Patients. Diagnostics (Basel) 2023; 13:3035. [PMID: 37835778 PMCID: PMC10572414 DOI: 10.3390/diagnostics13193035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 09/09/2023] [Accepted: 09/15/2023] [Indexed: 10/15/2023] Open
Abstract
X-linked juvenile retinoschisis (XLRS) is an inherited disorder characterized by retinal schisis cavities, which can be observed in optical coherence tomography (OCT) images. Monitoring disease progression necessitates the accurate segmentation and quantification of these cavities; yet, current manual methods are time consuming and result in subjective interpretations, highlighting the need for automated and precise solutions. We employed five state-of-the-art deep learning models-U-Net, U-Net++, Attention U-Net, Residual U-Net, and TransUNet-for the task, leveraging a dataset of 1500 OCT images from 30 patients. To enhance the models' performance, we utilized data augmentation strategies that were optimized via deep reinforcement learning. The deep learning models achieved a human-equivalent accuracy level in the segmentation of schisis cavities, with U-Net++ surpassing others by attaining an accuracy of 0.9927 and a Dice coefficient of 0.8568. By utilizing reinforcement-learning-based automatic data augmentation, deep learning segmentation models demonstrate a robust and precise method for the automated segmentation of schisis cavities in OCT images. These findings are a promising step toward enhancing clinical evaluation and treatment planning for XLRS.
Collapse
Affiliation(s)
| | | | | | | | | | - Ruifang Sui
- Department of Ophthalmology, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Peking Union Medical College, No. 1, Shuai Fu Yuan, Beijing 100730, China; (X.W.); (H.L.); (T.Z.); (W.L.); (Y.L.)
| |
Collapse
|
18
|
Shi M, Lokhande A, Fazli MS, Sharma V, Tian Y, Luo Y, Pasquale LR, Elze T, Boland MV, Zebardast N, Friedman DS, Shen LQ, Wang M. Artifact-Tolerant Clustering-Guided Contrastive Embedding Learning for Ophthalmic Images in Glaucoma. IEEE J Biomed Health Inform 2023; 27:4329-4340. [PMID: 37347633 PMCID: PMC10560582 DOI: 10.1109/jbhi.2023.3288830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/24/2023]
Abstract
Ophthalmic images, along with their derivatives like retinal nerve fiber layer (RNFL) thickness maps, play a crucial role in detecting and monitoring eye diseases such as glaucoma. For computer-aided diagnosis of eye diseases, the key technique is to automatically extract meaningful features from ophthalmic images that can reveal the biomarkers (e.g., RNFL thinning patterns) associated with functional vision loss. However, representation learning from ophthalmic images that links structural retinal damage with human vision loss is non-trivial mostly due to large anatomical variations between patients. This challenge is further amplified by the presence of image artifacts, commonly resulting from image acquisition and automated segmentation issues. In this paper, we present an artifact-tolerant unsupervised learning framework called EyeLearn for learning ophthalmic image representations in glaucoma cases. EyeLearn includes an artifact correction module to learn representations that optimally predict artifact-free images. In addition, EyeLearn adopts a clustering-guided contrastive learning strategy to explicitly capture the affinities within and between images. During training, images are dynamically organized into clusters to form contrastive samples, which encourage learning similar or dissimilar representations for images in the same or different clusters, respectively. To evaluate EyeLearn, we use the learned representations for visual field prediction and glaucoma detection with a real-world dataset of glaucoma patient ophthalmic images. Extensive experiments and comparisons with state-of-the-art methods confirm the effectiveness of EyeLearn in learning optimal feature representations from ophthalmic images.
Collapse
|
19
|
Hua K, Fang X, Tang Z, Cheng Y, Yu Z. DCAM-NET:A novel domain generalization optic cup and optic disc segmentation pipeline with multi-region and multi-scale convolution attention mechanism. Comput Biol Med 2023; 163:107076. [PMID: 37379616 DOI: 10.1016/j.compbiomed.2023.107076] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2023] [Revised: 04/27/2023] [Accepted: 05/27/2023] [Indexed: 06/30/2023]
Abstract
Fundus images are an essential basis for diagnosing ocular diseases, and using convolutional neural networks has shown promising results in achieving accurate fundus image segmentation. However, the difference between the training data (source domain) and the testing data (target domain) will significantly affect the final segmentation performance. This paper proposes a novel framework named DCAM-NET for fundus domain generalization segmentation, which substantially improves the generalization ability of the segmentation model to the target domain data and enhances the extraction of detailed information on the source domain data. This model can effectively overcome the problem of poor model performance due to cross-domain segmentation. To enhance the adaptability of the segmentation model to target domain data, this paper proposes a multi-scale attention mechanism module (MSA) that functions at the feature extraction level. Extracting different attribute features to enter the corresponding scale attention module further captures the critical features in channel, position, and spatial regions. The MSA attention mechanism module also integrates the characteristics of the self-attention mechanism, it can capture dense context information, and the aggregation of multi-feature information effectively enhances the generalization of the model when dealing with unknown domain data. In addition, this paper proposes the multi-region weight fusion convolution module (MWFC), which is essential for the segmentation model to extract feature information from the source domain data accurately. Fusing multiple region weights and convolutional kernel weights on the image to enhance the model adaptability to information at different locations on the image, the fusion of weights deepens the capacity and depth of the model. It enhances the learning ability of the model for multiple regions on the source domain. Our experiments on fundus data for cup/disc segmentation show that the introduction of MSA and MWFC modules in this paper effectively improves the segmentation ability of the segmentation model on the unknown domain. And the performance of the proposed method is significantly better than other methods in the current domain generalization segmentation of the optic cup/disc.
Collapse
Affiliation(s)
- Kaiwen Hua
- School of Computer Science and Engineering, Anhui University of Science and Technology, 232001, Huainan, Anhui, China
| | - Xianjin Fang
- School of Computer Science and Engineering, Anhui University of Science and Technology, 232001, Huainan, Anhui, China.
| | - Zhiri Tang
- Academy for Engineering and Technology, Fudan University, 200433, Shanghai, China
| | - Ying Cheng
- School of Artificial Intelligence Academy, Anhui University of Science and Technology, 232001, Huainan, Anhui, China
| | - Zekuan Yu
- Academy for Engineering and Technology, Fudan University, 200433, Shanghai, China.
| |
Collapse
|
20
|
Shi P, Qiu J, Abaxi SMD, Wei H, Lo FPW, Yuan W. Generalist Vision Foundation Models for Medical Imaging: A Case Study of Segment Anything Model on Zero-Shot Medical Segmentation. Diagnostics (Basel) 2023; 13:1947. [PMID: 37296799 PMCID: PMC10252742 DOI: 10.3390/diagnostics13111947] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 05/26/2023] [Accepted: 05/31/2023] [Indexed: 06/12/2023] Open
Abstract
Medical image analysis plays an important role in clinical diagnosis. In this paper, we examine the recent Segment Anything Model (SAM) on medical images, and report both quantitative and qualitative zero-shot segmentation results on nine medical image segmentation benchmarks, covering various imaging modalities, such as optical coherence tomography (OCT), magnetic resonance imaging (MRI), and computed tomography (CT), as well as different applications including dermatology, ophthalmology, and radiology. Those benchmarks are representative and commonly used in model development. Our experimental results indicate that while SAM presents remarkable segmentation performance on images from the general domain, its zero-shot segmentation ability remains restricted for out-of-distribution images, e.g., medical images. In addition, SAM exhibits inconsistent zero-shot segmentation performance across different unseen medical domains. For certain structured targets, e.g., blood vessels, the zero-shot segmentation of SAM completely failed. In contrast, a simple fine-tuning of it with a small amount of data could lead to remarkable improvement of the segmentation quality, showing the great potential and feasibility of using fine-tuned SAM to achieve accurate medical image segmentation for a precision diagnostics. Our study indicates the versatility of generalist vision foundation models on medical imaging, and their great potential to achieve desired performance through fine-turning and eventually address the challenges associated with accessing large and diverse medical datasets in support of clinical diagnostics.
Collapse
Affiliation(s)
- Peilun Shi
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China; (P.S.); (J.Q.); (S.M.D.A.); (H.W.)
| | - Jianing Qiu
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China; (P.S.); (J.Q.); (S.M.D.A.); (H.W.)
- Department of Computing, Imperial College London, London SW7 2AZ, UK
| | - Sai Mu Dalike Abaxi
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China; (P.S.); (J.Q.); (S.M.D.A.); (H.W.)
| | - Hao Wei
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China; (P.S.); (J.Q.); (S.M.D.A.); (H.W.)
| | - Frank P.-W. Lo
- Hamlyn Centre, Department of Surgery and Cancer, Imperial College London, London SW7 2AZ, UK;
| | - Wu Yuan
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China; (P.S.); (J.Q.); (S.M.D.A.); (H.W.)
| |
Collapse
|
21
|
Zhou K, Liu Z, Qiao Y, Xiang T, Loy CC. Domain Generalization: A Survey. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:4396-4415. [PMID: 35914036 DOI: 10.1109/tpami.2022.3195549] [Citation(s) in RCA: 50] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Generalization to out-of-distribution (OOD) data is a capability natural to humans yet challenging for machines to reproduce. This is because most learning algorithms strongly rely on the i.i.d. assumption on source/target data, which is often violated in practice due to domain shift. Domain generalization (DG) aims to achieve OOD generalization by using only source data for model learning. Over the last ten years, research in DG has made great progress, leading to a broad spectrum of methodologies, e.g., those based on domain alignment, meta-learning, data augmentation, or ensemble learning, to name a few; DG has also been studied in various application areas including computer vision, speech recognition, natural language processing, medical imaging, and reinforcement learning. In this paper, for the first time a comprehensive literature review in DG is provided to summarize the developments over the past decade. Specifically, we first cover the background by formally defining DG and relating it to other relevant fields like domain adaptation and transfer learning. Then, we conduct a thorough review into existing methods and theories. Finally, we conclude this survey with insights and discussions on future research directions.
Collapse
|
22
|
Hu S, Liao Z, Zhang J, Xia Y. Domain and Content Adaptive Convolution Based Multi-Source Domain Generalization for Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:233-244. [PMID: 36155434 DOI: 10.1109/tmi.2022.3210133] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
The domain gap caused mainly by variable medical image quality renders a major obstacle on the path between training a segmentation model in the lab and applying the trained model to unseen clinical data. To address this issue, domain generalization methods have been proposed, which however usually use static convolutions and are less flexible. In this paper, we propose a multi-source domain generalization model based on the domain and content adaptive convolution (DCAC) for the segmentation of medical images across different modalities. Specifically, we design the domain adaptive convolution (DAC) module and content adaptive convolution (CAC) module and incorporate both into an encoder-decoder backbone. In the DAC module, a dynamic convolutional head is conditioned on the predicted domain code of the input to make our model adapt to the unseen target domain. In the CAC module, a dynamic convolutional head is conditioned on the global image features to make our model adapt to the test image. We evaluated the DCAC model against the baseline and four state-of-the-art domain generalization methods on the prostate segmentation, COVID-19 lesion segmentation, and optic cup/optic disc segmentation tasks. Our results not only indicate that the proposed DCAC model outperforms all competing methods on each segmentation task but also demonstrate the effectiveness of the DAC and CAC modules. Code is available at https://git.io/DCAC.
Collapse
|
23
|
Zhang X, Song J, Wang C, Zhou Z. Convolutional autoencoder joint boundary and mask adversarial learning for fundus image segmentation. Front Hum Neurosci 2022; 16:1043569. [PMID: 36561837 PMCID: PMC9765310 DOI: 10.3389/fnhum.2022.1043569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 11/18/2022] [Indexed: 12/12/2022] Open
Abstract
The precise segmentation of the optic cup (OC) and the optic disc (OD) is important for glaucoma screening. In recent years, medical image segmentation based on convolutional neural networks (CNN) has achieved remarkable results. However, many traditional CNN methods do not consider the cross-domain problem, i.e., generalization on datasets of different domains. In this paper, we propose a novel unsupervised domain-adaptive segmentation architecture called CAE-BMAL. Firstly, we enhance the source domain with a convolutional autoencoder to improve the generalization ability of the model. Then, we introduce an adversarial learning-based boundary discrimination branch to reduce the impact of the complex environment during segmentation. Finally, we evaluate the proposed method on three datasets, Drishti-GS, RIM-ONE-r3, and REFUGE. The experimental evaluations outperform most state-of-the-art methods in accuracy and generalization. We further evaluate the cup-to-disk ratio performance in OD and OC segmentation, which indicates the effectiveness of glaucoma discrimination.
Collapse
Affiliation(s)
- Xu Zhang
- Department of Computer Science and Technology, Chongqing University of Posts and Technology, Chongqing, China,Key Laboratory of Tourism Multisource Data Perception and Decision, Ministry of Culture and Tourism, Chongqing, China
| | - Jiaqi Song
- Department of Computer Science and Technology, Chongqing University of Posts and Technology, Chongqing, China,Key Laboratory of Tourism Multisource Data Perception and Decision, Ministry of Culture and Tourism, Chongqing, China
| | - Chengrui Wang
- Chongqing Telecom System Integration Co., Ltd., Chongqing, China
| | - Zhen Zhou
- Tianjin Eye Hospital, Tianjin, China,Tianjin Key Laboratory of Ophthalmology and Vision Science, Tianjin, China,Nankai University Affiliated Eye Hospital, Tianjin, China,Clinical College of Ophthalmology Tianjin Medical University, Tianjin, China,*Correspondence: Zhen Zhou
| |
Collapse
|
24
|
Lyu J, Zhang Y, Huang Y, Lin L, Cheng P, Tang X. AADG: Automatic Augmentation for Domain Generalization on Retinal Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:3699-3711. [PMID: 35862336 DOI: 10.1109/tmi.2022.3193146] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Convolutional neural networks have been widely applied to medical image segmentation and have achieved considerable performance. However, the performance may be significantly affected by the domain gap between training data (source domain) and testing data (target domain). To address this issue, we propose a data manipulation based domain generalization method, called Automated Augmentation for Domain Generalization (AADG). Our AADG framework can effectively sample data augmentation policies that generate novel domains and diversify the training set from an appropriate search space. Specifically, we introduce a novel proxy task maximizing the diversity among multiple augmented novel domains as measured by the Sinkhorn distance in a unit sphere space, making automated augmentation tractable. Adversarial training and deep reinforcement learning are employed to efficiently search the objectives. Quantitative and qualitative experiments on 11 publicly-accessible fundus image datasets (four for retinal vessel segmentation, four for optic disc and cup (OD/OC) segmentation and three for retinal lesion segmentation) are comprehensively performed. Two OCTA datasets for retinal vasculature segmentation are further involved to validate cross-modality generalization. Our proposed AADG exhibits state-of-the-art generalization performance and outperforms existing approaches by considerable margins on retinal vessel, OD/OC and lesion segmentation tasks. The learned policies are empirically validated to be model-agnostic and can transfer well to other models. The source code is available at https://github.com/CRazorback/AADG.
Collapse
|
25
|
Classification of diabetic retinopathy with feature selection over deep features using nature-inspired wrapper methods. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
26
|
Qin X, Wang J, Chen Y, Lu W, Jiang X. Domain Generalization for Activity Recognition via Adaptive Feature Fusion. ACM T INTEL SYST TEC 2022. [DOI: 10.1145/3552434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
Abstract
Human activity recognition requires the efforts to build a generalizable model using the training datasets with the hope to achieve good performance in test datasets. However, in real applications, the training and testing datasets may have totally different distributions due to various reasons such as different body shapes, acting styles, and habits, damaging the model’s generalization performance. While such a distribution gap can be reduced by existing domain adaptation approaches, they typically assume that the test data can be accessed in the training stage, which is not realistic. In this paper, we consider a more practical and challenging scenario: domain-generalized activity recognition (DGAR) where the test dataset
cannot
be accessed during training. To this end, we propose
Adaptive Feature Fusion for Activity Recognition (AFFAR)
, a domain generalization approach that learns to fuse the domain-invariant and domain-specific representations to improve the model’s generalization performance. AFFAR takes the best of both worlds where domain-invariant representations enhance the transferability across domains and domain-specific representations leverage the model discrimination power from each domain. Extensive experiments on three public HAR datasets show its effectiveness. Furthermore, we apply AFFAR to a real application, i.e., the diagnosis of Children’s Attention Deficit Hyperactivity Disorder (ADHD), which also demonstrates the superiority of our approach.
Collapse
Affiliation(s)
- Xin Qin
- Beijing Key Lab. of Mobile Computing and Pervasive Devices, Inst. of Computing Tech., CAS, University of Chinese Academy of Sciences, China
| | | | - Yiqiang Chen
- Beijing Key Lab. of Mobile Computing and Pervasive Devices, Inst. of Computing Tech., CAS, University of Chinese Academy of Sciences, Pengcheng Laboratory, Shenzhen, China
| | - Wang Lu
- Beijing Key Lab. of Mobile Computing and Pervasive Devices, Inst. of Computing Tech., CAS, University of Chinese Academy of Sciences, China
| | - Xinlong Jiang
- Beijing Key Lab. of Mobile Computing and Pervasive Devices, Inst. of Computing Tech., CAS, China
| |
Collapse
|
27
|
Robust color medical image segmentation on unseen domain by randomized illumination enhancement. Comput Biol Med 2022; 145:105427. [DOI: 10.1016/j.compbiomed.2022.105427] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 02/26/2022] [Accepted: 03/18/2022] [Indexed: 11/19/2022]
|
28
|
Shi C, Zhang J, Zhang X, Shen M, Chen H, Wang L. A recurrent skip deep learning network for accurate image segmentation. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103533] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
29
|
Li K, Yu L, Heng PA. Towards Reliable Cardiac Image Segmentation: Assessing Image-level and Pixel-level Segmentation Quality via Self-reflective References. Med Image Anal 2022; 78:102426. [DOI: 10.1016/j.media.2022.102426] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 02/25/2022] [Accepted: 03/18/2022] [Indexed: 11/24/2022]
|
30
|
Xiong H, Liu S, Sharan RV, Coiera E, Berkovsky S. Weak label based Bayesian U-Net for optic disc segmentation in fundus images. Artif Intell Med 2022; 126:102261. [DOI: 10.1016/j.artmed.2022.102261] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 01/18/2022] [Accepted: 02/20/2022] [Indexed: 01/27/2023]
|
31
|
Vinayaki VD, Kalaiselvi R. Multithreshold Image Segmentation Technique Using Remora Optimization Algorithm for Diabetic Retinopathy Detection from Fundus Images. Neural Process Lett 2022; 54:2363-2384. [PMID: 35095328 PMCID: PMC8784591 DOI: 10.1007/s11063-021-10734-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/24/2021] [Indexed: 12/21/2022]
Abstract
One of the most common complications of diabetes mellitus is diabetic retinopathy (DR), which produces lesions on the retina. A novel framework for DR detection and classification was proposed in this study. The proposed work includes four stages: pre-processing, segmentation, feature extraction, and classification. Initially, the image pre-processing is performed and after that, the Multi threshold-based Remora Optimization (MTRO) algorithm performs the vessel segmentation. The feature extraction and classification process are done by using a Region-based Convolution Neural Network (R-CNN) with Wild Geese Algorithm (WGA). Finally, the proposed R-CNN with WGA effectively classifies the different stages of DR including Non-DR, Proliferative DR, Severe, Moderate DR, Mild DR. The experimental images were collected from the DRIVE database, and the proposed framework exhibited superior DR detection performance. Compared to other existing methods like fully convolutional deep neural network (FCDNN), genetic-search feature selection (GSFS), Convolutional Neural Networks (CNN), and deep learning (DL) techniques, the proposed R-CNN with WGA provided 95.42% accuracy, 93.10% specificity, 93.20% sensitivity, and 98.28% F-score results.
Collapse
Affiliation(s)
- V. Desika Vinayaki
- Department of Computer Science and Engineering, Noorul Islam Centre for Higher Education, Kumaracoil, India
| | - R. Kalaiselvi
- Department of Computer Science and Engineering, Noorul Islam Centre for Higher Education, Kumaracoil, India
| |
Collapse
|
32
|
Domain generalization on medical imaging classification using episodic training with task augmentation. Comput Biol Med 2021; 141:105144. [PMID: 34971982 DOI: 10.1016/j.compbiomed.2021.105144] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Revised: 12/12/2021] [Accepted: 12/13/2021] [Indexed: 12/22/2022]
Abstract
Medical imaging datasets usually exhibit domain shift due to the variations of scanner vendors, imaging protocols, etc. This raises the concern about the generalization capacity of machine learning models. Domain generalization (DG), which aims to learn a model from multiple source domains such that it can be directly generalized to unseen test domains, seems particularly promising to medical imaging community. To address DG, recent model-agnostic meta-learning (MAML) has been introduced, which transfers the knowledge from previous training tasks to facilitate the learning of novel testing tasks. However, in clinical practice, there are usually only a few annotated source domains available, which decreases the capacity of training task generation and thus increases the risk of overfitting to training tasks in the paradigm. In this paper, we propose a novel DG scheme of episodic training with task augmentation on medical imaging classification. Based on meta-learning, we develop the paradigm of episodic training to construct the knowledge transfer from episodic training-task simulation to the real testing task of DG. Motivated by the limited number of source domains in real-world medical deployment, we consider the unique task-level overfitting and we propose task augmentation to enhance the variety during training task generation to alleviate it. With the established learning framework, we further exploit a novel meta-objective to regularize the deep embedding of training domains. To validate the effectiveness of the proposed method, we perform experiments on histopathological images and abdominal CT images.
Collapse
|
33
|
Enayet A, Sukthankar G. Learning a Generalizable Model of Team Conflict from Multiparty Dialogues. INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING 2021. [DOI: 10.1142/s1793351x21400110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Good communication is indubitably the foundation of effective teamwork. Over time teams develop their own communication styles and often exhibit entrainment, a conversational phenomena in which humans synchronize their linguistic choices. Conversely, teams may experience conflict due to either personal incompatibility or differing viewpoints. We tackle the problem of predicting team conflict from embeddings learned from multiparty dialogues such that teams with similar post-task conflict scores lie close to one another in vector space. Embeddings were extracted from three types of features: (1) dialogue acts, (2) sentiment polarity, and (3) syntactic entrainment. Machine learning models often suffer domain shift; one advantage of encoding the semantic features is their adaptability across multiple domains. To provide intuition on the generalizability of different embeddings to other goal-oriented teamwork dialogues, we test the effectiveness of learned models trained on the Teams corpus on two other datasets. Unlike syntactic entrainment, both dialogue act and sentiment embeddings are effective for identifying team conflict. Our results show that dialogue act-based embeddings have the potential to generalize better than sentiment and entrainment-based embeddings. These findings have potential ramifications for the development of conversational agents that facilitate teaming.
Collapse
Affiliation(s)
- Ayesha Enayet
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
| | - Gita Sukthankar
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
| |
Collapse
|
34
|
Nomura Y, Hanaoka S, Takenaga T, Nakao T, Shibata H, Miki S, Yoshikawa T, Watadani T, Hayashi N, Abe O. Preliminary study of generalized semiautomatic segmentation for 3D voxel labeling of lesions based on deep learning. Int J Comput Assist Radiol Surg 2021; 16:1901-1913. [PMID: 34652606 DOI: 10.1007/s11548-021-02504-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 09/17/2021] [Indexed: 11/28/2022]
Abstract
PURPOSE The three-dimensional (3D) voxel labeling of lesions requires significant radiologists' effort in the development of computer-aided detection software. To reduce the time required for the 3D voxel labeling, we aimed to develop a generalized semiautomatic segmentation method based on deep learning via a data augmentation-based domain generalization framework. In this study, we investigated whether a generalized semiautomatic segmentation model trained using two types of lesion can segment previously unseen types of lesion. METHODS We targeted lung nodules in chest CT images, liver lesions in hepatobiliary-phase images of Gd-EOB-DTPA-enhanced MR imaging, and brain metastases in contrast-enhanced MR images. For each lesion, the 32 × 32 × 32 isotropic volume of interest (VOI) around the center of gravity of the lesion was extracted. The VOI was input into a 3D U-Net model to define the label of the lesion. For each type of target lesion, we compared five types of data augmentation and two types of input data. RESULTS For all considered target lesions, the highest dice coefficients among the training patterns were obtained when using a combination of the existing data augmentation-based domain generalization framework and random monochrome inversion and when using the resized VOI as the input image. The dice coefficients were 0.639 ± 0.124 for the lung nodules, 0.660 ± 0.137 for the liver lesions, and 0.727 ± 0.115 for the brain metastases. CONCLUSIONS Our generalized semiautomatic segmentation model could label unseen three types of lesion with different contrasts from the surroundings. In addition, the resized VOI as the input image enables the adaptation to the various sizes of lesions even when the size distribution differed between the training set and the test set.
Collapse
Affiliation(s)
- Yukihiro Nomura
- Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan. .,Center for Frontier Medical Engineering, Chiba University, 1-33 Yayoi-cho, Inage-ku, Chiba, 263-8522, Japan.
| | - Shouhei Hanaoka
- Department of Radiology, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan
| | - Tomomi Takenaga
- Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan
| | - Takahiro Nakao
- Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan
| | - Hisaichi Shibata
- Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan
| | - Soichiro Miki
- Department of Radiology, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan
| | - Takeharu Yoshikawa
- Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan
| | - Takeyuki Watadani
- Department of Radiology, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan
| | - Naoto Hayashi
- Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan
| | - Osamu Abe
- Department of Radiology, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan
| |
Collapse
|
35
|
Lei H, Liu W, Xie H, Zhao B, Yue G, Lei B. Unsupervised Domain Adaptation Based Image Synthesis and Feature Alignment for Joint Optic Disc and Cup Segmentation. IEEE J Biomed Health Inform 2021; 26:90-102. [PMID: 34061755 DOI: 10.1109/jbhi.2021.3085770] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Due to the discrepancy of different devices for fundus image collection, a well-trained neural network is usually unsuitable for another new dataset. To solve this problem, the unsupervised domain adaptation strategy attracts a lot of attentions. In this paper, we propose an unsupervised domain adaptation method based image synthesis and feature alignment (ISFA) method to segment optic disc and cup on the fundus image. The GAN-based image synthesis (IS) mechanism along with the boundary information of optic disc and cup is utilized to generate target-like query images, which serves as the intermediate latent space between source domain and target domain images to alleviate the domain shift problem. Specifically, we use content and style feature alignment (CSFA) to ensure the feature consistency among source domain images, target-like query images and target domain images. The adversarial learning is used to extract domain invariant features for output-level feature alignment (OLFA). To enhance the representation ability of domain-invariant boundary structure information, we introduce the edge attention module (EAM) for low-level feature maps. Eventually, we train our proposed method on the training set of the REFUGE challenge dataset and test it on Drishti-GS and RIM-ONE_r3 datasets. On the Drishti-GS dataset, our method achieves about 3% improvement of Dice on optic cup segmentation over the next best method. We comprehensively discuss the robustness of our method for small dataset domain adaptation. The experimental results also demonstrate the effectiveness of our method. Our code is available at https://github.com/thinkobj/ISFA.
Collapse
|