1
|
Lin Q, Guo S, Zhang H, Gao Z. Causal recurrent intervention for cross-modal cardiac image segmentation. Comput Med Imaging Graph 2025; 123:102549. [PMID: 40279865 DOI: 10.1016/j.compmedimag.2025.102549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2024] [Revised: 04/01/2025] [Accepted: 04/01/2025] [Indexed: 04/29/2025]
Abstract
Cross-modal cardiac image segmentation is essential for cardiac disease analysis. In diagnosis, it enables clinicians to obtain more precise information about cardiac structure or function for potential signs by leveraging specific imaging modalities. For instance, cardiovascular pathologies such as myocardial infarction and congenital heart defects require precise cross-modal characterization to guide clinical decisions. The growing adoption of cross-modal segmentation in clinical research underscores its technical value, yet annotating cardiac images with multiple slices is time-consuming and labor-intensive, making it difficult to meet clinical and deep learning demands. To reduce the need for labels, cross-modal approaches could leverage general knowledge from multiple modalities. However, implementing a cross-modal method remains challenging due to cross-domain confounding. This challenge arises from the intricate effects of modality and view alterations between images, including inconsistent high-dimensional features. The confounding complicates the causality between the observation (image) and the prediction (label), thereby weakening the domain-invariant representation. Existing disentanglement methods face difficulties in addressing the confounding due to the insufficient depiction of the relationship between latent factors. This paper proposes the causal recurrent intervention (CRI) method to overcome the above challenge. It establishes a structural causal model that allows individual domains to maintain causal consistency through interventions. The CRI method integrates diverse high-dimensional variations into a singular causal relationship by embedding image slices into a sequence. This approach further distinguishes stable and dynamic factors from the sequence, subsequently separating the stable factor into modal and view factors and establishing causal connections between them. It then learns the dynamic factor and the view factor from the observation to obtain the label. Experimental results on cross-modal cardiac images of 1697 examples show that the CRI method delivers promising and productive cross-modal cardiac image segmentation performance.
Collapse
Affiliation(s)
- Qixin Lin
- School of Biomedical Engineering, Sun Yat-sen University, Shenzhen, China.
| | - Saidi Guo
- School of Cyberspace Security, Zhengzhou University, Zhengzhou, China.
| | - Heye Zhang
- School of Biomedical Engineering, Sun Yat-sen University, Shenzhen, China; Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, China.
| | - Zhifan Gao
- School of Biomedical Engineering, Sun Yat-sen University, Shenzhen, China.
| |
Collapse
|
2
|
Cai Z, Xin J, You C, Shi P, Dong S, Dvornek NC, Zheng N, Duncan JS. Style mixup enhanced disentanglement learning for unsupervised domain adaptation in medical image segmentation. Med Image Anal 2025; 101:103440. [PMID: 39764933 DOI: 10.1016/j.media.2024.103440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 11/22/2024] [Accepted: 12/13/2024] [Indexed: 03/05/2025]
Abstract
Unsupervised domain adaptation (UDA) has shown impressive performance by improving the generalizability of the model to tackle the domain shift problem for cross-modality medical segmentation. However, most of the existing UDA approaches depend on high-quality image translation with diversity constraints to explicitly augment the potential data diversity, which is hard to ensure semantic consistency and capture domain-invariant representation. In this paper, free of image translation and diversity constraints, we propose a novel Style Mixup Enhanced Disentanglement Learning (SMEDL) for UDA medical image segmentation to further improve domain generalization and enhance domain-invariant learning ability. Firstly, our method adopts disentangled style mixup to implicitly generate style-mixed domains with diverse styles in the feature space through a convex combination of disentangled style factors, which can effectively improve the model generalization. Meanwhile, we further introduce pixel-wise consistency regularization to ensure the effectiveness of style-mixed domains and provide domain consistency guidance. Secondly, we introduce dual-level domain-invariant learning, including intra-domain contrastive learning and inter-domain adversarial learning to mine the underlying domain-invariant representation under both intra- and inter-domain variations. We have conducted comprehensive experiments to evaluate our method on two public cardiac datasets and one brain dataset. Experimental results demonstrate that our proposed method achieves superior performance compared to the state-of-the-art methods for UDA medical image segmentation.
Collapse
Affiliation(s)
- Zhuotong Cai
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, Shaanxi, China; Department of Biomedical Engineering, Yale University, New Haven, CT, USA.
| | - Jingmin Xin
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, Shaanxi, China.
| | - Chenyu You
- Department of Electrical Engineering, Yale University, New Haven, CT, USA
| | - Peiwen Shi
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Siyuan Dong
- Department of Electrical Engineering, Yale University, New Haven, CT, USA
| | - Nicha C Dvornek
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA
| | - Nanning Zheng
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - James S Duncan
- Department of Electrical Engineering, Yale University, New Haven, CT, USA; Department of Biomedical Engineering, Yale University, New Haven, CT, USA.
| |
Collapse
|
3
|
Wang Y, Meng C, Tang Z, Bai X, Ji P, Bai X. Unsupervised Domain Adaptation for Cross-Modality Cerebrovascular Segmentation. IEEE J Biomed Health Inform 2025; 29:2871-2884. [PMID: 40030830 DOI: 10.1109/jbhi.2024.3523103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Cerebrovascular segmentation from time-of-flight magnetic resonance angiography (TOF-MRA) and computed tomography angiography (CTA) is essential in providing supportive information for diagnosing and treatment planning of multiple intracranial vascular diseases. Different imaging modalities utilize distinct principles to visualize the cerebral vasculature, which leads to the limitations of expensive annotations and performance degradation while training and deploying deep learning models. In this paper, we propose an unsupervised domain adaptation framework CereTS to perform translation and segmentation of cross-modality unpaired cerebral angiography. Considering the commonality of vascular structures and stylistic textures as domain-invariant and domain-specific features, CereTS adopts a multi-level domain alignment pattern that includes an image-level cyclic geometric consistency constraint, a patch-level masked contrastive constraint and a feature-level semantic perception constraint to shrink domain discrepancy while preserving consistency of vascular structures. Conducted on a publicly available TOF-MRA dataset and a private CTA dataset, our experiment shows that CereTS outperforms current state-of-the-art methods by a large margin.
Collapse
|
4
|
Beizaee F, Lodygensky GA, Adamson CL, Thompson DK, Cheong JLY, Spittle AJ, Anderson PJ, Desrosiers C, Dolz J. Harmonizing flows: Leveraging normalizing flows for unsupervised and source-free MRI harmonization. Med Image Anal 2025; 101:103483. [PMID: 39919411 DOI: 10.1016/j.media.2025.103483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2024] [Revised: 01/21/2025] [Accepted: 01/25/2025] [Indexed: 02/09/2025]
Abstract
Lack of standardization and various intrinsic parameters for magnetic resonance (MR) image acquisition results in heterogeneous images across different sites and devices, which adversely affects the generalization of deep neural networks. To alleviate this issue, this work proposes a novel unsupervised harmonization framework that leverages normalizing flows to align MR images, thereby emulating the distribution of a source domain. The proposed strategy comprises three key steps. Initially, a normalizing flow network is trained to capture the distribution characteristics of the source domain. Then, we train a shallow harmonizer network to reconstruct images from the source domain via their augmented counterparts. Finally, during inference, the harmonizer network is updated to ensure that the output images conform to the learned source domain distribution, as modeled by the normalizing flow network. Our approach, which is unsupervised, source-free, and task-agnostic is assessed in the context of both adults and neonatal cross-domain brain MRI segmentation, as well as neonatal brain age estimation, demonstrating its generalizability across tasks and population demographics. The results underscore its superior performance compared to existing methodologies. The code is available at https://github.com/farzad-bz/Harmonizing-Flows.
Collapse
Affiliation(s)
- Farzad Beizaee
- LIVIA, ÉTS, Montreal, Quebec, Canada; ILLS , McGill - ETS - Mila - CNRS - Université Paris-Saclay - CentraleSupelec, Canada; CHU Sainte-Justine, University of Montreal, Montreal, Canada.
| | - Gregory A Lodygensky
- CHU Sainte-Justine, University of Montreal, Montreal, Canada; Canadian Neonatal Brain Platform, Montreal, Canada
| | - Chris L Adamson
- Murdoch Children's Research Institute, Parkville, Victoria, Australia
| | - Deanne K Thompson
- Murdoch Children's Research Institute, Parkville, Victoria, Australia; School of Psychological Sciences, Monash University, Clayton, Victoria, Australia; Department of Paediatrics, The University of Melbourne, Victoria, Australia
| | - Jeanie L Y Cheong
- Murdoch Children's Research Institute, Parkville, Victoria, Australia; Department of Paediatrics, The University of Melbourne, Victoria, Australia; The Royal Women's Hospital, Melbourne, Parkville, Victoria, Australia; Department of Obstetrics and Gynaecology, The University of Melbourne, Victoria, Australia
| | - Alicia J Spittle
- Murdoch Children's Research Institute, Parkville, Victoria, Australia; The Royal Women's Hospital, Melbourne, Parkville, Victoria, Australia; Department of Physiotherapy, The University of Melbourne, Victoria, Australia
| | - Peter J Anderson
- Murdoch Children's Research Institute, Parkville, Victoria, Australia; School of Psychological Sciences, Monash University, Clayton, Victoria, Australia
| | - Christian Desrosiers
- LIVIA, ÉTS, Montreal, Quebec, Canada; ILLS , McGill - ETS - Mila - CNRS - Université Paris-Saclay - CentraleSupelec, Canada
| | - Jose Dolz
- LIVIA, ÉTS, Montreal, Quebec, Canada; ILLS , McGill - ETS - Mila - CNRS - Université Paris-Saclay - CentraleSupelec, Canada
| |
Collapse
|
5
|
Chen R, Yang J, Xiong H, Xu R, Feng Y, Wu J, Liu Z. Cross-center Model Adaptive Tooth segmentation. Med Image Anal 2025; 101:103443. [PMID: 39778266 DOI: 10.1016/j.media.2024.103443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 08/22/2024] [Accepted: 12/17/2024] [Indexed: 01/11/2025]
Abstract
Automatic 3-dimensional tooth segmentation on intraoral scans (IOS) plays a pivotal role in computer-aided orthodontic treatments. In practice, deploying existing well-trained models to different medical centers suffers from two main problems: (1) the data distribution shifts between existing and new centers, which causes significant performance degradation. (2) The data in the existing center(s) is usually not permitted to be shared, and annotating additional data in the new center(s) is time-consuming and expensive, thus making re-training or fine-tuning unfeasible. In this paper, we propose a framework for Cross-center Model Adaptive Tooth segmentation (CMAT) to alleviate these issues. CMAT takes the trained model(s) from the source center(s) as input and adapts them to different target centers, without data transmission or additional annotations. CMAT is applicable to three cross-center scenarios: source-data-free, multi-source-data-free, and test-time. The model adaptation in CMAT is realized by a tooth-level prototype alignment module, a progressive pseudo-labeling transfer module, and a tooth-prior regularized information maximization module. Experiments under three cross-center scenarios on two datasets show that CMAT can consistently surpass existing baselines. The effectiveness is further verified with extensive ablation studies and statistical analysis, demonstrating its applicability for privacy-preserving model adaptive tooth segmentation in real-world digital dentistry.
Collapse
Affiliation(s)
- Ruizhe Chen
- Stomatology Hospital Affliated to Zhejiang University of Medicine, Zhejiang University, Hangzhou, 310016, China; ZJU-Angelalign R&D Center for Intelligence Healthcare, ZJU-UIUC Institute, Zhejiang University, Haining, 314400, China; Zhejiang Key Laboratory of Medical Imaging Artificial Intelligence, Zhejiang University, Hangzhou, 310058, China
| | - Jianfei Yang
- School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore, 639798, Singapore
| | - Huimin Xiong
- ZJU-Angelalign R&D Center for Intelligence Healthcare, ZJU-UIUC Institute, Zhejiang University, Haining, 314400, China; Zhejiang Key Laboratory of Medical Imaging Artificial Intelligence, Zhejiang University, Hangzhou, 310058, China
| | - Ruiling Xu
- ZJU-Angelalign R&D Center for Intelligence Healthcare, ZJU-UIUC Institute, Zhejiang University, Haining, 314400, China
| | - Yang Feng
- Angelalign Technology Inc., Shanghai, 200433, China
| | - Jian Wu
- Zhejiang Key Laboratory of Medical Imaging Artificial Intelligence, Zhejiang University, Hangzhou, 310058, China; State Key Laboratory of Transvascular Implantation Devices of The Second Affiliated Hospital, School of Medicine and School of Public Health, Zhejiang University, Hangzhou, 310058, China
| | - Zuozhu Liu
- Stomatology Hospital Affliated to Zhejiang University of Medicine, Zhejiang University, Hangzhou, 310016, China; ZJU-Angelalign R&D Center for Intelligence Healthcare, ZJU-UIUC Institute, Zhejiang University, Haining, 314400, China; Zhejiang Key Laboratory of Medical Imaging Artificial Intelligence, Zhejiang University, Hangzhou, 310058, China.
| |
Collapse
|
6
|
Qian X, Shao HC, Li Y, Lu W, Zhang Y. Histogram matching-enhanced adversarial learning for unsupervised domain adaptation in medical image segmentation. Med Phys 2025. [PMID: 40102198 DOI: 10.1002/mp.17757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2024] [Revised: 02/20/2025] [Accepted: 02/26/2025] [Indexed: 03/20/2025] Open
Abstract
BACKGROUND Unsupervised domain adaptation (UDA) seeks to mitigate the performance degradation of deep neural networks when applied to new, unlabeled domains by leveraging knowledge from source domains. In medical image segmentation, prevailing UDA techniques often utilize adversarial learning to address domain shifts for cross-modality adaptation. Current research on adversarial learning tends to adopt increasingly complex models and loss functions, making the training process highly intricate and less stable/robust. Furthermore, most methods primarily focused on segmentation accuracy while neglecting the associated confidence levels and uncertainties. PURPOSE To develop a simple yet effective UDA method based on histogram matching-enhanced adversarial learning (HMeAL-UDA), and provide comprehensive uncertainty estimations of the model predictions. METHODS Aiming to bridge the domain gap while reducing the model complexity, we developed a novel adversarial learning approach to align multi-modality features. The method, termed HMeAL-UDA, integrates a plug-and-play histogram matching strategy to mitigate domain-specific image style biases across modalities. We employed adversarial learning to constrain the model in the prediction space, enabling it to focus on domain-invariant features during segmentation. Moreover, we quantified the model's prediction confidence using Monte Carlo (MC) dropouts to assess two voxel-level uncertainty estimates of the segmentation results, which were subsequently aggregated into a volume-level uncertainty score, providing an overall measure of the model's reliability. The proposed method was evaluated on three public datasets (Combined Healthy Abdominal Organ Segmentation [CHAOS], Beyond the Cranial Vault [BTCV], and Abdominal Multi-Organ Segmentation Challenge [AMOS]) and one in-house clinical dataset (UTSW). We used 30 MRI scans (20 from the CHAOS dataset and 10 from the in-house dataset) and 30 CT scans from the BTCV dataset for UDA-based, cross-modality liver segmentation. Additionally, 240 CT scans and 60 MRI scans from the AMOS dataset were utilized for cross-modality multi-organ segmentation. The training and testing sets for each modality were split with ratios of approximately 4:1-3:1. RESULTS Extensive experiments on cross-modality medical image segmentation demonstrated the superiority of HMeAL-UDA over two state-of-the-art approaches. HMeAL-UDA achieved a mean (± s.d.) Dice similarity coefficient (DSC) of 91.34% ± 1.23% and an HD95 of 6.18 ± 2.93 mm for cross-modality (from CT to MRI) adaptation of abdominal multi-organ segmentation, and a DSC of 87.13% ± 3.67% with an HD95 of 2.48 ± 1.56 mm for segmentation adaptation in the opposite direction (MRI to CT). The results are approaching or even outperforming those of supervised methods trained with "ground-truth" labels in the target domain. In addition, we provide a comprehensive assessment of the model's uncertainty, which can help with the understanding of segmentation reliability to guide clinical decisions. CONCLUSION HMeAL-UDA provides a powerful segmentation tool to address cross-modality domain shifts, with the potential to generalize to other deep learning applications in medical imaging.
Collapse
Affiliation(s)
- Xiaoxue Qian
- Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Hua-Chieh Shao
- Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Yunxiang Li
- Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Weiguo Lu
- Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - You Zhang
- Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| |
Collapse
|
7
|
Han K, Lou Q, Lu F. A semi-supervised domain adaptation method with scale-aware and global-local fusion for abdominal multi-organ segmentation. J Appl Clin Med Phys 2025; 26:e70008. [PMID: 39924943 PMCID: PMC11905256 DOI: 10.1002/acm2.70008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2024] [Revised: 11/02/2024] [Accepted: 11/27/2024] [Indexed: 02/11/2025] Open
Abstract
BACKGROUND Abdominal multi-organ segmentation remains a challenging task. Semi-supervised domain adaptation (SSDA) has emerged as an innovative solution. However, SSDA frameworks based on UNet struggle to capture multi-scale and global information. PURPOSE Our work aimed to propose a novel SSDA method to achieve more accurate abdominal multi-organ segmentation with limited labeled target domain data, which has a superior ability to capture the multi-scale features and integrate local and global information effectively. METHODS The proposed network is based on UNet. In the encoder part, a scale-aware with domain-specific batch normalization (SAD) module is integrated to adaptively extract multi-scale features and to get better generalization across source and target domains. In the bottleneck part, a global-local fusion (GLF) module is utilized for capturing and integrating both local and global information. They are integrated into the framework of self-ensembling mean-teacher (SE-MT) to enhance the model's capability to learn common features across source and target domains. RESULTS To validate the performance of the proposed model, we evaluated it on the public CHAOS and BTCV datasets. For CHAOS, the proposed method obtains an average DSC of 88.97% and ASD of 1.12 mm with only 20% labeled target data. For BTCV, it achieves an average DSC of 88.95% and ASD of 1.13 mm with 20% labeled target data. Compared with the state-of-the-art methods, DSC and ASD increased by at least 0.72% and 0.33 mm on CHAOS, 1.29% and 0.06 mm on BTCV, respectively. Ablation studies were also conducted to verify the contribution of each component of the model. The proposed method achieves a DSC improvement of 3.17% over the baseline with 20% labeled target data. CONCLUSION The proposed SSDA method for abdominal multi-organ segmentation has a powerful ability to extract multi-scale and more global features, significantly improving segmentation accuracy and robustness.
Collapse
Affiliation(s)
- Kexin Han
- School of ScienceZhejiang University of Science and TechnologyHangzhouChina
| | - Qiong Lou
- School of ScienceZhejiang University of Science and TechnologyHangzhouChina
| | - Fang Lu
- School of ScienceZhejiang University of Science and TechnologyHangzhouChina
| |
Collapse
|
8
|
Zhao S, Sun Q, Yang J, Yuan Y, Huang Y, Li Z. Structure preservation constraints for unsupervised domain adaptation intracranial vessel segmentation. Med Biol Eng Comput 2025; 63:609-627. [PMID: 39432222 DOI: 10.1007/s11517-024-03195-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Accepted: 09/11/2024] [Indexed: 10/22/2024]
Abstract
Unsupervised domain adaptation (UDA) has received interest as a means to alleviate the burden of data annotation. Nevertheless, existing UDA segmentation methods exhibit performance degradation in fine intracranial vessel segmentation tasks due to the problem of structure mismatch in the image synthesis procedure. To improve the image synthesis quality and the segmentation performance, a novel UDA segmentation method with structure preservation approaches, named StruP-Net, is proposed. The StruP-Net employs adversarial learning for image synthesis and utilizes two domain-specific segmentation networks to enhance the semantic consistency between real images and synthesized images. Additionally, two distinct structure preservation approaches, feature-level structure preservation (F-SP) and image-level structure preservation (I-SP), are proposed to alleviate the problem of structure mismatch in the image synthesis procedure. The F-SP, composed of two domain-specific graph convolutional networks (GCN), focuses on providing feature-level constraints to enhance the structural similarity between real images and synthesized images. Meanwhile, the I-SP imposes constraints on structure similarity based on perceptual loss. The cross-modality experimental results from magnetic resonance angiography (MRA) images to computed tomography angiography (CTA) images indicate that StruP-Net achieves better segmentation performance compared with other state-of-the-art methods. Furthermore, high inference efficiency demonstrates the clinical application potential of StruP-Net. The code is available at https://github.com/Mayoiuta/StruP-Net .
Collapse
Affiliation(s)
- Sizhe Zhao
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Qi Sun
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Jinzhu Yang
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China.
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China.
| | - Yuliang Yuan
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Yan Huang
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Zhiqing Li
- The First Affiliated Hospital of China Medical University, Shenyang, Liaoning, China
| |
Collapse
|
9
|
Chen W, Ye Q, Guo L, Wu Q. Unsupervised cross-modality domain adaptation via source-domain labels guided contrastive learning for medical image segmentation. Med Biol Eng Comput 2025:10.1007/s11517-025-03312-2. [PMID: 39939403 DOI: 10.1007/s11517-025-03312-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Accepted: 01/22/2025] [Indexed: 02/14/2025]
Abstract
Unsupervised domain adaptation (UDA) offers a promising approach to enhance discriminant performance on target domains by utilizing domain adaptation techniques. These techniques enable models to leverage knowledge from the source domain to adjust to the feature distribution in the target domain. This paper proposes a unified domain adaptation framework to carry out cross-modality medical image segmentation from two perspectives: image and feature. To achieve image alignment, the loss function of Fourier-based Contrastive Style Augmentation (FCSA) has been fine-tuned to increase the impact of style change for improving system robustness. For feature alignment, a module called Source-domain Labels Guided Contrastive Learning (SLGCL) has been designed to encourage the target domain to align features of different classes with those in the source domain. In addition, a generative adversarial network has been incorporated to ensure consistency in spatial layout and local context in generated image space. According to our knowledge, our method is the first attempt to utilize source domain class intensity information to guide target domain class intensity information for feature alignment in an unsupervised domain adaptation setting. Extensive experiments conducted on a public whole heart image segmentation task demonstrate that our proposed method outperforms state-of-the-art UDA methods for medical image segmentation.
Collapse
Affiliation(s)
- Wenshuang Chen
- School of Electronic and Information Engineering, South China University of Technology, Wushan Road 381, Guangzhou, Guangdong, 510641, China
| | - Qi Ye
- School of Electronic and Information Engineering, South China University of Technology, Wushan Road 381, Guangzhou, Guangdong, 510641, China
| | - Lihua Guo
- School of Electronic and Information Engineering, South China University of Technology, Wushan Road 381, Guangzhou, Guangdong, 510641, China.
| | - Qi Wu
- School of Electronic and Information Engineering, South China University of Technology, Wushan Road 381, Guangzhou, Guangdong, 510641, China
| |
Collapse
|
10
|
Chen Z, Bian Y, Shen E, Fan L, Zhu W, Shi F, Shao C, Chen X, Xiang D. Moment-Consistent Contrastive CycleGAN for Cross-Domain Pancreatic Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:422-435. [PMID: 39167524 DOI: 10.1109/tmi.2024.3447071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
CT and MR are currently the most common imaging techniques for pancreatic cancer diagnosis. Accurate segmentation of the pancreas in CT and MR images can provide significant help in the diagnosis and treatment of pancreatic cancer. Traditional supervised segmentation methods require a large number of labeled CT and MR training data, which is usually time-consuming and laborious. Meanwhile, due to domain shift, traditional segmentation networks are difficult to be deployed on different imaging modality datasets. Cross-domain segmentation can utilize labeled source domain data to assist unlabeled target domains in solving the above problems. In this paper, a cross-domain pancreas segmentation algorithm is proposed based on Moment-Consistent Contrastive Cycle Generative Adversarial Networks (MC-CCycleGAN). MC-CCycleGAN is a style transfer network, in which the encoder of its generator is used to extract features from real images and style transfer images, constrain feature extraction through a contrastive loss, and fully extract structural features of input images during style transfer while eliminate redundant style features. The multi-order central moments of the pancreas are proposed to describe its anatomy in high dimensions and a contrastive loss is also proposed to constrain the moment consistency, so as to maintain consistency of the pancreatic structure and shape before and after style transfer. Multi-teacher knowledge distillation framework is proposed to transfer the knowledge from multiple teachers to a single student, so as to improve the robustness and performance of the student network. The experimental results have demonstrated the superiority of our framework over state-of-the-art domain adaptation methods.
Collapse
|
11
|
Salle G, Andrade-Miranda G, Conze PH, Boussion N, Bert J, Visvikis D, Jaouen V. Cross-Modal Tumor Segmentation Using Generative Blending Augmentation and Self-Training. IEEE Trans Biomed Eng 2025; 72:370-380. [PMID: 38557627 DOI: 10.1109/tbme.2024.3384014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
OBJECTIVES Data scarcity and domain shifts lead to biased training sets that do not accurately represent deployment conditions. A related practical problem is cross-modal image segmentation, where the objective is to segment unlabelled images using previously labelled datasets from other imaging modalities. METHODS We propose a cross-modal segmentation method based on conventional image synthesis boosted by a new data augmentation technique called Generative Blending Augmentation (GBA). GBA leverages a SinGAN model to learn representative generative features from a single training image to diversify realistically tumor appearances. This way, we compensate for image synthesis errors, subsequently improving the generalization power of a downstream segmentation model. The proposed augmentation is further combined to an iterative self-training procedure leveraging pseudo labels at each pass. RESULTS The proposed solution ranked first for vestibular schwannoma (VS) segmentation during the validation and test phases of the MICCAI CrossMoDA 2022 challenge, with best mean Dice similarity and average symmetric surface distance measures. CONCLUSION AND SIGNIFICANCE Local contrast alteration of tumor appearances and iterative self-training with pseudo labels are likely to lead to performance improvements in a variety of segmentation contexts.
Collapse
|
12
|
Wang S, Liu L, Wang J, Peng X, Liu B. MSR-UNet: enhancing multi-scale and long-range dependencies in medical image segmentation. PeerJ Comput Sci 2024; 10:e2563. [PMID: 39650414 PMCID: PMC11623095 DOI: 10.7717/peerj-cs.2563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Accepted: 11/08/2024] [Indexed: 12/11/2024]
Abstract
Transformer-based technology has attracted widespread attention in medical image segmentation. Due to the diversity of organs, effective modelling of multi-scale information and establishing long-range dependencies between pixels are crucial for successful medical image segmentation. However, most studies rely on a fixed single-scale window for modeling, which ignores the potential impact of window size on performance. This limitation can hinder window-based models' ability to fully explore multi-scale and long-range relationships within medical images. To address this issue, we propose a multi-scale reconfiguration self-attention (MSR-SA) module that accurately models multi-scale information and long-range dependencies in medical images. The MSR-SA module first divides the attention heads into multiple groups, each assigned an ascending dilation rate. These groups are then uniformly split into several non-overlapping local windows. Using dilated sampling, we gather the same number of keys to obtain both long-range and multi-scale information. Finally, dynamic information fusion is achieved by integrating features from the sampling points at corresponding positions across different windows. Based on the MSR-SA module, we propose a multi-scale reconfiguration U-Net (MSR-UNet) framework for medical image segmentation. Experiments on the Synapse and automated cardiac diagnosis challenge (ACDC) datasets show that MSR-UNet can achieve satisfactory segmentation results. The code is available at https://github.com/davidsmithwj/MSR-UNet (DOI: 10.5281/zenodo.13969855).
Collapse
Affiliation(s)
- Shuai Wang
- School of Computer Science and Technology, Huaibei Normal University, Huaibei, China
| | - Lei Liu
- School of Computer Science and Technology, Huaibei Normal University, Huaibei, China
- Huaibei Key Laboratory of Digital Multimedia Intelligent Information Processing, Huaibei, China
| | - Jun Wang
- College of Electronic and Information Engineering, Hebei University, Baoding, China
| | - Xinyue Peng
- School of Computer Science and Technology, Huaibei Normal University, Huaibei, China
| | - Baosen Liu
- Huaibei People’s Hospital, Huaibei, China
| |
Collapse
|
13
|
Chen X, Pang Y, Yap PT, Lian J. Multi-scale anatomical regularization for domain-adaptive segmentation of pelvic CBCT images. Med Phys 2024; 51:8804-8813. [PMID: 39225652 PMCID: PMC11672636 DOI: 10.1002/mp.17378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 07/22/2024] [Accepted: 08/16/2024] [Indexed: 09/04/2024] Open
Abstract
BACKGROUND Cone beam computed tomography (CBCT) image segmentation is crucial in prostate cancer radiotherapy, enabling precise delineation of the prostate gland for accurate treatment planning and delivery. However, the poor quality of CBCT images poses challenges in clinical practice, making annotation difficult due to factors such as image noise, low contrast, and organ deformation. PURPOSE The objective of this study is to create a segmentation model for the label-free target domain (CBCT), leveraging valuable insights derived from the label-rich source domain (CT). This goal is achieved by addressing the domain gap across diverse domains through the implementation of a cross-modality medical image segmentation framework. METHODS Our approach introduces a multi-scale domain adaptive segmentation method, performing domain adaptation simultaneously at both the image and feature levels. The primary innovation lies in a novel multi-scale anatomical regularization approach, which (i) aligns the target domain feature space with the source domain feature space at multiple spatial scales simultaneously, and (ii) exchanges information across different scales to fuse knowledge from multi-scale perspectives. RESULTS Quantitative and qualitative experiments were conducted on pelvic CBCT segmentation tasks. The training dataset comprises 40 unpaired CBCT-CT images with only CT images annotated. The validation and testing datasets consist of 5 and 10 CT images, respectively, all with annotations. The experimental results demonstrate the superior performance of our method compared to other state-of-the-art cross-modality medical image segmentation methods. The Dice similarity coefficients (DSC) for CBCT image segmentation results is74.6 ± 9.3 $74.6 \pm 9.3$ %, and the average symmetric surface distance (ASSD) is3.9 ± 1.8 mm $3.9\pm 1.8\;\mathrm{mm}$ . Statistical analysis confirms the statistical significance of the improvements achieved by our method. CONCLUSIONS Our method exhibits superiority in pelvic CBCT image segmentation compared to its counterparts.
Collapse
Affiliation(s)
- Xu Chen
- College of Computer Science and Technology, Huaqiao University, Xiamen, Fujian, China
- Key Laboratory of Computer Vision and Machine Learning (Huaqiao University), Fujian Province University, Xiamen, Fujian, China
- Xiamen Key Laboratory of Computer Vision and Pattern Recognition, Huaqiao University, Xiamen, Fujian, China
| | - Yunkui Pang
- Department of Computer Science, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Pew-Thian Yap
- Department of Radiology, University of North Carolina, Chapel Hill, North Carolina, USA
- Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Jun Lian
- Department of Radiation Oncology, University of North Carolina, Chapel Hill, North Carolina, USA
| |
Collapse
|
14
|
Tang Y, Lyu T, Jin H, Du Q, Wang J, Li Y, Li M, Chen Y, Zheng J. Domain adaptive noise reduction with iterative knowledge transfer and style generalization learning. Med Image Anal 2024; 98:103327. [PMID: 39191093 DOI: 10.1016/j.media.2024.103327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 08/20/2024] [Accepted: 08/21/2024] [Indexed: 08/29/2024]
Abstract
Low-dose computed tomography (LDCT) denoising tasks face significant challenges in practical imaging scenarios. Supervised methods encounter difficulties in real-world scenarios as there are no paired data for training. Moreover, when applied to datasets with varying noise patterns, these methods may experience decreased performance owing to the domain gap. Conversely, unsupervised methods do not require paired data and can be directly trained on real-world data. However, they often exhibit inferior performance compared to supervised methods. To address this issue, it is necessary to leverage the strengths of these supervised and unsupervised methods. In this paper, we propose a novel domain adaptive noise reduction framework (DANRF), which integrates both knowledge transfer and style generalization learning to effectively tackle the domain gap problem. Specifically, an iterative knowledge transfer method with knowledge distillation is selected to train the target model using unlabeled target data and a pre-trained source model trained with paired simulation data. Meanwhile, we introduce the mean teacher mechanism to update the source model, enabling it to adapt to the target domain. Furthermore, an iterative style generalization learning process is also designed to enrich the style diversity of the training dataset. We evaluate the performance of our approach through experiments conducted on multi-source datasets. The results demonstrate the feasibility and effectiveness of our proposed DANRF model in multi-source LDCT image processing tasks. Given its hybrid nature, which combines the advantages of supervised and unsupervised learning, and its ability to bridge domain gaps, our approach is well-suited for improving practical low-dose CT imaging in clinical settings. Code for our proposed approach is publicly available at https://github.com/tyfeiii/DANRF.
Collapse
Affiliation(s)
- Yufei Tang
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Tianling Lyu
- Research Center of Augmented Intelligence, Zhejiang Lab, Hangzhou, 310000, China
| | - Haoyang Jin
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Qiang Du
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Jiping Wang
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Yunxiang Li
- Nanovision Technology Co., Ltd., Beiqing Road, Haidian District, Beijing, 100094, China
| | - Ming Li
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China.
| | - Yang Chen
- Laboratory of Image Science and Technology, the School of Computer Science and Engineering, Southeast University, Nanjing, 210096, China
| | - Jian Zheng
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China; Shandong Laboratory of Advanced Biomaterials and Medical Devices in Weihai, Weihai, 264200, China.
| |
Collapse
|
15
|
Kang B, Nam H, Kang M, Heo KS, Lim M, Oh JH, Kam TE. Target-aware cross-modality unsupervised domain adaptation for vestibular schwannoma and cochlea segmentation. Sci Rep 2024; 14:27883. [PMID: 39537681 PMCID: PMC11561345 DOI: 10.1038/s41598-024-77633-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 10/23/2024] [Indexed: 11/16/2024] Open
Abstract
There is growing interest in research on segmentation for the vestibular schwannoma (VS) and cochlea using high-resolution T2 (hrT2) imaging over contrast-enhanced T1 (ceT1) imaging due to the contrast agent side effects. However, the hrT2 imaging remains a problem of insufficient annotated data, which is fatal for building more robust segmentation models. To address the issue, recent studies have adopted unsupervised domain adaptation approaches that translate ceT1 images to hrT2 images. However, previous studies did not consider the size and visual characteristics of the target objects, such as VS and cochlea, during image translation. Specifically, those works simply performed normalization on the entire image without considering its significant impact on the quality of the translated images. These approaches tend to erase the small target objects, making it difficult to preserve the structure of these objects when generating pseudo-target images. Furthermore, they may also struggle to accurately reflect the unique style of the target objects within the images. Therefore, we propose a target-aware unsupervised domain adaptation framework, designed for translating target objects, each tailored to their unique visual characteristics and size using target-aware normalization. We demonstrate the superiority of the proposed framework on a publicly available challenge dataset. Codes are available at https://github.com/Bokyeong-Kang/TANQ .
Collapse
Affiliation(s)
- Bogyeong Kang
- Department of Artificial Intelligence, Korea University, Seoul, South Korea
| | - Hyeonyeong Nam
- Department of Artificial Intelligence, Korea University, Seoul, South Korea
| | - Myeongkyun Kang
- Department of Robotics and Mechatronics Engineering, Daegu Gyeongbuk Institute of Science and Technology (DGIST), Daegu, South Korea
| | - Keun-Soo Heo
- Department of Artificial Intelligence, Korea University, Seoul, South Korea
| | - Minjoo Lim
- Department of Artificial Intelligence, Korea University, Seoul, South Korea
| | - Ji-Hye Oh
- Department of Artificial Intelligence, Korea University, Seoul, South Korea
| | - Tae-Eui Kam
- Department of Artificial Intelligence, Korea University, Seoul, South Korea.
| |
Collapse
|
16
|
Han T, Ai D, Fan J, Song H, Xiao D, Wang Y, Yang J. Cross-Anatomy Transfer Learning via Shape-Aware Adaptive Fine-Tuning for 3D Vessel Segmentation. IEEE J Biomed Health Inform 2024; 28:6064-6077. [PMID: 38954568 DOI: 10.1109/jbhi.2024.3422177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/04/2024]
Abstract
Deep learning methods have recently achieved remarkable performance in vessel segmentation applications, yet require numerous labor-intensive labeled data. To alleviate the requirement of manual annotation, transfer learning methods can potentially be used to acquire the related knowledge of tubular structures from public large-scale labeled vessel datasets for target vessel segmentation in other anatomic sites of the human body. However, the cross-anatomy domain shift is a challenging task due to the formidable discrepancy among various vessel structures in different anatomies, resulting in the limited performance of transfer learning. Therefore, we propose a cross-anatomy transfer learning framework for 3D vessel segmentation, which first generates a pre-trained model on a public hepatic vessel dataset and then adaptively fine-tunes our target segmentation network initialized from the model for segmentation of other anatomic vessels. In the framework, the adaptive fine-tuning strategy is presented to dynamically decide on the frozen or fine-tuned filters of the target network for each input sample with a proxy network. Moreover, we develop a Gaussian-based signed distance map that explicitly encodes vessel-specific shape context. The prediction of the map is added as an auxiliary task in the segmentation network to capture geometry-aware knowledge in the fine-tuning. We demonstrate the effectiveness of our method through extensive experiments on two small-scale datasets of coronary artery and brain vessel. The results indicate the proposed method effectively overcomes the discrepancy of cross-anatomy domain shift to achieve accurate vessel segmentation for these two datasets.
Collapse
|
17
|
Zheng B, Zhang R, Diao S, Zhu J, Yuan Y, Cai J, Shao L, Li S, Qin W. Dual domain distribution disruption with semantics preservation: Unsupervised domain adaptation for medical image segmentation. Med Image Anal 2024; 97:103275. [PMID: 39032395 DOI: 10.1016/j.media.2024.103275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 06/14/2024] [Accepted: 07/10/2024] [Indexed: 07/23/2024]
Abstract
Recent unsupervised domain adaptation (UDA) methods in medical image segmentation commonly utilize Generative Adversarial Networks (GANs) for domain translation. However, the translated images often exhibit a distribution deviation from the ideal due to the inherent instability of GANs, leading to challenges such as visual inconsistency and incorrect style, consequently causing the segmentation model to fall into the fixed wrong pattern. To address this problem, we propose a novel UDA framework known as Dual Domain Distribution Disruption with Semantics Preservation (DDSP). Departing from the idea of generating images conforming to the target domain distribution in GAN-based UDA methods, we make the model domain-agnostic and focus on anatomical structural information by leveraging semantic information as constraints to guide the model to adapt to images with disrupted distributions in both source and target domains. Furthermore, we introduce the inter-channel similarity feature alignment based on the domain-invariant structural prior information, which facilitates the shared pixel-wise classifier to achieve robust performance on target domain features by aligning the source and target domain features across channels. Without any exaggeration, our method significantly outperforms existing state-of-the-art UDA methods on three public datasets (i.e., the heart dataset, the brain dataset, and the prostate dataset). The code is available at https://github.com/MIXAILAB/DDSPSeg.
Collapse
Affiliation(s)
- Boyun Zheng
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Ranran Zhang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Songhui Diao
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Jingke Zhu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Yixuan Yuan
- Department of Electronic Engineering, The Chinese University of Hong Kong, 999077, Hong Kong, China
| | - Jing Cai
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, 999077, Hong Kong, China
| | - Liang Shao
- Department of Cardiology, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang 330013, China
| | - Shuo Li
- Department of Biomedical Engineering, Department of Computer and Data Science, Case Western Reserve University, Cleveland, United States.
| | - Wenjian Qin
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.
| |
Collapse
|
18
|
Chen L, Bian Y, Zeng J, Meng Q, Zhu W, Shi F, Shao C, Chen X, Xiang D. Style Consistency Unsupervised Domain Adaptation Medical Image Segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:4882-4895. [PMID: 39236126 DOI: 10.1109/tip.2024.3451934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/07/2024]
Abstract
Unsupervised domain adaptation medical image segmentation is aimed to segment unlabeled target domain images with labeled source domain images. However, different medical imaging modalities lead to large domain shift between their images, in which well-trained models from one imaging modality often fail to segment images from anothor imaging modality. In this paper, to mitigate domain shift between source domain and target domain, a style consistency unsupervised domain adaptation image segmentation method is proposed. First, a local phase-enhanced style fusion method is designed to mitigate domain shift and produce locally enhanced organs of interest. Second, a phase consistency discriminator is constructed to distinguish the phase consistency of domain-invariant features between source domain and target domain, so as to enhance the disentanglement of the domain-invariant and style encoders and removal of domain-specific features from the domain-invariant encoder. Third, a style consistency estimation method is proposed to obtain inconsistency maps from intermediate synthesized target domain images with different styles to measure the difficult regions, mitigate domain shift between synthesized target domain images and real target domain images, and improve the integrity of interested organs. Fourth, style consistency entropy is defined for target domain images to further improve the integrity of the interested organ by the concentration on the inconsistent regions. Comprehensive experiments have been performed with an in-house dataset and a publicly available dataset. The experimental results have demonstrated the superiority of our framework over state-of-the-art methods.
Collapse
|
19
|
Chen Y, Gao Y, Zhu L, Shao W, Lu Y, Han H, Xie Z. PCNet: Prior Category Network for CT Universal Segmentation Model. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3319-3330. [PMID: 38687654 DOI: 10.1109/tmi.2024.3395349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
Accurate segmentation of anatomical structures in Computed Tomography (CT) images is crucial for clinical diagnosis, treatment planning, and disease monitoring. The present deep learning segmentation methods are hindered by factors such as data scale and model size. Inspired by how doctors identify tissues, we propose a novel approach, the Prior Category Network (PCNet), that boosts segmentation performance by leveraging prior knowledge between different categories of anatomical structures. Our PCNet comprises three key components: prior category prompt (PCP), hierarchy category system (HCS), and hierarchy category loss (HCL). PCP utilizes Contrastive Language-Image Pretraining (CLIP), along with attention modules, to systematically define the relationships between anatomical categories as identified by clinicians. HCS guides the segmentation model in distinguishing between specific organs, anatomical structures, and functional systems through hierarchical relationships. HCL serves as a consistency constraint, fortifying the directional guidance provided by HCS to enhance the segmentation model's accuracy and robustness. We conducted extensive experiments to validate the effectiveness of our approach, and the results indicate that PCNet can generate a high-performance, universal model for CT segmentation. The PCNet framework also demonstrates a significant transferability on multiple downstream tasks. The ablation experiments show that the methodology employed in constructing the HCS is of critical importance. The prompt and HCS can be accessed at https://github.com/PKU-MIPET/PCNet.
Collapse
|
20
|
Hu J, Yang Y, Guo X, Ma T, Wang J. A Chebyshev Confidence Guided Source-Free Domain Adaptation Framework for Medical Image Segmentation. IEEE J Biomed Health Inform 2024; 28:5473-5486. [PMID: 38809721 DOI: 10.1109/jbhi.2024.3406906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2024]
Abstract
Source-free domain adaptation (SFDA) aims to adapt models trained on a labeled source domain to an unlabeled target domain without access to source data. In medical imaging scenarios, the practical significance of SFDA methods has been emphasized due to data heterogeneity and privacy concerns. Recent state-of-the-art SFDA methods primarily rely on self-training based on pseudo-labels (PLs). Unfortunately, the accuracy of PLs may deteriorate due to domain shift, thus limiting the effectiveness of the adaptation process. To address this issue, we propose a Chebyshev confidence guided SFDA framework to accurately assess the reliability of PLs and generate self-improving PLs for self-training. The Chebyshev confidence is estimated by calculating the probability lower bound of PL confidence, given the prediction and the corresponding uncertainty. Leveraging the Chebyshev confidence, we introduce two confidence-guided denoising methods: direct denoising and prototypical denoising. Additionally, we propose a novel teacher-student joint training scheme (TJTS) that incorporates a confidence weighting module to iteratively improve PLs' accuracy. The TJTS, in collaboration with the denoising methods, effectively prevents the propagation of noise and enhances the accuracy of PLs. Extensive experiments in diverse domain scenarios validate the effectiveness of our proposed framework and establish its superiority over state-of-the-art SFDA methods. Our paper contributes to the field of SFDA by providing a novel approach for precisely estimating the reliability of PLs and a framework for obtaining high-quality PLs, resulting in improved adaptation performance.
Collapse
|
21
|
Li H, Liu H, von Busch H, Grimm R, Huisman H, Tong A, Winkel D, Penzkofer T, Shabunin I, Choi MH, Yang Q, Szolar D, Shea S, Coakley F, Harisinghani M, Oguz I, Comaniciu D, Kamen A, Lou B. Deep Learning-based Unsupervised Domain Adaptation via a Unified Model for Prostate Lesion Detection Using Multisite Biparametric MRI Datasets. Radiol Artif Intell 2024; 6:e230521. [PMID: 39166972 PMCID: PMC11449150 DOI: 10.1148/ryai.230521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
Purpose To determine whether the unsupervised domain adaptation (UDA) method with generated images improves the performance of a supervised learning (SL) model for prostate cancer (PCa) detection using multisite biparametric (bp) MRI datasets. Materials and Methods This retrospective study included data from 5150 patients (14 191 samples) collected across nine different imaging centers. A novel UDA method using a unified generative model was developed for PCa detection using multisite bpMRI datasets. This method translates diffusion-weighted imaging (DWI) acquisitions, including apparent diffusion coefficient (ADC) and individual diffusion-weighted (DW) images acquired using various b values, to align with the style of images acquired using b values recommended by Prostate Imaging Reporting and Data System (PI-RADS) guidelines. The generated ADC and DW images replace the original images for PCa detection. An independent set of 1692 test cases (2393 samples) was used for evaluation. The area under the receiver operating characteristic curve (AUC) was used as the primary metric, and statistical analysis was performed via bootstrapping. Results For all test cases, the AUC values for baseline SL and UDA methods were 0.73 and 0.79 (P < .001), respectively, for PCa lesions with PI-RADS score of 3 or greater and 0.77 and 0.80 (P < .001) for lesions with PI-RADS scores of 4 or greater. In the 361 test cases under the most unfavorable image acquisition setting, the AUC values for baseline SL and UDA were 0.49 and 0.76 (P < .001) for lesions with PI-RADS scores of 3 or greater and 0.50 and 0.77 (P < .001) for lesions with PI-RADS scores of 4 or greater. Conclusion UDA with generated images improved the performance of SL methods in PCa lesion detection across multisite datasets with various b values, especially for images acquired with significant deviations from the PI-RADS-recommended DWI protocol (eg, with an extremely high b value). Keywords: Prostate Cancer Detection, Multisite, Unsupervised Domain Adaptation, Diffusion-weighted Imaging, b Value Supplemental material is available for this article. © RSNA, 2024.
Collapse
Affiliation(s)
- Hao Li
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - Han Liu
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - Heinrich von Busch
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - Robert Grimm
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - Henkjan Huisman
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - Angela Tong
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - David Winkel
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - Tobias Penzkofer
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - Ivan Shabunin
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - Moon Hyung Choi
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - Qingsong Yang
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - Dieter Szolar
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - Steven Shea
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - Fergus Coakley
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - Mukesh Harisinghani
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - Ipek Oguz
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - Dorin Comaniciu
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - Ali Kamen
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| | - Bin Lou
- From Digital Technology and Innovation, Siemens Healthineers, 755 College Rd E, Princeton, NJ 08540 (H. Li, H. Liu, D.C., A.K., B.L.); Diagnostic Imaging, Siemens Healthineers, Erlangen, Bavaria, Germany (H.v.B., R.G.); Vanderbilt University, Nashville, Tenn (H. Li, H. Liu, I.O.); Radboud University Medical Center, Nijmegen, the Netherlands (H.H.); New York University, New York, NY (A.T.); Universitätsspital Basel, Basel, Switzerland (D.W.); Charité, Universitätsmedizin Berlin, Berlin, Germany (T.P.); Patero Clinic, Moscow, Russia (I.S.); Eunpyeong St. Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea (M.H.C.); Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China (Q.Y.); Diagnostikum Graz Süd-West, Graz, Austria (D.S.); Department of Radiology, Loyola University Medical Center, Maywood, Ill (S.S.); Department of Diagnostic Radiology, Oregon Health and Science University School of Medicine, Portland, Ore (F.C.); and Massachusetts General Hospital, Boston, Mass (M.H.)
| |
Collapse
|
22
|
Wu J, Guo D, Wang G, Yue Q, Yu H, Li K, Zhang S. FPL+: Filtered Pseudo Label-Based Unsupervised Cross-Modality Adaptation for 3D Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3098-3109. [PMID: 38602852 DOI: 10.1109/tmi.2024.3387415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/13/2024]
Abstract
Adapting a medical image segmentation model to a new domain is important for improving its cross-domain transferability, and due to the expensive annotation process, Unsupervised Domain Adaptation (UDA) is appealing where only unlabeled images are needed for the adaptation. Existing UDA methods are mainly based on image or feature alignment with adversarial training for regularization, and they are limited by insufficient supervision in the target domain. In this paper, we propose an enhanced Filtered Pseudo Label (FPL+)-based UDA method for 3D medical image segmentation. It first uses cross-domain data augmentation to translate labeled images in the source domain to a dual-domain training set consisting of a pseudo source-domain set and a pseudo target-domain set. To leverage the dual-domain augmented images to train a pseudo label generator, domain-specific batch normalization layers are used to deal with the domain shift while learning the domain-invariant structure features, generating high-quality pseudo labels for target-domain images. We then combine labeled source-domain images and target-domain images with pseudo labels to train a final segmentor, where image-level weighting based on uncertainty estimation and pixel-level weighting based on dual-domain consensus are proposed to mitigate the adverse effect of noisy pseudo labels. Experiments on three public multi-modal datasets for Vestibular Schwannoma, brain tumor and whole heart segmentation show that our method surpassed ten state-of-the-art UDA methods, and it even achieved better results than fully supervised learning in the target domain in some cases.
Collapse
|
23
|
Wang R, Zheng G. PFMNet: Prototype-based feature mapping network for few-shot domain adaptation in medical image segmentation. Comput Med Imaging Graph 2024; 116:102406. [PMID: 38824715 DOI: 10.1016/j.compmedimag.2024.102406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 05/23/2024] [Accepted: 05/24/2024] [Indexed: 06/04/2024]
Abstract
Lack of data is one of the biggest hurdles for rare disease research using deep learning. Due to the lack of rare-disease images and annotations, training a robust network for automatic rare-disease image segmentation is very challenging. To address this challenge, few-shot domain adaptation (FSDA) has emerged as a practical research direction, aiming to leverage a limited number of annotated images from a target domain to facilitate adaptation of models trained on other large datasets in a source domain. In this paper, we present a novel prototype-based feature mapping network (PFMNet) designed for FSDA in medical image segmentation. PFMNet adopts an encoder-decoder structure for segmentation, with the prototype-based feature mapping (PFM) module positioned at the bottom of the encoder-decoder structure. The PFM module transforms high-level features from the target domain into the source domain-like features that are more easily comprehensible by the decoder. By leveraging these source domain-like features, the decoder can effectively perform few-shot segmentation in the target domain and generate accurate segmentation masks. We evaluate the performance of PFMNet through experiments on three typical yet challenging few-shot medical image segmentation tasks: cross-center optic disc/cup segmentation, cross-center polyp segmentation, and cross-modality cardiac structure segmentation. We consider four different settings: 5-shot, 10-shot, 15-shot, and 20-shot. The experimental results substantiate the efficacy of our proposed approach for few-shot domain adaptation in medical image segmentation.
Collapse
Affiliation(s)
- Runze Wang
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Shanghai, 200240, China
| | - Guoyan Zheng
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Shanghai, 200240, China.
| |
Collapse
|
24
|
Jiang X, Yang Y, Su T, Xiao K, Lu L, Wang W, Guo C, Shao L, Wang M, Jiang D. Unsupervised domain adaptation based on feature and edge alignment for femur X-ray image segmentation. Comput Med Imaging Graph 2024; 116:102407. [PMID: 38880065 DOI: 10.1016/j.compmedimag.2024.102407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 05/24/2024] [Accepted: 05/24/2024] [Indexed: 06/18/2024]
Abstract
The gold standard for diagnosing osteoporosis is bone mineral density (BMD) measurement by dual-energy X-ray absorptiometry (DXA). However, various factors during the imaging process cause domain shifts in DXA images, which lead to incorrect bone segmentation. Research shows that poor bone segmentation is one of the prime reasons of inaccurate BMD measurement, severely affecting the diagnosis and treatment plans for osteoporosis. In this paper, we propose a Multi-feature Joint Discriminative Domain Adaptation (MDDA) framework to improve segmentation performance and the generalization of the network in domain-shifted images. The proposed method learns domain-invariant features between the source and target domains from the perspectives of multi-scale features and edges, and is evaluated on real data from multi-center datasets. Compared to other state-of-the-art methods, the feature prior from the source domain and edge prior enable the proposed MDDA to achieve the optimal domain adaptation performance and generalization. It also demonstrates superior performance in domain adaptation tasks on small amount datasets, even using only 5 or 10 images. In this study, MDDA provides an accurate bone segmentation tool for BMD measurement based on DXA imaging.
Collapse
Affiliation(s)
- Xiaoming Jiang
- Chongqing Engineering Research Center of Medical Electronics and Information Technology, Chongqing University of Post and Telecommunications, Chongqing, China
| | - Yongxin Yang
- Chongqing Engineering Research Center of Medical Electronics and Information Technology, Chongqing University of Post and Telecommunications, Chongqing, China
| | - Tong Su
- Department of Sports Medicine, Peking University Third Hospital, Institute of Sports Medicine of Peking University, Beijing Key Laboratory of Sports Injuries, Engineering Research Center of Sports Trauma Treatment Technology and Devices, Ministry of Education, No. 49 North Garden Road, Beijing, China
| | - Kai Xiao
- Department of Foot and Ankle Surgery, Wuhan Fourth Hospital, Wuhan, Hubei, China
| | - LiDan Lu
- Chongqing Engineering Research Center of Medical Electronics and Information Technology, Chongqing University of Post and Telecommunications, Chongqing, China
| | - Wei Wang
- Chongqing Engineering Research Center of Medical Electronics and Information Technology, Chongqing University of Post and Telecommunications, Chongqing, China
| | - Changsong Guo
- National Health Commission Capacity Building and Continuing Education Center, Beijing, China
| | - Lizhi Shao
- Chinese Academy of Sciences Key Laboratory of Molecular Imaging, Institute of Automation, Beijing 100190, China.
| | - Mingjing Wang
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou 325000, China.
| | - Dong Jiang
- Department of Sports Medicine, Peking University Third Hospital, Institute of Sports Medicine of Peking University, Beijing Key Laboratory of Sports Injuries, Engineering Research Center of Sports Trauma Treatment Technology and Devices, Ministry of Education, No. 49 North Garden Road, Beijing, China.
| |
Collapse
|
25
|
Abbasi S, Lan H, Choupan J, Sheikh-Bahaei N, Pandey G, Varghese B. Deep learning for the harmonization of structural MRI scans: a survey. Biomed Eng Online 2024; 23:90. [PMID: 39217355 PMCID: PMC11365220 DOI: 10.1186/s12938-024-01280-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Accepted: 08/06/2024] [Indexed: 09/04/2024] Open
Abstract
Medical imaging datasets for research are frequently collected from multiple imaging centers using different scanners, protocols, and settings. These variations affect data consistency and compatibility across different sources. Image harmonization is a critical step to mitigate the effects of factors like inherent differences between various vendors, hardware upgrades, protocol changes, and scanner calibration drift, as well as to ensure consistent data for medical image processing techniques. Given the critical importance and widespread relevance of this issue, a vast array of image harmonization methodologies have emerged, with deep learning-based approaches driving substantial advancements in recent times. The goal of this review paper is to examine the latest deep learning techniques employed for image harmonization by analyzing cutting-edge architectural approaches in the field of medical image harmonization, evaluating both their strengths and limitations. This paper begins by providing a comprehensive fundamental overview of image harmonization strategies, covering three critical aspects: established imaging datasets, commonly used evaluation metrics, and characteristics of different scanners. Subsequently, this paper analyzes recent structural MRI (Magnetic Resonance Imaging) harmonization techniques based on network architecture, network learning algorithm, network supervision strategy, and network output. The underlying architectures include U-Net, Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), flow-based generative models, transformer-based approaches, as well as custom-designed network architectures. This paper investigates the effectiveness of Disentangled Representation Learning (DRL) as a pivotal learning algorithm in harmonization. Lastly, the review highlights the primary limitations in harmonization techniques, specifically the lack of comprehensive quantitative comparisons across different methods. The overall aim of this review is to serve as a guide for researchers and practitioners to select appropriate architectures based on their specific conditions and requirements. It also aims to foster discussions around ongoing challenges in the field and shed light on promising future research directions with the potential for significant advancements.
Collapse
Affiliation(s)
- Soolmaz Abbasi
- Department of Computer Engineering, Yazd University, Yazd, Iran
| | - Haoyu Lan
- Department of Neurology, University of Southern California, Los Angeles, CA, USA
| | - Jeiran Choupan
- Department of Neurology, University of Southern California, Los Angeles, CA, USA
| | - Nasim Sheikh-Bahaei
- Department of Radiology, University of Southern California, Los Angeles, CA, USA
| | - Gaurav Pandey
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Bino Varghese
- Department of Radiology, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
26
|
Yang M, Wu Z, Zheng H, Huang L, Ding W, Pan L, Yin L. Cross-Modality Medical Image Segmentation via Enhanced Feature Alignment and Cross Pseudo Supervision Learning. Diagnostics (Basel) 2024; 14:1751. [PMID: 39202240 PMCID: PMC11353479 DOI: 10.3390/diagnostics14161751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2024] [Revised: 08/08/2024] [Accepted: 08/10/2024] [Indexed: 09/03/2024] Open
Abstract
Given the diversity of medical images, traditional image segmentation models face the issue of domain shift. Unsupervised domain adaptation (UDA) methods have emerged as a pivotal strategy for cross modality analysis. These methods typically utilize generative adversarial networks (GANs) for both image-level and feature-level domain adaptation through the transformation and reconstruction of images, assuming the features between domains are well-aligned. However, this assumption falters with significant gaps between different medical image modalities, such as MRI and CT. These gaps hinder the effective training of segmentation networks with cross-modality images and can lead to misleading training guidance and instability. To address these challenges, this paper introduces a novel approach comprising a cross-modality feature alignment sub-network and a cross pseudo supervised dual-stream segmentation sub-network. These components work together to bridge domain discrepancies more effectively and ensure a stable training environment. The feature alignment sub-network is designed for the bidirectional alignment of features between the source and target domains, incorporating a self-attention module to aid in learning structurally consistent and relevant information. The segmentation sub-network leverages an enhanced cross-pseudo-supervised loss to harmonize the output of the two segmentation networks, assessing pseudo-distances between domains to improve the pseudo-label quality and thus enhancing the overall learning efficiency of the framework. This method's success is demonstrated by notable advancements in segmentation precision across target domains for abdomen and brain tasks.
Collapse
Affiliation(s)
- Mingjing Yang
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China; (M.Y.); (Z.W.); (H.Z.); (L.H.)
| | - Zhicheng Wu
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China; (M.Y.); (Z.W.); (H.Z.); (L.H.)
| | - Hanyu Zheng
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China; (M.Y.); (Z.W.); (H.Z.); (L.H.)
| | - Liqin Huang
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China; (M.Y.); (Z.W.); (H.Z.); (L.H.)
| | - Wangbin Ding
- School of Medical Imaging, Fujian Medical University, Fuzhou 350122, China;
| | - Lin Pan
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China; (M.Y.); (Z.W.); (H.Z.); (L.H.)
| | - Lei Yin
- The Departments of Radiology, Shengli Clinical Medical College of Fujian Medical University, Fuzhou 350001, China
- Fujian Provincial Hospital, Fuzhou 350001, China
- Fuzhou University Affiliated Provincial Hospital, Fuzhou 350001, China
| |
Collapse
|
27
|
Diao S, Yin Z, Chen X, Li M, Zhu W, Mateen M, Xu X, Shi F, Fan Y. Two-stage adversarial learning based unsupervised domain adaptation for retinal OCT segmentation. Med Phys 2024; 51:5374-5385. [PMID: 38426594 DOI: 10.1002/mp.17012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 01/23/2024] [Accepted: 02/20/2024] [Indexed: 03/02/2024] Open
Abstract
BACKGROUND Deep learning based optical coherence tomography (OCT) segmentation methods have achieved excellent results, allowing quantitative analysis of large-scale data. However, OCT images are often acquired by different devices or under different imaging protocols, which leads to serious domain shift problem. This in turn results in performance degradation of segmentation models. PURPOSE Aiming at the domain shift problem, we propose a two-stage adversarial learning based network (TSANet) that accomplishes unsupervised cross-domain OCT segmentation. METHODS In the first stage, a Fourier transform based approach is adopted to reduce image style differences from the image level. Then, adversarial learning networks, including a segmenter and a discriminator, are designed to achieve inter-domain consistency in the segmentation output. In the second stage, pseudo labels of selected unlabeled target domain training data are used to fine-tune the segmenter, which further improves its generalization capability. The proposed method was tested on cross-domain datasets for choroid or retinoschisis segmentation tasks. For choroid segmentation, the model was trained on 400 images and validated on 100 images from the source domain, and then trained on 1320 unlabeled images and tested on 330 images from target domain I, and also trained on 400 unlabeled images and tested on 200 images from target domain II. For retinoschisis segmentation, the model was trained on 1284 images and validated on 312 images from the source domain, and then trained on 1024 unlabeled images and tested on 200 images from the target domain. RESULTS The proposed method achieved significantly improved results over that without domain adaptation, with improvement of 8.34%, 55.82% and 3.53% in intersection over union (IoU) respectively for the three test sets. The performance is better than some state-of-the-art domain adaptation methods. CONCLUSIONS The proposed TSANet, with image level adaptation, feature level adaptation and pseudo-label based fine-tuning, achieved excellent cross-domain generalization. This alleviates the burden of obtaining additional manual labels when adapting the deep learning model to new OCT data.
Collapse
Affiliation(s)
- Shengyong Diao
- MIPAV Lab, the School of Electronics and Information Engineering, Soochow University, Suzhou, China
| | - Ziting Yin
- MIPAV Lab, the School of Electronics and Information Engineering, Soochow University, Suzhou, China
| | - Xinjian Chen
- MIPAV Lab, the School of Electronics and Information Engineering, Soochow University, Suzhou, China
- The State Key Laboratory of Radiation Medicine and Protection, Soochow University, Suzhou, China
| | - Menghan Li
- Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Weifang Zhu
- MIPAV Lab, the School of Electronics and Information Engineering, Soochow University, Suzhou, China
| | - Muhammad Mateen
- MIPAV Lab, the School of Electronics and Information Engineering, Soochow University, Suzhou, China
| | - Xun Xu
- Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Fei Shi
- MIPAV Lab, the School of Electronics and Information Engineering, Soochow University, Suzhou, China
| | - Ying Fan
- Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| |
Collapse
|
28
|
Stan S, Rostami M. Unsupervised model adaptation for source-free segmentation of medical images. Med Image Anal 2024; 95:103179. [PMID: 38626666 DOI: 10.1016/j.media.2024.103179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 04/09/2024] [Accepted: 04/11/2024] [Indexed: 04/18/2024]
Abstract
The recent prevalence of deep neural networks has led semantic segmentation networks to achieve human-level performance in the medical field, provided they are given sufficient training data. However, these networks often fail to generalize when tasked with creating semantic maps for out-of-distribution images, necessitating re-training on new distributions. This labor-intensive process requires expert knowledge for generating training labels. In the medical field, distribution shifts can naturally occur due to the choice of imaging devices, such as MRI or CT scanners. To mitigate the need for labeling images in a target domain after successful model training in a fully annotated source domain with a different data distribution, unsupervised domain adaptation (UDA) can be employed. Most UDA approaches ensure target generalization by generating a shared source/target latent feature space, allowing a source-trained classifier to maintain performance in the target domain. However, such approaches necessitate joint source and target data access, potentially leading to privacy leaks with respect to patient information. We propose a UDA algorithm for medical image segmentation that does not require access to source data during adaptation, thereby preserving patient data privacy. Our method relies on approximating the source latent features at the time of adaptation and creates a joint source/target embedding space by minimizing a distributional distance metric based on optimal transport. We demonstrate that our approach is competitive with recent UDA medical segmentation works, even with the added requirement of privacy. 1.
Collapse
Affiliation(s)
- Serban Stan
- University of Southern California, United States of America
| | | |
Collapse
|
29
|
Guo Z, Feng J, Lu W, Yin Y, Yang G, Zhou J. Cross-modality cerebrovascular segmentation based on pseudo-label generation via paired data. Comput Med Imaging Graph 2024; 115:102393. [PMID: 38704993 DOI: 10.1016/j.compmedimag.2024.102393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Revised: 04/26/2024] [Accepted: 04/26/2024] [Indexed: 05/07/2024]
Abstract
Accurate segmentation of cerebrovascular structures from Computed Tomography Angiography (CTA), Magnetic Resonance Angiography (MRA), and Digital Subtraction Angiography (DSA) is crucial for clinical diagnosis of cranial vascular diseases. Recent advancements in deep Convolution Neural Network (CNN) have significantly improved the segmentation process. However, training segmentation networks for all modalities requires extensive data labeling for each modality, which is often expensive and time-consuming. To circumvent this limitation, we introduce an approach to train cross-modality cerebrovascular segmentation network based on paired data from source and target domains. Our approach involves training a universal vessel segmentation network with manually labeled source domain data, which automatically produces initial labels for target domain training images. We improve the initial labels of target domain training images by fusing paired images, which are then used to refine the target domain segmentation network. A series of experimental arrangements is presented to assess the efficacy of our method in various practical application scenarios. The experiments conducted on an MRA-CTA dataset and a DSA-CTA dataset demonstrate that the proposed method is effective for cross-modality cerebrovascular segmentation and achieves state-of-the-art performance.
Collapse
Affiliation(s)
- Zhanqiang Guo
- Department of Automation, BNRist, Tsinghua University, Beijing, China
| | - Jianjiang Feng
- Department of Automation, BNRist, Tsinghua University, Beijing, China.
| | - Wangsheng Lu
- UnionStrong (Beijing) Technology Co.Ltd, Beijing, China
| | - Yin Yin
- UnionStrong (Beijing) Technology Co.Ltd, Beijing, China
| | | | - Jie Zhou
- Department of Automation, BNRist, Tsinghua University, Beijing, China
| |
Collapse
|
30
|
Alsaleh AM, Albalawi E, Algosaibi A, Albakheet SS, Khan SB. Few-Shot Learning for Medical Image Segmentation Using 3D U-Net and Model-Agnostic Meta-Learning (MAML). Diagnostics (Basel) 2024; 14:1213. [PMID: 38928629 PMCID: PMC11202447 DOI: 10.3390/diagnostics14121213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Revised: 05/24/2024] [Accepted: 05/30/2024] [Indexed: 06/28/2024] Open
Abstract
Deep learning has attained state-of-the-art results in general image segmentation problems; however, it requires a substantial number of annotated images to achieve the desired outcomes. In the medical field, the availability of annotated images is often limited. To address this challenge, few-shot learning techniques have been successfully adapted to rapidly generalize to new tasks with only a few samples, leveraging prior knowledge. In this paper, we employ a gradient-based method known as Model-Agnostic Meta-Learning (MAML) for medical image segmentation. MAML is a meta-learning algorithm that quickly adapts to new tasks by updating a model's parameters based on a limited set of training samples. Additionally, we use an enhanced 3D U-Net as the foundational network for our models. The enhanced 3D U-Net is a convolutional neural network specifically designed for medical image segmentation. We evaluate our approach on the TotalSegmentator dataset, considering a few annotated images for four tasks: liver, spleen, right kidney, and left kidney. The results demonstrate that our approach facilitates rapid adaptation to new tasks using only a few annotated images. In 10-shot settings, our approach achieved mean dice coefficients of 93.70%, 85.98%, 81.20%, and 89.58% for liver, spleen, right kidney, and left kidney segmentation, respectively. In five-shot sittings, the approach attained mean Dice coefficients of 90.27%, 83.89%, 77.53%, and 87.01% for liver, spleen, right kidney, and left kidney segmentation, respectively. Finally, we assess the effectiveness of our proposed approach on a dataset collected from a local hospital. Employing five-shot sittings, we achieve mean Dice coefficients of 90.62%, 79.86%, 79.87%, and 78.21% for liver, spleen, right kidney, and left kidney segmentation, respectively.
Collapse
Affiliation(s)
- Aqilah M. Alsaleh
- College of Computer Science and Information Technology, King Faisal University, Al Hofuf 400-31982, AlAhsa, Saudi Arabia; (E.A.); (A.A.)
- Department of Information Technology, AlAhsa Health Cluster, Al Hofuf 3158-36421, AlAhsa, Saudi Arabia
| | - Eid Albalawi
- College of Computer Science and Information Technology, King Faisal University, Al Hofuf 400-31982, AlAhsa, Saudi Arabia; (E.A.); (A.A.)
| | - Abdulelah Algosaibi
- College of Computer Science and Information Technology, King Faisal University, Al Hofuf 400-31982, AlAhsa, Saudi Arabia; (E.A.); (A.A.)
| | - Salman S. Albakheet
- Department of Radiology, King Faisal General Hospital, Al Hofuf 36361, AlAhsa, Saudi Arabia;
| | - Surbhi Bhatia Khan
- Department of Data Science, School of Science Engineering and Environment, University of Salford, Manchester M5 4WT, UK;
- Department of Electrical and Computer Engineering, Lebanese American University, Byblos P.O. Box 13-5053, Lebanon
| |
Collapse
|
31
|
He Y, Kong J, Li J, Zheng C. Entropy and distance-guided super self-ensembling for optic disc and cup segmentation. BIOMEDICAL OPTICS EXPRESS 2024; 15:3975-3992. [PMID: 38867792 PMCID: PMC11166439 DOI: 10.1364/boe.521778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 04/14/2024] [Accepted: 05/06/2024] [Indexed: 06/14/2024]
Abstract
Segmenting the optic disc (OD) and optic cup (OC) is crucial to accurately detect changes in glaucoma progression in the elderly. Recently, various convolutional neural networks have emerged to deal with OD and OC segmentation. Due to the domain shift problem, achieving high-accuracy segmentation of OD and OC from different domain datasets remains highly challenging. Unsupervised domain adaptation has taken extensive focus as a way to address this problem. In this work, we propose a novel unsupervised domain adaptation method, called entropy and distance-guided super self-ensembling (EDSS), to enhance the segmentation performance of OD and OC. EDSS is comprised of two self-ensembling models, and the Gaussian noise is added to the weights of the whole network. Firstly, we design a super self-ensembling (SSE) framework, which can combine two self-ensembling to learn more discriminative information about images. Secondly, we propose a novel exponential moving average with Gaussian noise (G-EMA) to enhance the robustness of the self-ensembling framework. Thirdly, we propose an effective multi-information fusion strategy (MFS) to guide and improve the domain adaptation process. We evaluate the proposed EDSS on two public fundus image datasets RIGA+ and REFUGE. Large amounts of experimental results demonstrate that the proposed EDSS outperforms state-of-the-art segmentation methods with unsupervised domain adaptation, e.g., the Dicemean score on three test sub-datasets of RIGA+ are 0.8442, 0.8772 and 0.9006, respectively, and the Dicemean score on the REFUGE dataset is 0.9154.
Collapse
Affiliation(s)
- Yanlin He
- College of Information Sciences and Technology, Northeast Normal University, Changchun 130117, China
| | - Jun Kong
- College of Information Sciences and Technology, Northeast Normal University, Changchun 130117, China
| | - Juan Li
- Jilin Engineering Normal University, Changchun 130052, China
- Business School, Northeast Normal University, Changchun 130117, China
| | - Caixia Zheng
- College of Information Sciences and Technology, Northeast Normal University, Changchun 130117, China
- Key Laboratory of Applied Statistics of MOE, Northeast Normal University, Changchun 130024, China
| |
Collapse
|
32
|
Luu HM, Yoo GS, Park W, Park SH. CycleSeg: Simultaneous synthetic CT generation and unsupervised segmentation for MR-only radiotherapy treatment planning of prostate cancer. Med Phys 2024; 51:4365-4379. [PMID: 38323835 DOI: 10.1002/mp.16976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 01/22/2024] [Accepted: 01/25/2024] [Indexed: 02/08/2024] Open
Abstract
BACKGROUND MR-only radiotherapy treatment planning is an attractive alternative to conventional workflow, reducing scan time and ionizing radiation. It is crucial to derive the electron density map or synthetic CT (sCT) from MR data to perform dose calculations to enable MR-only treatment planning. Automatic segmentation of relevant organs in MR images can accelerate the process by preventing the time-consuming manual contouring step. However, the segmentation label is available only for CT data in many cases. PURPOSE We propose CycleSeg, a unified framework that generates sCT and corresponding segmentation from MR images without access to MR segmentation labels METHODS: CycleSeg utilizes the CycleGAN formulation to perform unpaired synthesis of sCT and image alignment. To enable MR (sCT) segmentation, CycleSeg incorporates unsupervised domain adaptation by using a pseudo-labeling approach with feature alignment in semantic segmentation space. In contrast to previous approaches that perform segmentation on MR data, CycleSeg could perform segmentation on both MR and sCT. Experiments were performed with data from prostate cancer patients, with 78/7/10 subjects in the training/validation/test sets, respectively. RESULTS CycleSeg showed the best sCT generation results, with the lowest mean absolute error of 102.2 and the lowest Fréchet inception distance of 13.0. CycleSeg also performed best on MR segmentation, with the highest average dice score of 81.0 and 81.1 for MR and sCT segmentation, respectively. Ablation experiments confirmed the contribution of the proposed components of CycleSeg. CONCLUSION CycleSeg effectively synthesized CT and performed segmentation on MR images of prostate cancer patients. Thus, CycleSeg has the potential to expedite MR-only radiotherapy treatment planning, reducing the prescribed scans and manual segmentation effort, and increasing throughput.
Collapse
Affiliation(s)
- Huan Minh Luu
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Gyu Sang Yoo
- Department of Radiation Oncology, Chungbuk National University Hospital, Cheongju, Republic of Korea
| | - Won Park
- Department of Radiation Oncology, Samsung Medical Center, Seoul, Republic of Korea
| | - Sung-Hong Park
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| |
Collapse
|
33
|
Jin X, Hao Y, Hilliard J, Zhang Z, Thomas MA, Li H, Jha AK, Hugo GD. A quality assurance framework for routine monitoring of deep learning cardiac substructure computed tomography segmentation models in radiotherapy. Med Phys 2024; 51:2741-2758. [PMID: 38015793 DOI: 10.1002/mp.16846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 10/24/2023] [Accepted: 10/30/2023] [Indexed: 11/30/2023] Open
Abstract
BACKGROUND For autosegmentation models, the data used to train the model (e.g., public datasets and/or vendor-collected data) and the data on which the model is deployed in the clinic are typically not the same, potentially impacting the performance of these models by a process called domain shift. Tools to routinely monitor and predict segmentation performance are needed for quality assurance. Here, we develop an approach to perform such monitoring and performance prediction for cardiac substructure segmentation. PURPOSE To develop a quality assurance (QA) framework for routine or continuous monitoring of domain shift and the performance of cardiac substructure autosegmentation algorithms. METHODS A benchmark dataset consisting of computed tomography (CT) images along with manual cardiac substructure delineations of 241 breast cancer radiotherapy patients were collected, including one "normal" image domain of clean images and five "abnormal" domains containing images with artifact (metal, contrast), pathology, or quality variations due to scanner protocol differences (field of view, noise, reconstruction kernel, and slice thickness). The QA framework consisted of an image domain shift detector which operated on the input CT images and a shape quality detector on the output of an autosegmentation model, and a regression model for predicting autosegmentation model performance. The image domain shift detector was composed of a trained denoising autoencoder (DAE) and two hand-engineered image quality features to detect normal versus abnormal domains in the input CT images. The shape quality detector was a variational autoencoder (VAE) trained to estimate the shape quality of the auto-segmentation results. The output from the image domain shift and shape quality detectors was used to train a regression model to predict the per-patient segmentation accuracy, measured by Dice coefficient similarity (DSC) to physician contours. Different regression techniques were investigated including linear regression, Bagging, Gaussian process regression, random forest, and gradient boost regression. Of the 241 patients, 60 were used to train the autosegmentation models, 120 for training the QA framework, and the remaining 61 for testing the QA framework. A total of 19 autosegmentation models were used to evaluate QA framework performance, including 18 convolutional neural network (CNN)-based and one transformer-based model. RESULTS When tested on the benchmark dataset, all abnormal domains resulted in a significant DSC decrease relative to the normal domain for CNN models (p < 0.001 $p < 0.001$ ), but only for some domains for the transformer model. No significant relationship was found between the performance of an autosegmentation model and scanner protocol parameters (p = 0.42 $p = 0.42$ ) except noise (p = 0.01 $p = 0.01$ ). CNN-based autosegmentation models demonstrated a decreased DSC ranging from 0.07 to 0.41 with added noise, while the transformer-based model was not significantly affected (ANOVA,p = 0.99 $p=0.99$ ). For the QA framework, linear regression models with bootstrap aggregation resulted in the highest mean absolute error (MAE) of0.041 ± 0.002 $0.041 \pm 0.002$ , in predicted DSC (relative to true DSC between autosegmentation and physician). MAE was lowest when combining both input (image) detectors and output (shape) detectors compared to output detectors alone. CONCLUSIONS A QA framework was able to predict cardiac substructure autosegmentation model performance for clinically anticipated "abnormal" domain shifts.
Collapse
Affiliation(s)
- Xiyao Jin
- Department of Radiation Oncology, Washington University in St. Louis School of Medicine, St. Louis, Missouri, USA
| | - Yao Hao
- Department of Radiation Oncology, Washington University in St. Louis School of Medicine, St. Louis, Missouri, USA
| | - Jessica Hilliard
- Department of Radiation Oncology, Washington University in St. Louis School of Medicine, St. Louis, Missouri, USA
| | - Zhehao Zhang
- Department of Radiation Oncology, Washington University in St. Louis School of Medicine, St. Louis, Missouri, USA
| | - Maria A Thomas
- Department of Radiation Oncology, Washington University in St. Louis School of Medicine, St. Louis, Missouri, USA
| | - Hua Li
- Department of Radiation Oncology, Washington University in St. Louis School of Medicine, St. Louis, Missouri, USA
| | - Abhinav K Jha
- Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, Missouri, USA
- Mallinckrodt Institute of Radiology, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Geoffrey D Hugo
- Department of Radiation Oncology, Washington University in St. Louis School of Medicine, St. Louis, Missouri, USA
| |
Collapse
|
34
|
Huang S, Zhang X, Cui Z, Zhang H, Chen G, Shen D. Tissue Segmentation of Thick-Slice Fetal Brain MR Scans With Guidance From High-Quality Isotropic Volumes. IEEE Trans Biomed Eng 2024; 71:1404-1415. [PMID: 38048237 DOI: 10.1109/tbme.2023.3337338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/06/2023]
Abstract
Accurate tissue segmentation of thick-slice fetal brain magnetic resonance (MR) scans is crucial for both reconstruction of isotropic brain MR volumes and the quantification of fetal brain development. However, this task is challenging due to the use of thick-slice scans in clinically-acquired fetal brain data. To address this issue, we propose to leverage high-quality isotropic fetal brain MR volumes (and also their corresponding annotations) as guidance for segmentation of thick-slice scans. Due to existence of significant domain gap between high-quality isotropic volume (i.e., source data) and thick-slice scans (i.e., target data), we employ a domain adaptation technique to achieve the associated knowledge transfer (from high-quality "source" volumes to thick-slice "target" scans). Specifically, we first register the available high-quality isotropic fetal brain MR volumes across different gestational weeks to construct longitudinally-complete source data. To capture domain-invariant information, we then perform Fourier decomposition to extract image content and style codes. Finally, we propose a novel Cycle-Consistent Domain Adaptation Network (C 2DA-Net) to efficiently transfer the knowledge learned from high-quality isotropic volumes for accurate tissue segmentation of thick-slice scans. Our C 2DA-Net can fully utilize a small set of annotated isotropic volumes to guide tissue segmentation on unannotated thick-slice scans. Extensive experiments on a large-scale dataset of 372 clinically acquired thick-slice MR scans demonstrate that our C 2DA-Net achieves much better performance than cutting-edge methods quantitatively and qualitatively.
Collapse
|
35
|
Kumari S, Singh P. Deep learning for unsupervised domain adaptation in medical imaging: Recent advancements and future perspectives. Comput Biol Med 2024; 170:107912. [PMID: 38219643 DOI: 10.1016/j.compbiomed.2023.107912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 11/02/2023] [Accepted: 12/24/2023] [Indexed: 01/16/2024]
Abstract
Deep learning has demonstrated remarkable performance across various tasks in medical imaging. However, these approaches primarily focus on supervised learning, assuming that the training and testing data are drawn from the same distribution. Unfortunately, this assumption may not always hold true in practice. To address these issues, unsupervised domain adaptation (UDA) techniques have been developed to transfer knowledge from a labeled domain to a related but unlabeled domain. In recent years, significant advancements have been made in UDA, resulting in a wide range of methodologies, including feature alignment, image translation, self-supervision, and disentangled representation methods, among others. In this paper, we provide a comprehensive literature review of recent deep UDA approaches in medical imaging from a technical perspective. Specifically, we categorize current UDA research in medical imaging into six groups and further divide them into finer subcategories based on the different tasks they perform. We also discuss the respective datasets used in the studies to assess the divergence between the different domains. Finally, we discuss emerging areas and provide insights and discussions on future research directions to conclude this survey.
Collapse
Affiliation(s)
- Suruchi Kumari
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| | - Pravendra Singh
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| |
Collapse
|
36
|
Hognon C, Conze PH, Bourbonne V, Gallinato O, Colin T, Jaouen V, Visvikis D. Contrastive image adaptation for acquisition shift reduction in medical imaging. Artif Intell Med 2024; 148:102747. [PMID: 38325919 DOI: 10.1016/j.artmed.2023.102747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 10/21/2023] [Accepted: 12/10/2023] [Indexed: 02/09/2024]
Abstract
The domain shift, or acquisition shift in medical imaging, is responsible for potentially harmful differences between development and deployment conditions of medical image analysis techniques. There is a growing need in the community for advanced methods that could mitigate this issue better than conventional approaches. In this paper, we consider configurations in which we can expose a learning-based pixel level adaptor to a large variability of unlabeled images during its training, i.e. sufficient to span the acquisition shift expected during the training or testing of a downstream task model. We leverage the ability of convolutional architectures to efficiently learn domain-agnostic features and train a many-to-one unsupervised mapping between a source collection of heterogeneous images from multiple unknown domains subjected to the acquisition shift and a homogeneous subset of this source set of lower cardinality, potentially constituted of a single image. To this end, we propose a new cycle-free image-to-image architecture based on a combination of three loss functions : a contrastive PatchNCE loss, an adversarial loss and an edge preserving loss allowing for rich domain adaptation to the target image even under strong domain imbalance and low data regimes. Experiments support the interest of the proposed contrastive image adaptation approach for the regularization of downstream deep supervised segmentation and cross-modality synthesis models.
Collapse
Affiliation(s)
- Clément Hognon
- UMR U1101 Inserm LaTIM, IMT Atlantique, Université de Bretagne Occidentale, France; SOPHiA Genetics, Pessac, France
| | - Pierre-Henri Conze
- UMR U1101 Inserm LaTIM, IMT Atlantique, Université de Bretagne Occidentale, France
| | - Vincent Bourbonne
- UMR U1101 Inserm LaTIM, IMT Atlantique, Université de Bretagne Occidentale, France
| | | | | | - Vincent Jaouen
- UMR U1101 Inserm LaTIM, IMT Atlantique, Université de Bretagne Occidentale, France.
| | - Dimitris Visvikis
- UMR U1101 Inserm LaTIM, IMT Atlantique, Université de Bretagne Occidentale, France
| |
Collapse
|
37
|
Yang X, Chin BB, Silosky M, Wehrend J, Litwiller DV, Ghosh D, Xing F. Learning Without Real Data Annotations to Detect Hepatic Lesions in PET Images. IEEE Trans Biomed Eng 2024; 71:679-688. [PMID: 37708016 DOI: 10.1109/tbme.2023.3315268] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/16/2023]
Abstract
OBJECTIVE Deep neural networks have been recently applied to lesion identification in fluorodeoxyglucose (FDG) positron emission tomography (PET) images, but they typically rely on a large amount of well-annotated data for model training. This is extremely difficult to achieve for neuroendocrine tumors (NETs), because of low incidence of NETs and expensive lesion annotation in PET images. The objective of this study is to design a novel, adaptable deep learning method, which uses no real lesion annotations but instead low-cost, list mode-simulated data, for hepatic lesion detection in real-world clinical NET PET images. METHODS We first propose a region-guided generative adversarial network (RG-GAN) for lesion-preserved image-to-image translation. Then, we design a specific data augmentation module for our list-mode simulated data and incorporate this module into the RG-GAN to improve model training. Finally, we combine the RG-GAN, the data augmentation module and a lesion detection neural network into a unified framework for joint-task learning to adaptatively identify lesions in real-world PET data. RESULTS The proposed method outperforms recent state-of-the-art lesion detection methods in real clinical 68Ga-DOTATATE PET images, and produces very competitive performance with the target model that is trained with real lesion annotations. CONCLUSION With RG-GAN modeling and specific data augmentation, we can obtain good lesion detection performance without using any real data annotations. SIGNIFICANCE This study introduces an adaptable deep learning method for hepatic lesion identification in NETs, which can significantly reduce human effort for data annotation and improve model generalizability for lesion detection with PET imaging.
Collapse
|
38
|
Zhang Y, Wang Y, Xu L, Yao Y, Qian W, Qi L. ST-GAN: A Swin Transformer-Based Generative Adversarial Network for Unsupervised Domain Adaptation of Cross-Modality Cardiac Segmentation. IEEE J Biomed Health Inform 2024; 28:893-904. [PMID: 38019618 DOI: 10.1109/jbhi.2023.3336965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2023]
Abstract
Unsupervised domain adaptation (UDA) methods have shown great potential in cross-modality medical image segmentation tasks, where target domain labels are unavailable. However, the domain shift among different image modalities remains challenging, because the conventional UDA methods are based on convolutional neural networks (CNNs), which tend to focus on the texture of images and cannot establish the global semantic relevance of features due to the locality of CNNs. This paper proposes a novel end-to-end Swin Transformer-based generative adversarial network (ST-GAN) for cross-modality cardiac segmentation. In the generator of ST-GAN, we utilize the local receptive fields of CNNs to capture spatial information and introduce the Swin Transformer to extract global semantic information, which enables the generator to better extract the domain-invariant features in UDA tasks. In addition, we design a multi-scale feature fuser to sufficiently fuse the features acquired at different stages and improve the robustness of the UDA network. We extensively evaluated our method with two cross-modality cardiac segmentation tasks on the MS-CMR 2019 dataset and the M&Ms dataset. The results of two different tasks show the validity of ST-GAN compared with the state-of-the-art cross-modality cardiac image segmentation methods.
Collapse
|
39
|
Ji W, Chung ACS. Unsupervised Domain Adaptation for Medical Image Segmentation Using Transformer With Meta Attention. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:820-831. [PMID: 37801381 DOI: 10.1109/tmi.2023.3322581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/08/2023]
Abstract
Image segmentation is essential to medical image analysis as it provides the labeled regions of interest for the subsequent diagnosis and treatment. However, fully-supervised segmentation methods require high-quality annotations produced by experts, which is laborious and expensive. In addition, when performing segmentation on another unlabeled image modality, the segmentation performance will be adversely affected due to the domain shift. Unsupervised domain adaptation (UDA) is an effective way to tackle these problems, but the performance of the existing methods is still desired to improve. Also, despite the effectiveness of recent Transformer-based methods in medical image segmentation, the adaptability of Transformers is rarely investigated. In this paper, we present a novel UDA framework using a Transformer for building a cross-modality segmentation method with the advantages of learning long-range dependencies and transferring attentive information. To fully utilize the attention learned by the Transformer in UDA, we propose Meta Attention (MA) and use it to perform a fully attention-based alignment scheme, which can learn the hierarchical consistencies of attention and transfer more discriminative information between two modalities. We have conducted extensive experiments on cross-modality segmentation using three datasets, including a whole heart segmentation dataset (MMWHS), an abdominal organ segmentation dataset, and a brain tumor segmentation dataset. The promising results show that our method can significantly improve performance compared with the state-of-the-art UDA methods.
Collapse
|
40
|
Tiwary P, Bhattacharyya K, A P P. Cycle consistent twin energy-based models for image-to-image translation. Med Image Anal 2024; 91:103031. [PMID: 37988920 DOI: 10.1016/j.media.2023.103031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 09/10/2023] [Accepted: 11/13/2023] [Indexed: 11/23/2023]
Abstract
Domain shift refers to change of distributional characteristics between the training (source) and the testing (target) datasets of a learning task, leading to performance drop. For tasks involving medical images, domain shift may be caused because of several factors such as change in underlying imaging modalities, measuring devices and staining mechanisms. Recent approaches address this issue via generative models based on the principles of adversarial learning albeit they suffer from issues such as difficulty in training and lack of diversity. Motivated by the aforementioned observations, we adapt an alternative class of deep generative models called the Energy-Based Models (EBMs) for the task of unpaired image-to-image translation of medical images. Specifically, we propose a novel method called the Cycle Consistent Twin EBMs (CCT-EBM) which employs a pair of EBMs in the latent space of an Auto-Encoder trained on the source data. While one of the EBMs translates the source to the target domain the other does vice-versa along with a novel consistency loss, ensuring translation symmetry and coupling between the domains. We theoretically analyze the proposed method and show that our design leads to better translation between the domains with reduced langevin mixing steps. We demonstrate the efficacy of our method through detailed quantitative and qualitative experiments on image segmentation tasks on three different datasets vis-a-vis state-of-the-art methods.
Collapse
Affiliation(s)
- Piyush Tiwary
- Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore, Karnataka 560012, India.
| | - Kinjawl Bhattacharyya
- Department of Electrical Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Prathosh A P
- Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore, Karnataka 560012, India
| |
Collapse
|
41
|
Xie Q, Li Y, He N, Ning M, Ma K, Wang G, Lian Y, Zheng Y. Unsupervised Domain Adaptation for Medical Image Segmentation by Disentanglement Learning and Self-Training. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:4-14. [PMID: 35853072 DOI: 10.1109/tmi.2022.3192303] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Unsupervised domain adaption (UDA), which aims to enhance the segmentation performance of deep models on unlabeled data, has recently drawn much attention. In this paper, we propose a novel UDA method (namely DLaST) for medical image segmentation via disentanglement learning and self-training. Disentanglement learning factorizes an image into domain-invariant anatomy and domain-specific modality components. To make the best of disentanglement learning, we propose a novel shape constraint to boost the adaptation performance. The self-training strategy further adaptively improves the segmentation performance of the model for the target domain through adversarial learning and pseudo label, which implicitly facilitates feature alignment in the anatomy space. Experimental results demonstrate that the proposed method outperforms the state-of-the-art UDA methods for medical image segmentation on three public datasets, i.e., a cardiac dataset, an abdominal dataset and a brain dataset. The code will be released soon.
Collapse
|
42
|
Baldeon-Calisto M, Lai-Yuen SK, Puente-Mejia B. StAC-DA: Structure aware cross-modality domain adaptation framework with image and feature-level adaptation for medical image segmentation. Digit Health 2024; 10:20552076241277440. [PMID: 39229464 PMCID: PMC11369866 DOI: 10.1177/20552076241277440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Accepted: 08/06/2024] [Indexed: 09/05/2024] Open
Abstract
Objective Convolutional neural networks (CNNs) have achieved state-of-the-art results in various medical image segmentation tasks. However, CNNs often assume that the source and target dataset follow the same probability distribution and when this assumption is not satisfied their performance degrades significantly. This poses a limitation in medical image analysis, where including information from different imaging modalities can bring large clinical benefits. In this work, we present an unsupervised Structure Aware Cross-modality Domain Adaptation (StAC-DA) framework for medical image segmentation. Methods StAC-DA implements an image- and feature-level adaptation in a sequential two-step approach. The first step performs an image-level alignment, where images from the source domain are translated to the target domain in pixel space by implementing a CycleGAN-based model. The latter model includes a structure-aware network that preserves the shape of the anatomical structure during translation. The second step consists of a feature-level alignment. A U-Net network with deep supervision is trained with the transformed source domain images and target domain images in an adversarial manner to produce probable segmentations for the target domain. Results The framework is evaluated on bidirectional cardiac substructure segmentation. StAC-DA outperforms leading unsupervised domain adaptation approaches, being ranked first in the segmentation of the ascending aorta when adapting from Magnetic Resonance Imaging (MRI) to Computed Tomography (CT) domain and from CT to MRI domain. Conclusions The presented framework overcomes the limitations posed by differing distributions in training and testing datasets. Moreover, the experimental results highlight its potential to improve the accuracy of medical image segmentation across diverse imaging modalities.
Collapse
Affiliation(s)
- Maria Baldeon-Calisto
- Departamento de Ingeniería Industrial, Colegio de Ciencias e Ingeniería, Instituto de Innovación en Productividad y Logística CATENA-USFQ, Universidad San Francisco de Quito, Quito, Ecuador
| | - Susana K. Lai-Yuen
- Department of Industrial and Management Systems, University of South Florida, Tampa, FL, USA
| | - Bernardo Puente-Mejia
- Departamento de Ingeniería Industrial, Colegio de Ciencias e Ingeniería, Instituto de Innovación en Productividad y Logística CATENA-USFQ, Universidad San Francisco de Quito, Quito, Ecuador
| |
Collapse
|
43
|
Muffoletto M, Xu H, Kunze KP, Neji R, Botnar R, Prieto C, Rückert D, Young AA. Combining generative modelling and semi-supervised domain adaptation for whole heart cardiovascular magnetic resonance angiography segmentation. J Cardiovasc Magn Reson 2023; 25:80. [PMID: 38124106 PMCID: PMC10734115 DOI: 10.1186/s12968-023-00981-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 11/12/2023] [Indexed: 12/23/2023] Open
Abstract
BACKGROUND Quantification of three-dimensional (3D) cardiac anatomy is important for the evaluation of cardiovascular diseases. Changes in anatomy are indicative of remodeling processes as the heart tissue adapts to disease. Although robust segmentation methods exist for computed tomography angiography (CTA), few methods exist for whole-heart cardiovascular magnetic resonance angiograms (CMRA) which are more challenging due to variable contrast, lower signal to noise ratio and a limited amount of labeled data. METHODS Two state-of-the-art unsupervised generative deep learning domain adaptation architectures, generative adversarial networks and variational auto-encoders, were applied to 3D whole heart segmentation of both conventional (n = 20) and high-resolution (n = 45) CMRA (target) images, given segmented CTA (source) images for training. An additional supervised loss function was implemented to improve performance given 10%, 20% and 30% segmented CMRA cases. A fully supervised nn-UNet trained on the given CMRA segmentations was used as the benchmark. RESULTS The addition of a small number of segmented CMRA training cases substantially improved performance in both generative architectures in both standard and high-resolution datasets. Compared with the nn-UNet benchmark, the generative methods showed substantially better performance in the case of limited labelled cases. On the standard CMRA dataset, an average 12% (adversarial method) and 10% (variational method) improvement in Dice score was obtained. CONCLUSIONS Unsupervised domain-adaptation methods for CMRA segmentation can be boosted by the addition of a small number of supervised target training cases. When only few labelled cases are available, semi-supervised generative modelling is superior to supervised methods.
Collapse
Affiliation(s)
- Marica Muffoletto
- School of Biomedical Engineering and Imaging Sciences, King's College, St Thomas' Hospital, 4th Floor Lambeth Wing, Westminster Bridge, London, SW1 7EH, UK.
| | - Hao Xu
- School of Biomedical Engineering and Imaging Sciences, King's College, St Thomas' Hospital, 4th Floor Lambeth Wing, Westminster Bridge, London, SW1 7EH, UK
| | - Karl P Kunze
- School of Biomedical Engineering and Imaging Sciences, King's College, St Thomas' Hospital, 4th Floor Lambeth Wing, Westminster Bridge, London, SW1 7EH, UK
- MR Research Collaborations, Siemens Healthcare Limited, Frimley, UK
| | - Radhouene Neji
- School of Biomedical Engineering and Imaging Sciences, King's College, St Thomas' Hospital, 4th Floor Lambeth Wing, Westminster Bridge, London, SW1 7EH, UK
- MR Research Collaborations, Siemens Healthcare Limited, Frimley, UK
| | - René Botnar
- School of Biomedical Engineering and Imaging Sciences, King's College, St Thomas' Hospital, 4th Floor Lambeth Wing, Westminster Bridge, London, SW1 7EH, UK
| | - Claudia Prieto
- School of Biomedical Engineering and Imaging Sciences, King's College, St Thomas' Hospital, 4th Floor Lambeth Wing, Westminster Bridge, London, SW1 7EH, UK
| | - Daniel Rückert
- Biomedical Image Analysis Group, Department of Computing, Imperial College London, London, UK
- Institute for Artificial Intelligence and Informatics in Medicine, Klinikum Rechts der Isar, Technical University of Munich, Munich, Germany
| | - Alistair A Young
- School of Biomedical Engineering and Imaging Sciences, King's College, St Thomas' Hospital, 4th Floor Lambeth Wing, Westminster Bridge, London, SW1 7EH, UK
| |
Collapse
|
44
|
Chen C, Teng Y, Tan S, Wang Z, Zhang L, Xu J. Performance Test of a Well-Trained Model for Meningioma Segmentation in Health Care Centers: Secondary Analysis Based on Four Retrospective Multicenter Data Sets. J Med Internet Res 2023; 25:e44119. [PMID: 38100181 PMCID: PMC10757229 DOI: 10.2196/44119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 06/21/2023] [Accepted: 11/22/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Convolutional neural networks (CNNs) have produced state-of-the-art results in meningioma segmentation on magnetic resonance imaging (MRI). However, images obtained from different institutions, protocols, or scanners may show significant domain shift, leading to performance degradation and challenging model deployment in real clinical scenarios. OBJECTIVE This research aims to investigate the realistic performance of a well-trained meningioma segmentation model when deployed across different health care centers and verify the methods to enhance its generalization. METHODS This study was performed in four centers. A total of 606 patients with 606 MRIs were enrolled between January 2015 and December 2021. Manual segmentations, determined through consensus readings by neuroradiologists, were used as the ground truth mask. The model was previously trained using a standard supervised CNN called Deeplab V3+ and was deployed and tested separately in four health care centers. To determine the appropriate approach to mitigating the observed performance degradation, two methods were used: unsupervised domain adaptation and supervised retraining. RESULTS The trained model showed a state-of-the-art performance in tumor segmentation in two health care institutions, with a Dice ratio of 0.887 (SD 0.108, 95% CI 0.903-0.925) in center A and a Dice ratio of 0.874 (SD 0.800, 95% CI 0.854-0.894) in center B. Whereas in the other health care institutions, the performance declined, with Dice ratios of 0.631 (SD 0.157, 95% CI 0.556-0.707) in center C and 0.649 (SD 0.187, 95% CI 0.566-0.732) in center D, as they obtained the MRI using different scanning protocols. The unsupervised domain adaptation showed a significant improvement in performance scores, with Dice ratios of 0.842 (SD 0.073, 95% CI 0.820-0.864) in center C and 0.855 (SD 0.097, 95% CI 0.826-0.886) in center D. Nonetheless, it did not overperform the supervised retraining, which achieved Dice ratios of 0.899 (SD 0.026, 95% CI 0.889-0.906) in center C and 0.886 (SD 0.046, 95% CI 0.870-0.903) in center D. CONCLUSIONS Deploying the trained CNN model in different health care institutions may show significant performance degradation due to the domain shift of MRIs. Under this circumstance, the use of unsupervised domain adaptation or supervised retraining should be considered, taking into account the balance between clinical requirements, model performance, and the size of the available data.
Collapse
Affiliation(s)
- Chaoyue Chen
- Neurosurgery Department, West China Hospital, Sichuan University, Chengdu, China
| | - Yuen Teng
- Neurosurgery Department, West China Hospital, Sichuan University, Chengdu, China
| | - Shuo Tan
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, China
| | - Zizhou Wang
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, China
- Institute of High Performance Computing, Agency for Science, Technology and Research, Singapore, Singapore
| | - Lei Zhang
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, China
| | - Jianguo Xu
- Neurosurgery Department, West China Hospital, Sichuan University, Chengdu, China
| |
Collapse
|
45
|
Liu S, Yin S, Qu L, Wang M, Song Z. A Structure-Aware Framework of Unsupervised Cross-Modality Domain Adaptation via Frequency and Spatial Knowledge Distillation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3919-3931. [PMID: 37738201 DOI: 10.1109/tmi.2023.3318006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/24/2023]
Abstract
Unsupervised domain adaptation (UDA) aims to train a model on a labeled source domain and adapt it to an unlabeled target domain. In medical image segmentation field, most existing UDA methods rely on adversarial learning to address the domain gap between different image modalities. However, this process is complicated and inefficient. In this paper, we propose a simple yet effective UDA method based on both frequency and spatial domain transfer under a multi-teacher distillation framework. In the frequency domain, we introduce non-subsampled contourlet transform for identifying domain-invariant and domain-variant frequency components (DIFs and DVFs) and replace the DVFs of the source domain images with those of the target domain images while keeping the DIFs unchanged to narrow the domain gap. In the spatial domain, we propose a batch momentum update-based histogram matching strategy to minimize the domain-variant image style bias. Additionally, we further propose a dual contrastive learning module at both image and pixel levels to learn structure-related information. Our proposed method outperforms state-of-the-art methods on two cross-modality medical image segmentation datasets (cardiac and abdominal). Codes are avaliable at https://github.com/slliuEric/FSUDA.
Collapse
|
46
|
Huang Y, Xie W, Li M, Xiao E, You J, Liu X. Source-free domain adaptive segmentation with class-balanced complementary self-training. Artif Intell Med 2023; 146:102694. [PMID: 38042612 DOI: 10.1016/j.artmed.2023.102694] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Revised: 10/20/2023] [Accepted: 10/29/2023] [Indexed: 12/04/2023]
Abstract
Unsupervised domain adaptation (UDA) plays a crucial role in transferring knowledge gained from a labeled source domain to effectively apply it in an unlabeled and diverse target domain. While UDA commonly involves training on data from both domains, accessing labeled data from the source domain is frequently constrained, citing concerns related to patient data privacy or intellectual property. The source-free UDA (SFUDA) can be promising to sidestep this difficulty. However, without the source domain supervision, the SFUDA methods can easily fall into the dilemma of "winner takes all", in which the majority category can dominate the deep segmentor, and the minority categories are largely ignored. In addition, the over-confident pseudo-label noise in self-training-based UDA is a long-lasting problem. To sidestep these difficulties, we propose a novel class-balanced complementary self-training (CBCOST) framework for SFUDA segmentation. Specifically, we jointly optimize the pseudo-label-based self-training with two mutually reinforced components. The first class-wise balanced pseudo-label training (CBT) explicitly exploits the fine-grained class-wise confidence to select the class-wise balanced pseudo-labeled pixels with the adaptive within-class thresholds. Second, to alleviate the pseudo-labeled noise, we propose a complementary self-training (COST) to exclude the classes that do not belong to, with a heuristic complementary label selection scheme. We evaluated our CBCOST framework on both 2D and 3D cross-modality cardiac anatomical segmentation tasks and brain tumor segmentation tasks. Our experimental results showed that our CBCOST performs better than existing SFUDA methods and yields similar performance, compared with UDA methods with the source data.
Collapse
Affiliation(s)
- Yongsong Huang
- Harvard Medical School, Harvard University, Boston, MA, USA; Department of Communications Engineering, Graduate School of Engineering, Tohoku University, Sendai, Miyagi, Japan; Gordon Center for Medical Imaging, Massachusetts General Hospital, Boston, MA, USA
| | - Wanqing Xie
- Harvard Medical School, Harvard University, Boston, MA, USA; Department of Intelligent Medical Engineering, School of Biomedical Engineering, Anhui Medical University, Hefei, Anhui, China
| | - Mingzhen Li
- Harvard Medical School, Harvard University, Boston, MA, USA; Department of Mathematics, Washington University in St. Louis, St. Louis, MO, USA
| | - Ethan Xiao
- Harvard Medical School, Harvard University, Boston, MA, USA
| | - Jane You
- Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Hong Kong
| | - Xiaofeng Liu
- Harvard Medical School, Harvard University, Boston, MA, USA; Gordon Center for Medical Imaging, Massachusetts General Hospital, Boston, MA, USA.
| |
Collapse
|
47
|
Xing F, Yang X, Cornish TC, Ghosh D. Learning with limited target data to detect cells in cross-modality images. Med Image Anal 2023; 90:102969. [PMID: 37802010 DOI: 10.1016/j.media.2023.102969] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 08/16/2023] [Accepted: 09/11/2023] [Indexed: 10/08/2023]
Abstract
Deep neural networks have achieved excellent cell or nucleus quantification performance in microscopy images, but they often suffer from performance degradation when applied to cross-modality imaging data. Unsupervised domain adaptation (UDA) based on generative adversarial networks (GANs) has recently improved the performance of cross-modality medical image quantification. However, current GAN-based UDA methods typically require abundant target data for model training, which is often very expensive or even impossible to obtain for real applications. In this paper, we study a more realistic yet challenging UDA situation, where (unlabeled) target training data is limited and previous work seldom delves into cell identification. We first enhance a dual GAN with task-specific modeling, which provides additional supervision signals to assist with generator learning. We explore both single-directional and bidirectional task-augmented GANs for domain adaptation. Then, we further improve the GAN by introducing a differentiable, stochastic data augmentation module to explicitly reduce discriminator overfitting. We examine source-, target-, and dual-domain data augmentation for GAN enhancement, as well as joint task and data augmentation in a unified GAN-based UDA framework. We evaluate the framework for cell detection on multiple public and in-house microscopy image datasets, which are acquired with different imaging modalities, staining protocols and/or tissue preparations. The experiments demonstrate that our method significantly boosts performance when compared with the reference baseline, and it is superior to or on par with fully supervised models that are trained with real target annotations. In addition, our method outperforms recent state-of-the-art UDA approaches by a large margin on different datasets.
Collapse
Affiliation(s)
- Fuyong Xing
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, 13001 E 17th Pl, Aurora, CO 80045, USA.
| | - Xinyi Yang
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, 13001 E 17th Pl, Aurora, CO 80045, USA
| | - Toby C Cornish
- Department of Pathology, University of Colorado Anschutz Medical Campus, 13001 E 17th Pl, Aurora, CO 80045, USA
| | - Debashis Ghosh
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, 13001 E 17th Pl, Aurora, CO 80045, USA
| |
Collapse
|
48
|
Wu J, Wang G, Gu R, Lu T, Chen Y, Zhu W, Vercauteren T, Ourselin S, Zhang S. UPL-SFDA: Uncertainty-Aware Pseudo Label Guided Source-Free Domain Adaptation for Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3932-3943. [PMID: 37738202 DOI: 10.1109/tmi.2023.3318364] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/24/2023]
Abstract
Domain Adaptation (DA) is important for deep learning-based medical image segmentation models to deal with testing images from a new target domain. As the source-domain data are usually unavailable when a trained model is deployed at a new center, Source-Free Domain Adaptation (SFDA) is appealing for data and annotation-efficient adaptation to the target domain. However, existing SFDA methods have a limited performance due to lack of sufficient supervision with source-domain images unavailable and target-domain images unlabeled. We propose a novel Uncertainty-aware Pseudo Label guided (UPL) SFDA method for medical image segmentation. Specifically, we propose Target Domain Growing (TDG) to enhance the diversity of predictions in the target domain by duplicating the pre-trained model's prediction head multiple times with perturbations. The different predictions in these duplicated heads are used to obtain pseudo labels for unlabeled target-domain images and their uncertainty to identify reliable pseudo labels. We also propose a Twice Forward pass Supervision (TFS) strategy that uses reliable pseudo labels obtained in one forward pass to supervise predictions in the next forward pass. The adaptation is further regularized by a mean prediction-based entropy minimization term that encourages confident and consistent results in different prediction heads. UPL-SFDA was validated with a multi-site heart MRI segmentation dataset, a cross-modality fetal brain segmentation dataset, and a 3D fetal tissue segmentation dataset. It improved the average Dice by 5.54, 5.01 and 6.89 percentage points for the three tasks compared with the baseline, respectively, and outperformed several state-of-the-art SFDA methods.
Collapse
|
49
|
Wang Y, Cheng J, Chen Y, Shao S, Zhu L, Wu Z, Liu T, Zhu H. FVP: Fourier Visual Prompting for Source-Free Unsupervised Domain Adaptation of Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3738-3751. [PMID: 37590107 DOI: 10.1109/tmi.2023.3306105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/19/2023]
Abstract
Medical image segmentation methods normally perform poorly when there is a domain shift between training and testing data. Unsupervised Domain Adaptation (UDA) addresses the domain shift problem by training the model using both labeled data from the source domain and unlabeled data from the target domain. Source-Free UDA (SFUDA) was recently proposed for UDA without requiring the source data during the adaptation, due to data privacy or data transmission issues, which normally adapts the pre-trained deep model in the testing stage. However, in real clinical scenarios of medical image segmentation, the trained model is normally frozen in the testing stage. In this paper, we propose Fourier Visual Prompting (FVP) for SFUDA of medical image segmentation. Inspired by prompting learning in natural language processing, FVP steers the frozen pre-trained model to perform well in the target domain by adding a visual prompt to the input target data. In FVP, the visual prompt is parameterized using only a small amount of low-frequency learnable parameters in the input frequency space, and is learned by minimizing the segmentation loss between the predicted segmentation of the prompted target image and reliable pseudo segmentation label of the target image under the frozen model. To our knowledge, FVP is the first work to apply visual prompts to SFUDA for medical image segmentation. The proposed FVP is validated using three public datasets, and experiments demonstrate that FVP yields better segmentation results, compared with various existing methods.
Collapse
|
50
|
Liu Z, Zheng L, Gu L, Yang S, Zhong Z, Zhang G. InstrumentNet: An integrated model for real-time segmentation of intracranial surgical instruments. Comput Biol Med 2023; 166:107565. [PMID: 37839219 DOI: 10.1016/j.compbiomed.2023.107565] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 09/13/2023] [Accepted: 10/10/2023] [Indexed: 10/17/2023]
Abstract
In robot-assisted surgery, precise surgical instrument segmentation technology can provide accurate location and pose data for surgeons, helping them perform a series of surgical operations efficiently and safely. However, there are still some interfering factors, such as surgical instruments being covered by tissue, multiple surgical instruments interlacing with each other, and instrument shaking during surgery. To better address these issues, an effective surgical instrument segmentation network called InstrumentNet is proposed, which adopts YOLOv7 as the object detection framework to achieve a real-time detection solution. Specifically, a multiscale feature fusion network is constructed, which aims to avoid problems such as feature redundancy and feature loss and enhance the generalization ability. Furthermore, an adaptive feature-weighted fusion mechanism is introduced to regulate network learning and convergence. Finally, a semantic segmentation head is introduced to integrate the detection and segmentation functions, and a multitask learning loss function is specifically designed to optimize the surgical instrument segmentation performance. The proposed segmentation model is validated on a dataset of intracranial surgical instruments provided by seven experts from Beijing Tiantan Hospital and achieved an mAP score of 93.5 %, Dice score of 82.49 %, and MIoU score of 85.48 %, demonstrating its universality and superiority. The experimental results demonstrate that the proposed model achieves good segmentation performance on surgical instruments compared to other advanced models and can provide a reference for developing intelligent medical robots.
Collapse
Affiliation(s)
- Zhenzhong Liu
- Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, 300384, China; National Demonstration Center for Experimental Mechanical and Electrical Engineering Education (Tianjin University of Technology), China
| | - Laiwang Zheng
- Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, 300384, China; National Demonstration Center for Experimental Mechanical and Electrical Engineering Education (Tianjin University of Technology), China
| | - Lin Gu
- RIkagaku KENkyusho, Tokyo, Japan; Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo, Japan
| | - Shubin Yang
- Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, 300384, China; National Demonstration Center for Experimental Mechanical and Electrical Engineering Education (Tianjin University of Technology), China
| | - Zichen Zhong
- Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, 300384, China; National Demonstration Center for Experimental Mechanical and Electrical Engineering Education (Tianjin University of Technology), China
| | - Guobin Zhang
- Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, 300384, China; National Demonstration Center for Experimental Mechanical and Electrical Engineering Education (Tianjin University of Technology), China.
| |
Collapse
|