1
|
Zhang C, Zheng H, You X, Zheng Y, Gu Y. PASS: Test-Time Prompting to Adapt Styles and Semantic Shapes in Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:1853-1865. [PMID: 40030683 DOI: 10.1109/tmi.2024.3521463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Test-time adaptation (TTA) has emerged as a promising paradigm to handle the domain shifts at test time for medical images from different institutions without using extra training data. However, existing TTA solutions for segmentation tasks suffer from 1) dependency on modifying the source training stage and access to source priors or 2) lack of emphasis on shape-related semantic knowledge that is crucial for segmentation tasks. Recent research on visual prompt learning achieves source-relaxed adaptation by extended parameter space but still neglects the full utilization of semantic features, thus motivating our work on knowledge-enriched deep prompt learning. Beyond the general concern of image style shifts, we reveal that shape variability is another crucial factor causing the performance drop. To address this issue, we propose a TTA framework called PASS (Prompting to Adapt Styles and Semantic shapes), which jointly learns two types of prompts: the input-space prompt to reformulate the style of the test image to fit into the pretrained model and the semantic-aware prompts to bridge high-level shape discrepancy across domains. Instead of naively imposing a fixed prompt, we introduce an input decorator to generate the self-regulating visual prompt conditioned on the input data. To retrieve the knowledge representations and customize target-specific shape prompts for each test sample, we propose a cross-attention prompt modulator, which performs interaction between target representations and an enriched shape prompt bank. Extensive experiments demonstrate the superior performance of PASS over state-of-the-art methods on multiple medical image segmentation datasets. The code is available at https://github.com/EndoluminalSurgicalVision-IMR/PASS.
Collapse
|
2
|
Cheng Z, Liu M, Yan C, Wang S. Dynamic domain generalization for medical image segmentation. Neural Netw 2025; 184:107073. [PMID: 39733701 DOI: 10.1016/j.neunet.2024.107073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 12/13/2024] [Accepted: 12/18/2024] [Indexed: 12/31/2024]
Abstract
Domain Generalization-based Medical Image Segmentation (DGMIS) aims to enhance the robustness of segmentation models on unseen target domains by learning from fully annotated data across multiple source domains. Despite the progress made by traditional DGMIS methods, they still face several challenges. First, most DGMIS approaches rely on static models to perform inference on unseen target domains, lacking the ability to dynamically adapt to samples from different target domains. Second, current DGMIS methods often use Fourier transforms to simulate target domain styles from a global perspective, but relying solely on global transformations for data augmentation fails to fully capture the complexity and local details of the target domains. To address these issues, we propose a Dynamic Domain Generalization (DDG) method for medical image segmentation, which improves the generalization capability of models on unseen target domains by dynamically adjusting model parameters and effectively simulating target domain styles. Specifically, we design a Dynamic Position Transfer (DPT) module that decouples model parameters into static and dynamic components while incorporating positional encoding information to enable efficient feature representation and dynamic adaptation to target domain characteristics. Additionally, we introduce a Global-Local Fourier Random Transformation (GLFRT) module, which jointly considers both global and local style information of the samples. By using a random style selection strategy, this module enhances sample diversity while controlling training costs. Experimental results demonstrate that our method outperforms state-of-the-art approaches on several public medical image datasets, achieving average Dice score improvements of 0.58%, 0.76%, and 0.76% on the Fundus dataset (1060 retinal images), Prostate dataset (1744 T2-weighted MRI scans), and SCGM dataset (551 MRI image slices), respectively. The code is available online (https://github.com/ZMC-IIIM/DDG-Med).
Collapse
Affiliation(s)
- Zhiming Cheng
- School of Communication Engineering, Hangzhou Dianzi University, Hangzhou, 310018, China.
| | - Mingxia Liu
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States.
| | - Chenggang Yan
- School of Communication Engineering, Hangzhou Dianzi University, Hangzhou, 310018, China.
| | - Shuai Wang
- School of Cyberspace, Hangzhou Dianzi University, Hangzhou, 310018, China; Suzhou Research Institute of Shandong University, Suzhou, 215123, China.
| |
Collapse
|
3
|
Zhu J, Bolsterlee B, Song Y, Meijering E. Improving cross-domain generalizability of medical image segmentation using uncertainty and shape-aware continual test-time domain adaptation. Med Image Anal 2025; 101:103422. [PMID: 39700846 DOI: 10.1016/j.media.2024.103422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 11/19/2024] [Accepted: 11/29/2024] [Indexed: 12/21/2024]
Abstract
Continual test-time adaptation (CTTA) aims to continuously adapt a source-trained model to a target domain with minimal performance loss while assuming no access to the source data. Typically, source models are trained with empirical risk minimization (ERM) and assumed to perform reasonably on the target domain to allow for further adaptation. However, ERM-trained models often fail to perform adequately on a severely drifted target domain, resulting in unsatisfactory adaptation results. To tackle this issue, we propose a generalizable CTTA framework. First, we incorporate domain-invariant shape modeling into the model and train it using domain-generalization (DG) techniques, promoting target-domain adaptability regardless of the severity of the domain shift. Then, an uncertainty and shape-aware mean teacher network performs adaptation with uncertainty-weighted pseudo-labels and shape information. As part of this process, a novel uncertainty-ranked cross-task regularization scheme is proposed to impose consistency between segmentation maps and their corresponding shape representations, both produced by the student model, at the patch and global levels to enhance performance further. Lastly, small portions of the model's weights are stochastically reset to the initial domain-generalized state at each adaptation step, preventing the model from 'diving too deep' into any specific test samples. The proposed method demonstrates strong continual adaptability and outperforms its peers on five cross-domain segmentation tasks, showcasing its effectiveness and generalizability.
Collapse
Affiliation(s)
- Jiayi Zhu
- School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia; Neuroscience Research Australia (NeuRA), Randwick, NSW 2031, Australia.
| | - Bart Bolsterlee
- Neuroscience Research Australia (NeuRA), Randwick, NSW 2031, Australia; Graduate School of Biomedical Engineering, University of New South Wales, Sydney, NSW 2052, Australia
| | - Yang Song
- School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia
| | - Erik Meijering
- School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia
| |
Collapse
|
4
|
Zhang Z, Li Y, Shin BS. Enhancing generalization of medical image segmentation via game theory-based domain selection. J Biomed Inform 2025; 164:104802. [PMID: 40049504 DOI: 10.1016/j.jbi.2025.104802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2024] [Revised: 01/08/2025] [Accepted: 02/03/2025] [Indexed: 03/17/2025]
Abstract
Medical image segmentation models often fail to generalize well to new datasets due to substantial variability in imaging conditions, anatomical differences, and patient demographics. Conventional domain generalization (DG) methods focus on learning domain-agnostic features but often overlook the importance of maintaining performance balance across different domains, leading to suboptimal results. To address these issues, we propose a novel approach using game theory to model the training process as a zero-sum game, aiming for a Nash equilibrium to enhance adaptability and robustness against domain shifts. Specifically, our adaptive domain selection method, guided by the Beta distribution and optimized via reinforcement learning, dynamically adjusts to the variability across different domains, thus improving model generalization. We conducted extensive experiments on benchmark datasets for polyp segmentation, optic cup/optic disc (OC/OD) segmentation, and prostate segmentation. Our method achieved an average Dice score improvement of 1.75% compared with other methods, demonstrating the effectiveness of our approach in enhancing the generalization performance of medical image segmentation models.
Collapse
Affiliation(s)
- Zuyu Zhang
- Key Laboratory of Big Data Intelligent Computing, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Yan Li
- Department of Electrical and Computer Engineering, Inha University, Incheon, 22212, Republic of Korea.
| | - Byeong-Seok Shin
- Department of Electrical and Computer Engineering, Inha University, Incheon, 22212, Republic of Korea.
| |
Collapse
|
5
|
Song Y, Dornisch AM, Dess RT, Margolis DJA, Weinberg EP, Barrett T, Cornell M, Fan RE, Harisinghani M, Kamran SC, Lee JH, Li CX, Liss MA, Rusu M, Santos J, Sonn GA, Vidic I, Woolen SA, Dale AM, Seibert TM. Multidisciplinary Consensus Prostate Contours on Magnetic Resonance Imaging: Educational Atlas and Reference Standard for Artificial Intelligence Benchmarking. Int J Radiat Oncol Biol Phys 2025:S0360-3016(25)00253-6. [PMID: 40154847 DOI: 10.1016/j.ijrobp.2025.03.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2024] [Revised: 01/25/2025] [Accepted: 03/11/2025] [Indexed: 04/01/2025]
Abstract
PURPOSE Evaluation of artificial intelligence (AI) algorithms for prostate segmentation is challenging because ground truth is lacking. We aimed to: (1) create a reference standard data set with precise prostate contours by expert consensus, and (2) evaluate various AI tools against this standard. METHODS AND MATERIALS We obtained prostate magnetic resonance imaging cases from six institutions from the Qualitative Prostate Imaging Consortium. A panel of 4 experts (2 genitourinary radiologists and 2 prostate radiation oncologists) meticulously developed consensus prostate segmentations on axial T2-weighted series. We evaluated the performance of 6 AI tools (3 commercially available and 3 academic) using Dice scores, distance from reference contour, and volume error. RESULTS The panel achieved consensus prostate segmentation on each slice of all 68 patient cases included in the reference data set. We present 2 patient examples to serve as contouring guides. Depending on the AI tool, median Dice scores (across patients) ranged from 0.80 to 0.94 for whole prostate segmentation. For a typical (median) patient, AI tools had a mean error over the prostate surface ranging from 1.3 to 2.4 mm. They maximally deviated 3.0 to 9.4 mm outside the prostate and 3.0 to 8.5 mm inside the prostate for a typical patient. Error in prostate volume measurement for a typical patient ranged from 4.3% to 31.4%. CONCLUSIONS We established an expert consensus benchmark for prostate segmentation. The best-performing AI tools have typical accuracy greater than that reported for radiation oncologists using computed tomography scans (the most common clinical approach for radiation therapy planning). Physician review remains essential to detect occasional major errors.
Collapse
Affiliation(s)
- Yuze Song
- Department of Radiation Medicine and Applied Sciences, University of California San Diego, La Jolla, California; Department of Electrical and Computer Engineering, University of California San Diego, La Jolla, California
| | - Anna M Dornisch
- Department of Radiation Medicine and Applied Sciences, University of California San Diego, La Jolla, California
| | - Robert T Dess
- Department of Radiation Oncology, University of Michigan, Ann Arbor, Michigan
| | | | - Eric P Weinberg
- Department of Clinical Imaging Sciences, University of Rochester Medical Center, Rochester, New York
| | - Tristan Barrett
- Department of Radiology, University of Cambridge, Cambridge, United Kingdom
| | | | - Richard E Fan
- Department of Urology, Stanford School of Medicine, Palo Alto, California
| | - Mukesh Harisinghani
- Department of Radiology, Massachusetts General Hospital, Boston, Massachusetts
| | - Sophia C Kamran
- Department of Radiation Oncology, Massachusetts General Hospital, Boston, Massachusetts
| | - Jeong Hoon Lee
- Department of Radiology, Stanford School of Medicine, Palo Alto, California
| | - Cynthia Xinran Li
- Institute for Computational and Mathematical Engineering, Stanford University, Palo Alto, California
| | - Michael A Liss
- Department of Urology, University of Texas Health Sciences Center San Antonio, San Antonio, Texas
| | - Mirabela Rusu
- Department of Urology, Stanford School of Medicine, Palo Alto, California; Department of Radiology, Stanford School of Medicine, Palo Alto, California; Department of Biomedical Data Science, Stanford University, Palo Alto, California
| | | | - Geoffrey A Sonn
- Department of Urology, Stanford School of Medicine, Palo Alto, California; Department of Radiology, Stanford School of Medicine, Palo Alto, California
| | | | - Sean A Woolen
- Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, California
| | - Anders M Dale
- Department of Radiology, University of California San Diego, La Jolla, California; Department of Neurosciences, University of California San Diego, La Jolla, California; Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, California
| | - Tyler M Seibert
- Department of Radiation Medicine and Applied Sciences, University of California San Diego, La Jolla, California; Department of Radiology, University of California San Diego, La Jolla, California; Department of Bioengineering, University of California San Diego, La Jolla, California; Department of Urology, University of California San Diego, La Jolla, California.
| |
Collapse
|
6
|
Kächele J, Zenk M, Rokuss M, Ulrich C, Wald T, Maier-Hein KH. Enhanced nnU-Net Architectures for Automated MRI Segmentation of Head and Neck Tumors in Adaptive Radiation Therapy. HEAD AND NECK TUMOR SEGMENTATION FOR MR-GUIDED APPLICATIONS : FIRST MICCAI CHALLENGE, HNTS-MRG 2024, HELD IN CONJUNCTION WITH MICCAI 2024, MARRAKESH, MOROCCO, OCTOBER 17, 2024, PROCEEDINGS 2025; 15273:50-64. [PMID: 40291013 PMCID: PMC12023904 DOI: 10.1007/978-3-031-83274-1_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/30/2025]
Abstract
The increasing utilization of MRI in radiation therapy planning for head and neck cancer (HNC) highlights the need for precise tumor segmentation to enhance treatment efficacy and reduce side effects. This work presents segmentation models developed for the HNTS-MRG 2024 challenge by the team mic-dkfz, focusing on automated segmentation of HNC tumors from MRI images at two radiotherapy (RT) stages: before (pre-RT) and 2-4 weeks into RT (mid-RT). For Task 1 (pre-RT segmentation), we built upon the nnU-Net framework, enhancing it with the larger Residual Encoder architecture. We incorporated extensive data augmentation and applied transfer learning by pre-training the model on a diverse set of public 3D medical imaging datasets. For Task 2 (mid-RT segmentation), we adopted a longitudinal approach by integrating registered pre-RT images and their segmentations as additional inputs into the nnU-Net framework. On the test set, our models achieved mean aggregated Dice Similarity Coefficient (aggDSC) scores of 81.2 for Task 1 and 72.7 for Task 2. Especially the primary tumor (GTVp) segmentation is challenging and presents potential for further optimization. These results demonstrate the effectiveness of combining advanced architectures, transfer learning, and longitudinal data integration for automated tumor segmentation in MRI-guided adaptive radiation therapy.
Collapse
Affiliation(s)
- Jessica Kächele
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Medical Faculty Heidelberg, University of Heidelberg, Heidelberg, Germany
- German Cancer Consortium (DKTK), DKFZ, core center Heidelberg, Heidelberg, Germany
| | - Maximilian Zenk
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Medical Faculty Heidelberg, University of Heidelberg, Heidelberg, Germany
| | - Maximilian Rokuss
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany
| | - Constantin Ulrich
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Medical Faculty Heidelberg, University of Heidelberg, Heidelberg, Germany
- National Center for Tumor Diseases (NCT), NCT Heidelberg, A partnership between DKFZ and University Medical Center Heidelberg, Heidelberg, Germany
| | - Tassilo Wald
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Helmholtz Imaging, DKFZ, Heidelberg, Germany
- Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany
| | - Klaus H Maier-Hein
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Helmholtz Imaging, DKFZ, Heidelberg, Germany
- Pattern Analysis and Learning Group, Department of Radiation Oncology, Heidelberg University Hospital, Heidelberg, Germany
| |
Collapse
|
7
|
Lin L, Liu Y, Wu J, Cheng P, Cai Z, Wong KKY, Tang X. FedLPPA: Learning Personalized Prompt and Aggregation for Federated Weakly-Supervised Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:1127-1139. [PMID: 39423080 DOI: 10.1109/tmi.2024.3483221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2024]
Abstract
Federated learning (FL) effectively mitigates the data silo challenge brought about by policies and privacy concerns, implicitly harnessing more data for deep model training. However, traditional centralized FL models grapple with diverse multi-center data, especially in the face of significant data heterogeneity, notably in medical contexts. In the realm of medical image segmentation, the growing imperative to curtail annotation costs has amplified the importance of weakly-supervised techniques which utilize sparse annotations such as points, scribbles, etc. A pragmatic FL paradigm shall accommodate diverse annotation formats across different sites, which research topic remains under-investigated. In such context, we propose a novel personalized FL framework with learnable prompt and aggregation (FedLPPA) to uniformly leverage heterogeneous weak supervision for medical image segmentation. In FedLPPA, a learnable universal knowledge prompt is maintained, complemented by multiple learnable personalized data distribution prompts and prompts representing the supervision sparsity. Integrated with sample features through a dual-attention mechanism, those prompts empower each local task decoder to adeptly adjust to both the local distribution and the supervision form. Concurrently, a dual-decoder strategy, predicated on prompt similarity, is introduced for enhancing the generation of pseudo-labels in weakly-supervised learning, alleviating overfitting and noise accumulation inherent to local data, while an adaptable aggregation method is employed to customize the task decoder on a parameter-wise basis. Extensive experiments on four distinct medical image segmentation tasks involving different modalities underscore the superiority of FedLPPA, with its efficacy closely parallels that of fully supervised centralized training. Our code and data will be available at https://github.com/llmir/FedLPPA.
Collapse
|
8
|
Lu S, Chen Y, Chen Y, Li P, Sun J, Zheng C, Zou Y, Liang B, Li M, Jin Q, Cui E, Long W, Feng B. General lightweight framework for vision foundation model supporting multi-task and multi-center medical image analysis. Nat Commun 2025; 16:2097. [PMID: 40025028 PMCID: PMC11873151 DOI: 10.1038/s41467-025-57427-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2024] [Accepted: 02/21/2025] [Indexed: 03/04/2025] Open
Abstract
The foundation model, trained on extensive and diverse datasets, has shown strong performance across numerous downstream tasks. Nevertheless, its application in the medical domain is significantly hindered by issues such as data volume, heterogeneity, and privacy concerns. Therefore, we propose the Vision Foundation Model General Lightweight (VFMGL) framework, which facilitates the decentralized construction of expert clinical models for various medical tasks. The VFMGL framework transfers general knowledge from large-parameter vision foundation models to construct lightweight, robust expert clinical models tailored to specific medical tasks. Through extensive experiments and analyses across a range of medical tasks and scenarios, we demonstrate that VFMGL achieves superior performance in both medical image classification and segmentation tasks, effectively managing the challenges posed by data heterogeneity. These results underscore the potential of VFMGL in advancing the efficacy and reliability of AI-driven medical diagnostics.
Collapse
Affiliation(s)
- Senliang Lu
- Laboratory of Intelligent Detection and Information Processing, Guilin University of Aerospace Technology, Guilin, Guangxi, China
- Jiangmen Key Laboratory of Artificial Intelligence in Medical Image Computation and Application, Jiangmen Central Hospital, Jiangmen, Guangdong, China
- School of Electronic Engineering and Automation, Guilin University of Electronic Technology, Guilin, Guangxi, China
| | - Yehang Chen
- Laboratory of Intelligent Detection and Information Processing, Guilin University of Aerospace Technology, Guilin, Guangxi, China
| | - Yuan Chen
- Department of Gynecology, Jiangmen Central Hospital, Jiangmen, Guangdong, China
| | - Peijun Li
- Department of Radiology, Jiangmen Central Hospital, Jiangmen, Guangdong, China
| | - Junqi Sun
- Department of Radiology, Yuebei People's Hospital, Shaoguan, Guangdong, China
| | - Changye Zheng
- Department of Radiology, Affiliated Dongguan Hospital, Southern Medical University, Dongguan, Guangdong, China
| | - Yujian Zou
- Department of Radiology, Affiliated Dongguan Hospital, Southern Medical University, Dongguan, Guangdong, China
| | - Bo Liang
- Department of MRI, Maoming People's Hospital, Maoming, Guangdong, China
| | - Mingwei Li
- Department of Gynecology, Kaiping Central Hospital, Kaiping, Guangdong, China
| | - Qinggeng Jin
- School of Electrical Engineering, Guangxi University, Nanning, Guangxi, China
| | - Enming Cui
- Jiangmen Key Laboratory of Artificial Intelligence in Medical Image Computation and Application, Jiangmen Central Hospital, Jiangmen, Guangdong, China
- Department of Radiology, Jiangmen Central Hospital, Jiangmen, Guangdong, China
| | - Wansheng Long
- Jiangmen Key Laboratory of Artificial Intelligence in Medical Image Computation and Application, Jiangmen Central Hospital, Jiangmen, Guangdong, China.
- Department of Radiology, Jiangmen Central Hospital, Jiangmen, Guangdong, China.
| | - Bao Feng
- Laboratory of Intelligent Detection and Information Processing, Guilin University of Aerospace Technology, Guilin, Guangxi, China.
- Jiangmen Key Laboratory of Artificial Intelligence in Medical Image Computation and Application, Jiangmen Central Hospital, Jiangmen, Guangdong, China.
| |
Collapse
|
9
|
Yuan Y, Wang X, Yang X, Heng PA. Effective Semi-Supervised Medical Image Segmentation With Probabilistic Representations and Prototype Learning. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:1181-1193. [PMID: 39437272 DOI: 10.1109/tmi.2024.3484166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2024]
Abstract
Label scarcity, class imbalance and data uncertainty are three primary challenges that are commonly encountered in the semi-supervised medical image segmentation. In this work, we focus on the data uncertainty issue that is overlooked by previous literature. To address this issue, we propose a probabilistic prototype-based classifier that introduces uncertainty estimation into the entire pixel classification process, including probabilistic representation formulation, probabilistic pixel-prototype proximity matching, and distribution prototype update, leveraging principles from probability theory. By explicitly modeling data uncertainty at the pixel level, model robustness of our proposed framework to tricky pixels, such as ambiguous boundaries and noises, is greatly enhanced when compared to its deterministic counterpart and other uncertainty-aware strategy. Empirical evaluations on three publicly available datasets that exhibit severe boundary ambiguity show the superiority of our method over several competitors. Moreover, our method also demonstrates a stronger model robustness to simulated noisy data. Code is available at https://github.com/IsYuchenYuan/PPC.
Collapse
|
10
|
Li J, Zhou Z, Yang J, Pepe A, Gsaxner C, Luijten G, Qu C, Zhang T, Chen X, Li W, Wodzinski M, Friedrich P, Xie K, Jin Y, Ambigapathy N, Nasca E, Solak N, Melito GM, Vu VD, Memon AR, Schlachta C, De Ribaupierre S, Patel R, Eagleson R, Chen X, Mächler H, Kirschke JS, de la Rosa E, Christ PF, Li HB, Ellis DG, Aizenberg MR, Gatidis S, Küstner T, Shusharina N, Heller N, Andrearczyk V, Depeursinge A, Hatt M, Sekuboyina A, Löffler MT, Liebl H, Dorent R, Vercauteren T, Shapey J, Kujawa A, Cornelissen S, Langenhuizen P, Ben-Hamadou A, Rekik A, Pujades S, Boyer E, Bolelli F, Grana C, Lumetti L, Salehi H, Ma J, Zhang Y, Gharleghi R, Beier S, Sowmya A, Garza-Villarreal EA, Balducci T, Angeles-Valdez D, Souza R, Rittner L, Frayne R, Ji Y, Ferrari V, Chatterjee S, Dubost F, Schreiber S, Mattern H, Speck O, Haehn D, John C, Nürnberger A, Pedrosa J, Ferreira C, Aresta G, Cunha A, Campilho A, Suter Y, Garcia J, Lalande A, Vandenbossche V, Van Oevelen A, Duquesne K, Mekhzoum H, Vandemeulebroucke J, Audenaert E, Krebs C, van Leeuwen T, Vereecke E, Heidemeyer H, Röhrig R, Hölzle F, Badeli V, Krieger K, Gunzer M, et alLi J, Zhou Z, Yang J, Pepe A, Gsaxner C, Luijten G, Qu C, Zhang T, Chen X, Li W, Wodzinski M, Friedrich P, Xie K, Jin Y, Ambigapathy N, Nasca E, Solak N, Melito GM, Vu VD, Memon AR, Schlachta C, De Ribaupierre S, Patel R, Eagleson R, Chen X, Mächler H, Kirschke JS, de la Rosa E, Christ PF, Li HB, Ellis DG, Aizenberg MR, Gatidis S, Küstner T, Shusharina N, Heller N, Andrearczyk V, Depeursinge A, Hatt M, Sekuboyina A, Löffler MT, Liebl H, Dorent R, Vercauteren T, Shapey J, Kujawa A, Cornelissen S, Langenhuizen P, Ben-Hamadou A, Rekik A, Pujades S, Boyer E, Bolelli F, Grana C, Lumetti L, Salehi H, Ma J, Zhang Y, Gharleghi R, Beier S, Sowmya A, Garza-Villarreal EA, Balducci T, Angeles-Valdez D, Souza R, Rittner L, Frayne R, Ji Y, Ferrari V, Chatterjee S, Dubost F, Schreiber S, Mattern H, Speck O, Haehn D, John C, Nürnberger A, Pedrosa J, Ferreira C, Aresta G, Cunha A, Campilho A, Suter Y, Garcia J, Lalande A, Vandenbossche V, Van Oevelen A, Duquesne K, Mekhzoum H, Vandemeulebroucke J, Audenaert E, Krebs C, van Leeuwen T, Vereecke E, Heidemeyer H, Röhrig R, Hölzle F, Badeli V, Krieger K, Gunzer M, Chen J, van Meegdenburg T, Dada A, Balzer M, Fragemann J, Jonske F, Rempe M, Malorodov S, Bahnsen FH, Seibold C, Jaus A, Marinov Z, Jaeger PF, Stiefelhagen R, Santos AS, Lindo M, Ferreira A, Alves V, Kamp M, Abourayya A, Nensa F, Hörst F, Brehmer A, Heine L, Hanusrichter Y, Weßling M, Dudda M, Podleska LE, Fink MA, Keyl J, Tserpes K, Kim MS, Elhabian S, Lamecker H, Zukić D, Paniagua B, Wachinger C, Urschler M, Duong L, Wasserthal J, Hoyer PF, Basu O, Maal T, Witjes MJH, Schiele G, Chang TC, Ahmadi SA, Luo P, Menze B, Reyes M, Deserno TM, Davatzikos C, Puladi B, Fua P, Yuille AL, Kleesiek J, Egger J. MedShapeNet - a large-scale dataset of 3D medical shapes for computer vision. BIOMED ENG-BIOMED TE 2025; 70:71-90. [PMID: 39733351 DOI: 10.1515/bmt-2024-0396] [Show More Authors] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Accepted: 09/21/2024] [Indexed: 12/31/2024]
Abstract
OBJECTIVES The shape is commonly used to describe the objects. State-of-the-art algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models are used. This is seen from the growing popularity of ShapeNet (51,300 models) and Princeton ModelNet (127,915 models). However, a large collection of anatomical shapes (e.g., bones, organs, vessels) and 3D models of surgical instruments is missing. METHODS We present MedShapeNet to translate data-driven vision algorithms to medical applications and to adapt state-of-the-art vision algorithms to medical problems. As a unique feature, we directly model the majority of shapes on the imaging data of real patients. We present use cases in classifying brain tumors, skull reconstructions, multi-class anatomy completion, education, and 3D printing. RESULTS By now, MedShapeNet includes 23 datasets with more than 100,000 shapes that are paired with annotations (ground truth). Our data is freely accessible via a web interface and a Python application programming interface and can be used for discriminative, reconstructive, and variational benchmarks as well as various applications in virtual, augmented, or mixed reality, and 3D printing. CONCLUSIONS MedShapeNet contains medical shapes from anatomy and surgical instruments and will continue to collect data for benchmarks and applications. The project page is: https://medshapenet.ikim.nrw/.
Collapse
Affiliation(s)
- Jianning Li
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
- Institute of Computer Graphics and Vision (ICG), Graz University of Technology, Graz, Austria
- Computer Algorithms for Medicine Laboratory (Cafe), Graz, Austria
| | - Zongwei Zhou
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Jiancheng Yang
- Computer Vision Laboratory, Swiss Federal Institute of Technology Lausanne (EPFL), Lausanne, Switzerland
| | - Antonio Pepe
- Institute of Computer Graphics and Vision (ICG), Graz University of Technology, Graz, Austria
- Computer Algorithms for Medicine Laboratory (Cafe), Graz, Austria
| | - Christina Gsaxner
- Institute of Computer Graphics and Vision (ICG), Graz University of Technology, Graz, Austria
- Computer Algorithms for Medicine Laboratory (Cafe), Graz, Austria
- Department of Oral and Maxillofacial Surgery, University Hospital RWTH Aachen, Aachen, Germany
| | - Gijs Luijten
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
- Institute of Computer Graphics and Vision (ICG), Graz University of Technology, Graz, Austria
- Computer Algorithms for Medicine Laboratory (Cafe), Graz, Austria
- Center for Virtual and Extended Reality in Medicine (ZvRM), University Hospital Essen, University Medicine Essen, Essen, Germany
| | - Chongyu Qu
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Tiezheng Zhang
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Xiaoxi Chen
- Department of Radiology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Wenxuan Li
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Marek Wodzinski
- Department of Measurement and Electronics, AGH University of Science and Technology, Krakow, Poland
- Information Systems Institute, University of Applied Sciences Western Switzerland (HES-SO Valais), Sierre, Switzerland
| | - Paul Friedrich
- Center for Medical Image Analysis & Navigation (CIAN), Department of Biomedical Engineering, University of Basel, Allschwil, Switzerland
| | - Kangxian Xie
- Department of Computer Science and Engineering, University at Buffalo, SUNY, NY, 14260, USA
| | - Yuan Jin
- Institute of Computer Graphics and Vision (ICG), Graz University of Technology, Graz, Austria
- Computer Algorithms for Medicine Laboratory (Cafe), Graz, Austria
- Research Center for Connected Healthcare Big Data, ZhejiangLab, Hangzhou, Zhejiang, China
| | - Narmada Ambigapathy
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
| | - Enrico Nasca
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
| | - Naida Solak
- Institute of Computer Graphics and Vision (ICG), Graz University of Technology, Graz, Austria
- Computer Algorithms for Medicine Laboratory (Cafe), Graz, Austria
| | - Gian Marco Melito
- Institute of Mechanics, Graz University of Technology, Graz, Austria
| | - Viet Duc Vu
- Department of Diagnostic and Interventional Radiology, University Hospital Giessen, Justus-Liebig-University Giessen, Giessen, Germany
| | - Afaque R Memon
- Department of Mechanical Engineering, Mehran University of Engineering and Technology, Jamshoro, Sindh, Pakistan
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, People's Republic of China
| | - Christopher Schlachta
- Canadian Surgical Technologies & Advanced Robotics (CSTAR), University Hospital, London, Canada
| | - Sandrine De Ribaupierre
- Canadian Surgical Technologies & Advanced Robotics (CSTAR), University Hospital, London, Canada
| | - Rajnikant Patel
- Canadian Surgical Technologies & Advanced Robotics (CSTAR), University Hospital, London, Canada
| | - Roy Eagleson
- Canadian Surgical Technologies & Advanced Robotics (CSTAR), University Hospital, London, Canada
| | - Xiaojun Chen
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Institute of Biomedical Manufacturing and Life Quality Engineering, Shanghai Jiao Tong University, Shanghai, People's Republic of China
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, People's Republic of China
| | - Heinrich Mächler
- Department of Cardiac Surgery, Medical University Graz, Graz, Austria
| | - Jan Stefan Kirschke
- Geschäftsführender Oberarzt Abteilung für Interventionelle und Diagnostische Neuroradiologie, Universitätsklinikum der Technischen Universität München, München, Germany
| | - Ezequiel de la Rosa
- icometrix, Leuven, Belgium
- Department of Informatics, Technical University of Munich, Garching bei München, Germany
| | | | - Hongwei Bran Li
- Department of Quantitative Biomedicine, University of Zurich, Zurich, Switzerland
| | - David G Ellis
- Department of Neurosurgery, University of Nebraska Medical Center, Omaha, NE, USA
| | - Michele R Aizenberg
- Department of Neurosurgery, University of Nebraska Medical Center, Omaha, NE, USA
| | - Sergios Gatidis
- University Hospital of Tuebingen Diagnostic and Interventional Radiology Medical Image and Data Analysis (MIDAS.lab), Tuebingen, Germany
| | - Thomas Küstner
- University Hospital of Tuebingen Diagnostic and Interventional Radiology Medical Image and Data Analysis (MIDAS.lab), Tuebingen, Germany
| | - Nadya Shusharina
- Division of Radiation Biophysics, Department of Radiation Oncology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Vincent Andrearczyk
- Institute of Informatics, HES-SO Valais-Wallis University of Applied Sciences and Arts Western Switzerland, Sierre, Switzerland
| | - Adrien Depeursinge
- Institute of Informatics, HES-SO Valais-Wallis University of Applied Sciences and Arts Western Switzerland, Sierre, Switzerland
- Department of Nuclear Medicine and Molecular Imaging, Lausanne University Hospital (CHUV), Lausanne, Switzerland
| | - Mathieu Hatt
- LaTIM, INSERM UMR 1101, Univ Brest, Brest, France
| | - Anjany Sekuboyina
- Department of Informatics, Technical University of Munich, Garching bei München, Germany
| | | | - Hans Liebl
- Department of Neuroradiology, Klinikum Rechts der Isar, Munich, Germany
| | - Reuben Dorent
- King's College London, Strand, London, UK
- Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | | | | | | | - Stefan Cornelissen
- Elisabeth-TweeSteden Hospital, Tilburg, Netherlands
- Video Coding & Architectures Research Group, Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven, Netherlands
| | - Patrick Langenhuizen
- Elisabeth-TweeSteden Hospital, Tilburg, Netherlands
- Video Coding & Architectures Research Group, Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven, Netherlands
| | - Achraf Ben-Hamadou
- Centre de Recherche en Numérique de Sfax, Laboratory of Signals, Systems, Artificial Intelligence and Networks, Sfax, Tunisia
- Udini, Aix-en-Provence, France
| | - Ahmed Rekik
- Centre de Recherche en Numérique de Sfax, Laboratory of Signals, Systems, Artificial Intelligence and Networks, Sfax, Tunisia
- Udini, Aix-en-Provence, France
| | - Sergi Pujades
- Inria, Université Grenoble Alpes, CNRS, Grenoble, France
| | - Edmond Boyer
- Inria, Université Grenoble Alpes, CNRS, Grenoble, France
| | - Federico Bolelli
- "Enzo Ferrari" Department of Engineering, University of Modena and Reggio Emilia, Modena, Italy
| | - Costantino Grana
- "Enzo Ferrari" Department of Engineering, University of Modena and Reggio Emilia, Modena, Italy
| | - Luca Lumetti
- "Enzo Ferrari" Department of Engineering, University of Modena and Reggio Emilia, Modena, Italy
| | - Hamidreza Salehi
- Department of Artificial Intelligence in Medical Sciences, Faculty of Advanced Technologies in Medicine, Iran University of Medical Sciences, Tehran, Iran
| | - Jun Ma
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada
- Peter Munk Cardiac Centre, University Health Network, Toronto, ON, Canada
- Vector Institute, Toronto, ON, Canada
| | - Yao Zhang
- Shanghai AI Laboratory, Shanghai, People's Republic of China
| | - Ramtin Gharleghi
- School of Mechanical and Manufacturing Engineering, UNSW, Sydney, NSW, Australia
| | - Susann Beier
- School of Mechanical and Manufacturing Engineering, UNSW, Sydney, NSW, Australia
| | - Arcot Sowmya
- School of Computer Science and Engineering, UNSW, Sydney, NSW, Australia
| | | | - Thania Balducci
- Institute of Neurobiology, Universidad Nacional Autónoma de México Campus Juriquilla, Querétaro, Mexico
| | - Diego Angeles-Valdez
- Institute of Neurobiology, Universidad Nacional Autónoma de México Campus Juriquilla, Querétaro, Mexico
- Department of Biomedical Sciences of Cells and Systems, Cognitive Neuroscience Center, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Roberto Souza
- Advanced Imaging and Artificial Intelligence Lab, Electrical and Software Engineering Department, The Hotchkiss Brain Institute, University of Calgary, Calgary, Canada
| | - Leticia Rittner
- Medical Image Computing Lab, School of Electrical and Computer Engineering (FEEC), University of Campinas, Campinas, Brazil
| | - Richard Frayne
- Radiology and Clinical Neurosciences Departments, The Hotchkiss Brain Institute, University of Calgary, Calgary, Canada
- Seaman Family MR Research Centre, Foothills Medical Center, Calgary, Canada
| | - Yuanfeng Ji
- University of Hongkong, Pok Fu Lam, Hong Kong, People's Republic of China
| | - Vincenzo Ferrari
- Dipartimento di Ingegneria dell'Informazione, University of Pisa, Pisa, Italy
- EndoCAS Center, Department of Translational Research and of New Surgical and Medical Technologies, University of Pisa, Pisa, Italy
| | - Soumick Chatterjee
- Data and Knowledge Engineering Group, Faculty of Computer Science, Otto von Guericke University Magdeburg, Magdeburg, Germany
- Genomics Research Centre, Human Technopole, Milan, Italy
| | | | - Stefanie Schreiber
- German Centre for Neurodegenerative Disease, Magdeburg, Germany
- Centre for Behavioural Brain Sciences, Magdeburg, Germany
- Department of Neurology, Medical Faculty, University Hospital of Magdeburg, Magdeburg, Germany
| | - Hendrik Mattern
- German Centre for Neurodegenerative Disease, Magdeburg, Germany
- Centre for Behavioural Brain Sciences, Magdeburg, Germany
- Department of Biomedical Magnetic Resonance, Otto von Guericke University Magdeburg, Magdeburg, Germany
| | - Oliver Speck
- German Centre for Neurodegenerative Disease, Magdeburg, Germany
- Centre for Behavioural Brain Sciences, Magdeburg, Germany
- Department of Biomedical Magnetic Resonance, Otto von Guericke University Magdeburg, Magdeburg, Germany
| | - Daniel Haehn
- University of Massachusetts Boston, Boston, MA, USA
| | | | - Andreas Nürnberger
- Centre for Behavioural Brain Sciences, Magdeburg, Germany
- Data and Knowledge Engineering Group, Faculty of Computer Science, Otto von Guericke University Magdeburg, Magdeburg, Germany
| | - João Pedrosa
- Institute for Systems and Computer Engineering, Technology and Science (INESC TEC), Porto, Portugal
- Faculty of Engineering, University of Porto (FEUP), Porto, Portugal
| | - Carlos Ferreira
- Institute for Systems and Computer Engineering, Technology and Science (INESC TEC), Porto, Portugal
- Faculty of Engineering, University of Porto (FEUP), Porto, Portugal
| | - Guilherme Aresta
- Christian Doppler Lab for Artificial Intelligence in Retina, Department of Ophthalmology and Optometry, Medical University of Vienna, Vienna, Austria
| | - António Cunha
- Institute for Systems and Computer Engineering, Technology and Science (INESC TEC), Porto, Portugal
- Universidade of Trás-os-Montes and Alto Douro (UTAD), Vila Real, Portugal
| | - Aurélio Campilho
- Institute for Systems and Computer Engineering, Technology and Science (INESC TEC), Porto, Portugal
- Faculty of Engineering, University of Porto (FEUP), Porto, Portugal
| | - Yannick Suter
- ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland
| | - Jose Garcia
- Center for Biomedical Image Computing and Analytics (CBICA), Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA
| | - Alain Lalande
- ICMUB Laboratory, Faculty of Medicine, CNRS UMR 6302, University of Burgundy, Dijon, France
- Medical Imaging Department, University Hospital of Dijon, Dijon, France
| | | | - Aline Van Oevelen
- Department of Human Structure and Repair, Ghent University, Ghent, Belgium
| | - Kate Duquesne
- Department of Human Structure and Repair, Ghent University, Ghent, Belgium
| | - Hamza Mekhzoum
- Department of Electronics and Informatics (ETRO), Vrije Universiteit Brussel, Brussels, Belgium
| | - Jef Vandemeulebroucke
- Department of Electronics and Informatics (ETRO), Vrije Universiteit Brussel, Brussels, Belgium
| | - Emmanuel Audenaert
- Department of Human Structure and Repair, Ghent University, Ghent, Belgium
| | - Claudia Krebs
- Department of Cellular and Physiological Sciences, Life Sciences Centre, University of British Columbia, Vancouver, BC, Canada
| | - Timo van Leeuwen
- Department of Development & Regeneration, KU Leuven Campus Kulak, Kortrijk, Belgium
| | - Evie Vereecke
- Department of Development & Regeneration, KU Leuven Campus Kulak, Kortrijk, Belgium
| | - Hauke Heidemeyer
- Institute of Medical Informatics, University Hospital RWTH Aachen, Aachen, Germany
| | - Rainer Röhrig
- Institute of Medical Informatics, University Hospital RWTH Aachen, Aachen, Germany
| | - Frank Hölzle
- Department of Oral and Maxillofacial Surgery, University Hospital RWTH Aachen, Aachen, Germany
| | - Vahid Badeli
- Institute of Fundamentals and Theory in Electrical Engineering, Graz University of Technology, Graz, Austria
| | - Kathrin Krieger
- Leibniz-Institut für Analytische Wissenschaften-ISAS-e.V., Dortmund, Germany
| | - Matthias Gunzer
- Leibniz-Institut für Analytische Wissenschaften-ISAS-e.V., Dortmund, Germany
- Institute for Experimental Immunology and Imaging, University Hospital, University Duisburg-Essen, Essen, Germany
| | - Jianxu Chen
- Leibniz-Institut für Analytische Wissenschaften-ISAS-e.V., Dortmund, Germany
| | - Timo van Meegdenburg
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
- Faculty of Statistics, Technical University Dortmund, Dortmund, Germany
| | - Amin Dada
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
| | - Miriam Balzer
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
| | - Jana Fragemann
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
| | - Frederic Jonske
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
| | - Moritz Rempe
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
| | - Stanislav Malorodov
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
| | - Fin H Bahnsen
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
| | - Constantin Seibold
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
| | - Alexander Jaus
- Computer Vision for Human-Computer Interaction Lab, Department of Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Zdravko Marinov
- Computer Vision for Human-Computer Interaction Lab, Department of Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Paul F Jaeger
- German Cancer Research Center (DKFZ) Heidelberg, Interactive Machine Learning Group, Heidelberg, Germany
- Helmholtz Imaging, DKFZ Heidelberg, Heidelberg, Germany
| | - Rainer Stiefelhagen
- Computer Vision for Human-Computer Interaction Lab, Department of Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Ana Sofia Santos
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
- Center Algoritmi, LASI, University of Minho, Braga, Portugal
| | - Mariana Lindo
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
- Center Algoritmi, LASI, University of Minho, Braga, Portugal
| | - André Ferreira
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
- Center Algoritmi, LASI, University of Minho, Braga, Portugal
| | - Victor Alves
- Center Algoritmi, LASI, University of Minho, Braga, Portugal
| | - Michael Kamp
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
- Cancer Research Center Cologne Essen (CCCE), University Medicine Essen (AöR), Essen, Germany
- Institute for Neuroinformatics, Ruhr University Bochum, Bochum, Germany
- Department of Data Science & AI, Monash University, Clayton, VIC, Australia
| | - Amr Abourayya
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
- Institute for Neuroinformatics, Ruhr University Bochum, Bochum, Germany
| | - Felix Nensa
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
- Institute of Diagnostic and Interventional Radiology and Neuroradiology, University Hospital Essen (AöR), Essen, Germany
| | - Fabian Hörst
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
- Cancer Research Center Cologne Essen (CCCE), University Medicine Essen (AöR), Essen, Germany
| | - Alexander Brehmer
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
| | - Lukas Heine
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
- Cancer Research Center Cologne Essen (CCCE), University Medicine Essen (AöR), Essen, Germany
| | - Yannik Hanusrichter
- Department of Tumour Orthopaedics and Revision Arthroplasty, Orthopaedic Hospital Volmarstein, Wetter, Germany
- Center for Musculoskeletal Surgery, University Hospital of Essen, Essen, Germany
| | - Martin Weßling
- Department of Tumour Orthopaedics and Revision Arthroplasty, Orthopaedic Hospital Volmarstein, Wetter, Germany
- Center for Musculoskeletal Surgery, University Hospital of Essen, Essen, Germany
| | - Marcel Dudda
- Department of Trauma, Hand and Reconstructive Surgery, University Hospital Essen, Essen, Germany
- Department of Orthopaedics and Trauma Surgery, BG-Klinikum Duisburg, University of Duisburg-Essen, Essen , Germany
| | - Lars E Podleska
- Department of Tumor Orthopedics and Sarcoma Surgery, University Hospital Essen (AöR), Essen, Germany
| | - Matthias A Fink
- Clinic for Diagnostic and Interventional Radiology, University Hospital Heidelberg, Heidelberg, Germany
| | - Julius Keyl
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
| | - Konstantinos Tserpes
- Department of Informatics and Telematics, Harokopio University of Athens, Tavros, Greece
| | - Moon-Sung Kim
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
- Institute of Diagnostic and Interventional Radiology and Neuroradiology, University Hospital Essen (AöR), Essen, Germany
- Cancer Research Center Cologne Essen (CCCE), University Medicine Essen (AöR), Essen, Germany
| | - Shireen Elhabian
- Scientific Computing and Imaging Institute, University of Utah, Salt Lake City, USA
| | | | - Dženan Zukić
- Medical Computing, Kitware Inc., Carrboro, NC, USA
| | | | - Christian Wachinger
- Lab for Artificial Intelligence in Medical Imaging, Department of Radiology, Technical University Munich, Munich, Germany
| | - Martin Urschler
- Institute for Medical Informatics, Statistics and Documentation, Medical University Graz, Graz, Austria
| | - Luc Duong
- Department of Software and IT Engineering, Ecole de Technologie Superieure, Montreal, Quebec, Canada
| | - Jakob Wasserthal
- Clinic of Radiology & Nuclear Medicine, University Hospital Basel, Basel, Switzerland
| | - Peter F Hoyer
- Pediatric Clinic II, University Children's Hospital Essen, University Duisburg-Essen, Essen, Germany
| | - Oliver Basu
- Pediatric Clinic III, University Children's Hospital Essen, University Duisburg-Essen, Essen, Germany
- Center for Virtual and Extended Reality in Medicine (ZvRM), University Hospital Essen, University Medicine Essen, Essen, Germany
| | - Thomas Maal
- Radboudumc 3D-Lab , Department of Oral and Maxillofacial Surgery , Radboud University Nijmegen Medical Centre, Nijmegen , The Netherlands
| | - Max J H Witjes
- 3D Lab, Department of Oral and Maxillofacial Surgery, University Medical Center Groningen, Groningen, the Netherlands
| | - Gregor Schiele
- Intelligent Embedded Systems Lab, University of Duisburg-Essen, Bismarckstraße 90, 47057 Duisburg, Germany
| | | | | | - Ping Luo
- University of Hongkong, Pok Fu Lam, Hong Kong, People's Republic of China
| | - Bjoern Menze
- Department of Quantitative Biomedicine, University of Zurich, Zurich, Switzerland
| | - Mauricio Reyes
- ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland
- Department of Radiation Oncology, University Hospital Bern, University of Bern, Bern, Switzerland
| | - Thomas M Deserno
- Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Braunschweig, Germany
| | - Christos Davatzikos
- Center for Biomedical Image Computing and Analytics , Penn Neurodegeneration Genomics Center , University of Pennsylvania, Philadelphia , PA , USA ; and Center for AI and Data Science for Integrated Diagnostics, University of Pennsylvania, Philadelphia, PA, USA
| | - Behrus Puladi
- Department of Oral and Maxillofacial Surgery, University Hospital RWTH Aachen, Aachen, Germany
- Institute of Medical Informatics, University Hospital RWTH Aachen, Aachen, Germany
| | - Pascal Fua
- Computer Vision Laboratory, Swiss Federal Institute of Technology Lausanne (EPFL), Lausanne, Switzerland
| | - Alan L Yuille
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Jens Kleesiek
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
- German Cancer Consortium (DKTK), Partner Site Essen, Essen, Germany
- Department of Physics, TU Dortmund University, Dortmund, Germany
- Cancer Research Center Cologne Essen (CCCE), University Medicine Essen (AöR), Essen, Germany
| | - Jan Egger
- Institute for Artificial Intelligence in Medicine (IKIM), University Hospital Essen (AöR), Essen, Germany
- Institute of Computer Graphics and Vision (ICG), Graz University of Technology, Graz, Austria
- Computer Algorithms for Medicine Laboratory (Cafe), Graz, Austria
- Cancer Research Center Cologne Essen (CCCE), University Medicine Essen (AöR), Essen, Germany
- Center for Virtual and Extended Reality in Medicine (ZvRM), University Hospital Essen, University Medicine Essen, Essen, Germany
| |
Collapse
|
11
|
Zhu J, Zhang X, Luo X, Zheng Z, Zhou K, Kang Y, Li H, Geng D. Accurate Prostate Segmentation in Large-Scale Magnetic Resonance Imaging Datasets via First-in-First-Out Feature Memory and Multi-Scale Context Modeling. J Imaging 2025; 11:61. [PMID: 39997563 PMCID: PMC11856738 DOI: 10.3390/jimaging11020061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2024] [Revised: 01/25/2025] [Accepted: 02/10/2025] [Indexed: 02/26/2025] Open
Abstract
Prostate cancer, a prevalent malignancy affecting males globally, underscores the critical need for precise prostate segmentation in diagnostic imaging. However, accurate delineation via MRI still faces several challenges: (1) The distinction of the prostate from surrounding soft tissues is impeded by subtle boundaries in MRI images. (2) Regions such as the apex and base of the prostate exhibit inherent blurriness, which complicates edge extraction and precise segmentation. The objective of this study was to precisely delineate the borders of the prostate including the apex and base regions. This study introduces a multi-scale context modeling module to enhance boundary pixel representation, thus reducing the impact of irrelevant features on segmentation outcomes. Utilizing a first-in-first-out dynamic adjustment mechanism, the proposed methodology optimizes feature vector selection, thereby enhancing segmentation outcomes for challenging apex and base regions of the prostate. Segmentation of the prostate on 2175 clinically annotated MRI datasets demonstrated that our proposed MCM-UNet outperforms existing methods. The Average Symmetric Surface Distance (ASSD) and Dice similarity coefficient (DSC) for prostate segmentation were 0.58 voxels and 91.71%, respectively. The prostate segmentation results closely matched those manually delineated by experienced radiologists. Consequently, our method significantly enhances the accuracy of prostate segmentation and holds substantial significance in the diagnosis and treatment of prostate cancer.
Collapse
Affiliation(s)
- Jingyi Zhu
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China; (J.Z.); (X.Z.); (X.L.); (Z.Z.); (K.Z.); (Y.K.)
| | - Xukun Zhang
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China; (J.Z.); (X.Z.); (X.L.); (Z.Z.); (K.Z.); (Y.K.)
| | - Xiao Luo
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China; (J.Z.); (X.Z.); (X.L.); (Z.Z.); (K.Z.); (Y.K.)
| | - Zhiji Zheng
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China; (J.Z.); (X.Z.); (X.L.); (Z.Z.); (K.Z.); (Y.K.)
| | - Kun Zhou
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China; (J.Z.); (X.Z.); (X.L.); (Z.Z.); (K.Z.); (Y.K.)
| | - Yanlan Kang
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China; (J.Z.); (X.Z.); (X.L.); (Z.Z.); (K.Z.); (Y.K.)
| | - Haiqing Li
- Department of Radiology, Huashan Hospital, Fudan University, Shanghai 200400, China;
| | - Daoying Geng
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China; (J.Z.); (X.Z.); (X.L.); (Z.Z.); (K.Z.); (Y.K.)
- Department of Radiology, Huashan Hospital, Fudan University, Shanghai 200400, China;
| |
Collapse
|
12
|
Huang S, Ge Y, Liu D, Hong M, Zhao J, Loui AC. Rethinking Copy-Paste for Consistency Learning in Medical Image Segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:1060-1074. [PMID: 40031728 DOI: 10.1109/tip.2025.3536208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Semi-supervised learning based on consistency learning offers significant promise for enhancing medical image segmentation. Current approaches use copy-paste as an effective data perturbation technique to facilitate weak-to-strong consistency learning. However, these techniques often lead to a decrease in the accuracy of synthetic labels corresponding to the synthetic data and introduce excessive perturbations to the distribution of the training data. Such over-perturbation causes the data distribution to stray from its true distribution, thereby impairing the model's generalization capabilities as it learns the decision boundaries. We propose a weak-to-strong consistency learning framework that integrally addresses these issues with two primary designs: 1) it emphasizes the use of highly reliable data to enhance the quality of labels in synthetic datasets through cross-copy-pasting between labeled and unlabeled datasets; 2) it employs uncertainty estimation and foreground region constraints to meticulously filter the regions for copy-pasting, thus the copy-paste technique implemented introduces a beneficial perturbation to the training data distribution. Our framework expands the copy-paste method by addressing its inherent limitations, and amplifying the potential of data perturbations for consistency learning. We extensively validated our model using six publicly available medical image segmentation datasets across different diagnostic tasks, including the segmentation of cardiac structures, prostate structures, brain structures, skin lesions, and gastrointestinal polyps. The results demonstrate that our method significantly outperforms state-of-the-art models. For instance, on the PROMISE12 dataset for the prostate structure segmentation task, using only 10% labeled data, our method achieves a 15.31% higher Dice score compared to the baseline models. Our experimental code will be made publicly available at https://github.com/slhuang24/RCP4CL.
Collapse
|
13
|
Li Y, Jing B, Li Z, Wang J, Zhang Y. Plug-and-play segment anything model improves nnUNet performance. Med Phys 2025; 52:899-912. [PMID: 39466578 PMCID: PMC11788268 DOI: 10.1002/mp.17481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2024] [Revised: 09/02/2024] [Accepted: 10/05/2024] [Indexed: 10/30/2024] Open
Abstract
BACKGROUND The automatic segmentation of medical images has widespread applications in modern clinical workflows. The Segment Anything Model (SAM), a recent development of foundational models in computer vision, has become a universal tool for image segmentation without the need for specific domain training. However, SAM's reliance on prompts necessitates human-computer interaction during the inference process. Its performance on specific domains can also be limited without additional adaptation. In contrast, traditional models like nnUNet are designed to perform segmentation tasks automatically during inference and can work well for each specific domain, but they require extensive training on domain-specific datasets. PURPOSE To leverage the advantages of both foundational and domain-specific models and achieve fully automated segmentation with limited training samples, we propose nnSAM, which combines the robust feature extraction capabilities of SAM with the automatic configuration abilities of nnUNet to enhance the accuracy and robustness of medical image segmentation on small datasets. METHODS We propose the nnSAM model for small sample medical image segmentation. We made optimizations for this goal via two main approaches: first, we integrated the feature extraction capabilities of SAM with the automatic configuration advantages of nnUNet, which enables robust feature extraction and domain-specific adaptation on small datasets. Second, during the training process, we designed a boundary shape supervision loss based on level set functions and curvature calculations, enabling the model to learn anatomical shape priors from limited annotation data. RESULTS We conducted quantitative and qualitative assessments on the performance of our proposed method on four segmentation tasks: brain white matter, liver, lung, and heart segmentation. Our method achieved the best performance across all tasks. Specifically, in brain white matter segmentation using 20 training samples, nnSAM achieved the highest DICE score of 82.77 ( ± $\pm$ 10.12) % and the lowest average surface distance (ASD) of 1.14 ( ± $\pm$ 1.03) mm, compared to nnUNet, which had a DICE score of 79.25 ( ± $\pm$ 17.24) % and an ASD of 1.36 ( ± $\pm$ 1.63) mm. A sample size study shows that the advantage of nnSAM becomes more prominent under fewer training samples. CONCLUSIONS A comprehensive evaluation of multiple small-sample segmentation tasks demonstrates significant improvements in segmentation performance by nnSAM, highlighting the vast potential of small-sample learning.
Collapse
Affiliation(s)
- Yunxiang Li
- Department of Radiation OncologyUT Southwestern Medical CenterDallasTexasUSA
| | - Bowen Jing
- Department of Radiation OncologyUT Southwestern Medical CenterDallasTexasUSA
| | - Zihan Li
- Department of BioengineeringUniversity of WashingtonSeattleWashingtonUSA
| | - Jing Wang
- Department of Radiation OncologyUT Southwestern Medical CenterDallasTexasUSA
| | - You Zhang
- Department of Radiation OncologyUT Southwestern Medical CenterDallasTexasUSA
| |
Collapse
|
14
|
Gu Y, Sun Z, Chen T, Xiao X, Liu Y, Xu Y, Najman L. Dual structure-aware image filterings for semi-supervised medical image segmentation. Med Image Anal 2025; 99:103364. [PMID: 39418830 DOI: 10.1016/j.media.2024.103364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 07/25/2024] [Accepted: 10/01/2024] [Indexed: 10/19/2024]
Abstract
Semi-supervised image segmentation has attracted great attention recently. The key is how to leverage unlabeled images in the training process. Most methods maintain consistent predictions of the unlabeled images under variations (e.g., adding noise/perturbations, or creating alternative versions) in the image and/or model level. In most image-level variation, medical images often have prior structure information, which has not been well explored. In this paper, we propose novel dual structure-aware image filterings (DSAIF) as the image-level variations for semi-supervised medical image segmentation. Motivated by connected filtering that simplifies image via filtering in structure-aware tree-based image representation, we resort to the dual contrast invariant Max-tree and Min-tree representation. Specifically, we propose a novel connected filtering that removes topologically equivalent nodes (i.e. connected components) having no siblings in the Max/Min-tree. This results in two filtered images preserving topologically critical structure. Applying the proposed DSAIF to mutually supervised networks decreases the consensus of their erroneous predictions on unlabeled images. This helps to alleviate the confirmation bias issue of overfitting to noisy pseudo labels of unlabeled images, and thus effectively improves the segmentation performance. Extensive experimental results on three benchmark datasets demonstrate that the proposed method significantly/consistently outperforms some state-of-the-art methods. The source codes will be publicly available.
Collapse
Affiliation(s)
- Yuliang Gu
- National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan, China; Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, China; Medical Artificial Intelligence Research Institute of Renmin Hospital, Wuhan University, Wuhan, China.
| | - Zhichao Sun
- National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan, China; Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, China; Medical Artificial Intelligence Research Institute of Renmin Hospital, Wuhan University, Wuhan, China.
| | - Tian Chen
- National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan, China; Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, China; Medical Artificial Intelligence Research Institute of Renmin Hospital, Wuhan University, Wuhan, China.
| | - Xin Xiao
- National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan, China; Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, China; Medical Artificial Intelligence Research Institute of Renmin Hospital, Wuhan University, Wuhan, China.
| | - Yepeng Liu
- National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan, China; Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, China; Medical Artificial Intelligence Research Institute of Renmin Hospital, Wuhan University, Wuhan, China.
| | - Yongchao Xu
- National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan, China; Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, China; Medical Artificial Intelligence Research Institute of Renmin Hospital, Wuhan University, Wuhan, China.
| | - Laurent Najman
- Univ Gustave Eiffel, CNRS, LIGM, Marne-la-Vallée, France.
| |
Collapse
|
15
|
Wang C, Xu R, Xu S, Meng W, Xiao J, Zhang X. Accurate Lung Nodule Segmentation With Detailed Representation Transfer and Soft Mask Supervision. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:18381-18393. [PMID: 37824321 DOI: 10.1109/tnnls.2023.3315271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/14/2023]
Abstract
Accurate lung lesion segmentation from computed tomography (CT) images is crucial to the analysis and diagnosis of lung diseases, such as COVID-19 and lung cancer. However, the smallness and variety of lung nodules and the lack of high-quality labeling make the accurate lung nodule segmentation difficult. To address these issues, we first introduce a novel segmentation mask named " soft mask," which has richer and more accurate edge details description and better visualization, and develop a universal automatic soft mask annotation pipeline to deal with different datasets correspondingly. Then, a novel network with detailed representation transfer and soft mask supervision (DSNet) is proposed to process the input low-resolution images of lung nodules into high-quality segmentation results. Our DSNet contains a special detailed representation transfer module (DRTM) for reconstructing the detailed representation to alleviate the small size of lung nodules images and an adversarial training framework with soft mask for further improving the accuracy of segmentation. Extensive experiments validate that our DSNet outperforms other state-of-the-art methods for accurate lung nodule segmentation, and has strong generalization ability in other accurate medical segmentation tasks with competitive results. Besides, we provide a new challenging lung nodules segmentation dataset for further studies (https://drive.google.com/file/d/15NNkvDTb_0Ku0IoPsNMHezJRTH1Oi1wm/view?usp=sharing).
Collapse
|
16
|
Cheng Z, Wang S, Gao Y, Zhu Z, Yan C. Invariant Content Representation for Generalizable Medical Image Segmentation. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:3193-3207. [PMID: 38758420 PMCID: PMC11612095 DOI: 10.1007/s10278-024-01088-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 01/20/2024] [Accepted: 02/09/2024] [Indexed: 05/18/2024]
Abstract
Domain generalization (DG) for medical image segmentation due to privacy preservation prefers learning from a single-source domain and expects good robustness on unseen target domains. To achieve this goal, previous methods mainly use data augmentation to expand the distribution of samples and learn invariant content from them. However, most of these methods commonly perform global augmentation, leading to limited augmented sample diversity. In addition, the style of the augmented image is more scattered than the source domain, which may cause the model to overfit the style of the source domain. To address the above issues, we propose an invariant content representation network (ICRN) to enhance the learning of invariant content and suppress the learning of variability styles. Specifically, we first design a gamma correction-based local style augmentation (LSA) to expand the distribution of samples by augmenting foreground and background styles, respectively. Then, based on the augmented samples, we introduce invariant content learning (ICL) to learn generalizable invariant content from both augmented and source-domain samples. Finally, we design domain-specific batch normalization (DSBN) based style adversarial learning (SAL) to suppress the learning of preferences for source-domain styles. Experimental results show that our proposed method improves by 8.74% and 11.33% in overall dice coefficient (Dice) and reduces 15.88 mm and 3.87 mm in overall average surface distance (ASD) on two publicly available cross-domain datasets, Fundus and Prostate, compared to the state-of-the-art DG methods. The code is available at https://github.com/ZMC-IIIM/ICRN-DG .
Collapse
Affiliation(s)
- Zhiming Cheng
- School of Automation, Hangzhou Dianzi University, Hangzhou, 310018, China
| | - Shuai Wang
- School of Cyberspace, Hangzhou Dianzi University, Hangzhou, 310018, China.
- Suzhou Research Institute of Shandong University, SuZhou, 215123, China.
| | - Yuhan Gao
- School of Automation, Hangzhou Dianzi University, Hangzhou, 310018, China
- Lishui Institute of Hangzhou Dianzi Universitu, Lishui, 323010, China
| | - Zunjie Zhu
- Lishui Institute of Hangzhou Dianzi Universitu, Lishui, 323010, China
- School of Communication, Engineering, Hangzhou Dianzi Universitu, Hangzhou, 310018, China
| | - Chenggang Yan
- School of Communication, Engineering, Hangzhou Dianzi Universitu, Hangzhou, 310018, China
| |
Collapse
|
17
|
Ahmad I, Alqurashi F. Early cancer detection using deep learning and medical imaging: A survey. Crit Rev Oncol Hematol 2024; 204:104528. [PMID: 39413940 DOI: 10.1016/j.critrevonc.2024.104528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2024] [Accepted: 10/02/2024] [Indexed: 10/18/2024] Open
Abstract
Cancer, characterized by the uncontrolled division of abnormal cells that harm body tissues, necessitates early detection for effective treatment. Medical imaging is crucial for identifying various cancers, yet its manual interpretation by radiologists is often subjective, labour-intensive, and time-consuming. Consequently, there is a critical need for an automated decision-making process to enhance cancer detection and diagnosis. Previously, a lot of work was done on surveys of different cancer detection methods, and most of them were focused on specific cancers and limited techniques. This study presents a comprehensive survey of cancer detection methods. It entails a review of 99 research articles collected from the Web of Science, IEEE, and Scopus databases, published between 2020 and 2024. The scope of the study encompasses 12 types of cancer, including breast, cervical, ovarian, prostate, esophageal, liver, pancreatic, colon, lung, oral, brain, and skin cancers. This study discusses different cancer detection techniques, including medical imaging data, image preprocessing, segmentation, feature extraction, deep learning and transfer learning methods, and evaluation metrics. Eventually, we summarised the datasets and techniques with research challenges and limitations. Finally, we provide future directions for enhancing cancer detection techniques.
Collapse
Affiliation(s)
- Istiak Ahmad
- Department of Computer Science, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia; School of Information and Communication Technology, Griffith University, Queensland 4111, Australia.
| | - Fahad Alqurashi
- Department of Computer Science, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| |
Collapse
|
18
|
Chen C, Miao J, Wu D, Zhong A, Yan Z, Kim S, Hu J, Liu Z, Sun L, Li X, Liu T, Heng PA, Li Q. MA-SAM: Modality-agnostic SAM adaptation for 3D medical image segmentation. Med Image Anal 2024; 98:103310. [PMID: 39182302 PMCID: PMC11381141 DOI: 10.1016/j.media.2024.103310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Revised: 08/13/2024] [Accepted: 08/16/2024] [Indexed: 08/27/2024]
Abstract
The Segment Anything Model (SAM), a foundation model for general image segmentation, has demonstrated impressive zero-shot performance across numerous natural image segmentation tasks. However, SAM's performance significantly declines when applied to medical images, primarily due to the substantial disparity between natural and medical image domains. To effectively adapt SAM to medical images, it is important to incorporate critical third-dimensional information, i.e., volumetric or temporal knowledge, during fine-tuning. Simultaneously, we aim to harness SAM's pre-trained weights within its original 2D backbone to the fullest extent. In this paper, we introduce a modality-agnostic SAM adaptation framework, named as MA-SAM, that is applicable to various volumetric and video medical data. Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments while preserving the majority of SAM's pre-trained weights. By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data. We comprehensively evaluate our method on five medical image segmentation tasks, by using 11 public datasets across CT, MRI, and surgical video data. Remarkably, without using any prompt, our method consistently outperforms various state-of-the-art 3D approaches, surpassing nnU-Net by 0.9%, 2.6%, and 9.9% in Dice for CT multi-organ segmentation, MRI prostate segmentation, and surgical scene segmentation respectively. Our model also demonstrates strong generalization, and excels in challenging tumor segmentation when prompts are used. Our code is available at: https://github.com/cchen-cc/MA-SAM.
Collapse
Affiliation(s)
- Cheng Chen
- Center of Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Juzheng Miao
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Dufan Wu
- Center of Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Aoxiao Zhong
- Center of Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Harvard John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA 02138, USA
| | - Zhiling Yan
- Department of Computer Science and Engineering, Lehigh University, Bethlehem, PA 18015, USA
| | - Sekeun Kim
- Center of Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Jiang Hu
- Center of Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Zhengliang Liu
- Center of Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; School of Computing, The University of Georgia, Athens, GA 30602, USA
| | - Lichao Sun
- Department of Computer Science and Engineering, Lehigh University, Bethlehem, PA 18015, USA
| | - Xiang Li
- Center of Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA.
| | - Tianming Liu
- School of Computing, The University of Georgia, Athens, GA 30602, USA
| | - Pheng-Ann Heng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Quanzheng Li
- Center of Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| |
Collapse
|
19
|
Kuş Z, Aydin M. MedSegBench: A comprehensive benchmark for medical image segmentation in diverse data modalities. Sci Data 2024; 11:1283. [PMID: 39587124 PMCID: PMC11589128 DOI: 10.1038/s41597-024-04159-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2024] [Accepted: 11/19/2024] [Indexed: 11/27/2024] Open
Abstract
MedSegBench is a comprehensive benchmark designed to evaluate deep learning models for medical image segmentation across a wide range of modalities. It covers a wide range of modalities, including 35 datasets with over 60,000 images from ultrasound, MRI, and X-ray. The benchmark addresses challenges in medical imaging by providing standardized datasets with train/validation/test splits, considering variability in image quality and dataset imbalances. The benchmark supports binary and multi-class segmentation tasks with up to 19 classes and uses the U-Net architecture with various encoder/decoder networks such as ResNets, EfficientNet, and DenseNet for evaluations. MedSegBench is a valuable resource for developing robust and flexible segmentation algorithms and allows for fair comparisons across different models, promoting the development of universal models for medical tasks. It is the most comprehensive study among medical segmentation datasets. The datasets and source code are publicly available, encouraging further research and development in medical image analysis.
Collapse
Affiliation(s)
- Zeki Kuş
- Fatih Sultan Mehmet Vakif University, Computer Engineering, İstanbul, 34445, Türkiye.
| | - Musa Aydin
- Fatih Sultan Mehmet Vakif University, Computer Engineering, İstanbul, 34445, Türkiye
| |
Collapse
|
20
|
Wu X, Xu Z, Tong RKY. Continual learning in medical image analysis: A survey. Comput Biol Med 2024; 182:109206. [PMID: 39332115 DOI: 10.1016/j.compbiomed.2024.109206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 06/24/2024] [Accepted: 09/22/2024] [Indexed: 09/29/2024]
Abstract
In the dynamic realm of practical clinical scenarios, Continual Learning (CL) has gained increasing interest in medical image analysis due to its potential to address major challenges associated with data privacy, model adaptability, memory inefficiency, prediction robustness and detection accuracy. In general, the primary challenge in adapting and advancing CL remains catastrophic forgetting. Beyond this challenge, recent years have witnessed a growing body of work that expands our comprehension and application of continual learning in the medical domain, highlighting its practical significance and intricacy. In this paper, we present an in-depth and up-to-date review of the application of CL in medical image analysis. Our discussion delves into the strategies employed to address specific tasks within the medical domain, categorizing existing CL methods into three settings: Task-Incremental Learning, Class-Incremental Learning, and Domain-Incremental Learning. These settings are further subdivided based on representative learning strategies, allowing us to assess their strengths and weaknesses in the context of various medical scenarios. By establishing a correlation between each medical challenge and the corresponding insights provided by CL, we provide a comprehensive understanding of the potential impact of these techniques. To enhance the utility of our review, we provide an overview of the commonly used benchmark medical datasets and evaluation metrics in the field. Through a comprehensive comparison, we discuss promising future directions for the application of CL in medical image analysis. A comprehensive list of studies is being continuously updated at https://github.com/xw1519/Continual-Learning-Medical-Adaptation.
Collapse
Affiliation(s)
- Xinyao Wu
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, NT, Hong Kong, China.
| | - Zhe Xu
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, NT, Hong Kong, China; Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
| | - Raymond Kai-Yu Tong
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, NT, Hong Kong, China.
| |
Collapse
|
21
|
Kumar A, Jiang H, Imran M, Valdes C, Leon G, Kang D, Nataraj P, Zhou Y, Weiss MD, Shao W. A flexible 2.5D medical image segmentation approach with in-slice and cross-slice attention. Comput Biol Med 2024; 182:109173. [PMID: 39317055 DOI: 10.1016/j.compbiomed.2024.109173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 08/18/2024] [Accepted: 09/17/2024] [Indexed: 09/26/2024]
Abstract
Deep learning has become the de facto method for medical image segmentation, with 3D segmentation models excelling in capturing complex 3D structures and 2D models offering high computational efficiency. However, segmenting 2.5D images, characterized by high in-plane resolution but lower through-plane resolution, presents significant challenges. While applying 2D models to individual slices of a 2.5D image is feasible, it fails to capture the spatial relationships between slices. On the other hand, 3D models face challenges such as resolution inconsistencies in 2.5D images, along with computational complexity and susceptibility to overfitting when trained with limited data. In this context, 2.5D models, which capture inter-slice correlations using only 2D neural networks, emerge as a promising solution due to their reduced computational demand and simplicity in implementation. In this paper, we introduce CSA-Net, a flexible 2.5D segmentation model capable of processing 2.5D images with an arbitrary number of slices. CSA-Net features an innovative Cross-Slice Attention (CSA) module that effectively captures 3D spatial information by learning long-range dependencies between the center slice (for segmentation) and its neighboring slices. Moreover, CSA-Net utilizes the self-attention mechanism to learn correlations among pixels within the center slice. We evaluated CSA-Net on three 2.5D segmentation tasks: (1) multi-class brain MR image segmentation, (2) binary prostate MR image segmentation, and (3) multi-class prostate MR image segmentation. CSA-Net outperformed leading 2D, 2.5D, and 3D segmentation methods across all three tasks, achieving average Dice coefficients and HD95 values of 0.897 and 1.40 mm for the brain dataset, 0.921 and 1.06 mm for the prostate dataset, and 0.659 and 2.70 mm for the ProstateX dataset, demonstrating its efficacy and superiority. Our code is publicly available at: https://github.com/mirthAI/CSA-Net.
Collapse
Affiliation(s)
- Amarjeet Kumar
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL, 32610, United States
| | - Hongxu Jiang
- Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL, 32610, United States
| | - Muhammad Imran
- Department of Medicine, University of Florida, Gainesville, FL, 32610, United States
| | - Cyndi Valdes
- Department of Pediatrics, University of Florida, Gainesville, FL, 32610, United States
| | - Gabriela Leon
- College of Medicine, University of Florida, Gainesville, FL, 32610, United States
| | - Dahyun Kang
- College of Medicine, University of Florida, Gainesville, FL, 32610, United States
| | - Parvathi Nataraj
- Department of Pediatrics, University of Florida, Gainesville, FL, 32610, United States
| | - Yuyin Zhou
- Department of Computer Science and Engineering, University of California, Santa Cruz, CA, 95064, United States
| | - Michael D Weiss
- Department of Pediatrics, University of Florida, Gainesville, FL, 32610, United States
| | - Wei Shao
- Department of Medicine, University of Florida, Gainesville, FL, 32610, United States; Intelligent Clinical Care Center, University of Florida, Gainesville, FL, 32610, United States.
| |
Collapse
|
22
|
Osman YBM, Li C, Huang W, Wang S. Collaborative Learning for Annotation-Efficient Volumetric MR Image Segmentation. J Magn Reson Imaging 2024; 60:1604-1614. [PMID: 38156427 DOI: 10.1002/jmri.29194] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 12/05/2023] [Accepted: 12/05/2023] [Indexed: 12/30/2023] Open
Abstract
BACKGROUND Deep learning has presented great potential in accurate MR image segmentation when enough labeled data are provided for network optimization. However, manually annotating three-dimensional (3D) MR images is tedious and time-consuming, requiring experts with rich domain knowledge and experience. PURPOSE To build a deep learning method exploring sparse annotations, namely only a single two-dimensional slice label for each 3D training MR image. STUDY TYPE Retrospective. POPULATION Three-dimensional MR images of 150 subjects from two publicly available datasets were included. Among them, 50 (1377 image slices) are for prostate segmentation. The other 100 (8800 image slices) are for left atrium segmentation. Five-fold cross-validation experiments were carried out utilizing the first dataset. For the second dataset, 80 subjects were used for training and 20 were used for testing. FIELD STRENGTH/SEQUENCE 1.5 T and 3.0 T; axial T2-weighted and late gadolinium-enhanced, 3D respiratory navigated, inversion recovery prepared gradient echo pulse sequence. ASSESSMENT A collaborative learning method by integrating the strengths of semi-supervised and self-supervised learning schemes was developed. The method was trained using labeled central slices and unlabeled noncentral slices. Segmentation performance on testing set was reported quantitatively and qualitatively. STATISTICAL TESTS Quantitative evaluation metrics including boundary intersection-over-union (B-IoU), Dice similarity coefficient, average symmetric surface distance, and relative absolute volume difference were calculated. Paired t test was performed, and P < 0.05 was considered statistically significant. RESULTS Compared to fully supervised training with only the labeled central slice, mean teacher, uncertainty-aware mean teacher, deep co-training, interpolation consistency training (ICT), and ambiguity-consensus mean teacher, the proposed method achieved a substantial improvement in segmentation accuracy, increasing the mean B-IoU significantly by more than 10.0% for prostate segmentation (proposed method B-IoU: 70.3% ± 7.6% vs. ICT B-IoU: 60.3% ± 11.2%) and by more than 6.0% for left atrium segmentation (proposed method B-IoU: 66.1% ± 6.8% vs. ICT B-IoU: 60.1% ± 7.1%). DATA CONCLUSIONS A collaborative learning method trained using sparse annotations can segment prostate and left atrium with high accuracy. LEVEL OF EVIDENCE 0 TECHNICAL EFFICACY: Stage 1.
Collapse
Affiliation(s)
- Yousuf Babiker M Osman
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Cheng Li
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Weijian Huang
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
- Peng Cheng Laboratory, Shenzhen, China
| | - Shanshan Wang
- Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- Peng Cheng Laboratory, Shenzhen, China
| |
Collapse
|
23
|
Sabati M, Yang M, Chauhan A. Editorial for "Collaborative Learning for Annotation-Efficient Volumetric MR Image Segmentation". J Magn Reson Imaging 2024; 60:1615-1616. [PMID: 38258419 DOI: 10.1002/jmri.29212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 11/20/2023] [Indexed: 01/24/2024] Open
Affiliation(s)
- Mohammad Sabati
- Hoglund Biomedical Imaging Center, University of Kansas Medical Center, Kansas City, Kansas, USA
- Bioengineering Program, School of Engineering, University of Kansas, Lawrence, Kansas, USA
| | - Mingrui Yang
- Department of Biomedical Engineering, Program of Advanced Musculoskeletal Imaging, Cleveland Clinic, Cleveland, Ohio, USA
| | - Anil Chauhan
- Department of Radiology, University of Kansas Medical Center, Kansas City, Kansas, USA
| |
Collapse
|
24
|
Hu J, Yang Y, Guo X, Ma T, Wang J. A Chebyshev Confidence Guided Source-Free Domain Adaptation Framework for Medical Image Segmentation. IEEE J Biomed Health Inform 2024; 28:5473-5486. [PMID: 38809721 DOI: 10.1109/jbhi.2024.3406906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2024]
Abstract
Source-free domain adaptation (SFDA) aims to adapt models trained on a labeled source domain to an unlabeled target domain without access to source data. In medical imaging scenarios, the practical significance of SFDA methods has been emphasized due to data heterogeneity and privacy concerns. Recent state-of-the-art SFDA methods primarily rely on self-training based on pseudo-labels (PLs). Unfortunately, the accuracy of PLs may deteriorate due to domain shift, thus limiting the effectiveness of the adaptation process. To address this issue, we propose a Chebyshev confidence guided SFDA framework to accurately assess the reliability of PLs and generate self-improving PLs for self-training. The Chebyshev confidence is estimated by calculating the probability lower bound of PL confidence, given the prediction and the corresponding uncertainty. Leveraging the Chebyshev confidence, we introduce two confidence-guided denoising methods: direct denoising and prototypical denoising. Additionally, we propose a novel teacher-student joint training scheme (TJTS) that incorporates a confidence weighting module to iteratively improve PLs' accuracy. The TJTS, in collaboration with the denoising methods, effectively prevents the propagation of noise and enhances the accuracy of PLs. Extensive experiments in diverse domain scenarios validate the effectiveness of our proposed framework and establish its superiority over state-of-the-art SFDA methods. Our paper contributes to the field of SFDA by providing a novel approach for precisely estimating the reliability of PLs and a framework for obtaining high-quality PLs, resulting in improved adaptation performance.
Collapse
|
25
|
Du Q, Wang L, Chen H. A mixed Mamba U-net for prostate segmentation in MR images. Sci Rep 2024; 14:19976. [PMID: 39198553 PMCID: PMC11358272 DOI: 10.1038/s41598-024-71045-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 08/23/2024] [Indexed: 09/01/2024] Open
Abstract
The diagnosis of early prostate cancer depends on the accurate segmentation of prostate regions in magnetic resonance imaging (MRI). However, this segmentation task is challenging due to the particularities of prostate MR images themselves and the limitations of existing methods. To address these issues, we propose a U-shaped encoder-decoder network MM-UNet based on Mamba and CNN for prostate segmentation in MR images. Specifically, we first proposed an adaptive feature fusion module based on channel attention guidance to achieve effective fusion between adjacent hierarchical features and suppress the interference of background noise. Secondly, we propose a global context-aware module based on Mamba, which has strong long-range modeling capabilities and linear complexity, to capture global context information in images. Finally, we propose a multi-scale anisotropic convolution module based on the principle of parallel multi-scale anisotropic convolution blocks and 3D convolution decomposition. Experimental results on two public prostate MR image segmentation datasets demonstrate that the proposed method outperforms competing models in terms of prostate segmentation performance and achieves state-of-the-art performance. In future work, we intend to enhance the model's robustness and extend its applicability to additional medical image segmentation tasks.
Collapse
Affiliation(s)
- Qiu Du
- Department of Urology, Hunan Provincial People's Hospital, The First Affiliated Hospital of Hunan Normal University, Changsha, 410005, People's Republic of China
| | - Luowu Wang
- Department of Urology, Hunan Provincial People's Hospital, The First Affiliated Hospital of Hunan Normal University, Changsha, 410005, People's Republic of China
| | - Hao Chen
- Department of Urology, Hunan Provincial People's Hospital, The First Affiliated Hospital of Hunan Normal University, Changsha, 410005, People's Republic of China.
| |
Collapse
|
26
|
Guan H, Yap PT, Bozoki A, Liu M. Federated learning for medical image analysis: A survey. PATTERN RECOGNITION 2024; 151:110424. [PMID: 38559674 PMCID: PMC10976951 DOI: 10.1016/j.patcog.2024.110424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Machine learning in medical imaging often faces a fundamental dilemma, namely, the small sample size problem. Many recent studies suggest using multi-domain data pooled from different acquisition sites/centers to improve statistical power. However, medical images from different sites cannot be easily shared to build large datasets for model training due to privacy protection reasons. As a promising solution, federated learning, which enables collaborative training of machine learning models based on data from different sites without cross-site data sharing, has attracted considerable attention recently. In this paper, we conduct a comprehensive survey of the recent development of federated learning methods in medical image analysis. We have systematically gathered research papers on federated learning and its applications in medical image analysis published between 2017 and 2023. Our search and compilation were conducted using databases from IEEE Xplore, ACM Digital Library, Science Direct, Springer Link, Web of Science, Google Scholar, and PubMed. In this survey, we first introduce the background of federated learning for dealing with privacy protection and collaborative learning issues. We then present a comprehensive review of recent advances in federated learning methods for medical image analysis. Specifically, existing methods are categorized based on three critical aspects of a federated learning system, including client end, server end, and communication techniques. In each category, we summarize the existing federated learning methods according to specific research problems in medical image analysis and also provide insights into the motivations of different approaches. In addition, we provide a review of existing benchmark medical imaging datasets and software platforms for current federated learning research. We also conduct an experimental study to empirically evaluate typical federated learning methods for medical image analysis. This survey can help to better understand the current research status, challenges, and potential research opportunities in this promising research field.
Collapse
Affiliation(s)
- Hao Guan
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Pew-Thian Yap
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Andrea Bozoki
- Department of Neurology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Mingxia Liu
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
27
|
Langkilde F, Masaba P, Edenbrandt L, Gren M, Halil A, Hellström M, Larsson M, Naeem AA, Wallström J, Maier SE, Jäderling F. Manual prostate MRI segmentation by readers with different experience: a study of the learning progress. Eur Radiol 2024; 34:4801-4809. [PMID: 38165432 PMCID: PMC11213744 DOI: 10.1007/s00330-023-10515-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 11/06/2023] [Accepted: 11/10/2023] [Indexed: 01/03/2024]
Abstract
OBJECTIVE To evaluate the learning progress of less experienced readers in prostate MRI segmentation. MATERIALS AND METHODS One hundred bi-parametric prostate MRI scans were retrospectively selected from the Göteborg Prostate Cancer Screening 2 Trial (single center). Nine readers with varying degrees of segmentation experience were involved: one expert radiologist, two experienced radiology residents, two inexperienced radiology residents, and four novices. The task was to segment the whole prostate gland. The expert's segmentations were used as reference. For all other readers except three novices, the 100 MRI scans were divided into five rounds (cases 1-10, 11-25, 26-50, 51-76, 76-100). Three novices segmented only 50 cases (three rounds). After each round, a one-on-one feedback session between the expert and the reader was held, with feedback on systematic errors and potential improvements for the next round. Dice similarity coefficient (DSC) > 0.8 was considered accurate. RESULTS Using DSC > 0.8 as the threshold, the novices had a total of 194 accurate segmentations out of 250 (77.6%). The residents had a total of 397/400 (99.2%) accurate segmentations. In round 1, the novices had 19/40 (47.5%) accurate segmentations, in round 2 41/60 (68.3%), and in round 3 84/100 (84.0%) indicating learning progress. CONCLUSIONS Radiology residents, regardless of prior experience, showed high segmentation accuracy. Novices showed larger interindividual variation and lower segmentation accuracy than radiology residents. To prepare datasets for artificial intelligence (AI) development, employing radiology residents seems safe and provides a good balance between cost-effectiveness and segmentation accuracy. Employing novices should only be considered on an individual basis. CLINICAL RELEVANCE STATEMENT Employing radiology residents for prostate MRI segmentation seems safe and can potentially reduce the workload of expert radiologists. Employing novices should only be considered on an individual basis. KEY POINTS • Using less experienced readers for prostate MRI segmentation is cost-effective but may reduce quality. • Radiology residents provided high accuracy segmentations while novices showed large inter-reader variability. • To prepare datasets for AI development, employing radiology residents seems safe and might provide a good balance between cost-effectiveness and segmentation accuracy while novices should only be employed on an individual basis.
Collapse
Affiliation(s)
- Fredrik Langkilde
- Department of Radiology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden.
- Department of Radiology, Sahlgrenska University Hospital, Gothenburg, Sweden.
| | - Patrick Masaba
- Department of Molecular Medicine and Surgery (MMK), Karolinska Institutet, Stockholm, Sweden
| | - Lars Edenbrandt
- Department of Molecular and Clinical Medicine, Institute of Medicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Clinical Physiology, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Magnus Gren
- Department of Radiology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Radiology, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Airin Halil
- Department of Radiology, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Mikael Hellström
- Department of Radiology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Radiology, Sahlgrenska University Hospital, Gothenburg, Sweden
| | | | - Ameer Ali Naeem
- Department of Radiology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Jonas Wallström
- Department of Radiology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Radiology, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Stephan E Maier
- Department of Radiology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Radiology, Sahlgrenska University Hospital, Gothenburg, Sweden
- Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Fredrik Jäderling
- Department of Molecular Medicine and Surgery (MMK), Karolinska Institutet, Stockholm, Sweden
- Department of Diagnostic Radiology, Capio S:T Göran's Hospital, Stockholm, Sweden
| |
Collapse
|
28
|
Mao K, Li R, Cheng J, Huang D, Song Z, Liu Z. PL-Net: progressive learning network for medical image segmentation. Front Bioeng Biotechnol 2024; 12:1414605. [PMID: 38994123 PMCID: PMC11236745 DOI: 10.3389/fbioe.2024.1414605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 05/30/2024] [Indexed: 07/13/2024] Open
Abstract
In recent years, deep convolutional neural network-based segmentation methods have achieved state-of-the-art performance for many medical analysis tasks. However, most of these approaches rely on optimizing the U-Net structure or adding new functional modules, which overlooks the complementation and fusion of coarse-grained and fine-grained semantic information. To address these issues, we propose a 2D medical image segmentation framework called Progressive Learning Network (PL-Net), which comprises Internal Progressive Learning (IPL) and External Progressive Learning (EPL). PL-Net offers the following advantages: 1) IPL divides feature extraction into two steps, allowing for the mixing of different size receptive fields and capturing semantic information from coarse to fine granularity without introducing additional parameters; 2) EPL divides the training process into two stages to optimize parameters and facilitate the fusion of coarse-grained information in the first stage and fine-grained information in the second stage. We conducted comprehensive evaluations of our proposed method on five medical image segmentation datasets, and the experimental results demonstrate that PL-Net achieves competitive segmentation performance. It is worth noting that PL-Net does not introduce any additional learnable parameters compared to other U-Net variants.
Collapse
Affiliation(s)
- Kunpeng Mao
- Chongqing City Management College, Chongqing, China
| | - Ruoyu Li
- College of Computer Science, Sichuan University, Chengdu, China
| | - Junlong Cheng
- College of Computer Science, Sichuan University, Chengdu, China
| | - Danmei Huang
- Chongqing City Management College, Chongqing, China
| | - Zhiping Song
- Chongqing University of Engineering, Chongqing, China
| | - ZeKui Liu
- Chongqing University of Engineering, Chongqing, China
| |
Collapse
|
29
|
Mu J, Kadoch M, Yuan T, Lv W, Liu Q, Li B. Explainable Federated Medical Image Analysis Through Causal Learning and Blockchain. IEEE J Biomed Health Inform 2024; 28:3206-3218. [PMID: 38470597 DOI: 10.1109/jbhi.2024.3375894] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/14/2024]
Abstract
Federated learning (FL) enables collaborative training of machine learning models across distributed medical data sources without compromising privacy. However, applying FL to medical image analysis presents challenges like high communication overhead and data heterogeneity. This paper proposes novel FL techniques using explainable artificial intelligence (XAI) for efficient, accurate, and trustworthy analysis. A heterogeneity-aware causal learning approach selectively sparsifies model weights based on their causal contributions, significantly reducing communication requirements while retaining performance and improving interpretability. Furthermore, blockchain provides decentralized quality assessment of client datasets. The assessment scores adjust aggregation weights so higher-quality data has more influence during training, improving model generalization. Comprehensive experiments show our XAI-integrated FL framework enhances efficiency, accuracy and interpretability. The causal learning method decreases communication overhead while maintaining segmentation accuracy. The blockchain-based data valuation mitigates issues from low-quality local datasets. Our framework provides essential model explanations and trust mechanisms, making FL viable for clinical adoption in medical image analysis.
Collapse
|
30
|
Showrav TT, Hasan MK. Hi- gMISnet: generalized medical image segmentation using DWT based multilayer fusion and dual mode attention into high resolution pGAN. Phys Med Biol 2024; 69:115019. [PMID: 38593830 DOI: 10.1088/1361-6560/ad3cb3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 04/09/2024] [Indexed: 04/11/2024]
Abstract
Objective.Automatic medical image segmentation is crucial for accurately isolating target tissue areas in the image from background tissues, facilitating precise diagnoses and procedures. While the proliferation of publicly available clinical datasets led to the development of deep learning-based medical image segmentation methods, a generalized, accurate, robust, and reliable approach across diverse imaging modalities remains elusive.Approach.This paper proposes a novel high-resolution parallel generative adversarial network (pGAN)-based generalized deep learning method for automatic segmentation of medical images from diverse imaging modalities. The proposed method showcases better performance and generalizability by incorporating novel components such as partial hybrid transfer learning, discrete wavelet transform (DWT)-based multilayer and multiresolution feature fusion in the encoder, and a dual mode attention gate in the decoder of the multi-resolution U-Net-based GAN. With multi-objective adversarial training loss functions including a unique reciprocal loss for enforcing cooperative learning inpGANs, it further enhances the robustness and accuracy of the segmentation map.Main results.Experimental evaluations conducted on nine diverse publicly available medical image segmentation datasets, including PhysioNet ICH, BUSI, CVC-ClinicDB, MoNuSeg, GLAS, ISIC-2018, DRIVE, Montgomery, and PROMISE12, demonstrate the proposed method's superior performance. The proposed method achieves mean F1 scores of 79.53%, 88.68%, 82.50%, 93.25%, 90.40%, 94.19%, 81.65%, 98.48%, and 90.79%, respectively, on the above datasets, surpass state-of-the-art segmentation methods. Furthermore, our proposed method demonstrates robust multi-domain segmentation capabilities, exhibiting consistent and reliable performance. The assessment of the model's proficiency in accurately identifying small details indicates that the high-resolution generalized medical image segmentation network (Hi-gMISnet) is more precise in segmenting even when the target area is very small.Significance.The proposed method provides robust and reliable segmentation performance on medical images, and thus it has the potential to be used in a clinical setting for the diagnosis of patients.
Collapse
Affiliation(s)
- Tushar Talukder Showrav
- Department of Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology (BUET), Dhaka, 1205, Bangladesh
| | - Md Kamrul Hasan
- Department of Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology (BUET), Dhaka, 1205, Bangladesh
| |
Collapse
|
31
|
Zhao X, Wang W. Semi-Supervised Medical Image Segmentation Based on Deep Consistent Collaborative Learning. J Imaging 2024; 10:118. [PMID: 38786572 PMCID: PMC11122630 DOI: 10.3390/jimaging10050118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 05/02/2024] [Accepted: 05/10/2024] [Indexed: 05/25/2024] Open
Abstract
In the realm of medical image analysis, the cost associated with acquiring accurately labeled data is prohibitively high. To address the issue of label scarcity, semi-supervised learning methods are employed, utilizing unlabeled data alongside a limited set of labeled data. This paper presents a novel semi-supervised medical segmentation framework, DCCLNet (deep consistency collaborative learning UNet), grounded in deep consistent co-learning. The framework synergistically integrates consistency learning from feature and input perturbations, coupled with collaborative training between CNN (convolutional neural networks) and ViT (vision transformer), to capitalize on the learning advantages offered by these two distinct paradigms. Feature perturbation involves the application of auxiliary decoders with varied feature disturbances to the main CNN backbone, enhancing the robustness of the CNN backbone through consistency constraints generated by the auxiliary and main decoders. Input perturbation employs an MT (mean teacher) architecture wherein the main network serves as the student model guided by a teacher model subjected to input perturbations. Collaborative training aims to improve the accuracy of the main networks by encouraging mutual learning between the CNN and ViT. Experiments conducted on publicly available datasets for ACDC (automated cardiac diagnosis challenge) and Prostate datasets yielded Dice coefficients of 0.890 and 0.812, respectively. Additionally, comprehensive ablation studies were performed to demonstrate the effectiveness of each methodological contribution in this study.
Collapse
Affiliation(s)
- Xin Zhao
- College of Information Engineering, Dalian University, Dalian 116622, China;
| | | |
Collapse
|
32
|
He A, Li T, Yan J, Wang K, Fu H. Bilateral Supervision Network for Semi-Supervised Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1715-1726. [PMID: 38153819 DOI: 10.1109/tmi.2023.3347689] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2023]
Abstract
Massive high-quality annotated data is required by fully-supervised learning, which is difficult to obtain for image segmentation since the pixel-level annotation is expensive, especially for medical image segmentation tasks that need domain knowledge. As an alternative solution, semi-supervised learning (SSL) can effectively alleviate the dependence on the annotated samples by leveraging abundant unlabeled samples. Among the SSL methods, mean-teacher (MT) is the most popular one. However, in MT, teacher model's weights are completely determined by student model's weights, which will lead to the training bottleneck at the late training stages. Besides, only pixel-wise consistency is applied for unlabeled data, which ignores the category information and is susceptible to noise. In this paper, we propose a bilateral supervision network with bilateral exponential moving average (bilateral-EMA), named BSNet to overcome these issues. On the one hand, both the student and teacher models are trained on labeled data, and then their weights are updated with the bilateral-EMA, and thus the two models can learn from each other. On the other hand, pseudo labels are used to perform bilateral supervision for unlabeled data. Moreover, for enhancing the supervision, we adopt adversarial learning to enforce the network generate more reliable pseudo labels for unlabeled data. We conduct extensive experiments on three datasets to evaluate the proposed BSNet, and results show that BSNet can improve the semi-supervised segmentation performance by a large margin and surpass other state-of-the-art SSL methods.
Collapse
|
33
|
Peng J, Wang P, Pedersoli M, Desrosiers C. Boundary-aware information maximization for self-supervised medical image segmentation. Med Image Anal 2024; 94:103150. [PMID: 38574545 DOI: 10.1016/j.media.2024.103150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 02/24/2024] [Accepted: 03/20/2024] [Indexed: 04/06/2024]
Abstract
Self-supervised representation learning can boost the performance of a pre-trained network on downstream tasks for which labeled data is limited. A popular method based on this paradigm, known as contrastive learning, works by constructing sets of positive and negative pairs from the data, and then pulling closer the representations of positive pairs while pushing apart those of negative pairs. Although contrastive learning has been shown to improve performance in various classification tasks, its application to image segmentation has been more limited. This stems in part from the difficulty of defining positive and negative pairs for dense feature maps without having access to pixel-wise annotations. In this work, we propose a novel self-supervised pre-training method that overcomes the challenges of contrastive learning in image segmentation. Our method leverages Information Invariant Clustering (IIC) as an unsupervised task to learn a local representation of images in the decoder of a segmentation network, but addresses three important drawbacks of this approach: (i) the difficulty of optimizing the loss based on mutual information maximization; (ii) the lack of clustering consistency for different random transformations of the same image; (iii) the poor correspondence of clusters obtained by IIC with region boundaries in the image. Toward this goal, we first introduce a regularized mutual information maximization objective that encourages the learned clusters to be balanced and consistent across different image transformations. We also propose a boundary-aware loss based on cross-correlation, which helps the learned clusters to be more representative of important regions in the image. Compared to contrastive learning applied in dense features, our method does not require computing positive and negative pairs and also enhances interpretability through the visualization of learned clusters. Comprehensive experiments involving four different medical image segmentation tasks reveal the high effectiveness of our self-supervised representation learning method. Our results show the proposed method to outperform by a large margin several state-of-the-art self-supervised and semi-supervised approaches for segmentation, reaching a performance close to full supervision with only a few labeled examples.
Collapse
Affiliation(s)
- Jizong Peng
- ETS Montréal, 1100 Notre-Dame St W, Montreal H3C 1K3, QC, Canada.
| | - Ping Wang
- ETS Montréal, 1100 Notre-Dame St W, Montreal H3C 1K3, QC, Canada
| | - Marco Pedersoli
- ETS Montréal, 1100 Notre-Dame St W, Montreal H3C 1K3, QC, Canada
| | | |
Collapse
|
34
|
Johnson LA, Harmon SA, Yilmaz EC, Lin Y, Belue MJ, Merriman KM, Lay NS, Sanford TH, Sarma KV, Arnold CW, Xu Z, Roth HR, Yang D, Tetreault J, Xu D, Patel KR, Gurram S, Wood BJ, Citrin DE, Pinto PA, Choyke PL, Turkbey B. Automated prostate gland segmentation in challenging clinical cases: comparison of three artificial intelligence methods. Abdom Radiol (NY) 2024; 49:1545-1556. [PMID: 38512516 DOI: 10.1007/s00261-024-04242-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 02/05/2024] [Accepted: 02/06/2024] [Indexed: 03/23/2024]
Abstract
OBJECTIVE Automated methods for prostate segmentation on MRI are typically developed under ideal scanning and anatomical conditions. This study evaluates three different prostate segmentation AI algorithms in a challenging population of patients with prior treatments, variable anatomic characteristics, complex clinical history, or atypical MRI acquisition parameters. MATERIALS AND METHODS A single institution retrospective database was queried for the following conditions at prostate MRI: prior prostate-specific oncologic treatment, transurethral resection of the prostate (TURP), abdominal perineal resection (APR), hip prosthesis (HP), diversity of prostate volumes (large ≥ 150 cc, small ≤ 25 cc), whole gland tumor burden, magnet strength, noted poor quality, and various scanners (outside/vendors). Final inclusion criteria required availability of axial T2-weighted (T2W) sequence and corresponding prostate organ segmentation from an expert radiologist. Three previously developed algorithms were evaluated: (1) deep learning (DL)-based model, (2) commercially available shape-based model, and (3) federated DL-based model. Dice Similarity Coefficient (DSC) was calculated compared to expert. DSC by model and scan factors were evaluated with Wilcox signed-rank test and linear mixed effects (LMER) model. RESULTS 683 scans (651 patients) met inclusion criteria (mean prostate volume 60.1 cc [9.05-329 cc]). Overall DSC scores for models 1, 2, and 3 were 0.916 (0.707-0.971), 0.873 (0-0.997), and 0.894 (0.025-0.961), respectively, with DL-based models demonstrating significantly higher performance (p < 0.01). In sub-group analysis by factors, Model 1 outperformed Model 2 (all p < 0.05) and Model 3 (all p < 0.001). Performance of all models was negatively impacted by prostate volume and poor signal quality (p < 0.01). Shape-based factors influenced DL models (p < 0.001) while signal factors influenced all (p < 0.001). CONCLUSION Factors affecting anatomical and signal conditions of the prostate gland can adversely impact both DL and non-deep learning-based segmentation models.
Collapse
Affiliation(s)
- Latrice A Johnson
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Stephanie A Harmon
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Enis C Yilmaz
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Yue Lin
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mason J Belue
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Katie M Merriman
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Nathan S Lay
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Karthik V Sarma
- Department of Psychiatry and Behavioral Sciences, University of California, San Francisco, CA, USA
| | - Corey W Arnold
- Department of Radiology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Ziyue Xu
- NVIDIA Corporation, Santa Clara, CA, USA
| | | | - Dong Yang
- NVIDIA Corporation, Santa Clara, CA, USA
| | | | - Daguang Xu
- NVIDIA Corporation, Santa Clara, CA, USA
| | - Krishnan R Patel
- Radiation Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sandeep Gurram
- Urologic Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Bradford J Wood
- Center for Interventional Oncology, National Cancer Institute, NIH, Bethesda, MD, USA
- Department of Radiology, Clinical Center, NIH, Bethesda, MD, USA
| | - Deborah E Citrin
- Radiation Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Peter A Pinto
- Urologic Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Peter L Choyke
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Baris Turkbey
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.
- Molecular Imaging Branch (B.T.), National Cancer Institute, National Institutes of Health, 10 Center Dr., MSC 1182, Building 10, Room B3B85, Bethesda, MD, 20892, USA.
| |
Collapse
|
35
|
Zhu Z, Ma X, Wang W, Dong S, Wang K, Wu L, Luo G, Wang G, Li S. Boosting knowledge diversity, accuracy, and stability via tri-enhanced distillation for domain continual medical image segmentation. Med Image Anal 2024; 94:103112. [PMID: 38401270 DOI: 10.1016/j.media.2024.103112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 01/10/2024] [Accepted: 02/20/2024] [Indexed: 02/26/2024]
Abstract
Domain continual medical image segmentation plays a crucial role in clinical settings. This approach enables segmentation models to continually learn from a sequential data stream across multiple domains. However, it faces the challenge of catastrophic forgetting. Existing methods based on knowledge distillation show potential to address this challenge via a three-stage process: distillation, transfer, and fusion. Yet, each stage presents its unique issues that, collectively, amplify the problem of catastrophic forgetting. To address these issues at each stage, we propose a tri-enhanced distillation framework. (1) Stochastic Knowledge Augmentation reduces redundancy in knowledge, thereby increasing both the diversity and volume of knowledge derived from the old network. (2) Adaptive Knowledge Transfer selectively captures critical information from the old knowledge, facilitating a more accurate knowledge transfer. (3) Global Uncertainty-Guided Fusion introduces a global uncertainty view of the dataset to fuse the old and new knowledge with reduced bias, promoting a more stable knowledge fusion. Our experimental results not only validate the feasibility of our approach, but also demonstrate its superior performance compared to state-of-the-art methods. We suggest that our innovative tri-enhanced distillation framework may establish a robust benchmark for domain continual medical image segmentation.
Collapse
Affiliation(s)
- Zhanshi Zhu
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Xinghua Ma
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Wei Wang
- Faculty of Computing, Harbin Institute of Technology, Shenzhen, China.
| | - Suyu Dong
- College of Computer and Control Engineering, Northeast Forestry University, Harbin, China
| | - Kuanquan Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, China.
| | - Lianming Wu
- Department of Radiology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Gongning Luo
- Faculty of Computing, Harbin Institute of Technology, Harbin, China.
| | - Guohua Wang
- College of Computer and Control Engineering, Northeast Forestry University, Harbin, China
| | - Shuo Li
- Department of Biomedical Engineering, Case Western Reserve University, Cleveland, OH 44106, USA
| |
Collapse
|
36
|
Fechter T, Sachpazidis I, Baltas D. The use of deep learning in interventional radiotherapy (brachytherapy): A review with a focus on open source and open data. Z Med Phys 2024; 34:180-196. [PMID: 36376203 PMCID: PMC11156786 DOI: 10.1016/j.zemedi.2022.10.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Revised: 10/07/2022] [Accepted: 10/10/2022] [Indexed: 11/13/2022]
Abstract
Deep learning advanced to one of the most important technologies in almost all medical fields. Especially in areas, related to medical imaging it plays a big role. However, in interventional radiotherapy (brachytherapy) deep learning is still in an early phase. In this review, first, we investigated and scrutinised the role of deep learning in all processes of interventional radiotherapy and directly related fields. Additionally, we summarised the most recent developments. For better understanding, we provide explanations of key terms and approaches to solving common deep learning problems. To reproduce results of deep learning algorithms both source code and training data must be available. Therefore, a second focus of this work is on the analysis of the availability of open source, open data and open models. In our analysis, we were able to show that deep learning plays already a major role in some areas of interventional radiotherapy, but is still hardly present in others. Nevertheless, its impact is increasing with the years, partly self-propelled but also influenced by closely related fields. Open source, data and models are growing in number but are still scarce and unevenly distributed among different research groups. The reluctance in publishing code, data and models limits reproducibility and restricts evaluation to mono-institutional datasets. The conclusion of our analysis is that deep learning can positively change the workflow of interventional radiotherapy but there is still room for improvements when it comes to reproducible results and standardised evaluation methods.
Collapse
Affiliation(s)
- Tobias Fechter
- Division of Medical Physics, Department of Radiation Oncology, Medical Center University of Freiburg, Germany; Faculty of Medicine, University of Freiburg, Germany; German Cancer Consortium (DKTK), Partner Site Freiburg, Germany.
| | - Ilias Sachpazidis
- Division of Medical Physics, Department of Radiation Oncology, Medical Center University of Freiburg, Germany; Faculty of Medicine, University of Freiburg, Germany; German Cancer Consortium (DKTK), Partner Site Freiburg, Germany
| | - Dimos Baltas
- Division of Medical Physics, Department of Radiation Oncology, Medical Center University of Freiburg, Germany; Faculty of Medicine, University of Freiburg, Germany; German Cancer Consortium (DKTK), Partner Site Freiburg, Germany
| |
Collapse
|
37
|
Zhao J, Jiang T, Lin Y, Chan LC, Chan PK, Wen C, Chen H. Adaptive Fusion of Deep Learning With Statistical Anatomical Knowledge for Robust Patella Segmentation From CT Images. IEEE J Biomed Health Inform 2024; 28:2842-2853. [PMID: 38446653 DOI: 10.1109/jbhi.2024.3372576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]
Abstract
Kneeosteoarthritis (KOA), as a leading joint disease, can be decided by examining the shapes of patella to spot potential abnormal variations. To assist doctors in the diagnosis of KOA, a robust automatic patella segmentation method is highly demanded in clinical practice. Deep learning methods, especially convolutional neural networks (CNNs) have been widely applied to medical image segmentation in recent years. Nevertheless, poor image quality and limited data still impose challenges to segmentation via CNNs. On the other hand, statistical shape models (SSMs) can generate shape priors which give anatomically reliable segmentation to varying instances. Thus, in this work, we propose an adaptive fusion framework, explicitly combining deep neural networks and anatomical knowledge from SSM for robust patella segmentation. Our adaptive fusion framework will accordingly adjust the weight of segmentation candidates in fusion based on their segmentation performance. We also propose a voxel-wise refinement strategy to make the segmentation of CNNs more anatomically correct. Extensive experiments and thorough assessment have been conducted on various mainstream CNN backbones for patella segmentation in low-data regimes, which demonstrate that our framework can be flexibly attached to a CNN model, significantly improving its performance when labeled training data are limited and input image data are of poor quality.
Collapse
|
38
|
Xu Z, Lu D, Luo J, Zheng Y, Tong RKY. Separated collaborative learning for semi-supervised prostate segmentation with multi-site heterogeneous unlabeled MRI data. Med Image Anal 2024; 93:103095. [PMID: 38310678 DOI: 10.1016/j.media.2024.103095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 09/11/2023] [Accepted: 01/24/2024] [Indexed: 02/06/2024]
Abstract
Segmenting prostate from magnetic resonance imaging (MRI) is a critical procedure in prostate cancer staging and treatment planning. Considering the nature of labeled data scarcity for medical images, semi-supervised learning (SSL) becomes an appealing solution since it can simultaneously exploit limited labeled data and a large amount of unlabeled data. However, SSL relies on the assumption that the unlabeled images are abundant, which may not be satisfied when the local institute has limited image collection capabilities. An intuitive solution is to seek support from other centers to enrich the unlabeled image pool. However, this further introduces data heterogeneity, which can impede SSL that works under identical data distribution with certain model assumptions. Aiming at this under-explored yet valuable scenario, in this work, we propose a separated collaborative learning (SCL) framework for semi-supervised prostate segmentation with multi-site unlabeled MRI data. Specifically, on top of the teacher-student framework, SCL exploits multi-site unlabeled data by: (i) Local learning, which advocates local distribution fitting, including the pseudo label learning that reinforces confirmation of low-entropy easy regions and the cyclic propagated real label learning that leverages class prototypes to regularize the distribution of intra-class features; (ii) External multi-site learning, which aims to robustly mine informative clues from external data, mainly including the local-support category mutual dependence learning, which takes the spirit that mutual information can effectively measure the amount of information shared by two variables even from different domains, and the stability learning under strong adversarial perturbations to enhance robustness to heterogeneity. Extensive experiments on prostate MRI data from six different clinical centers show that our method can effectively generalize SSL on multi-site unlabeled data and significantly outperform other semi-supervised segmentation methods. Besides, we validate the extensibility of our method on the multi-class cardiac MRI segmentation task with data from four different clinical centers.
Collapse
Affiliation(s)
- Zhe Xu
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, NT, Hong Kong, China.
| | - Donghuan Lu
- Tencent Jarvis Research Center, Youtu Lab, Shenzhen, China.
| | - Jie Luo
- Massachusetts General Hospital, Harvard Medical School, Boston, USA
| | - Yefeng Zheng
- Tencent Jarvis Research Center, Youtu Lab, Shenzhen, China
| | - Raymond Kai-Yu Tong
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, NT, Hong Kong, China.
| |
Collapse
|
39
|
Liu J, Desrosiers C, Yu D, Zhou Y. Semi-Supervised Medical Image Segmentation Using Cross-Style Consistency With Shape-Aware and Local Context Constraints. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1449-1461. [PMID: 38032771 DOI: 10.1109/tmi.2023.3338269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]
Abstract
Despite the remarkable progress in semi-supervised medical image segmentation methods based on deep learning, their application to real-life clinical scenarios still faces considerable challenges. For example, insufficient labeled data often makes it difficult for networks to capture the complexity and variability of the anatomical regions to be segmented. To address these problems, we design a new semi-supervised segmentation framework that aspires to produce anatomically plausible predictions. Our framework comprises two parallel networks: shape-agnostic and shape-aware networks. These networks learn from each other, enabling effective utilization of unlabeled data. Our shape-aware network implicitly introduces shape guidance to capture shape fine-grained information. Meanwhile, shape-agnostic networks employ uncertainty estimation to further obtain reliable pseudo-labels for the counterpart. We also employ a cross-style consistency strategy to enhance the network's utilization of unlabeled data. It enriches the dataset to prevent overfitting and further eases the coupling of the two networks that learn from each other. Our proposed architecture also incorporates a novel loss term that facilitates the learning of the local context of segmentation by the network, thereby enhancing the overall accuracy of prediction. Experiments on three different datasets of medical images show that our method outperforms many excellent semi-supervised segmentation methods and outperforms them in perceiving shape. The code can be seen at https://github.com/igip-liu/SLC-Net.
Collapse
|
40
|
Kumari S, Singh P. Deep learning for unsupervised domain adaptation in medical imaging: Recent advancements and future perspectives. Comput Biol Med 2024; 170:107912. [PMID: 38219643 DOI: 10.1016/j.compbiomed.2023.107912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 11/02/2023] [Accepted: 12/24/2023] [Indexed: 01/16/2024]
Abstract
Deep learning has demonstrated remarkable performance across various tasks in medical imaging. However, these approaches primarily focus on supervised learning, assuming that the training and testing data are drawn from the same distribution. Unfortunately, this assumption may not always hold true in practice. To address these issues, unsupervised domain adaptation (UDA) techniques have been developed to transfer knowledge from a labeled domain to a related but unlabeled domain. In recent years, significant advancements have been made in UDA, resulting in a wide range of methodologies, including feature alignment, image translation, self-supervision, and disentangled representation methods, among others. In this paper, we provide a comprehensive literature review of recent deep UDA approaches in medical imaging from a technical perspective. Specifically, we categorize current UDA research in medical imaging into six groups and further divide them into finer subcategories based on the different tasks they perform. We also discuss the respective datasets used in the studies to assess the divergence between the different domains. Finally, we discuss emerging areas and provide insights and discussions on future research directions to conclude this survey.
Collapse
Affiliation(s)
- Suruchi Kumari
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| | - Pravendra Singh
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| |
Collapse
|
41
|
Rodrigues NM, Almeida JGD, Verde ASC, Gaivão AM, Bilreiro C, Santiago I, Ip J, Belião S, Moreno R, Matos C, Vanneschi L, Tsiknakis M, Marias K, Regge D, Silva S, Papanikolaou N. Analysis of domain shift in whole prostate gland, zonal and lesions segmentation and detection, using multicentric retrospective data. Comput Biol Med 2024; 171:108216. [PMID: 38442555 DOI: 10.1016/j.compbiomed.2024.108216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 02/09/2024] [Accepted: 02/25/2024] [Indexed: 03/07/2024]
Abstract
Despite being one of the most prevalent forms of cancer, prostate cancer (PCa) shows a significantly high survival rate, provided there is timely detection and treatment. Computational methods can help make this detection process considerably faster and more robust. However, some modern machine-learning approaches require accurate segmentation of the prostate gland and the index lesion. Since performing manual segmentations is a very time-consuming task, and highly prone to inter-observer variability, there is a need to develop robust semi-automatic segmentation models. In this work, we leverage the large and highly diverse ProstateNet dataset, which includes 638 whole gland and 461 lesion segmentation masks, from 3 different scanner manufacturers provided by 14 institutions, in addition to other 3 independent public datasets, to train accurate and robust segmentation models for the whole prostate gland, zones and lesions. We show that models trained on large amounts of diverse data are better at generalizing to data from other institutions and obtained with other manufacturers, outperforming models trained on single-institution single-manufacturer datasets in all segmentation tasks. Furthermore, we show that lesion segmentation models trained on ProstateNet can be reliably used as lesion detection models.
Collapse
Affiliation(s)
- Nuno Miguel Rodrigues
- Computational Clinical Imaging Group, Champalimaud Foundation, Portugal; LASIGE, Faculty of Sciences, University of Lisbon, Portugal.
| | | | | | - Ana Mascarenhas Gaivão
- Radiology Department, Champalimaud Clinical Center, Champalimaud Foundation, Lisbon, Portugal
| | - Carlos Bilreiro
- Radiology Department, Champalimaud Clinical Center, Champalimaud Foundation, Lisbon, Portugal
| | - Inês Santiago
- Radiology Department, Champalimaud Clinical Center, Champalimaud Foundation, Lisbon, Portugal
| | - Joana Ip
- Radiology Department, Champalimaud Clinical Center, Champalimaud Foundation, Lisbon, Portugal
| | - Sara Belião
- Radiology Department, Champalimaud Clinical Center, Champalimaud Foundation, Lisbon, Portugal
| | - Raquel Moreno
- Computational Clinical Imaging Group, Champalimaud Foundation, Portugal
| | - Celso Matos
- Computational Clinical Imaging Group, Champalimaud Foundation, Portugal
| | - Leonardo Vanneschi
- NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, Campus de Campolide, 1070-312 Lisboa, Portugal
| | - Manolis Tsiknakis
- Institute of Computer Science, Foundation for Research and Technology Hellas (FORTH), GR 700 13, Heraklion, Greece; Department of Electrical and Computer Engineering, Hellenic Mediterranean University, GR 710 04, Heraklion, Greece
| | - Kostas Marias
- Department of Electrical and Computer Engineering, Hellenic Mediterranean University, GR 710 04, Heraklion, Greece; Computational BioMedicine Laboratory (CBML), Institute of Computer Science, Foundation for Research and Technology - Hellas (FORTH), Heraklion, Greece
| | - Daniele Regge
- Department of Radiology, Candiolo Cancer Institute, FPO-IRCCS, Strada Provinciale 142 Km 3.95, Candiolo, Turin 10060, Italy; Department of Surgical Sciences, University of Turin, Turin 10124, Italy
| | - Sara Silva
- LASIGE, Faculty of Sciences, University of Lisbon, Portugal
| | - Nickolas Papanikolaou
- Computational Clinical Imaging Group, Champalimaud Foundation, Portugal; Department of Radiology, Royal Marsden Hospital, Sutton, UK
| |
Collapse
|
42
|
Li X, Jia L, Lin F, Chai F, Liu T, Zhang W, Wei Z, Xiong W, Li H, Zhang M, Wang Y. Semi-supervised auto-segmentation method for pelvic organ-at-risk in magnetic resonance images based on deep-learning. J Appl Clin Med Phys 2024; 25:e14296. [PMID: 38386963 DOI: 10.1002/acm2.14296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Revised: 01/06/2024] [Accepted: 01/23/2024] [Indexed: 02/24/2024] Open
Abstract
BACKGROUND AND PURPOSE In radiotherapy, magnetic resonance (MR) imaging has higher contrast for soft tissues compared to computed tomography (CT) scanning and does not emit radiation. However, manual annotation of the deep learning-based automatic organ-at-risk (OAR) delineation algorithms is expensive, making the collection of large-high-quality annotated datasets a challenge. Therefore, we proposed the low-cost semi-supervised OAR segmentation method using small pelvic MR image annotations. METHODS We trained a deep learning-based segmentation model using 116 sets of MR images from 116 patients. The bladder, femoral heads, rectum, and small intestine were selected as OAR regions. To generate the training set, we utilized a semi-supervised method and ensemble learning techniques. Additionally, we employed a post-processing algorithm to correct the self-annotation data. Both 2D and 3D auto-segmentation networks were evaluated for their performance. Furthermore, we evaluated the performance of semi-supervised method for 50 labeled data and only 10 labeled data. RESULTS The Dice similarity coefficient (DSC) of the bladder, femoral heads, rectum and small intestine between segmentation results and reference masks is 0.954, 0.984, 0.908, 0.852 only using self-annotation and post-processing methods of 2D segmentation model. The DSC of corresponding OARs is 0.871, 0.975, 0.975, 0.783, 0.724 using 3D segmentation network, 0.896, 0.984, 0.890, 0.828 using 2D segmentation network and common supervised method. CONCLUSION The outcomes of our study demonstrate that it is possible to train a multi-OAR segmentation model using small annotation samples and additional unlabeled data. To effectively annotate the dataset, ensemble learning and post-processing methods were employed. Additionally, when dealing with anisotropy and limited sample sizes, the 2D model outperformed the 3D model in terms of performance.
Collapse
Affiliation(s)
- Xianan Li
- Department of Radiation Oncology, Peking University People's Hospital, Beijing, China
| | - Lecheng Jia
- Radiotherapy laboratory, Shenzhen United Imaging Research Institute of Innovative Medical Equipment, Shenzhen, China
- Zhejiang Engineering Research Center for Innovation and Application of Intelligent Radiotherapy Technology, Wenzhou, China
| | - Fengyu Lin
- Radiotherapy laboratory, Shenzhen United Imaging Research Institute of Innovative Medical Equipment, Shenzhen, China
| | - Fan Chai
- Department of Radiology, Peking University People's Hospital, Beijing, China
| | - Tao Liu
- Department of Radiology, Peking University People's Hospital, Beijing, China
| | - Wei Zhang
- Radiotherapy Business Unit, Shanghai United Imaging Healthcare Co., Ltd., Shanghai, China
| | - Ziquan Wei
- Radiotherapy laboratory, Shenzhen United Imaging Research Institute of Innovative Medical Equipment, Shenzhen, China
| | - Weiqi Xiong
- Radiotherapy Business Unit, Shanghai United Imaging Healthcare Co., Ltd., Shanghai, China
| | - Hua Li
- Radiotherapy laboratory, Shenzhen United Imaging Research Institute of Innovative Medical Equipment, Shenzhen, China
| | - Min Zhang
- Department of Radiation Oncology, Peking University People's Hospital, Beijing, China
| | - Yi Wang
- Department of Radiology, Peking University People's Hospital, Beijing, China
| |
Collapse
|
43
|
Pu Q, Xi Z, Yin S, Zhao Z, Zhao L. Advantages of transformer and its application for medical image segmentation: a survey. Biomed Eng Online 2024; 23:14. [PMID: 38310297 PMCID: PMC10838005 DOI: 10.1186/s12938-024-01212-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 01/22/2024] [Indexed: 02/05/2024] Open
Abstract
PURPOSE Convolution operator-based neural networks have shown great success in medical image segmentation over the past decade. The U-shaped network with a codec structure is one of the most widely used models. Transformer, a technology used in natural language processing, can capture long-distance dependencies and has been applied in Vision Transformer to achieve state-of-the-art performance on image classification tasks. Recently, researchers have extended transformer to medical image segmentation tasks, resulting in good models. METHODS This review comprises publications selected through a Web of Science search. We focused on papers published since 2018 that applied the transformer architecture to medical image segmentation. We conducted a systematic analysis of these studies and summarized the results. RESULTS To better comprehend the benefits of convolutional neural networks and transformers, the construction of the codec and transformer modules is first explained. Second, the medical image segmentation model based on transformer is summarized. The typically used assessment markers for medical image segmentation tasks are then listed. Finally, a large number of medical segmentation datasets are described. CONCLUSION Even if there is a pure transformer model without any convolution operator, the sample size of medical picture segmentation still restricts the growth of the transformer, even though it can be relieved by a pretraining model. More often than not, researchers are still designing models using transformer and convolution operators.
Collapse
Affiliation(s)
- Qiumei Pu
- School of Information Engineering, Minzu University of China, Beijing, 100081, China
| | - Zuoxin Xi
- School of Information Engineering, Minzu University of China, Beijing, 100081, China
- CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety Institute of High Energy Physics, Chinese Academy of Sciences, Beijing, 100049, China
| | - Shuai Yin
- School of Information Engineering, Minzu University of China, Beijing, 100081, China
| | - Zhe Zhao
- The Fourth Medical Center of PLA General Hospital, Beijing, 100039, China
| | - Lina Zhao
- CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety Institute of High Energy Physics, Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
44
|
Jiao R, Zhang Y, Ding L, Xue B, Zhang J, Cai R, Jin C. Learning with limited annotations: A survey on deep semi-supervised learning for medical image segmentation. Comput Biol Med 2024; 169:107840. [PMID: 38157773 DOI: 10.1016/j.compbiomed.2023.107840] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 10/30/2023] [Accepted: 12/07/2023] [Indexed: 01/03/2024]
Abstract
Medical image segmentation is a fundamental and critical step in many image-guided clinical approaches. Recent success of deep learning-based segmentation methods usually relies on a large amount of labeled data, which is particularly difficult and costly to obtain, especially in the medical imaging domain where only experts can provide reliable and accurate annotations. Semi-supervised learning has emerged as an appealing strategy and been widely applied to medical image segmentation tasks to train deep models with limited annotations. In this paper, we present a comprehensive review of recently proposed semi-supervised learning methods for medical image segmentation and summarize both the technical novelties and empirical results. Furthermore, we analyze and discuss the limitations and several unsolved problems of existing approaches. We hope this review can inspire the research community to explore solutions to this challenge and further advance the field of medical image segmentation.
Collapse
Affiliation(s)
- Rushi Jiao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; School of Engineering Medicine, Beihang University, Beijing, 100191, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China.
| | - Yichi Zhang
- School of Data Science, Fudan University, Shanghai, 200433, China; Artificial Intelligence Innovation and Incubation Institute, Fudan University, Shanghai, 200433, China.
| | - Le Ding
- School of Biological Science and Medical Engineering, Beihang University, Beijing, 100191, China.
| | - Bingsen Xue
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China.
| | - Jicong Zhang
- School of Biological Science and Medical Engineering, Beihang University, Beijing, 100191, China; Hefei Innovation Research Institute, Beihang University, Hefei, 230012, China.
| | - Rong Cai
- School of Engineering Medicine, Beihang University, Beijing, 100191, China; Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beihang University, Beijing, 100191, China.
| | - Cheng Jin
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China; Beijing Anding Hospital, Capital Medical University, Beijing, 100088, China.
| |
Collapse
|
45
|
Wang J, Xia B. Weakly supervised image segmentation beyond tight bounding box annotations. Comput Biol Med 2024; 169:107913. [PMID: 38176213 DOI: 10.1016/j.compbiomed.2023.107913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 11/21/2023] [Accepted: 12/24/2023] [Indexed: 01/06/2024]
Abstract
Weakly supervised image segmentation approaches in the literature usually achieve high segmentation performance using tight bounding box supervision and decrease the performance greatly when supervised by loose bounding boxes. However, compared with loose bounding box, it is much more difficult to acquire tight bounding box due to its strict requirements on the precise locations of the four sides of the box. To resolve this issue, this study investigates whether it is possible to maintain good segmentation performance when loose bounding boxes are used as supervision. For this purpose, this work extends our previous parallel transformation based multiple instance learning (MIL) for tight bounding box supervision by integrating an MIL strategy based on polar transformation to assist image segmentation. The proposed polar transformation based MIL formulation works for both tight and loose bounding boxes, in which a positive bag is defined as pixels in a polar line of a bounding box with one endpoint located inside the object enclosed by the box and the other endpoint located at one of the four sides of the box. Moreover, a weighted smooth maximum approximation is introduced to incorporate the observation that pixels closer to the origin of the polar transformation are more likely to belong to the object in the box. The proposed approach was evaluated on two public datasets using dice coefficient when bounding boxes at different precision levels were considered in the experiments. The results demonstrate that the proposed approach achieves state-of-the-art performance for bounding boxes at all precision levels and is robust to mild and moderate errors in the loose bounding box annotations. The codes are available at https://github.com/wangjuan313/wsis-beyond-tightBB.
Collapse
Affiliation(s)
- Juan Wang
- Horizon Med Innovation Inc., 23421 South Pointe Dr., Laguna Hills, CA 92653, USA.
| | - Bin Xia
- Shenzhen SiBright Co. Ltd., Tinwe Industrial Park, No. 6 Liufang Rd., Shenzhen, Guangdong 518052, China.
| |
Collapse
|
46
|
Huang Y, Yang X, Liu L, Zhou H, Chang A, Zhou X, Chen R, Yu J, Chen J, Chen C, Liu S, Chi H, Hu X, Yue K, Li L, Grau V, Fan DP, Dong F, Ni D. Segment anything model for medical images? Med Image Anal 2024; 92:103061. [PMID: 38086235 DOI: 10.1016/j.media.2023.103061] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 09/28/2023] [Accepted: 12/05/2023] [Indexed: 01/12/2024]
Abstract
The Segment Anything Model (SAM) is the first foundation model for general image segmentation. It has achieved impressive results on various natural image segmentation tasks. However, medical image segmentation (MIS) is more challenging because of the complex modalities, fine anatomical structures, uncertain and complex object boundaries, and wide-range object scales. To fully validate SAM's performance on medical data, we collected and sorted 53 open-source datasets and built a large medical segmentation dataset with 18 modalities, 84 objects, 125 object-modality paired targets, 1050K 2D images, and 6033K masks. We comprehensively analyzed different models and strategies on the so-called COSMOS 1050K dataset. Our findings mainly include the following: (1) SAM showed remarkable performance in some specific objects but was unstable, imperfect, or even totally failed in other situations. (2) SAM with the large ViT-H showed better overall performance than that with the small ViT-B. (3) SAM performed better with manual hints, especially box, than the Everything mode. (4) SAM could help human annotation with high labeling quality and less time. (5) SAM was sensitive to the randomness in the center point and tight box prompts, and may suffer from a serious performance drop. (6) SAM performed better than interactive methods with one or a few points, but will be outpaced as the number of points increases. (7) SAM's performance correlated to different factors, including boundary complexity, intensity differences, etc. (8) Finetuning the SAM on specific medical tasks could improve its average DICE performance by 4.39% and 6.68% for ViT-B and ViT-H, respectively. Codes and models are available at: https://github.com/yuhoo0302/Segment-Anything-Model-for-Medical-Images. We hope that this comprehensive report can help researchers explore the potential of SAM applications in MIS, and guide how to appropriately use and develop SAM.
Collapse
Affiliation(s)
- Yuhao Huang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Xin Yang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Lian Liu
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Han Zhou
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Ao Chang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Xinrui Zhou
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Rusi Chen
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Junxuan Yu
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Jiongquan Chen
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Chaoyu Chen
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Sijing Liu
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | | | - Xindi Hu
- Shenzhen RayShape Medical Technology Co., Ltd, Shenzhen, China
| | - Kejuan Yue
- Hunan First Normal University, Changsha, China
| | - Lei Li
- Department of Engineering Science, University of Oxford, Oxford, UK
| | - Vicente Grau
- Department of Engineering Science, University of Oxford, Oxford, UK
| | - Deng-Ping Fan
- Computer Vision Lab (CVL), ETH Zurich, Zurich, Switzerland
| | - Fajin Dong
- Ultrasound Department, the Second Clinical Medical College, Jinan University, China; First Affiliated Hospital, Southern University of Science and Technology, Shenzhen People's Hospital, Shenzhen, China.
| | - Dong Ni
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China.
| |
Collapse
|
47
|
Kaneko M, Magoulianitis V, Ramacciotti LS, Raman A, Paralkar D, Chen A, Chu TN, Yang Y, Xue J, Yang J, Liu J, Jadvar DS, Gill K, Cacciamani GE, Nikias CL, Duddalwar V, Jay Kuo CC, Gill IS, Abreu AL. The Novel Green Learning Artificial Intelligence for Prostate Cancer Imaging: A Balanced Alternative to Deep Learning and Radiomics. Urol Clin North Am 2024; 51:1-13. [PMID: 37945095 DOI: 10.1016/j.ucl.2023.08.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2023]
Abstract
The application of artificial intelligence (AI) on prostate magnetic resonance imaging (MRI) has shown promising results. Several AI systems have been developed to automatically analyze prostate MRI for segmentation, cancer detection, and region of interest characterization, thereby assisting clinicians in their decision-making process. Deep learning, the current trend in imaging AI, has limitations including the lack of transparency "black box", large data processing, and excessive energy consumption. In this narrative review, the authors provide an overview of the recent advances in AI for prostate cancer diagnosis and introduce their next-generation AI model, Green Learning, as a promising solution.
Collapse
Affiliation(s)
- Masatomo Kaneko
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; USC Institute of Urology, Center for Image-Guided Surgery, Focal Therapy and Artificial Intelligence for Prostate Cancer; Department of Urology, Graduate School of Medical Science, Kyoto Prefectural University of Medicine, Kyoto, Japan
| | - Vasileios Magoulianitis
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA
| | - Lorenzo Storino Ramacciotti
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; USC Institute of Urology, Center for Image-Guided Surgery, Focal Therapy and Artificial Intelligence for Prostate Cancer
| | - Alex Raman
- Western University of Health Sciences. Pomona, CA, USA
| | - Divyangi Paralkar
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; USC Institute of Urology, Center for Image-Guided Surgery, Focal Therapy and Artificial Intelligence for Prostate Cancer
| | - Andrew Chen
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; USC Institute of Urology, Center for Image-Guided Surgery, Focal Therapy and Artificial Intelligence for Prostate Cancer
| | - Timothy N Chu
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; USC Institute of Urology, Center for Image-Guided Surgery, Focal Therapy and Artificial Intelligence for Prostate Cancer
| | - Yijing Yang
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA
| | - Jintang Xue
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA
| | - Jiaxin Yang
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA
| | - Jinyuan Liu
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA
| | - Donya S Jadvar
- Dornsife School of Letters and Science, University of Southern California, Los Angeles, CA, USA
| | - Karanvir Gill
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; USC Institute of Urology, Center for Image-Guided Surgery, Focal Therapy and Artificial Intelligence for Prostate Cancer
| | - Giovanni E Cacciamani
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; USC Institute of Urology, Center for Image-Guided Surgery, Focal Therapy and Artificial Intelligence for Prostate Cancer; Department of Radiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Chrysostomos L Nikias
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA
| | - Vinay Duddalwar
- Department of Radiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - C-C Jay Kuo
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA
| | - Inderbir S Gill
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Andre Luis Abreu
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; USC Institute of Urology, Center for Image-Guided Surgery, Focal Therapy and Artificial Intelligence for Prostate Cancer; Department of Radiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
48
|
Mahendrakar P, Kumar D, Patil U. A Comprehensive Review on MRI-based Knee Joint Segmentation and Analysis Techniques. Curr Med Imaging 2024; 20:e150523216894. [PMID: 37189281 DOI: 10.2174/1573405620666230515090557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 11/29/2022] [Accepted: 12/28/2022] [Indexed: 05/17/2023]
Abstract
Using magnetic resonance imaging (MRI) in osteoarthritis pathogenesis research has proven extremely beneficial. However, it is always challenging for both clinicians and researchers to detect morphological changes in knee joints from magnetic resonance (MR) imaging since the surrounding tissues produce identical signals in MR studies, making it difficult to distinguish between them. Segmenting the knee bone, articular cartilage and menisci from the MR images allows one to examine the complete volume of the bone, articular cartilage, and menisci. It can also be used to assess certain characteristics quantitatively. However, segmentation is a laborious and time-consuming operation that requires sufficient training to complete correctly. With the advancement of MRI technology and computational methods, researchers have developed several algorithms to automate the task of individual knee bone, articular cartilage and meniscus segmentation during the last two decades. This systematic review aims to present available fully and semi-automatic segmentation methods for knee bone, cartilage, and meniscus published in different scientific articles. This review provides a vivid description of the scientific advancements to clinicians and researchers in this field of image analysis and segmentation, which helps the development of novel automated methods for clinical applications. The review also contains the recently developed fully automated deep learning-based methods for segmentation, which not only provides better results compared to the conventional techniques but also open a new field of research in Medical Imaging.
Collapse
Affiliation(s)
- Pavan Mahendrakar
- BLDEA’s V.P.Dr. P.G., Halakatti College of Engineering and Technology, Vijayapur, Karnataka, India
| | | | - Uttam Patil
- Jain College of Engineering, T.S Nagar, Hunchanhatti Road, Machhe, Belagavi, Karnataka, India
| |
Collapse
|
49
|
Dutta A, Chan J, Haworth A, Dubowitz DJ, Kneebone A, Reynolds HM. Robustness of magnetic resonance imaging and positron emission tomography radiomic features in prostate cancer: Impact on recurrence prediction after radiation therapy. Phys Imaging Radiat Oncol 2024; 29:100530. [PMID: 38275002 PMCID: PMC10809082 DOI: 10.1016/j.phro.2023.100530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 12/21/2023] [Accepted: 12/29/2023] [Indexed: 01/27/2024] Open
Abstract
Background and purpose Radiomic features from MRI and PET are an emerging tool with potential to improve prostate cancer outcomes. However, feature robustness due to image segmentation variations is currently unknown. Therefore, this study aimed to evaluate the robustness of radiomic features with segmentation variations and their impact on predicting biochemical recurrence (BCR). Materials and methods Multi-scanner, pre-radiation therapy imaging from 142 patients with localised prostate cancer was used. Imaging included T2-weighted (T2), apparent diffusion coefficient (ADC) MRI, and prostate-specific membrane antigen (PSMA)-PET. The prostate gland and intraprostatic tumours were manually and automatically segmented, and differences were quantified using Dice Coefficient (DC). Radiomic features including shape, first-order, and texture features were extracted for each segmentation from original and filtered images. Intraclass Correlation Coefficient (ICC) and Mean Absolute Percentage Difference (MAPD) were used to assess feature robustness. Random forest (RF) models were developed for each segmentation using robust features to predict BCR. Results Prostate gland segmentations were more consistent (mean DC = 0.78) than tumour segmentations (mean DC = 0.46). 112 (3.6 %) radiomic features demonstrated 'excellent' robustness (ICC > 0.9 and MAPD < 1 %), and 480 features (15.4 %) demonstrated 'good' robustness (ICC > 0.75 and MAPD < 5 %). PET imaging provided more features with excellent robustness than T2 and ADC. RF models showed strong predictive power for BCR with a mean area under the receiver-operator-characteristics curve (AUC) of 0.89 (range 0.85-0.93). Conclusion When using radiomic features for predictive modelling, segmentation variability should be considered. To develop BCR predictive models, radiomic features from the entire prostate gland are preferable over tumour segmentation-based features.
Collapse
Affiliation(s)
- Arpita Dutta
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Joseph Chan
- Department of Radiation Oncology, Royal North Shore Hospital, Sydney, New South Wales, Australia
| | - Annette Haworth
- Institute of Medical Physics, School of Physics, University of Sydney, Sydney, New South Wales, Australia
| | - David J. Dubowitz
- Department of Anatomy and Medical Imaging, Faculty of Medical and Health Sciences, The University of Auckland, Auckland, New Zealand
- Centre for Advanced MRI, The University of Auckland, Auckland, New Zealand
| | - Andrew Kneebone
- Department of Radiation Oncology, Royal North Shore Hospital, Sydney, New South Wales, Australia
| | - Hayley M. Reynolds
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| |
Collapse
|
50
|
Yang H, Tan T, Tegzes P, Dong X, Tamada R, Ferenczi L, Avinash G. Light mixed-supervised segmentation for 3D medical image data. Med Phys 2024; 51:167-178. [PMID: 37909833 DOI: 10.1002/mp.16816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 10/03/2023] [Accepted: 10/16/2023] [Indexed: 11/03/2023] Open
Abstract
BACKGROUND Accurate 3D semantic segmentation models are essential for many clinical applications. To train a model for 3D segmentation, voxel-level annotation is necessary, which is expensive to obtain due to laborious work and privacy protection. To accurately annotate 3D medical data, such as MRI, a common practice is to annotate the volumetric data in a slice-by-slice contouring way along principal axes. PURPOSE In order to reduce the annotation effort in slices, weakly supervised learning with a bounding box (Bbox) was proposed to leverage the discriminating information via a tightness prior assumption. Nevertheless, this method requests accurate and tight Bboxes, which will significantly drop the performance when tightness is not held, that is when a relaxed Bbox is applied. Therefore, there is a need to train a stable model based on relaxed Bbox annotation. METHODS This paper presents a mixed-supervised training strategy to reduce the annotation effort for 3D segmentation tasks. In the proposed approach, a fully annotated contour is only required for a single slice of the volume. In contrast, the rest of the slices with targets are annotated with relaxed Bboxes. This mixed-supervised method adopts fully supervised learning, relaxed Bbox prior, and contrastive learning during the training, which ensures the network exploits the discriminative information of the training volumes properly. The proposed method was evaluated on two public 3D medical imaging datasets (MRI prostate dataset and Vestibular Schwannoma [VS] dataset). RESULTS The proposed method obtained a high segmentation Dice score of 85.3% on an MRI prostate dataset and 83.3% on a VS dataset with relaxed Bbox annotation, which are close to a fully supervised model. Moreover, with the same relaxed Bbox annotations, the proposed method outperforms the state-of-the-art methods. More importantly, the model performance is stable when the accuracy of Bbox annotation varies. CONCLUSIONS The presented study proposes a method based on a mixed-supervised learning method in 3D medical imaging. The benefit will be stable segmentation of the target in 3D images with low accurate annotation requirement, which leads to easier model training on large-scale datasets.
Collapse
Affiliation(s)
| | - Tao Tan
- GE Healthcare, Eindhoven, The Netherlands
| | | | | | | | | | | |
Collapse
|