1
|
Rahman H, Ben Aoun N, Bukht TFN, Ahmad S, Tadeusiewicz R, Pławiak P, Hammad M. Automatic liver tumor segmentation of CT and MRI volumes using ensemble ResUNet-InceptionV4 model. Inf Sci (N Y) 2025; 704:121966. [DOI: 10.1016/j.ins.2025.121966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/04/2025]
|
2
|
Liu X, Liang J, Zhang J, Qian Z, Xing P, Chen T, Yang S, Chukwudi C, Qiu L, Liu D, Zhao J. Advancing hierarchical neural networks with scale-aware pyramidal feature learning for medical image dense prediction. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 265:108705. [PMID: 40184852 DOI: 10.1016/j.cmpb.2025.108705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Revised: 02/16/2025] [Accepted: 03/03/2025] [Indexed: 04/07/2025]
Abstract
BACKGROUND AND OBJECTIVE Hierarchical neural networks are pivotal in medical imaging for multi-scale representation, aiding in tasks such as object detection and segmentation. However, their effectiveness is often limited by the loss of intra-scale information and misalignment of inter-scale features. Our study introduces the Integrated-Scale Pyramidal Interactive Reconfiguration to Enhance feature learning (INSPIRE). METHODS INSPIRE focuses on intra-scale semantic enhancement and precise inter-scale spatial alignment, integrated with a novel spatial-semantic back augmentation technique. We evaluated INSPIRE's efficacy using standard hierarchical neural networks, such as UNet and FPN, across multiple medical segmentation challenges including brain tumors and polyps. Additionally, we extended our evaluation to object detection and semantic segmentation in natural images to assess generalizability. RESULTS INSPIRE demonstrated superior performance over standard baselines in medical segmentation tasks, showing significant improvements in feature learning and alignment. In identifying brain tumors and polyps, INSPIRE achieved enhanced precision, sensitivity, and specificity compared to traditional models. Further testing in natural images confirmed the adaptability and robustness of our approach. CONCLUSIONS INSPIRE effectively enriches semantic clarity and aligns multi-scale features, achieving integrated spatial-semantic coherence. This method seamlessly integrates with existing frameworks used in medical image analysis, thereby promising to significantly enhance the efficacy of computer-aided diagnostics and clinical interventions. Its application could lead to more accurate and efficient imaging processes, essential for improved patient outcomes.
Collapse
Affiliation(s)
- Xiang Liu
- Alvus Health Inc., Harvard Pagliuca Life Lab, USA; Department of Biostatistics & Health Data Science, Indiana University, USA
| | - James Liang
- Department of Computer Engineering, Rochester Institute of Technology, USA
| | - Jianwei Zhang
- Ming Hsieh Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, USA
| | - Zihan Qian
- Department of Biostatistics, Harvard TH.Chan School of Public Health, USA
| | - Phoebe Xing
- Alvus Health Inc., Harvard Pagliuca Life Lab, USA; United World College of South East Asia, Singapore
| | - Taige Chen
- Department of Physics, University of Illinois Urbana-Champaign, USA
| | - Shanchieh Yang
- Department of Computer Engineering, Rochester Institute of Technology, USA
| | | | - Liang Qiu
- Department of Radiation Oncology, Stanford University, USA
| | - Dongfang Liu
- Department of Computer Engineering, Rochester Institute of Technology, USA
| | - Junhan Zhao
- Department of Biostatistics, Harvard TH.Chan School of Public Health, USA; Department of Biomedical Informatics, Harvard Medical School, USA.
| |
Collapse
|
3
|
Zhang C, Zheng Y, McAviney J, Ling SH. SSAT-Swin: Deep Learning-Based Spinal Ultrasound Feature Segmentation for Scoliosis Using Self-Supervised Swin Transformer. ULTRASOUND IN MEDICINE & BIOLOGY 2025; 51:999-1007. [PMID: 40082183 DOI: 10.1016/j.ultrasmedbio.2025.02.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2024] [Revised: 02/05/2025] [Accepted: 02/18/2025] [Indexed: 03/16/2025]
Abstract
OBJECTIVE Scoliosis, a 3-D spinal deformity, requires early detection and intervention. Ultrasound curve angle (UCA) measurement using ultrasound images has emerged as a promising diagnostic tool. However, calculating the UCA directly from ultrasound images remains challenging due to low contrast, high noise, and irregular target shapes. Accurate segmentation results are therefore crucial to enhance image clarity and precision prior to UCA calculation. METHODS We propose the SSAT-Swin model, a transformer-based multi-class segmentation framework designed for ultrasound image analysis in scoliosis diagnosis. The model integrates a boundary-enhancement module in the decoder and a channel attention module in the skip connections. Additionally, self-supervised proxy tasks are used during pre-training on 1,170 images, followed by fine-tuning on 109 image-label pairs. RESULTS The SSAT-Swin achieved Dice scores of 85.6% and Jaccard scores of 74.5%, with a 92.8% scoliosis bone feature detection rate, outperforming state-of-the-art models. CONCLUSION Self-supervised learning enhances the model's ability to capture global context information, making it well-suited for addressing the unique challenges of ultrasound images, ultimately advancing scoliosis assessment through more accurate segmentation.
Collapse
Affiliation(s)
- Chen Zhang
- School of Electrical and Data Engineering, University of Technology Sydney, NSW, Australia
| | - Yongping Zheng
- Department of Electronic and Information Engineering, Hong Kong Polytechnic University, Hong Kong, China
| | - Jeb McAviney
- ScoliCare Clinic Sydney (South), Kogarah, NSW, Australia
| | - Sai Ho Ling
- School of Electrical and Data Engineering, University of Technology Sydney, NSW, Australia.
| |
Collapse
|
4
|
Feng H, Qiu J, Wen L, Zhang J, Yang J, Lyu Z, Liu T, Fang K. U3UNet: An accurate and reliable segmentation model for forest fire monitoring based on UAV vision. Neural Netw 2025; 185:107207. [PMID: 39892353 DOI: 10.1016/j.neunet.2025.107207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 11/10/2024] [Accepted: 01/22/2025] [Indexed: 02/03/2025]
Abstract
Forest fires pose a serious threat to the global ecological environment, and the critical steps in reducing the impact of fires are fire warning and real-time monitoring. Traditional monitoring methods, like ground observation and satellite sensing, were limited by monitoring coverage or low spatio-temporal resolution, making it difficult to meet the needs for precise shape of fire sources. Therefore, we propose an accurate and reliable forest fire monitoring segmentation model U3UNet based on UAV vision, which uses a nested U-shaped structure for feature fusion at different scales to retain important feature information. The idea of a full-scale connection is utilized to balance the global information of detailed features to ensure the full fusion of features. We conducted a series of comparative experiments with U-Net, UNet 3+, U2-Net, Yolov9, FPS-U2Net, PSPNet, DeeplabV3+ and TransFuse on the Unreal Engine platform and several real forest fire scenes. According to the designed composite metric S, in static scenarios 71. 44% is achieved, which is 0.3% lower than the best method. In the dynamic scenario, it reaches 80.53%, which is 8.94% higher than the optimal method. In addition, we also tested the real-time performance of U3UNet on edge computing device equipped on UAV.
Collapse
Affiliation(s)
- Hailin Feng
- College of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou, 311300, China.
| | - Jiefan Qiu
- ZJUTDeus Robot Team, College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310014, China.
| | - Long Wen
- College of Innovation Engineering, Macau University of Science and Technology, 999078, Macao Special Administrative Region of China.
| | - Jinhong Zhang
- ZJUTDeus Robot Team, College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310014, China.
| | - Jiening Yang
- College of Computer Science, Beijing University of Posts and Telecommunications, Beijing, 100083, China.
| | - Zhihan Lyu
- Department of Game Design, Uppsala University, Sweden.
| | - Tongcun Liu
- College of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou, 311300, China.
| | - Kai Fang
- College of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou, 311300, China.
| |
Collapse
|
5
|
Anwar AS, Amin K, Hadhoud MM, Ibrahim M. ResTransUNet: A hybrid CNN-transformer approach for liver and tumor segmentation in CT images. Comput Biol Med 2025; 190:110048. [PMID: 40157314 DOI: 10.1016/j.compbiomed.2025.110048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 03/06/2025] [Accepted: 03/17/2025] [Indexed: 04/01/2025]
Abstract
BACKGROUND AND OBJECTIVE Accurate medical tumor segmentation is critical for early diagnosis and treatment planning, significantly improving patient outcomes. This study aims to enhance liver and tumor segmentation from CT and liver images by developing a novel model, ResTransUNet, which combines convolutional and transformer blocks to improve segmentation accuracy. METHODS The proposed ResTransUNet model is a custom implementation inspired by the TransUNet architecture, featuring a Standalone Transformer Block and ResNet50 as the backbone for the encoder. The hybrid architecture leverages the strengths of Convolutional Neural Networks (CNNs) and Transformer blocks to capture both local features and global context effectively. The encoder utilizes a pre-trained ResNet50 to extract rich hierarchical features, with key feature maps to preserved it as skip connections. The Standalone Transformer Block, integrated into the model, employs multi-head attention mechanisms to capture long-range dependencies across the image, enhancing segmentation performance in complex cases. The decoder reconstructs the segmentation mask by progressively upsampling encoded features while integrating skip connections, ensuring both semantic information and spatial details are retained. This process culminates in a precise binary segmentation mask that effectively distinguishes liver and tumor regions. RESULTS The ResTransUNet model achieved superior Dice Similarity Coefficient (DSC) for liver segmentation (98.3% on LiTS and 98.4% on 3D-IRCADb-01) and for tumor segmentation from CT images (94.7% on LiTS and 89.8% on 3D-IRCADb-01) as well as from liver images (94.6% on LiTS and 91.1% on 3D-IRCADb-01). The model also demonstrated high precision, sensitivity, and specificity, outperforming current state-of-the-art methods in these tasks. CONCLUSIONS The ResTransUNet model demonstrates robust and accurate performance in complex medical image segmentation tasks, particularly in liver and tumor segmentation. These findings suggest that ResTransUNet has significant potential for improving the precision of surgical interventions and therapy planning in clinical settings.
Collapse
Affiliation(s)
- Asmaa Sabet Anwar
- Department of Computer Engineering, Faculty of Engineering, May University, Cairo, Egypt.
| | - Khaled Amin
- Department of Information Technology, Faculty of Computers and Information, Menoufia University, Shebin Elkom, Egypt
| | - Mohiy M Hadhoud
- Department of Information Technology, Faculty of Computers and Information, Menoufia University, Shebin Elkom, Egypt
| | - Mina Ibrahim
- Department of Machine Intelligence, Faculty of Artificial Intelligence, Menoufia University, Shebin Elkom, Egypt
| |
Collapse
|
6
|
Fakhfakh M, Sarry L, Clarysse P. HALSR-Net: Improving CNN Segmentation of Cardiac Left Ventricle MRI with Hybrid Attention and Latent Space Reconstruction. Comput Med Imaging Graph 2025; 123:102546. [PMID: 40245744 DOI: 10.1016/j.compmedimag.2025.102546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2024] [Revised: 02/17/2025] [Accepted: 03/30/2025] [Indexed: 04/19/2025]
Abstract
Accurate cardiac MRI segmentation is vital for detailed cardiac analysis, yet the manual process is labor-intensive and prone to variability. Despite advancements in MRI technology, there remains a significant need for automated methods that can reliably and efficiently segment cardiac structures. This paper introduces HALSR-Net, a novel multi-level segmentation architecture designed to improve the accuracy and reproducibility of cardiac segmentation from Cine-MRI acquisitions, focusing on the left ventricle (LV). The methodology consists of two main phases: first, the extraction of the region of interest (ROI) using a regression model that accurately predicts the location of a bounding box around the LV; second, the semantic segmentation step based on HALSR-Net architecture. This architecture incorporates a Hybrid Attention Pooling Module (HAPM) that merges attention and pooling mechanisms to enhance feature extraction and capture contextual information. Additionally, a reconstruction module leverages latent space features to further improve segmentation accuracy. Experiments conducted on an in-house clinical dataset and two public datasets (ACDC and LVQuan19) demonstrate that HALSR-Net outperforms state-of-the-art architectures, achieving up to 98% accuracy and F1-score for the segmentation of the LV cavity and myocardium. The proposed approach effectively addresses the limitations of existing methods, offering a more accurate and robust solution for cardiac MRI segmentation, thereby likely to improve cardiac function analysis and patient care.
Collapse
Affiliation(s)
- Mohamed Fakhfakh
- Université Clermont Auvergne, CHU Clermont-Ferrand, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000, Clermont-Ferrand, France.
| | - Laurent Sarry
- Université Clermont Auvergne, CHU Clermont-Ferrand, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000, Clermont-Ferrand, France.
| | - Patrick Clarysse
- INSA-Lyon, Université Claude Bernard Lyon 1, CNRS, Inserm, CREATIS UMR 5220, U1294, F-69621, Lyon, France.
| |
Collapse
|
7
|
Du Y, Chen X, Fu Y. Multiscale transformers and multi-attention mechanism networks for pathological nuclei segmentation. Sci Rep 2025; 15:12549. [PMID: 40221423 PMCID: PMC11993704 DOI: 10.1038/s41598-025-90397-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Accepted: 02/12/2025] [Indexed: 04/14/2025] Open
Abstract
Pathology nuclei segmentation is crucial of computer-aided diagnosis in pathology. However, due to the high density, complex backgrounds, and blurred cell boundaries, it makes pathology cell segmentation still a challenging problem. In this paper, we propose a network model for pathology image segmentation based on a multi-scale Transformer multi-attention mechanism. To solve the problem that the high density of cell nuclei and the complexity of the background make it difficult to extract features, a dense attention module is embedded in the encoder, which improves the learning of the target cell information to minimize target information loss; Additionally, to solve the problem of poor segmentation accuracy due to the blurred cell boundaries, the Multi-scale Transformer Attention module is embedded between encoder and decoder, improving the transfer of the boundary feature information and makes the segmented cell boundaries more accurate. Experimental results on MoNuSeg, GlaS and CoNSeP datasets demonstrate the network's superior accuracy.
Collapse
Affiliation(s)
- Yongzhao Du
- College of Engineering, Huaqiao University, Fujian, 362021, China.
- College of Internet of Things Industry, Huaqiao University, Fujian, 362021, China.
| | - Xin Chen
- College of Engineering, Huaqiao University, Fujian, 362021, China
| | - Yuqing Fu
- College of Engineering, Huaqiao University, Fujian, 362021, China
- College of Internet of Things Industry, Huaqiao University, Fujian, 362021, China
| |
Collapse
|
8
|
Ahmad I, Anwar SJ, Hussain B, Ur Rehman A, Bermak A. Anatomy guided modality fusion for cancer segmentation in PET CT volumes and images. Sci Rep 2025; 15:12153. [PMID: 40204866 PMCID: PMC11982402 DOI: 10.1038/s41598-025-95757-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2024] [Accepted: 03/24/2025] [Indexed: 04/11/2025] Open
Abstract
Segmentation in computed tomography (CT) provides detailed anatomical information, while positron emission tomography (PET) provide the metabolic activity of cancer. Existing segmentation models in CT and PET either rely on early fusion, which struggles to effectively capture independent features from each modality, or late fusion, which is computationally expensive and fails to leverage the complementary nature of the two modalities. This research addresses the gap by proposing an intermediate fusion approach that optimally balances the strengths of both modalities. Our method leverages anatomical features to guide the fusion process while preserving spatial representation quality. We achieve this through the separate encoding of anatomical and metabolic features followed by an attentive fusion decoder. Unlike traditional fixed normalization techniques, we introduce novel "zero layers" with learnable normalization. The proposed intermediate fusion reduces the number of filters, resulting in a lightweight model. Our approach demonstrates superior performance, achieving a dice score of 0.8184 and an [Formula: see text] score of 2.31. The implications of this study include more precise tumor delineation, leading to enhanced cancer diagnosis and more effective treatment planning.
Collapse
Affiliation(s)
- Ibtihaj Ahmad
- Northwestern Polytechnical University, Xi'an, 710072, Shaanxi, People's Republic of China
- School of Public Health, Shandong University, Jinan, Shandong, People's Republic of China
| | - Sadia Jabbar Anwar
- Northwestern Polytechnical University, Xi'an, 710072, Shaanxi, People's Republic of China
| | - Bagh Hussain
- Northwestern Polytechnical University, Xi'an, 710072, Shaanxi, People's Republic of China
| | - Atiq Ur Rehman
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar.
| | - Amine Bermak
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| |
Collapse
|
9
|
Pan P, Zhang C, Sun J, Guo L. Multi-scale conv-attention U-Net for medical image segmentation. Sci Rep 2025; 15:12041. [PMID: 40199917 PMCID: PMC11978844 DOI: 10.1038/s41598-025-96101-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2024] [Accepted: 03/26/2025] [Indexed: 04/10/2025] Open
Abstract
U-Net-based network structures are widely used in medical image segmentation. However, effectively capturing multi-scale features and spatial context information of complex organizational structures remains a challenge. To address this, we propose a novel network structure based on the U-Net backbone. This model integrates the Adaptive Convolution (AC) module, Multi-Scale Learning (MSL) module, and Conv-Attention module to enhance feature expression ability and segmentation performance. The AC module dynamically adjusts the convolutional kernel through an adaptive convolutional layer. This enables the model to extract features of different shapes and scales adaptively, further improving its performance in complex scenarios. The MSL module is designed for multi-scale information fusion. It effectively aggregates fine-grained and high-level semantic features from different resolutions, creating rich multi-scale connections between the encoding and decoding processes. On the other hand, the Conv-Attention module incorporates an efficient attention mechanism into the skip connections. It captures global context information using a low-dimensional proxy for high-dimensional data. This approach reduces computational complexity while maintaining effective spatial and channel information extraction. Experimental validation on the CVC-ClinicDB, MICCAI 2023 Tooth, and ISIC2017 datasets demonstrates that our proposed MSCA-UNet significantly improves segmentation accuracy and model robustness. At the same time, it remains lightweight and outperforms existing segmentation methods.
Collapse
Affiliation(s)
- Peng Pan
- College of Technology and Data, Yantai Nanshan University, Yantai, 265713, China
| | - Chengxue Zhang
- College of Technology and Data, Yantai Nanshan University, Yantai, 265713, China
| | - Jingbo Sun
- College of Technology and Data, Yantai Nanshan University, Yantai, 265713, China.
| | - Lina Guo
- College of Technology and Data, Yantai Nanshan University, Yantai, 265713, China
| |
Collapse
|
10
|
Rajendran P, Yang Y, Niedermayr TR, Gensheimer M, Beadle B, Le QT, Xing L, Dai X. Large language model-augmented learning for auto-delineation of treatment targets in head-and-neck cancer radiotherapy. Radiother Oncol 2025; 205:110740. [PMID: 39855601 PMCID: PMC11956750 DOI: 10.1016/j.radonc.2025.110740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2024] [Revised: 12/20/2024] [Accepted: 01/17/2025] [Indexed: 01/27/2025]
Abstract
BACKGROUND AND PURPOSE Radiation therapy (RT) is highly effective, but its success depends on accurate, manual target delineation, which is time-consuming, labor-intensive, and prone to variability. Despite AI advancements in auto-contouring normal tissues, accurate RT target volume delineation remains challenging. This study presents Radformer, a novel visual language model that integrates text-rich clinical data with medical imaging for accurate automated RT target volume delineation. MATERIALS AND METHODS We developed Radformer, an innovative network that utilizes a hierarchical vision transformer as its backbone and integrates large language models (LLMs) to extract and embed clinical data in text-rich form. The model features a novel visual language attention module (VLAM) to combine visual and linguistic features, enabling language-aware visual encoding (LAVE). The Radformer was evaluated on a dataset of 2985 patients with head-and-neck cancer who underwent RT. Quantitative evaluations were performed utilizing metrics such as the Dice similarity coefficient (DSC), intersection over union (IOU), and 95th percentile Hausdorff distance (HD95). RESULTS The Radformer demonstrated superior performance in segmenting RT target volumes compared to state-of-the-art models. On the head-and-neck cancer dataset, Radformer achieved a mean DSC of 0.76 ± 0.09 versus 0.66 ± 0.09, a mean IOU of 0.69 ± 0.08 versus 0.59 ± 0.07, and a mean HD95 of 7.82 ± 6.87 mm versus 14.28 ± 6.85 mm for gross tumor volume delineation, compared to the baseline 3D-UNETR. CONCLUSIONS The Radformer model offers a clinically optimal means of RT target auto-delineation by integrating both imaging and clinical data through a visual language model. This approach improves the accuracy of RT target volume delineation, facilitating broader AI-assisted automation in RT treatment planning.
Collapse
Affiliation(s)
| | - Yong Yang
- Department of Radiation Oncology, Stanford University, Stanford, CA, United States
| | - Thomas R Niedermayr
- Department of Radiation Oncology, Stanford University, Stanford, CA, United States
| | - Michael Gensheimer
- Department of Radiation Oncology, Stanford University, Stanford, CA, United States
| | - Beth Beadle
- Department of Radiation Oncology, Stanford University, Stanford, CA, United States
| | - Quynh-Thu Le
- Department of Radiation Oncology, Stanford University, Stanford, CA, United States
| | - Lei Xing
- Department of Radiation Oncology, Stanford University, Stanford, CA, United States
| | - Xianjin Dai
- Department of Radiation Oncology, Stanford University, Stanford, CA, United States.
| |
Collapse
|
11
|
Huang Z, Deng Z, Ye J, Wang H, Su Y, Li T, Sun H, Cheng J, Chen J, He J, Gu Y, Zhang S, Gu L, Qiao Y. A-Eval: A benchmark for cross-dataset and cross-modality evaluation of abdominal multi-organ segmentation. Med Image Anal 2025; 101:103499. [PMID: 39970528 DOI: 10.1016/j.media.2025.103499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 01/16/2025] [Accepted: 02/06/2025] [Indexed: 02/21/2025]
Abstract
Although deep learning has revolutionized abdominal multi-organ segmentation, its models often struggle with generalization due to training on small-scale, specific datasets and modalities. The recent emergence of large-scale datasets may mitigate this issue, but some important questions remain unsolved: Can models trained on these large datasets generalize well across different datasets and imaging modalities? If yes/no, how can we further improve their generalizability? To address these questions, we introduce A-Eval, a benchmark for the cross-dataset and cross-modality Evaluation ('Eval') of Abdominal ('A') multi-organ segmentation, integrating seven datasets across CT and MRI modalities. Our evaluations indicate that significant domain gaps persist despite larger data scales. While increased datasets improve generalization, model performance on unseen data remains inconsistent. Joint training across multiple datasets and modalities enhances generalization, though annotation inconsistencies pose challenges. These findings highlight the need for diverse and well-curated training data across various clinical scenarios and modalities to develop robust medical imaging models. The code and pre-trained models are available at https://github.com/uni-medical/A-Eval.
Collapse
Affiliation(s)
- Ziyan Huang
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China; School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200000, China
| | - Zhongying Deng
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200000, China; Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, CB2 1TN, United Kingdom
| | - Jin Ye
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200000, China
| | - Haoyu Wang
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China; School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200000, China
| | - Yanzhou Su
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200000, China
| | - Tianbin Li
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200000, China
| | - Hui Sun
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200000, China
| | - Junlong Cheng
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200000, China
| | - Jianpin Chen
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200000, China
| | - Junjun He
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200000, China
| | - Yun Gu
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Shaoting Zhang
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200000, China
| | - Lixu Gu
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China; School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China.
| | - Yu Qiao
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200000, China.
| |
Collapse
|
12
|
Samak ZA. Multi-type stroke lesion segmentation: comparison of single-stage and hierarchical approach. Med Biol Eng Comput 2025; 63:975-986. [PMID: 39549224 DOI: 10.1007/s11517-024-03243-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2024] [Accepted: 11/02/2024] [Indexed: 11/18/2024]
Abstract
Stroke, a major cause of death and disability worldwide, can be haemorrhagic or ischaemic depending on the type of bleeding in the brain. Rapid and accurate identification of stroke type and lesion segmentation is critical for timely and effective treatment. However, existing research primarily focuses on segmenting a single stroke type, potentially limiting their clinical applicability. This study addresses this gap by exploring multi-type stroke lesion segmentation using deep learning methods. Specifically, we investigate two distinct approaches: a single-stage approach that directly segments all tissue types in one model and a hierarchical approach that first classifies stroke types and then utilises specialised segmentation models for each subtype. Recognising the importance of accurate stroke classification for the hierarchical approach, we evaluate ResNet, ResNeXt and ViT networks, incorporating focal loss and oversampling techniques to mitigate the impact of class imbalance. We further explore the performance of U-Net, U-Net++ and DeepLabV3 models for segmentation within each approach. We use a comprehensive dataset of 6650 images provided by the Ministry of Health of the Republic of Türkiye. This dataset includes 1130 ischaemic strokes, 1093 haemorrhagic strokes and 4427 non-stroke cases. In our comparative experiments, we achieve an AUC score of 0.996 when classifying stroke and non-stroke slices. For lesion segmentation task, while the performance of different architectures is comparable, the hierarchical training approach outperforms the single-stage approach in terms of intersection over union (IoU). The performance of the U-Net model increased significantly from an IoU of 0.788 to 0.875 when the hierarchical approach is used. This comparative analysis aims to identify the most effective approach and deep learning model for multi-type stroke lesion segmentation in brain CT scans, potentially leading to improved clinical decision-making, treatment efficiency and outcomes.
Collapse
Affiliation(s)
- Zeynel A Samak
- Department of Computer Engineering, Adiyaman University, Adiyaman, 02040, Türkiye.
| |
Collapse
|
13
|
Lim H, Gi Y, Ko Y, Jo Y, Hong J, Kim J, Ahn SH, Park HC, Kim H, Chung K, Yoon M. A device-dependent auto-segmentation method based on combined generalized and single-device datasets. Med Phys 2025; 52:2375-2383. [PMID: 39699056 DOI: 10.1002/mp.17570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 11/12/2024] [Accepted: 11/26/2024] [Indexed: 12/20/2024] Open
Abstract
BACKGROUND Although generalized-dataset-based auto-segmentation models that consider various computed tomography (CT) scanners have shown great clinical potential, their application to medical images from unseen scanners remains challenging because of device-dependent image features. PURPOSE This study aims to investigate the performance of a device-dependent auto-segmentation model based on a combined dataset of a generalized dataset and single CT scanner dataset. METHOD We constructed two training datasets for 21 chest and abdominal organs. The generalized dataset comprised 1203 publicly available multi-scanner data. The device-dependent dataset comprised 1253 data, including the 1203 multi-CT scanner data and 50 single CT scanner data. Using these datasets, the generalized-dataset-based model (GDSM) and the device-dependent-dataset-based model (DDSM) were trained. The models were trained using nnU-Net and tested on ten data samples from a single CT scanner. The evaluation metrics included the Dice similarity coefficient (DSC), the Hausdorff distance (HD), and the average symmetric surface distance (ASSD), which were used to assess the overall performance of the models. In addition, DSCdiff, HDratio, and ASSDratio, which are variations of the three existing metrics, were used to compare the performance of the models across different organs. RESULT For the average DSC, the GDSM and DDSM had values of 0.9251 and 0.9323, respectively. For the average HD, the GDSM and DDSM had values of 10.66 and 9.139 mm, respectively; for the average ASSD, the GDSM and DDSM had values of 0.8318 and 0.6656 mm, respectively. Compared with the GDSM, the DDSM showed consistent performance improvements of 0.78%, 14%, and 20% for the DSC, HD, and ASSD metrics, respectively. In addition, compared with the GDSM, the DDSM had better DSCdiff values in 14 of 21 tested organs, better HDratio values in 13 of 21 tested organs, and better ASSDratio values in 14 of 21 tested organs. The three averages of the variant metrics were all better for the DDSM than for the GDSM. CONCLUSION The results suggest that combining the generalized dataset with a single scanner dataset resulted in an overall improvement in model performance for that device image.
Collapse
Affiliation(s)
- Hyeongjin Lim
- Department of Bio-medical Engineering, Korea University, Seoul, Republic of Korea
| | - Yongha Gi
- Department of Bio-medical Engineering, Korea University, Seoul, Republic of Korea
| | - Yousun Ko
- Department of Bio-medical Engineering, Korea University, Seoul, Republic of Korea
| | - Yunhui Jo
- Institute of Global Health Technology (IGHT), Korea University, Seoul, Republic of Korea
| | - Jinyoung Hong
- Department of Bio-medical Engineering, Korea University, Seoul, Republic of Korea
| | | | - Sung Hwan Ahn
- Department of Radiation Oncology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Hee-Chul Park
- Department of Radiation Oncology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Haeyoung Kim
- Department of Radiation Oncology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Kwangzoo Chung
- Department of Radiation Oncology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Myonggeun Yoon
- Department of Bio-medical Engineering, Korea University, Seoul, Republic of Korea
- FieldCure Ltd, Seoul, Republic of Korea
| |
Collapse
|
14
|
Deng XW, Zhao HM, Jia LC, Li JN, Wei ZQ, Yang H, Qu A, Jiang WJ, Lei RH, Sun HT, Wang JJ, Jiang P. Prior Knowledge-Guided U-Net for Automatic Clinical Target Volume Segmentation in Postmastectomy Radiation Therapy of Breast Cancer. Int J Radiat Oncol Biol Phys 2025; 121:1361-1371. [PMID: 39667584 DOI: 10.1016/j.ijrobp.2024.11.104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Revised: 10/23/2024] [Accepted: 11/29/2024] [Indexed: 12/14/2024]
Abstract
PURPOSE This study aimed to design and evaluate a prior-knowledge-guided U-Net (PK-UNet) for automatic clinical target volume (CTV) segmentation in postmastectomy radiation therapy for breast cancer. METHODS AND MATERIALS A total of 102 computed tomography (CT) scans from breast cancer patients who underwent postmastectomy were retrospectively collected. Of these, 80 scans were used for training with 5-fold cross-validation, and 22 scans for independent testing. The CTV included the chest wall, supraclavicular region, and axillary group III. The proposed PK-UNet method employs a 2-stage auto-segmentation process. Initially, the localization network categorizes CT slices based on the anatomic information of the CTV and generates prior knowledge labels. These outputs, along with the CT images, were fed into the final segmentation network. Quantitative evaluation was conducted using the mean Dice similarity coefficient (DSC), 95% Hausdorff distance, average surface distance, and surface DSC. A four-level objective scale evaluation was performed by 2 experienced radiation oncologists in a randomized double-blind manner. RESULTS Quantitative evaluations revealed that PK-UNet significantly outperformed state-of-the-art segmentation methods (P < .01), with a mean DSC of 0.90 ± 0.02 and a 95% Hausdorff distance of 2.82 ± 1.29 mm. The mean average surface distance of PK-UNet was 0.91 ± 0.22 mm and the surface DSC was 0.84 ± 0.07, significantly surpassing the performance of AdwU-Net (P < .01) and showing comparable results to other models. Clinical evaluation confirmed the efficacy of PK-UNet, with 81.8% of the predicted contours being acceptable for clinical application. The advantages of the auto-segmentation capability of PK-UNet were most evident in the superior and inferior slices and slices with discontinuities at the junctions of different subregions. The average manual correction time was reduced to 1.02 min, compared with 18.20 min for manual contouring leading to a 94.4% reduction in working time. CONCLUSIONS This study introduced the pioneering integration of prior medical knowledge into a deep learning framework for postmastectomy radiation therapy. This strategy addresses the challenges of CTV segmentation in postmastectomy radiation therapy and improves clinical workflow efficiency.
Collapse
Affiliation(s)
- Xiu-Wen Deng
- Department of Radiation Oncology, Peking University Third Hospital, Beijing, China
| | - Hong-Mei Zhao
- Department of General Surgery, Peking University Third Hospital, Beijing, China
| | - Le-Cheng Jia
- Department of Radiotherapy Research Collaboration, Shenzhen United Imaging Research Institute of Innovative Medical Equipment, Shenzhen, China
| | - Jin-Na Li
- Department of Radiation Oncology, Peking University Third Hospital, Beijing, China
| | - Zi-Quan Wei
- Department of Radiotherapy Research Collaboration, Shenzhen United Imaging Research Institute of Innovative Medical Equipment, Shenzhen, China
| | - Hang Yang
- Department of Radiotherapy Research Collaboration, United Imaging Research Institute of Intelligent Imaging, Beijing, China
| | - Ang Qu
- Department of Radiation Oncology, Peking University Third Hospital, Beijing, China
| | - Wei-Juan Jiang
- Department of Radiation Oncology, Peking University Third Hospital, Beijing, China
| | - Run-Hong Lei
- Department of Radiation Oncology, Peking University Third Hospital, Beijing, China
| | - Hai-Tao Sun
- Department of Radiation Oncology, Peking University Third Hospital, Beijing, China
| | - Jun-Jie Wang
- Department of Radiation Oncology, Peking University Third Hospital, Beijing, China.
| | - Ping Jiang
- Department of Radiation Oncology, Peking University Third Hospital, Beijing, China.
| |
Collapse
|
15
|
Li P, Hu Y. Deep graph embedding based on Laplacian eigenmaps for MR fingerprinting reconstruction. Med Image Anal 2025; 101:103481. [PMID: 39923317 DOI: 10.1016/j.media.2025.103481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 01/24/2025] [Accepted: 01/25/2025] [Indexed: 02/11/2025]
Abstract
Magnetic resonance fingerprinting (MRF) is a promising technique for fast quantitative imaging of multiple tissue parameters. However, the highly undersampled schemes utilized in MRF typically lead to noticeable aliasing artifacts in reconstructed images. Existing model-based methods can mitigate aliasing artifacts and enhance reconstruction quality but suffer from long reconstruction times. In addition, data priors used in these methods, such as low-rank and total variation, make it challenging to incorporate non-local and non-linear redundancies in MRF data. Furthermore, existing deep learning-based methods for MRF often lack interpretability and struggle with the high computational overhead caused by the high dimensionality of MRF data. To address these issues, we introduce a novel deep graph embedding framework based on the Laplacian eigenmaps for improved MRF reconstruction. Our work first models the acquired high-dimensional MRF data and the corresponding parameter maps as graph data nodes. Then, we propose an MRF reconstruction framework based on the graph embedding framework, retaining intrinsic graph structures between parameter maps and MRF data. To improve the accuracy of the estimated graph structure and the computational efficiency of the proposed framework, we unroll the iterative optimization process into a deep neural network, incorporating a learned graph embedding module to adaptively learn the Laplacian eigenmaps. By introducing the graph embedding framework into the MRF reconstruction, the proposed method can effectively exploit non-local and non-linear correlations in MRF data. Numerical experiments demonstrate that our approach can reconstruct high-quality MRF data and multiple parameter maps within a significantly reduced computational cost.
Collapse
Affiliation(s)
- Peng Li
- The School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, China
| | - Yue Hu
- The School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, China.
| |
Collapse
|
16
|
Wang Y, Luo L, Wu M, Wang Q, Chen H. Learning robust medical image segmentation from multi-source annotations. Med Image Anal 2025; 101:103489. [PMID: 39933334 DOI: 10.1016/j.media.2025.103489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 11/02/2024] [Accepted: 01/28/2025] [Indexed: 02/13/2025]
Abstract
Collecting annotations from multiple independent sources could mitigate the impact of potential noises and biases from a single source, which is a common practice in medical image segmentation. However, learning segmentation networks from multi-source annotations remains a challenge due to the uncertainties brought by the variance of the annotations. In this paper, we proposed an Uncertainty-guided Multi-source Annotation Network (UMA-Net), which guided the training process by uncertainty estimation at both the pixel and the image levels. First, we developed an annotation uncertainty estimation module (AUEM) to estimate the pixel-wise uncertainty of each annotation, which then guided the network to learn from reliable pixels by a weighted segmentation loss. Second, a quality assessment module (QAM) was proposed to assess the image-level quality of the input samples based on the former estimated annotation uncertainties. Furthermore, instead of discarding the low-quality samples, we introduced an auxiliary predictor to learn from them and thus ensured the preservation of their representation knowledge in the backbone without directly accumulating errors within the primary predictor. Extensive experiments demonstrated the effectiveness and feasibility of our proposed UMA-Net on various datasets, including 2D chest X-ray segmentation dataset, 2D fundus image segmentation dataset, 3D breast DCE-MRI segmentation dataset, and the QUBIQ multi-task segmentation dataset. Code will be released at https://github.com/wangjin2945/UMA-Net.
Collapse
Affiliation(s)
- Yifeng Wang
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
| | - Luyang Luo
- Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
| | | | - Qiong Wang
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Hao Chen
- Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China; Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Hong Kong, China; Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong, China; State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Hong Kong, China.
| |
Collapse
|
17
|
Huang X, Qin M, Fang M, Wang Z, Hu C, Zhao T, Qin Z, Zhu H, Wu L, Yu G, De Cobelli F, Xie X, Palumbo D, Tian J, Dong D. The application of artificial intelligence in upper gastrointestinal cancers. JOURNAL OF THE NATIONAL CANCER CENTER 2025; 5:113-131. [PMID: 40265096 PMCID: PMC12010392 DOI: 10.1016/j.jncc.2024.12.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2024] [Revised: 09/17/2024] [Accepted: 12/20/2024] [Indexed: 04/24/2025] Open
Abstract
Upper gastrointestinal cancers, mainly comprising esophageal and gastric cancers, are among the most prevalent cancers worldwide. There are many new cases of upper gastrointestinal cancers annually, and the survival rate tends to be low. Therefore, timely screening, precise diagnosis, appropriate treatment strategies, and effective prognosis are crucial for patients with upper gastrointestinal cancers. In recent years, an increasing number of studies suggest that artificial intelligence (AI) technology can effectively address clinical tasks related to upper gastrointestinal cancers. These studies mainly focus on four aspects: screening, diagnosis, treatment, and prognosis. In this review, we focus on the application of AI technology in clinical tasks related to upper gastrointestinal cancers. Firstly, the basic application pipelines of radiomics and deep learning in medical image analysis were introduced. Furthermore, we separately reviewed the application of AI technology in the aforementioned aspects for both esophageal and gastric cancers. Finally, the current limitations and challenges faced in the field of upper gastrointestinal cancers were summarized, and explorations were conducted on the selection of AI algorithms in various scenarios, the popularization of early screening, the clinical applications of AI, and large multimodal models.
Collapse
Affiliation(s)
- Xiaoying Huang
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Minghao Qin
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
- University of Science and Technology Beijing, Beijing, China
| | - Mengjie Fang
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beihang University, Beijing, China
- Key Laboratory of Big Data-Based Precision Medicine, Beihang University, Ministry of Industry and Information Technology, Beijing, China
| | - Zipei Wang
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Chaoen Hu
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Tongyu Zhao
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
- University of Science and Technology of China, Hefei, China
| | - Zhuyuan Qin
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- Beijing University of Chinese Medicine, Beijing, China
| | | | - Ling Wu
- KiangWu Hospital, Macau, China
| | | | | | | | - Diego Palumbo
- Department of Radiology, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Jie Tian
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beihang University, Beijing, China
- Key Laboratory of Big Data-Based Precision Medicine, Beihang University, Ministry of Industry and Information Technology, Beijing, China
| | - Di Dong
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
18
|
Liu S, Zhang R, Fang M, Li H, Xun T, Wang Z, Shang W, Tian J, Dong D. PCRFed: personalized federated learning with contrastive representation for non-independently and identically distributed medical image segmentation. Vis Comput Ind Biomed Art 2025; 8:6. [PMID: 40153099 PMCID: PMC11953490 DOI: 10.1186/s42492-025-00191-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2024] [Accepted: 03/06/2025] [Indexed: 03/30/2025] Open
Abstract
Federated learning (FL) has shown great potential in addressing data privacy issues in medical image analysis. However, varying data distributions across different sites can create challenges in aggregating client models and achieving good global model performance. In this study, we propose a novel personalized contrastive representation FL framework, named PCRFed, which leverages contrastive representation learning to address the non-independent and identically distributed (non-IID) challenge and dynamically adjusts the distance between local clients and the global model to improve each client's performance without incurring additional communication costs. The proposed weighted model-contrastive loss provides additional regularization for local models, optimizing their respective distributions while effectively utilizing information from all clients to mitigate performance challenges caused by insufficient local data. The PCRFed approach was evaluated on two non-IID medical image segmentation datasets, and the results show that it outperforms several state-of-the-art FL frameworks, achieving higher single-client performance while ensuring privacy preservation and minimal communication costs. Our PCRFed framework can be adapted to various encoder-decoder segmentation network architectures and holds significant potential for advancing the use of FL in real-world medical applications. Based on a multi-center dataset, our framework demonstrates superior overall performance and higher single-client performance, achieving a 2.63% increase in the average Dice score for prostate segmentation.
Collapse
Affiliation(s)
- Shengyuan Liu
- Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, 999077, China
| | - Ruofan Zhang
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China
- Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Mengjie Fang
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Engineering Medicine, Beihang University, Beijing, 100191, China
| | - Hailin Li
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Engineering Medicine, Beihang University, Beijing, 100191, China
- The Artificial Intelligence and Intelligent Operation Center, China Mobile Research Institute, Beijing, 100053, China
| | - Tianwang Xun
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China
- Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Zipei Wang
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China
- Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Wenting Shang
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China
- Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Jie Tian
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China.
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Engineering Medicine, Beihang University, Beijing, 100191, China.
| | - Di Dong
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China.
- Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China.
| |
Collapse
|
19
|
Jin R, Tang HY, Yang Q, Chen W. LA-ResUNet: Attention-based network for longitudinal liver tumor segmentation from CT images. Comput Med Imaging Graph 2025; 123:102536. [PMID: 40168844 DOI: 10.1016/j.compmedimag.2025.102536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2024] [Revised: 03/17/2025] [Accepted: 03/19/2025] [Indexed: 04/03/2025]
Abstract
Longitudinal liver tumor segmentation plays a fundamental role in studying and monitoring the progression of associated diseases. The correlation and differences between longitudinal data can further improve segmentation performance, which are inevitably omitted in single-time-point segmentation. However, there is no research in this field due to the lack of relevant data. To this issue, we collect and annotate the first longitudinal liver tumor segmentation benchmark dataset. A novel strategy that utilizes images from one time point to facilitate the image segmentation from another time point of the same patient is presented. On this basis, we propose a longitudinal attention based residual U-shaped network. Within it, a channel & spatial attention module quantifies both channel-wise and spatial-wise dependencies of each feature to refine feature representations. And a longitudinal co-segmentation module captures cross-temporal correlation to recalibrate the feature at one time point according to another one for enhanced segmentation. Longitudinal segmentation is achieved by plugging these two multi-scale modules into each layer of the backbone network. Extensive experiments on our CT liver tumor dataset and an MRI brain tumor dataset have validated the effectiveness of the established strategy and the longitudinal segmentation ability of our network. Ablation studies have verified the functions of the proposed modules and their respective components.
Collapse
Affiliation(s)
- Ri Jin
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
| | - Hu-Ying Tang
- Department of Radiology, The First Affiliated Hospital of the Army Medical University (Southwest Hospital), Chongqing 400038, China.
| | - Qian Yang
- Department of Paediatrics, Sichuan Provincial People's Hospital, Chengdu 610072, China.
| | - Wei Chen
- Department of Radiology, The First Affiliated Hospital of the Army Medical University (Southwest Hospital), Chongqing 400038, China.
| |
Collapse
|
20
|
Hossain MS, Basak N, Mollah MA, Nahiduzzaman M, Ahsan M, Haider J. Ensemble-based multiclass lung cancer classification using hybrid CNN-SVD feature extraction and selection method. PLoS One 2025; 20:e0318219. [PMID: 40106514 PMCID: PMC11922248 DOI: 10.1371/journal.pone.0318219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Accepted: 01/10/2025] [Indexed: 03/22/2025] Open
Abstract
Lung cancer (LC) is a leading cause of cancer-related fatalities worldwide, underscoring the urgency of early detection for improved patient outcomes. The main objective of this research is to harness the noble strategies of artificial intelligence for identifying and classifying lung cancers more precisely from CT scan images at the early stage. This study introduces a novel lung cancer detection method, which was mainly focused on Convolutional Neural Networks (CNN) and was later customized for binary and multiclass classification utilizing a publicly available dataset of chest CT scan images of lung cancer. The main contribution of this research lies in its use of a hybrid CNN-SVD (Singular Value Decomposition) method and the use of a robust voting ensemble approach, which results in superior accuracy and effectiveness for mitigating potential errors. By employing contrast-limited adaptive histogram equalization (CLAHE), contrast-enhanced images were generated with minimal noise and prominent distinctive features. Subsequently, a CNN-SVD-Ensemble model was implemented to extract important features and reduce dimensionality. The extracted features were then processed by a set of ML algorithms along with a voting ensemble approach. Additionally, Gradient-weighted Class Activation Mapping (Grad-CAM) was integrated as an explainable AI (XAI) technique for enhancing model transparency by highlighting key influencing regions in the CT scans, which improved interpretability and ensured reliable and trustworthy results for clinical applications. This research offered state-of-the-art results, which achieved remarkable performance metrics with an accuracy, AUC, precision, recall, F1 score, Cohen's Kappa and Matthews Correlation Coefficient (MCC) of 99.49%, 99.73%, 100%, 99%, 99%, 99.15% and 99.16%, respectively, addressing the prior research gaps and setting a new benchmark in the field. Furthermore, in binary class classification, all the performance indicators attained a perfect score of 100%. The robustness of the suggested approach offered more reliable and impactful insights in the medical field, thus improving existing knowledge and setting the stage for future innovations.
Collapse
Affiliation(s)
- Md Sabbir Hossain
- Department of Electronics & Telecommunication Engineering, Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh
| | - Niloy Basak
- Department of Electrical & Computer Engineering, Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh
| | - Md Aslam Mollah
- Department of Electronics & Telecommunication Engineering, Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh
| | - Md Nahiduzzaman
- Department of Electrical & Computer Engineering, Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh
| | - Mominul Ahsan
- Department of Computer Science, University of York, York, United Kingdom
| | - Julfikar Haider
- Department of Engineering, Manchester Metropolitan University, Manchester, United Kingdom
| |
Collapse
|
21
|
Chang A, Tao X, Huang Y, Yang X, Zeng J, Zhou X, Huang R, Ni D. P 2ED: A four-quadrant framework for progressive prompt enhancement in 3D interactive medical imaging segmentation. Neural Netw 2025; 183:106973. [PMID: 39647317 DOI: 10.1016/j.neunet.2024.106973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2024] [Revised: 11/04/2024] [Accepted: 11/26/2024] [Indexed: 12/10/2024]
Abstract
Interactive segmentation allows active user participation to enhance output quality and resolve ambiguities. This may be especially indispensable to medical image segmentation to address complex anatomy and customization to varying user requirements. Existing approaches often encounter issues such as information dilution, limited adaptability to diverse user interactions, and insufficient response. To address these challenges, we present a novel 3D interactive framework P2ED that divides the task into four quadrants. It is equipped with a multi-granular prompt encrypted to extract prompt features from various hierarchical levels, along with a progressive hierarchical prompt decrypter to adaptively heighten the attention to the scarce prompt features along three spatial axes. Finally, it is appended by a calibration module to further align the prediction with user intentions. Extensive experiments demonstrate that the proposed P2ED achieves accurate results with fewer user interactions compared to state-of-the-art methods and is effective in promoting the upper limit of segmentation performance. The code will be released in https://github.com/chuyhu/P2ED.
Collapse
Affiliation(s)
- Ao Chang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Xing Tao
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Yuhao Huang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Xin Yang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Jiajun Zeng
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Xinrui Zhou
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Ruobing Huang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China.
| | - Dong Ni
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China.
| |
Collapse
|
22
|
Wang C, Jiang M, Li Y, Wei B, Li Y, Wang P, Yang G. MP-FocalUNet: Multiscale parallel focal self-attention U-Net for medical image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 260:108562. [PMID: 39675195 DOI: 10.1016/j.cmpb.2024.108562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 12/05/2024] [Accepted: 12/08/2024] [Indexed: 12/17/2024]
Abstract
BACKGROUND AND OBJECTIVE Medical image segmentation has been significantly improved in recent years with the progress of Convolutional Neural Networks (CNNs). Due to the inherent limitations of convolutional operations, CNNs perform poorly in learning the correlation information between global and long-range features. To solve this problem, some existing solutions rely on building deep encoders and down-sampling operations, but such methods are prone to produce redundant network structures and lose local details. Therefore, medical image segmentation tasks require better solutions to improve the modeling of the global context, while maintaining a strong grasp of the low-level details. METHODS We propose a novel multiscale parallel branch architecture (MP-FocalUNet). On the encoder side of MP-FocalUNet, dual-scale sub-networks are used to extract information of different scales. A cross-scale "Feature Fusion" (FF) module was proposed to explore the potential of dual branch networks and fully utilize feature representations at different scales. On the decoder side, combined with the traditional CNN in parallel, focal self-attention is used for long-distance modeling, which can effectively capture the global dependencies and underlying spatial details in a shallower way. RESULTS Our proposed method is evaluated on both abdominal organ segmentation datasets and automatic cardiac diagnosis challenge datasets. Our method consistently outperforms several state-of-the-art segmentation methods with an average Dice score of 82.45 % (2.68 % higher than HC-Net) and 91.44 % (0.35 % higher than HC-Net) on the abdominal organ datasets and the automatic cardiac diagnosis challenge datasets, respectively. CONCLUSIONS Our MP-FocalUNet is a novel encoder-decoder based multiscale parallel branch Transformer network, which solves the problem of insufficient long-distance modeling in CNNs and fuses image information at different scales. Extensive experiments on abdominal and cardiac medical image segmentation tasks show that our MP-FocalUNet outperforms other state-of-the-art methods. In the future, our work will focus on designing more lightweight Transformer-based models and better learning pixel-level intrinsic structural features generated by patch division in visual Transformers.
Collapse
Affiliation(s)
- Chuan Wang
- School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China
| | - Mingfeng Jiang
- School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China
| | - Yang Li
- School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China.
| | - Bo Wei
- School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China
| | - Yongming Li
- College of Communication Engineering, Chongqing University, Chongqing, China
| | - Pin Wang
- College of Communication Engineering, Chongqing University, Chongqing, China
| | - Guang Yang
- Cardiovascular Research Centre, Royal Brompton Hospital, London SW3 6NP, United Kingdom; National Heart and Lung Institute, Imperial College London, London SW7 2AZ, United Kingdom
| |
Collapse
|
23
|
Guo X, Sun L. Evaluation of stroke sequelae and rehabilitation effect on brain tumor by neuroimaging technique: A comparative study. PLoS One 2025; 20:e0317193. [PMID: 39992898 PMCID: PMC11849865 DOI: 10.1371/journal.pone.0317193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2024] [Accepted: 12/22/2024] [Indexed: 02/26/2025] Open
Abstract
This study aims at the limitations of traditional methods in the evaluation of stroke sequelae and rehabilitation effect monitoring, especially for the accurate identification and tracking of brain injury areas. To overcome these challenges, we introduce an advanced neuroimaging technology based on deep learning, the SWI-BITR-UNet model. This model, introduced as novel Machine Learning (ML) model, combines the SWIN Transformer's local receptive field and shift mechanism, and the effective feature fusion strategy in the U-Net architecture, aiming to improve the accuracy of brain lesion region segmentation in multimodal MRI scans. Through the application of a 3-D CNN encoder and decoder, as well as the integration of the CBAM attention module and jump connection, the model can finely capture and refine features, to achieve a level of segmentation accuracy comparable to that of manual segmentation by experts. This study introduces a 3D CNN encoder-decoder architecture specifically designed to enhance the processing capabilities of 3D medical imaging data. The development of the 3D CNN model utilizes the ADAM optimization algorithm to facilitate the training process. The Bra2020 dataset is utilized to assess the accuracy of the proposed deep learning neural network. By employing skip connections, the model effectively integrates the high-resolution features from the encoder with the up-sampling features from the decoder, thereby increasing the model's sensitivity to 3D spatial characteristics. To assess both the training and testing phases, the SWI-BITR-Unet model is trained using reliable datasets and evaluated through a comprehensive array of statistical metrics, including Recall (Rec), Precision (Pre), F1 test score, Kappa Coefficient (KC), mean Intersection over Union (mIoU), and Receiver Operating Characteristic-Area Under Curve (ROC-AUC). Furthermore, various machine learning models, such as Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), Categorical Boosting (CatBoost), Adaptive Boosting (AdaBoost), and K-Nearest Neighbor (KNN), have been employed to analyze tumor progression in the brain, with performance characterized by Hausdorff distance. In From the performance of ML models, the SWI-BITR-Unet model was more accurate than other models. Subsequently, regarding DICE coefficient values, the segmentation maps (annotation maps of brain tumor distributions) generated by the ML models indicated the models's capability to autonomously delineate areas such as the tumor core (TC) and the enhancing tumor (ET). Moreover, the efficacy of the proposed machine learning models demonstrated superiority over existing research in the field. The computational efficiency and the ability to handle long-distance dependencies of the model make it particularly suitable for applications in clinical Settings. The results showed that the SNA-BITR-UNet model can not only effectively identify and monitor the subtle changes in the stroke injury area, but also provided a new and efficient tool in the rehabilitation process, providing a scientific basis for developing personalized rehabilitation plans.
Collapse
Affiliation(s)
- Xueliang Guo
- Medical Department of Neurology, Shengzhou People’s Hospital, Shengzhou, Zhejiang, China
| | - Lin Sun
- Laboratory Department, Shengzhou People’s Hospital, Shengzhou, Zhejiang, China
| |
Collapse
|
24
|
An Q, Oda H, Hayashi Y, Kitasaka T, Uchida H, Hinoki A, Suzuki K, Takimoto A, Oda M, Mori K. Multi-dimensional consistency learning between 2D Swin U-Net and 3D U-Net for intestine segmentation from CT volume. Int J Comput Assist Radiol Surg 2025:10.1007/s11548-024-03252-6. [PMID: 39985731 DOI: 10.1007/s11548-024-03252-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 08/07/2024] [Indexed: 02/24/2025]
Abstract
PURPOSE The paper introduces a novel two-step network based on semi-supervised learning for intestine segmentation from CT volumes. The intestine folds in the abdomen with complex spatial structures and contact with neighboring organs that bring difficulty for accurate segmentation and labeling at the pixel level. We propose a multi-dimensional consistency learning method to reduce the insufficient intestine segmentation results caused by complex structures and the limited labeled dataset. METHODS We designed a two-stage model to segment the intestine. In stage 1, a 2D Swin U-Net is trained using labeled data to generate pseudo-labels for unlabeled data. In stage 2, a 3D U-Net is trained using labeled and unlabeled data to create the final segmentation model. The model comprises two networks from different dimensions, capturing more comprehensive representations of the intestine and potentially enhancing the model's performance in intestine segmentation. RESULTS We used 59 CT volumes to validate the effectiveness of our method. The experiment was repeated three times getting the average as the final result. Compared to the baseline method, our method improved 3.25% Dice score and 6.84% recall rate. CONCLUSION The proposed method is based on semi-supervised learning and involves training both 2D Swin U-Net and 3D U-Net. The method mitigates the impact of limited labeled data and maintains consistncy of multi-dimensional outputs from the two networks to improve the segmentation accuracy. Compared to previous methods, our method demonstrates superior segmentation performance.
Collapse
Affiliation(s)
- Qin An
- Graduate School of Informatics, Nagoya University, Nagoya, Aichi, 4648601, Japan
| | - Hirohisa Oda
- School of Management and Informatics, University of Shizuoka, Suruga-ku, Shizuoka, 4228526, Japan
| | - Yuichiro Hayashi
- Graduate School of Informatics, Nagoya University, Nagoya, Aichi, 4648601, Japan
| | - Takayuki Kitasaka
- School of Information Science, Aichi Institute of Technology, Toyota, Aichi, 4700392, Japan
| | - Hiroo Uchida
- Graduate School of Medicine, Nagoya University, Nagoya, Aichi, 4668550, Japan
| | - Akinari Hinoki
- Graduate School of Medicine, Nagoya University, Nagoya, Aichi, 4668550, Japan
| | - Kojiro Suzuki
- Department of Radiology, Aichi Medical University, Nagakute, Aichi, 4801195, Japan
| | - Aitaro Takimoto
- Graduate School of Medicine, Nagoya University, Nagoya, Aichi, 4668550, Japan
| | - Masahiro Oda
- Graduate School of Informatics, Nagoya University, Nagoya, Aichi, 4648601, Japan
- Information Technology Center, Nagoya University, Nagoya, Aichi, 4648601, Japan
| | - Kensaku Mori
- Graduate School of Informatics, Nagoya University, Nagoya, Aichi, 4648601, Japan.
- Information Technology Center, Nagoya University, Nagoya, Aichi, 4648601, Japan.
- Research Center for Medical Bigdata, National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, 1018430, Japan.
| |
Collapse
|
25
|
Raith S, Deitermann M, Pankert T, Li J, Modabber A, Hölzle F, Hildebrand F, Eschweiler J. Multi-label segmentation of carpal bones in MRI using expansion transfer learning. Phys Med Biol 2025; 70:055004. [PMID: 39823747 DOI: 10.1088/1361-6560/adabae] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2024] [Accepted: 01/17/2025] [Indexed: 01/20/2025]
Abstract
Objective.The purpose of this study was to develop a robust deep learning approach trained with a smallin-vivoMRI dataset for multi-label segmentation of all eight carpal bones for therapy planning and wrist dynamic analysis.Approach.A small dataset of 15 3.0-T MRI scans from five health subjects was employed within this study. The MRI data was variable with respect to the field of view (FOV), wide range of image intensity, and joint pose. Atwo-stagesegmentation pipeline using modified 3D U-Net was proposed. In thefirst stage, a novel architecture, introduced as expansion transfer learning (ETL), cascades the use of a focused region of interest (ROI) cropped around ground truth for pretraining and a subsequent transfer by an expansion to the original FOV for a primary prediction. The bounding box around the ROI generated was utilized in thesecond stagefor high-accuracy, labeled segmentations of eight carpal bones. Different metrics including dice similarity coefficient (DSC), average surface distance (ASD) and hausdorff distance (HD) were used to evaluate performance between proposed and four state-of-the-art approaches.Main results.With an average DSC of 87.8 %, an ASD of 0.46 mm, an average HD of 2.42 mm in all datasets (96.1 %, 0.16 mm, 1.38 mm in 12 datasets after exclusion criteria, respectively), the proposed approach showed an overall strongest performance than comparisons.Significance.To our best knowledge, this is the first CNN-based multi-label segmentation approach for MRI human carpal bones. The ETL introduced in this work improved the ability to localize a small ROI in a large FOV. Overall, the interplay of atwo-stageapproach and ETL culminated in convincingly accurate segmentation scores despite a very small amount of image data.
Collapse
Affiliation(s)
- Stefan Raith
- Department of Oral and Maxillofacial Surgery, University Hospital RWTH Aachen, Aachen, Germany
- Inzipio GmbH, Aachen, Germany
| | | | - Tobias Pankert
- Department of Oral and Maxillofacial Surgery, University Hospital RWTH Aachen, Aachen, Germany
- Inzipio GmbH, Aachen, Germany
| | - Jianzhang Li
- State IJR Center of Aerospace Design and Additive Manufacturing, School of Mechanical Engineering, Northwestern Polytechnical University, Xi'an, People's Republic of China
- Department of Orthopaedics, Trauma and Reconstructive Surgery, University Hospital RWTH Aachen, Aachen, Germany
| | - Ali Modabber
- Department of Oral and Maxillofacial Surgery, University Hospital RWTH Aachen, Aachen, Germany
| | - Frank Hölzle
- Department of Oral and Maxillofacial Surgery, University Hospital RWTH Aachen, Aachen, Germany
| | - Frank Hildebrand
- Department of Orthopaedics, Trauma and Reconstructive Surgery, University Hospital RWTH Aachen, Aachen, Germany
| | - Jörg Eschweiler
- Department of Trauma and Reconstructive Surgery, BG Hospital Bergmannstrost, Halle (Saale), Germany
- Department of Trauma and Reconstructive Surgery, University Hospital, Halle (Saale), Germany
| |
Collapse
|
26
|
Xie J, Zhou J, Yang M, Xu L, Li T, Jia H, Gong Y, Li X, Song B, Wei Y, Liu M. Lesion segmentation method for multiple types of liver cancer based on balanced dice loss. Med Phys 2025. [PMID: 39945728 DOI: 10.1002/mp.17624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 11/14/2024] [Accepted: 12/13/2024] [Indexed: 03/17/2025] Open
Abstract
BACKGROUND Obtaining accurate segmentation regions for liver cancer is of paramount importance for the clinical diagnosis and treatment of the disease. In recent years, a large number of variants of deep learning based liver cancer segmentation methods have been proposed to assist radiologists. Due to the differences in characteristics between different types of liver tumors and data imbalance, it is difficult to train a deep model that can achieve accurate segmentation for multiple types of liver cancer. PURPOSE In this paper, We propose a balance Dice Loss(BD Loss) function for balanced learning of multiple categories segmentation features. We also introduce a comprehensive method based on BD Loss to achieve accurate segmentation of multiple categories of liver cancer. MATERIALS AND METHODS We retrospectively collected computed tomography (CT) screening images and tumor segmentation of 591 patients with malignant liver tumors from West China Hospital of Sichuan University. We use the proposed BD Loss to train a deep model that can segment multiple types of liver tumors and, through a greedy parameter averaging algorithm (GPA algorithm) obtain a more generalized segmentation model. Finally, we employ model integration and our proposed post-processing method, which leverages inter-slice information, to achieve more accurate segmentation of liver cancer lesions. RESULTS We evaluated the performance of our proposed automatic liver cancer segmentation method on the dataset we collected. The BD loss we proposed can effectively mitigate the adverse effects of data imbalance on the segmentation model. Our proposed method can achieve a dice per case (DPC) of 0.819 (95%CI 0.798-0.841), significantly higher than baseline which achieve a DPC of 0.768(95%CI 0.740-0.796). CONCLUSIONS The differences in CT images between different types of liver cancer necessitate deep learning models to learn distinct features. Our method addresses this challenge, enabling balanced and accurate segmentation performance across multiple types of liver cancer.
Collapse
Affiliation(s)
- Jun Xie
- Information Technology Center, West China Hospital of Sichuan University, Chengdu, China
- Information Technology Center, People's Hospital of Sanya, Sanya, Hainan, China
| | - Jiajun Zhou
- School of Computer Science and Engineering, University of Electronic Science and technology of China, Chengdu, Sichuan, China
| | - Meiyi Yang
- School of Computer Science and Engineering, University of Electronic Science and technology of China, Chengdu, Sichuan, China
| | - Lifeng Xu
- Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, Quzhou, Zhejiang, China
| | - Tongtong Li
- Department of Radiology, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Haoyang Jia
- Department of Radiology, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Yu Gong
- Department of Radiology, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Xiansong Li
- Department of Radiology, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Bin Song
- Department of Radiology, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Yi Wei
- Department of Radiology, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Ming Liu
- Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, Quzhou, Zhejiang, China
- Yangtze Delta Region Institute(Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China
| |
Collapse
|
27
|
Tang R, Zhao H, Tong Y, Mu R, Wang Y, Zhang S, Zhao Y, Wang W, Zhang M, Liu Y, Gao J. A frequency attention-embedded network for polyp segmentation. Sci Rep 2025; 15:4961. [PMID: 39929863 PMCID: PMC11811025 DOI: 10.1038/s41598-025-88475-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Accepted: 01/28/2025] [Indexed: 02/13/2025] Open
Abstract
Gastrointestinal polyps are observed and treated under endoscopy, so there presents significant challenges to advance endoscopy imaging segmentation of polyps. Current methodologies often falter in distinguishing complex polyp structures within diverse (mucosal) tissue environments. In this paper, we propose the Frequency Attention-Embedded Network (FAENet), a novel approach leveraging frequency-based attention mechanisms to enhance polyp segmentation accuracy significantly. FAENet ingeniously segregates and processes image data into high and low-frequency components, enabling precise delineation of polyp boundaries and internal structures by integrating intra-component and cross-component attention mechanisms. This method not only preserves essential edge details but also refines the learned representation attentively, ensuring robust segmentation across varied imaging conditions. Comprehensive evaluations on two public datasets, Kvasir-SEG and CVC-ClinicDB, demonstrate FAENet's superiority over several state-of-the-art models in terms of Dice coefficient, Intersection over Union (IoU), sensitivity, and specificity. The results affirm that FAENet's advanced attention mechanisms significantly improve the segmentation quality, outperforming traditional and contemporary techniques. FAENet's success indicates its potential to revolutionize polyp segmentation in clinical practices, fostering diagnosis and efficient treatment of gastrointestinal polyps.
Collapse
Affiliation(s)
- Rui Tang
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Hejing Zhao
- Research Center on Flood and Drought Disaster Reduction of Ministry of Water Resource, China Institute of Water Resources and Hydropower Research, Beijing, 100038, China
- Water History Department, China Institute of Water Resources and Hydropower Research, Beijing, 100038, China
| | - Yao Tong
- School of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing, 210023, China
- Jiangsu Province Engineering Research Center of TCM Intelligence Health Service, Nanjing University of Chinese Medicine, Nanjing, 210023, China
| | - Ruihui Mu
- College of Computer and Information, Xinxiang University, Xinxiang, 453000, China
| | - Yuqiang Wang
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Shuhao Zhang
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Yao Zhao
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Weidong Wang
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Min Zhang
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Yilin Liu
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China.
| | - Jianbo Gao
- Department of Radiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.
| |
Collapse
|
28
|
Pande SD, Kalyani P, Nagendram S, Alluhaidan AS, Babu GH, Ahammad SH, Pandey VK, Sridevi G, Kumar A, Bonyah E. Comparative analysis of the DCNN and HFCNN Based Computerized detection of liver cancer. BMC Med Imaging 2025; 25:37. [PMID: 39901085 PMCID: PMC11792691 DOI: 10.1186/s12880-025-01578-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2024] [Accepted: 01/28/2025] [Indexed: 02/05/2025] Open
Abstract
Liver cancer detection is critically important in the discipline of biomedical image testing and diagnosis. Researchers have explored numerous machine learning (ML) techniques and deep learning (DL) approaches aimed at the automated recognition of liver disease by analysing computed tomography (CT) images. This study compares two frameworks, Deep Convolutional Neural Network (DCNN) and Hierarchical Fusion Convolutional Neural Networks (HFCNN), to assess their effectiveness in liver cancer segmentation. The contribution includes enhancing the edges and textures of CT images through filtering to achieve precise liver segmentation. Additionally, an existing DL framework was employed for liver cancer detection and segmentation. The strengths of this paper include a clear emphasis on the criticality of liver cancer detection in biomedical imaging and diagnostics. It also highlights the challenges associated with CT image detection and segmentation and provides a comprehensive summary of recent literature. However, certain difficulties arise during the detection process in CT images due to overlapping structures, such as bile ducts, blood vessels, image noise, textural changes, size and location variations, and inherent heterogeneity. These factors may lead to segmentation errors and subsequently different analyses. This research analysis compares two advanced methodologies, DCNN and HFCNN, for liver cancer detection. The evaluation of DCNN and HFCNN in liver cancer detection is conducted using multiple performance metrics, including precision, F1-score, recall, and accuracy. This comprehensive assessment provides a detailed evaluation of these models' effectiveness compared to other state-of-the-art methods in identifying liver cancer.
Collapse
Affiliation(s)
| | - Pala Kalyani
- Department of ECE, Vardhaman college of Engineering, Hyderabad, India
| | - S Nagendram
- Department of AI, KKR & KSR Institute of Technology & Sciences, Guntur, India
| | - Ala Saleh Alluhaidan
- Department of Information Systems, College of Computer and Information Science, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh, 11671, Saudi Arabia.
| | - G Harish Babu
- Department of ECE, CVR College of Engineering, Hyderabad, India
| | - Sk Hasane Ahammad
- Department of ECE, Koneru Lakshmaiah Education Foundation, Vaddeswaram, 522302, India
| | - Vivek Kumar Pandey
- Centre for Research Impact & Outcome, Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, 140401, Punjab, India
| | - G Sridevi
- Department of Computer Science and Engineering, Raghu Engineering College, Visakhapatnam, Andhra Pradesh, 531162, India
| | - Abhinav Kumar
- Department of Nuclear and Renewable Energy, Ural Federal University Named after the First President of Russia Boris Yeltsin, Ekaterinburg, 620002, Russia
- Refrigeration &Air-condition Department, Technical Engineering College, The Islamic University, Najaf, Iraq
- Department of Mechanical Engineering, Karpagam Academy of Higher Education, Coimbatore, 641021, India
| | - Ebenezer Bonyah
- Department of Mathematics Education, Akenten Appiah Menka University of Skills Training and Entrepreneurial Development, Kumasi, Ghana.
| |
Collapse
|
29
|
Qu M, Yang J, Li H, Qi Y, Yu Q. Contour-constrained branch U-Net for accurate left ventricular segmentation in echocardiography. Med Biol Eng Comput 2025; 63:561-573. [PMID: 39417962 DOI: 10.1007/s11517-024-03201-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 09/16/2024] [Indexed: 10/19/2024]
Abstract
Using echocardiography to assess the left ventricular function is one of the most crucial cardiac examinations in clinical diagnosis, and LV segmentation plays a particularly vital role in medical image processing as many important clinical diagnostic parameters are derived from the segmentation results, such as ejection function. However, echocardiography typically has a lower resolution and contains a significant amount of noise and motion artifacts, making it a challenge to accurate segmentation, especially in the region of the cardiac chamber boundary, which significantly restricts the accurate calculation of subsequent clinical parameters. In this paper, our goal is to achieve accurate LV segmentation through a simplified approach by introducing a branch sub-network into the decoder of the traditional U-Net. Specifically, we employed the LV contour features to supervise the branch decoding process and used a cross attention module to facilitate the interaction relationship between the branch and the original decoding process, thereby improving the segmentation performance in the region LV boundaries. In the experiments, the proposed branch U-Net (BU-Net) demonstrated superior performance on CAMUS and EchoNet-dynamic public echocardiography segmentation datasets in comparison to state-of-the-art segmentation models, without the need for complex residual connections or transformer-based architectures. Our codes are publicly available at Anonymous Github https://anonymous.4open.science/r/Anoymous_two-BFF2/ .
Collapse
Affiliation(s)
- Mingjun Qu
- Computer Science and Engineering, Northeastern University, Shenyang, China
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China
- National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Shenyang, China
| | - Jinzhu Yang
- Computer Science and Engineering, Northeastern University, Shenyang, China.
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China.
- National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Shenyang, China.
| | - Honghe Li
- Computer Science and Engineering, Northeastern University, Shenyang, China
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China
- National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Shenyang, China
| | - Yiqiu Qi
- Computer Science and Engineering, Northeastern University, Shenyang, China
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China
- National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Shenyang, China
| | - Qi Yu
- Computer Science and Engineering, Northeastern University, Shenyang, China
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China
- National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Shenyang, China
| |
Collapse
|
30
|
Ghobadi V, Ismail LI, Wan Hasan WZ, Ahmad H, Ramli HR, Norsahperi NMH, Tharek A, Hanapiah FA. Challenges and solutions of deep learning-based automated liver segmentation: A systematic review. Comput Biol Med 2025; 185:109459. [PMID: 39642700 DOI: 10.1016/j.compbiomed.2024.109459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 11/12/2024] [Accepted: 11/19/2024] [Indexed: 12/09/2024]
Abstract
The liver is one of the vital organs in the body. Precise liver segmentation in medical images is essential for liver disease treatment. The deep learning-based liver segmentation process faces several challenges. This research aims to analyze the challenges of liver segmentation in prior studies and identify the modifications made to network models and other enhancements implemented by researchers to tackle each challenge. In total, 88 articles from Scopus and ScienceDirect databases published between January 2016 and January 2022 have been studied. The liver segmentation challenges are classified into five main categories, each containing some subcategories. For each challenge, the proposed technique to overcome the challenge is investigated. The provided report details the authors, publication years, dataset types, imaging technologies, and evaluation metrics of all references for comparison. Additionally, a summary table outlines the challenges and solutions.
Collapse
Affiliation(s)
- Vahideh Ghobadi
- Faculty of Engineering, Universiti Putra Malaysia, Serdang, 43400, Selangor, Malaysia.
| | - Luthffi Idzhar Ismail
- Faculty of Engineering, Universiti Putra Malaysia, Serdang, 43400, Selangor, Malaysia.
| | - Wan Zuha Wan Hasan
- Faculty of Engineering, Universiti Putra Malaysia, Serdang, 43400, Selangor, Malaysia.
| | - Haron Ahmad
- KPJ Specialist Hospital, Damansara Utama, Petaling Jaya, 47400, Selangor, Malaysia.
| | - Hafiz Rashidi Ramli
- Faculty of Engineering, Universiti Putra Malaysia, Serdang, 43400, Selangor, Malaysia.
| | | | - Anas Tharek
- Hospital Sultan Abdul Aziz Shah, University Putra Malaysia, Serdang, 43400, Selangor, Malaysia.
| | - Fazah Akhtar Hanapiah
- Faculty of Medicine, Universiti Teknologi MARA, Damansara Utama, Sungai Buloh, 47000, Selangor, Malaysia.
| |
Collapse
|
31
|
Yue Z, Jiang J, Hou W, Zhou Q, David Spence J, Fenster A, Qiu W, Ding M. Prior-Knowledge Embedded U-Net-Based Fully Automatic Vessel Wall Volume Measurement of the Carotid Artery in 3D Ultrasound Image. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:711-727. [PMID: 39255086 DOI: 10.1109/tmi.2024.3457245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
The vessel-wall-volume (VWV) measured based on three-dimensional (3D) carotid artery (CA) ultrasound (US) images can help to assess carotid atherosclerosis and manage patients at risk for stroke. Manual involvement for measurement work is subjective and requires well-trained operators, and fully automatic measurement tools are not yet available. Thereby, we proposed a fully automatic VWV measurement framework (Auto-VWV) using a CA prior-knowledge embedded U-Net (CAP-UNet) to measure the VWV from 3D CA US images without manual intervention. The Auto-VWV framework is designed to improve the repeated VWV measuring consistency, which resulted in the first fully automatic framework for VWV measurement. CAP-UNet is developed to improve segmentation accuracy on the whole CA, which composed of a U-Net type backbone and three additional prior-knowledge learning modules. Specifically, a continuity learning module is used to learn the spatial continuity of the arteries in a sequence of image slices. A voxel evolution learning module was designed to learn the evolution of the artery in adjacent slices, and a topology learning module was used to learn the unique topology of the carotid artery. In two 3D CA US datasets, CAP-UNet architecture achieved state-of-the-art performance compared to eight competing models. Furthermore, CAP-UNet-based Auto-VWV achieved better accuracy and consistency than Auto-VWV based on competing models in the simulated repeated measurement. Finally, using 10 pairs of real repeatedly scanned samples, Auto-VWV achieved better VWV measurement reproducibility than intra- and inter-operator manual measurements. The code is available at https://github.com/Yue9603/Auto-VWV.
Collapse
|
32
|
Ma J, Yang H, Chou Y, Yoon J, Allison T, Komandur R, McDunn J, Tasneem A, Do RK, Schwartz LH, Zhao B. Generalizability of lesion detection and segmentation when ScaleNAS is trained on a large multi-organ dataset and validated in the liver. Med Phys 2025; 52:1005-1018. [PMID: 39576046 DOI: 10.1002/mp.17504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 09/25/2024] [Accepted: 10/05/2024] [Indexed: 02/04/2025] Open
Abstract
BACKGROUND Tumor assessment through imaging is crucial for diagnosing and treating cancer. Lesions in the liver, a common site for metastatic disease, are particularly challenging to accurately detect and segment. This labor-intensive task is subject to individual variation, which drives interest in automation using artificial intelligence (AI). PURPOSE Evaluate AI for lesion detection and lesion segmentation using CT in the context of human performance on the same task. Use internal testing to determine how an AI-developed model (ScaleNAS) trained on lesions in multiple organs performs when tested specifically on liver lesions in a dataset integrating real-world and clinical trial data. Use external testing to evaluate whether ScaleNAS's performance generalizes to publicly available colorectal liver metastases (CRLM) from The Cancer Imaging Archive (TCIA). METHODS The CUPA study dataset included patients whose CT scan of chest, abdomen, or pelvis at Columbia University between 2010-2020 indicated solid tumors (CUIMC, n = 5011) and from two clinical trials in metastatic colorectal cancer, PRIME (n = 1183) and Amgen (n = 463). Inclusion required ≥1 measurable lesion; exclusion criteria eliminated 1566 patients. Data were divided at the patient level into training (n = 3996), validation (n = 570), and testing (n = 1529) sets. To create the reference standard for training and validation, each case was annotated by one of six radiologists, randomly assigned, who marked the CUPA lesions without access to any previous annotations. For internal testing we refined the CUPA test set to contain only patients who had liver lesions (n = 525) and formed an enhanced reference standard through expert consensus reviewing prior annotations. For external testing, TCIA-CRLM (n = 197) formed the test set. The reference standard for TCIA-CRLM was formed by consensus review of the original annotation and contours by two new radiologists. Metrics for lesion detection were sensitivity and false positives. Lesion segmentation was assessed with median Dice coefficient, under-segmentation ratio (USR), and over-segmentation ratio (OSR). Subgroup analysis examined the influence of lesion size ≥ 10 mm (measurable by RECIST1.1) versus all lesions (important for early identification of disease progression). RESULTS ScaleNAS trained on all lesions achieved sensitivity of 71.4% and Dice of 70.2% for liver lesions in the CUPA internal test set (3,495 lesions) and sensitivity of 68.2% and Dice 64.2% in the TCIA-CRLM external test set (638 lesions). Human radiologists had mean sensitivity of 53.5% and Dice of 73.9% in CUPA and sensitivity of 84.1% and Dice of 88.4% in TCIA-CRLM. Performance improved for ScaleNAS and radiologists in the subgroup of lesions that excluded sub-centimeter lesions. CONCLUSIONS Our study presents the first evaluation of ScaleNAS in medical imaging, demonstrating its liver lesion detection and segmentation performance across diverse datasets. Using consensus reference standards from multiple radiologists, we addressed inter-observer variability and contributed to consistency in lesion annotation. While ScaleNAS does not surpass radiologists in performance, it offers fast and reliable results with potential utility in providing initial contours for radiologists. Future work will extend this model to lung and lymph node lesions, ultimately aiming to enhance clinical applications by generalizing detection and segmentation across tissue types.
Collapse
Affiliation(s)
- Jingchen Ma
- Department of Radiology, Columbia University Irving Medical Center, New York, New York, USA
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, New York, USA
| | - Hao Yang
- Department of Radiology, Columbia University Irving Medical Center, New York, New York, USA
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, New York, USA
| | - Yen Chou
- Department of Radiology, Columbia University Irving Medical Center, New York, New York, USA
- Fu Jen Catholic University Hospital, Department of Medical Imaging and Fu Jen Catholic University, School of Medicine, New Taipei City, Taiwan
| | - Jin Yoon
- Department of Radiology, Columbia University Irving Medical Center, New York, New York, USA
| | - Tavis Allison
- Department of Radiology, Columbia University Irving Medical Center, New York, New York, USA
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, New York, USA
| | | | - Jon McDunn
- Project Data Sphere, Cary, North Carolina, USA
| | | | - Richard K Do
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, New York, USA
| | - Lawrence H Schwartz
- Department of Radiology, Columbia University Irving Medical Center, New York, New York, USA
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, New York, USA
| | - Binsheng Zhao
- Department of Radiology, Columbia University Irving Medical Center, New York, New York, USA
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, New York, USA
| |
Collapse
|
33
|
Chu J, Liu W, Tian Q, Lu W. PFPRNet: A Phase-Wise Feature Pyramid With Retention Network for Polyp Segmentation. IEEE J Biomed Health Inform 2025; 29:1137-1150. [PMID: 40030242 DOI: 10.1109/jbhi.2024.3500026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
Abstract
Early detection of colonic polyps is crucial for the prevention and diagnosis of colorectal cancer. Currently, deep learning-based polyp segmentation methods have become mainstream and achieved remarkable results. Acquiring a large number of labeled data is time-consuming and labor-intensive, and meanwhile the presence of numerous similar wrinkles in polyp images also hampers model prediction performance. In this paper, we propose a novel approach called Phase-wise Feature Pyramid with Retention Network (PFPRNet), which leverages a pre-trained Transformer-based Encoder to obtain multi-scale feature maps. A Phase-wise Feature Pyramid with Retention Decoder is designed to gradually integrate global features into local features and guide the model's attention towards key regions. Additionally, our custom Enhance Perception module enables capturing image information from a broader perspective. Finally, we introduce an innovative Low-layer Retention module as an alternative to Transformer for more efficient global attention modeling. Evaluation results on several widely-used polyp segmentation datasets demonstrate that our proposed method has strong learning ability and generalization capability, and outperforms the state-of-the-art approaches.
Collapse
|
34
|
Kande GB, Nalluri MR, Manikandan R, Cho J, Veerappampalayam Easwaramoorthy S. Multi scale multi attention network for blood vessel segmentation in fundus images. Sci Rep 2025; 15:3438. [PMID: 39870673 PMCID: PMC11772654 DOI: 10.1038/s41598-024-84255-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Accepted: 12/20/2024] [Indexed: 01/29/2025] Open
Abstract
Precise segmentation of retinal vasculature is crucial for the early detection, diagnosis, and treatment of vision-threatening ailments. However, this task is challenging due to limited contextual information, variations in vessel thicknesses, the complexity of vessel structures, and the potential for confusion with lesions. In this paper, we introduce a novel approach, the MSMA Net model, which overcomes these challenges by replacing traditional convolution blocks and skip connections with an improved multi-scale squeeze and excitation block (MSSE Block) and Bottleneck residual paths (B-Res paths) with spatial attention blocks (SAB). Our experimental findings on publicly available datasets of fundus images, specifically DRIVE, STARE, CHASE_DB1, HRF and DR HAGIS consistently demonstrate that our approach outperforms other segmentation techniques, achieving higher accuracy, sensitivity, Dice score, and area under the receiver operator characteristic (AUC) in the segmentation of blood vessels with different thicknesses, even in situations involving diverse contextual information, the presence of coexisting lesions, and intricate vessel morphologies.
Collapse
Affiliation(s)
- Giri Babu Kande
- Vasireddy Venkatadri Institute of Technology, Nambur, 522508, India
| | - Madhusudana Rao Nalluri
- School of Computing, Amrita Vishwa Vidyapeetham, Amaravati, 522503, India.
- Department of Computer Science & Engineering, Faculty of Science and Technology (IcfaiTech), ICFAI Foundation for Higher Education, Hyderabad, India.
| | - R Manikandan
- School of Computing, SASTRA Deemed University, Thanjavur, 613401, India
| | - Jaehyuk Cho
- Department of Software Engineering & Division of Electronics and Information Engineering, Jeonbuk National University, Jeonju-si, 54896, Republic of Korea.
| | | |
Collapse
|
35
|
Qi H, Wang W, Dang H, Chen Y, Jia M, Wang X. An Efficient Retinal Fluid Segmentation Network Based on Large Receptive Field Context Capture for Optical Coherence Tomography Images. ENTROPY (BASEL, SWITZERLAND) 2025; 27:60. [PMID: 39851680 PMCID: PMC11764744 DOI: 10.3390/e27010060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2024] [Revised: 01/08/2025] [Accepted: 01/09/2025] [Indexed: 01/26/2025]
Abstract
Optical Coherence Tomography (OCT) is a crucial imaging modality for diagnosing and monitoring retinal diseases. However, the accurate segmentation of fluid regions and lesions remains challenging due to noise, low contrast, and blurred edges in OCT images. Although feature modeling with wide or global receptive fields offers a feasible solution, it typically leads to significant computational overhead. To address these challenges, we propose LKMU-Lite, a lightweight U-shaped segmentation method tailored for retinal fluid segmentation. LKMU-Lite integrates a Decoupled Large Kernel Attention (DLKA) module that captures both local patterns and long-range dependencies, thereby enhancing feature representation. Additionally, it incorporates a Multi-scale Group Perception (MSGP) module that employs Dilated Convolutions with varying receptive field scales to effectively predict lesions of different shapes and sizes. Furthermore, a novel Aggregating-Shift decoder is proposed, reducing model complexity while preserving feature integrity. With only 1.02 million parameters and a computational complexity of 3.82 G FLOPs, LKMU-Lite achieves state-of-the-art performance across multiple metrics on the ICF and RETOUCH datasets, demonstrating both its efficiency and generalizability compared to existing methods.
Collapse
Affiliation(s)
- Hang Qi
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing 100081, China; (H.Q.); (W.W.); (H.D.); (Y.C.); (M.J.)
| | - Weijiang Wang
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing 100081, China; (H.Q.); (W.W.); (H.D.); (Y.C.); (M.J.)
- BIT Chongqing Institute of Microelectronics and Microsystems, Chongqing 401332, China
| | - Hua Dang
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing 100081, China; (H.Q.); (W.W.); (H.D.); (Y.C.); (M.J.)
| | - Yueyang Chen
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing 100081, China; (H.Q.); (W.W.); (H.D.); (Y.C.); (M.J.)
| | - Minli Jia
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing 100081, China; (H.Q.); (W.W.); (H.D.); (Y.C.); (M.J.)
| | - Xiaohua Wang
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing 100081, China; (H.Q.); (W.W.); (H.D.); (Y.C.); (M.J.)
- BIT Chongqing Institute of Microelectronics and Microsystems, Chongqing 401332, China
| |
Collapse
|
36
|
Liang J, Wang R, Rao S, Xu F, Xiang J, Wang B, Yan T. Dual-View Dual-Boundary Dual U-Nets for Multiscale Segmentation of Oral CBCT Images. LECTURE NOTES IN COMPUTER SCIENCE 2025:48-62. [DOI: 10.1007/978-981-97-8499-8_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
37
|
Sun Y, Zhang S, Li J, Han Q, Qin Y. CAISeg: A Clustering-Aided Interactive Network for Lesion Segmentation in 3D Medical Imaging. IEEE J Biomed Health Inform 2025; 29:371-382. [PMID: 39321004 DOI: 10.1109/jbhi.2024.3467279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/27/2024]
Abstract
Accurate lesion segmentation in medical imaging is critical for medical diagnosis and treatment. Lesions' diverse and heterogeneous characteristics often present a distinct long-tail distribution, posing difficulties for automatic methods. Currently, interactive segmentation approaches have shown promise in improving accuracy, but still struggle to deal with tail features. This triggers a demand of effective utilizing strategies of user interaction. To this end, we propose a novel point-based interactive segmentation model called Clustering-Aided Interactive Segmentation Network (CAISeg) in 3D medical imaging. A customized Interaction-Guided Module (IGM) adopts the concept of clustering to capture features that are semantically similar to interaction points. These clustered features are then mapped to the head regions of the prompted category to facilitate more precise classification. Meanwhile, we put forward a Focus Guided Loss function to grant the network an inductive bias towards user interaction through assigning higher weights to voxels closer to the prompted points, thereby improving the responsiveness efficiency to user guidance. Evaluation across brain tumor, colon cancer, lung cancer, and pancreas cancer segmentation tasks show CAISeg's superiority over the state-of-the-art methods. It outperforms the fully automated segmentation models in accuracy, and achieves results comparable to or better than those of the leading point-based interactive methods while requiring fewer prompt points. Furthermore, we discover that CAISeg possesses good interpretability at various stages, which endows CAISeg with potential clinical application value.
Collapse
|
38
|
Yang X, Xu L, Yu S, Xia Q, Li H, Zhang S. Segmentation and Vascular Vectorization for Coronary Artery by Geometry-Based Cascaded Neural Network. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:259-269. [PMID: 39078771 DOI: 10.1109/tmi.2024.3435714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Segmentation of the coronary artery is an important task for the quantitative analysis of coronary computed tomography angiography (CCTA) images and is being stimulated by the field of deep learning. However, the complex structures with tiny and narrow branches of the coronary artery bring it a great challenge. Coupled with the medical image limitations of low resolution and poor contrast, fragmentations of segmented vessels frequently occur in the prediction. Therefore, a geometry-based cascaded segmentation method is proposed for the coronary artery, which has the following innovations: 1) Integrating geometric deformation networks, we design a cascaded network for segmenting the coronary artery and vectorizing results. The generated meshes of the coronary artery are continuous and accurate for twisted and sophisticated coronary artery structures, without fragmentations. 2) Different from mesh annotations generated by the traditional marching cube method from voxel-based labels, a finer vectorized mesh of the coronary artery is reconstructed with the regularized morphology. The novel mesh annotation benefits the geometry-based segmentation network, avoiding bifurcation adhesion and point cloud dispersion in intricate branches. 3) A dataset named CCA-200 is collected, consisting of 200 CCTA images with coronary artery disease. The ground truths of 200 cases are coronary internal diameter annotations by professional radiologists. Extensive experiments verify our method on our collected dataset CCA-200 and public ASOCA dataset, with a Dice of 0.778 on CCA-200 and 0.895 on ASOCA, showing superior results. Especially, our geometry-based model generates an accurate, intact and smooth coronary artery, devoid of any fragmentations of segmented vessels.
Collapse
|
39
|
Shen Q, Zheng B, Li W, Shi X, Luo K, Yao Y, Li X, Lv S, Tao J, Wei Q. MixUNETR: A U-shaped network based on W-MSA and depth-wise convolution with channel and spatial interactions for zonal prostate segmentation in MRI. Neural Netw 2025; 181:106782. [PMID: 39388995 DOI: 10.1016/j.neunet.2024.106782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 09/26/2024] [Accepted: 10/02/2024] [Indexed: 10/12/2024]
Abstract
Magnetic resonance imaging (MRI) plays a pivotal role in diagnosing and staging prostate cancer. Precise delineation of the peripheral zone (PZ) and transition zone (TZ) within prostate MRI is essential for accurate diagnosis and subsequent artificial intelligence-driven analysis. However, existing segmentation methods are limited by ambiguous boundaries, shape variations and texture complexities between PZ and TZ. Moreover, they suffer from inadequate modeling capabilities and limited receptive fields. To address these challenges, we propose a Enhanced MixFormer, which integrates window-based multi-head self-attention (W-MSA) and depth-wise convolution with parallel design and cross-branch bidirectional interaction. We further introduce MixUNETR, which use multiple Enhanced MixFormers as encoder to extract features from both PZ and TZ in prostate MRI. This augmentation effectively enlarges the receptive field and enhances the modeling capability of W-MSA, ultimately improving the extraction of both global and local feature information from PZ and TZ, thereby addressing mis-segmentation and challenges in delineating boundaries between them. Extensive experiments were conducted, comparing MixUNETR with several state-of-the-art methods on the Prostate158, ProstateX public datasets and private dataset. The results consistently demonstrate the accuracy and robustness of MixUNETR in MRI prostate segmentation. Our code of methods is available at https://github.com/skyous779/MixUNETR.git.
Collapse
Affiliation(s)
- Quanyou Shen
- School of Automation, Guangdong University of Technology, Guangzhou, 510006, China; Guangdong Provincial Key Laboratory of Intelligent Decision and Cooperative Control, Guangzhou, 510006, China; Guangdong-Hong Kong Joint Laboratory for Intelligent Decision and Cooperative Control, Guangzhou, 510006, China
| | - Bowen Zheng
- Department of Urology, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, China
| | - Wenhao Li
- School of Automation, Guangdong University of Technology, Guangzhou, 510006, China; Guangdong Provincial Key Laboratory of Intelligent Decision and Cooperative Control, Guangzhou, 510006, China; Guangdong-Hong Kong Joint Laboratory for Intelligent Decision and Cooperative Control, Guangzhou, 510006, China
| | - Xiaoran Shi
- Department of Urology, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, China
| | - Kun Luo
- School of Automation, Guangdong University of Technology, Guangzhou, 510006, China
| | - Yuqian Yao
- School of Automation, Guangdong University of Technology, Guangzhou, 510006, China; Guangdong Provincial Key Laboratory of Intelligent Decision and Cooperative Control, Guangzhou, 510006, China; Guangdong-Hong Kong Joint Laboratory for Intelligent Decision and Cooperative Control, Guangzhou, 510006, China
| | - Xinyan Li
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006, China; Guangdong Provincial Laboratory of Chemistry and Fine Chemical Engineering Jieyang Center, Jieyang, 515200, China
| | - Shidong Lv
- Department of Urology, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, China
| | - Jie Tao
- School of Automation, Guangdong University of Technology, Guangzhou, 510006, China; Guangdong Provincial Key Laboratory of Intelligent Decision and Cooperative Control, Guangzhou, 510006, China; Guangdong-Hong Kong Joint Laboratory for Intelligent Decision and Cooperative Control, Guangzhou, 510006, China.
| | - Qiang Wei
- Department of Urology, Guangdong Provincial People's Hospital, Southern Medical University, Guangzhou, 510515, China.
| |
Collapse
|
40
|
Li S, Li X, Xu X, Cheng KT. Dynamic Subcluster-Aware Network for Few-Shot Skin Disease Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1872-1883. [PMID: 38090872 DOI: 10.1109/tnnls.2023.3336765] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
This article addresses the problem of few-shot skin disease classification by introducing a novel approach called the subcluster-aware network (SCAN) that enhances accuracy in diagnosing rare skin diseases. The key insight motivating the design of SCAN is the observation that skin disease images within a class often exhibit multiple subclusters, characterized by distinct variations in appearance. To improve the performance of few-shot learning (FSL), we focus on learning a high-quality feature encoder that captures the unique subclustered representations within each disease class, enabling better characterization of feature distributions. Specifically, SCAN follows a dual-branch framework, where the first branch learns classwise features to distinguish different skin diseases, and the second branch aims to learn features, which can effectively partition each class into several groups so as to preserve the subclustered structure within each class. To achieve the objective of the second branch, we present a cluster loss to learn image similarities via unsupervised clustering. To ensure that the samples in each subcluster are from the same class, we further design a purity loss to refine the unsupervised clustering results. We evaluate the proposed approach on two public datasets for few-shot skin disease classification. The experimental results validate that our framework outperforms the state-of-the-art methods by around 2%-5% in terms of sensitivity, specificity, accuracy, and F1-score on the SD-198 and Derm7pt datasets.
Collapse
|
41
|
Zhang Z, Keles E, Durak G, Taktak Y, Susladkar O, Gorade V, Jha D, Ormeci AC, Medetalibeyoglu A, Yao L, Wang B, Isler IS, Peng L, Pan H, Vendrami CL, Bourhani A, Velichko Y, Gong B, Spampinato C, Pyrros A, Tiwari P, Klatte DCF, Engels M, Hoogenboom S, Bolan CW, Agarunov E, Harfouch N, Huang C, Bruno MJ, Schoots I, Keswani RN, Miller FH, Gonda T, Yazici C, Tirkes T, Turkbey B, Wallace MB, Bagci U. Large-scale multi-center CT and MRI segmentation of pancreas with deep learning. Med Image Anal 2025; 99:103382. [PMID: 39541706 PMCID: PMC11698238 DOI: 10.1016/j.media.2024.103382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Revised: 10/24/2024] [Accepted: 10/27/2024] [Indexed: 11/16/2024]
Abstract
Automated volumetric segmentation of the pancreas on cross-sectional imaging is needed for diagnosis and follow-up of pancreatic diseases. While CT-based pancreatic segmentation is more established, MRI-based segmentation methods are understudied, largely due to a lack of publicly available datasets, benchmarking research efforts, and domain-specific deep learning methods. In this retrospective study, we collected a large dataset (767 scans from 499 participants) of T1-weighted (T1 W) and T2-weighted (T2 W) abdominal MRI series from five centers between March 2004 and November 2022. We also collected CT scans of 1,350 patients from publicly available sources for benchmarking purposes. We introduced a new pancreas segmentation method, called PanSegNet, combining the strengths of nnUNet and a Transformer network with a new linear attention module enabling volumetric computation. We tested PanSegNet's accuracy in cross-modality (a total of 2,117 scans) and cross-center settings with Dice and Hausdorff distance (HD95) evaluation metrics. We used Cohen's kappa statistics for intra and inter-rater agreement evaluation and paired t-tests for volume and Dice comparisons, respectively. For segmentation accuracy, we achieved Dice coefficients of 88.3% (±7.2%, at case level) with CT, 85.0% (±7.9%) with T1 W MRI, and 86.3% (±6.4%) with T2 W MRI. There was a high correlation for pancreas volume prediction with R2 of 0.91, 0.84, and 0.85 for CT, T1 W, and T2 W, respectively. We found moderate inter-observer (0.624 and 0.638 for T1 W and T2 W MRI, respectively) and high intra-observer agreement scores. All MRI data is made available at https://osf.io/kysnj/. Our source code is available at https://github.com/NUBagciLab/PaNSegNet.
Collapse
Affiliation(s)
- Zheyuan Zhang
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA
| | - Elif Keles
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA
| | - Gorkem Durak
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA
| | - Yavuz Taktak
- Department of Internal Medicine, Istanbul University Faculty of Medicine, Istanbul, Turkey
| | - Onkar Susladkar
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA
| | - Vandan Gorade
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA
| | - Debesh Jha
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA
| | - Asli C Ormeci
- Department of Internal Medicine, Istanbul University Faculty of Medicine, Istanbul, Turkey
| | - Alpay Medetalibeyoglu
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA; Department of Internal Medicine, Istanbul University Faculty of Medicine, Istanbul, Turkey
| | - Lanhong Yao
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA
| | - Bin Wang
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA
| | - Ilkin Sevgi Isler
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA; Department of Computer Science, University of Central Florida, Florida, FL, USA
| | - Linkai Peng
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA
| | - Hongyi Pan
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA
| | - Camila Lopes Vendrami
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA
| | - Amir Bourhani
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA
| | - Yury Velichko
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA
| | | | | | - Ayis Pyrros
- Department of Radiology, Duly Health and Care and Department of Biomedical and Health Information Sciences, University of Illinois Chicago, Chicago, IL, USA
| | - Pallavi Tiwari
- Dept of Biomedical Engineering, University of Wisconsin-Madison, WI, USA
| | - Derk C F Klatte
- Department of Gastroenterology and Hepatology, Amsterdam Gastroenterology and Metabolism, Amsterdam UMC, University of Amsterdam, Netherlands; Department of Radiology, Mayo Clinic, Jacksonville, FL, USA
| | - Megan Engels
- Department of Gastroenterology and Hepatology, Amsterdam Gastroenterology and Metabolism, Amsterdam UMC, University of Amsterdam, Netherlands; Department of Radiology, Mayo Clinic, Jacksonville, FL, USA
| | - Sanne Hoogenboom
- Department of Gastroenterology and Hepatology, Amsterdam Gastroenterology and Metabolism, Amsterdam UMC, University of Amsterdam, Netherlands; Department of Radiology, Mayo Clinic, Jacksonville, FL, USA
| | | | - Emil Agarunov
- Division of Gastroenterology and Hepatology, New York University, NY, USA
| | - Nassier Harfouch
- Department of Radiology, NYU Grossman School of Medicine, New York, NY, USA
| | - Chenchan Huang
- Department of Radiology, NYU Grossman School of Medicine, New York, NY, USA
| | - Marco J Bruno
- Departments of Gastroenterology and Hepatology, Erasmus Medical Center, Rotterdam, Netherlands
| | - Ivo Schoots
- Department of Radiology and Nuclear Medicine, Erasmus University Medical Center, Rotterdam, Netherlands
| | - Rajesh N Keswani
- Departments of Gastroenterology and Hepatology, Northwestern University, IL, USA
| | - Frank H Miller
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA
| | - Tamas Gonda
- Division of Gastroenterology and Hepatology, New York University, NY, USA
| | - Cemal Yazici
- Division of Gastroenterology and Hepatology, University of Illinois at Chicago, Chicago, IL, USA
| | - Temel Tirkes
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Baris Turkbey
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Michael B Wallace
- Division of Gastroenterology and Hepatology, Mayo Clinic in Florida, Jacksonville, USA
| | - Ulas Bagci
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA.
| |
Collapse
|
42
|
Zhao T, Gu Y, Yang J, Usuyama N, Lee HH, Kiblawi S, Naumann T, Gao J, Crabtree A, Abel J, Moung-Wen C, Piening B, Bifulco C, Wei M, Poon H, Wang S. A foundation model for joint segmentation, detection and recognition of biomedical objects across nine modalities. Nat Methods 2025; 22:166-176. [PMID: 39558098 DOI: 10.1038/s41592-024-02499-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Accepted: 10/02/2024] [Indexed: 11/20/2024]
Abstract
Biomedical image analysis is fundamental for biomedical discovery. Holistic image analysis comprises interdependent subtasks such as segmentation, detection and recognition, which are tackled separately by traditional approaches. Here, we propose BiomedParse, a biomedical foundation model that can jointly conduct segmentation, detection and recognition across nine imaging modalities. This joint learning improves the accuracy for individual tasks and enables new applications such as segmenting all relevant objects in an image through a textual description. To train BiomedParse, we created a large dataset comprising over 6 million triples of image, segmentation mask and textual description by leveraging natural language labels or descriptions accompanying existing datasets. We showed that BiomedParse outperformed existing methods on image segmentation across nine imaging modalities, with larger improvement on objects with irregular shapes. We further showed that BiomedParse can simultaneously segment and label all objects in an image. In summary, BiomedParse is an all-in-one tool for biomedical image analysis on all major image modalities, paving the path for efficient and accurate image-based biomedical discovery.
Collapse
Affiliation(s)
| | - Yu Gu
- Microsoft Research, Redmond, WA, USA
| | | | | | | | | | | | | | - Angela Crabtree
- Earle A. Chiles Research Institute, Providence Cancer Institute, Portland, OR, USA
| | | | | | - Brian Piening
- Providence Genomics, Portland, OR, USA
- Earle A. Chiles Research Institute, Providence Cancer Institute, Portland, OR, USA
| | - Carlo Bifulco
- Providence Genomics, Portland, OR, USA
- Earle A. Chiles Research Institute, Providence Cancer Institute, Portland, OR, USA
| | - Mu Wei
- Microsoft Research, Redmond, WA, USA.
| | | | - Sheng Wang
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA.
- Department of Surgery, University of Washington, Seattle, WA, USA.
| |
Collapse
|
43
|
Wei H, Zheng T, Zhang X, Zheng C, Jiang D, Wu Y, Lee JM, Bashir MR, Lerner E, Liu R, Wu B, Guo H, Chen Y, Yang T, Gong X, Jiang H, Song B. Deep learning-based 3D quantitative total tumor burden predicts early recurrence of BCLC A and B HCC after resection. Eur Radiol 2025; 35:127-139. [PMID: 39028376 PMCID: PMC11632001 DOI: 10.1007/s00330-024-10941-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 05/15/2024] [Accepted: 06/16/2024] [Indexed: 07/20/2024]
Abstract
OBJECTIVES This study aimed to evaluate the potential of deep learning (DL)-assisted automated three-dimensional quantitative tumor burden at MRI to predict postoperative early recurrence (ER) of hepatocellular carcinoma (HCC). MATERIALS AND METHODS This was a single-center retrospective study enrolling patients who underwent resection for BCLC A and B HCC and preoperative contrast-enhanced MRI. Quantitative total tumor volume (cm3) and total tumor burden (TTB, %) were obtained using a DL automated segmentation tool. Radiologists' visual assessment was used to ensure the quality control of automated segmentation. The prognostic value of clinicopathological variables and tumor burden-related parameters for ER was determined by Cox regression analyses. RESULTS A total of 592 patients were included, with 525 and 67 patients assigned to BCLC A and B, respectively (2-year ER rate: 30.0% vs. 45.3%; hazard ratio (HR) = 1.8; p = 0.007). TTB was the most important predictor of ER (HR = 2.2; p < 0.001). Using 6.84% as the threshold of TTB, two ER risk strata were obtained in overall (p < 0.001), BCLC A (p < 0.001), and BCLC B (p = 0.027) patients, respectively. The BCLC B low-TTB patients had a similar risk for ER to BCLC A patients and thus were reassigned to a BCLC An stage; whilst the BCLC B high-TTB patients remained in a BCLC Bn stage. The 2-year ER rate was 30.5% for BCLC An patients vs. 58.1% for BCLC Bn patients (HR = 2.8; p < 0.001). CONCLUSIONS TTB determined by DL-based automated segmentation at MRI was a predictive biomarker for postoperative ER and facilitated refined subcategorization of patients within BCLC stages A and B. CLINICAL RELEVANCE STATEMENT Total tumor burden derived by deep learning-based automated segmentation at MRI may serve as an imaging biomarker for predicting early recurrence, thereby improving subclassification of Barcelona Clinic Liver Cancer A and B hepatocellular carcinoma patients after hepatectomy. KEY POINTS Total tumor burden (TTB) is important for Barcelona Clinic Liver Cancer (BCLC) staging, but is heterogenous. TTB derived by deep learning-based automated segmentation was predictive of postoperative early recurrence. Incorporating TTB into the BCLC algorithm resulted in successful subcategorization of BCLC A and B patients.
Collapse
Affiliation(s)
- Hong Wei
- Department of Radiology, Functional, and Molecular Imaging Key Laboratory of Sichuan Province, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, China
- Department of Radiology, Seoul National University Hospital, Seoul, 03080, Republic of Korea
| | - Tianying Zheng
- Department of Radiology, Functional, and Molecular Imaging Key Laboratory of Sichuan Province, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, China
| | | | - Chao Zheng
- Shukun Technology Co., Ltd, Beijing, 100102, China
| | - Difei Jiang
- Shukun Technology Co., Ltd, Beijing, 100102, China
| | - Yuanan Wu
- Big Data Research Center, University of Electronic Science and Technology of China, Chengdu, Sichuan, 610000, China
| | - Jeong Min Lee
- Department of Radiology, Seoul National University Hospital, Seoul, 03080, Republic of Korea
- Department of Radiology, Seoul National University College of Medicine, Seoul, 03080, Republic of Korea
| | - Mustafa R Bashir
- Department of Radiology, Duke University Medical Center, Durham, NC, 27710, USA
- Center for Advanced Magnetic Resonance in Medicine, Duke University Medical Center, Durham, NC, 27705, USA
- Division of Gastroenterology, Department of Medicine, Duke University Medical Center, Durham, NC, 27710, USA
| | - Emily Lerner
- Department of Radiology, Duke University Medical Center, Durham, NC, 27710, USA
| | - Rongbo Liu
- Department of Radiology, Functional, and Molecular Imaging Key Laboratory of Sichuan Province, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, China
| | - Botong Wu
- Center for Biomedical Imaging Research, Department of Biomedical Engineering, School of Medicine, Tsinghua University, Beijing, 100102, China
| | - Hua Guo
- Center for Biomedical Imaging Research, Department of Biomedical Engineering, School of Medicine, Tsinghua University, Beijing, 100102, China
| | - Yidi Chen
- Department of Radiology, Functional, and Molecular Imaging Key Laboratory of Sichuan Province, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, China
| | - Ting Yang
- Department of Radiology, Functional, and Molecular Imaging Key Laboratory of Sichuan Province, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, China
| | - Xiaoling Gong
- Department of Radiology, Functional, and Molecular Imaging Key Laboratory of Sichuan Province, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, China
| | - Hanyu Jiang
- Department of Radiology, Functional, and Molecular Imaging Key Laboratory of Sichuan Province, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, China.
| | - Bin Song
- Department of Radiology, Functional, and Molecular Imaging Key Laboratory of Sichuan Province, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, China.
- Department of Radiology, Sanya People's Hospital, Sanya, Hainan, 572000, China.
| |
Collapse
|
44
|
Jiang X, Zhang D, Li X, Liu K, Cheng KT, Yang X. Labeled-to-unlabeled distribution alignment for partially-supervised multi-organ medical image segmentation. Med Image Anal 2025; 99:103333. [PMID: 39244795 DOI: 10.1016/j.media.2024.103333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 04/17/2024] [Accepted: 08/30/2024] [Indexed: 09/10/2024]
Abstract
Partially-supervised multi-organ medical image segmentation aims to develop a unified semantic segmentation model by utilizing multiple partially-labeled datasets, with each dataset providing labels for a single class of organs. However, the limited availability of labeled foreground organs and the absence of supervision to distinguish unlabeled foreground organs from the background pose a significant challenge, which leads to a distribution mismatch between labeled and unlabeled pixels. Although existing pseudo-labeling methods can be employed to learn from both labeled and unlabeled pixels, they are prone to performance degradation in this task, as they rely on the assumption that labeled and unlabeled pixels have the same distribution. In this paper, to address the problem of distribution mismatch, we propose a labeled-to-unlabeled distribution alignment (LTUDA) framework that aligns feature distributions and enhances discriminative capability. Specifically, we introduce a cross-set data augmentation strategy, which performs region-level mixing between labeled and unlabeled organs to reduce distribution discrepancy and enrich the training set. Besides, we propose a prototype-based distribution alignment method that implicitly reduces intra-class variation and increases the separation between the unlabeled foreground and background. This can be achieved by encouraging consistency between the outputs of two prototype classifiers and a linear classifier. Extensive experimental results on the AbdomenCT-1K dataset and a union of four benchmark datasets (including LiTS, MSD-Spleen, KiTS, and NIH82) demonstrate that our method outperforms the state-of-the-art partially-supervised methods by a considerable margin, and even surpasses the fully-supervised methods. The source code is publicly available at LTUDA.
Collapse
Affiliation(s)
- Xixi Jiang
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
| | - Dong Zhang
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
| | - Xiang Li
- School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Kangyi Liu
- School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Kwang-Ting Cheng
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
| | - Xin Yang
- School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan 430074, China.
| |
Collapse
|
45
|
Liu H, Ren P, Yuan Y, Song C, Luo F. Uncertainty Global Contrastive Learning Framework for Semi-Supervised Medical Image Segmentation. IEEE J Biomed Health Inform 2025; 29:433-442. [PMID: 39504281 DOI: 10.1109/jbhi.2024.3492540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2024]
Abstract
In semi-supervised medical image segmentation, the issue of fuzzy boundaries for segmented objects arises. With limited labeled data and the interaction of boundaries from different segmented objects, classifying segmentation boundaries becomes challenging. To mitigate this issue, we propose an uncertainty global contrastive learning (UGCL) framework. Specifically, we propose a patch filtering method and a classification entropy filtering method to provide reliable pseudo-labels for unlabelled data, while separating fuzzy boundaries and high-entropy pixel points as unreliable points. Considering that unreliable regions contain rich complementary information, we introduce an uncertainty global contrast learning method to distinguish these challenging unreliable regions, enhancing intra-class compactness and inter-class separability at the global data level. Within our optimization framework, we also integrate consistency regularization techniques and select unreliable points as targets for consistency. As demonstrated, the contrastive learning and consistency regularization applied to uncertain points enable us to glean valuable semantic information from unreliable data, which enhances segmentation accuracy. We evaluate our method on two publicly available medical image datasets and compare it with other state-of-the-art semi-supervised medical image segmentation methods, and a series of experimental results show that our method has achieved substantial improvements.
Collapse
|
46
|
Guérendel C, Petrychenko L, Chupetlovska K, Bodalal Z, Beets-Tan RGH, Benson S. Generalizability, robustness, and correction bias of segmentations of thoracic organs at risk in CT images. Eur Radiol 2024:10.1007/s00330-024-11321-2. [PMID: 39738559 DOI: 10.1007/s00330-024-11321-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 10/28/2024] [Accepted: 11/28/2024] [Indexed: 01/02/2025]
Abstract
OBJECTIVE This study aims to assess and compare two state-of-the-art deep learning approaches for segmenting four thoracic organs at risk (OAR)-the esophagus, trachea, heart, and aorta-in CT images in the context of radiotherapy planning. MATERIALS AND METHODS We compare a multi-organ segmentation approach and the fusion of multiple single-organ models, each dedicated to one OAR. All were trained using nnU-Net with the default parameters and the full-resolution configuration. We evaluate their robustness with adversarial perturbations, and their generalizability on external datasets, and explore potential biases introduced by expert corrections compared to fully manual delineations. RESULTS The two approaches show excellent performance with an average Dice score of 0.928 for the multi-class setting and 0.930 when fusing the four single-organ models. The evaluation of external datasets and common procedural adversarial noise demonstrates the good generalizability of these models. In addition, expert corrections of both models show significant bias to the original automated segmentation. The average Dice score between the two corrections is 0.93, ranging from 0.88 for the trachea to 0.98 for the heart. CONCLUSION Both approaches demonstrate excellent performance and generalizability in segmenting four thoracic OARs, potentially improving efficiency in radiotherapy planning. However, the multi-organ setting proves advantageous for its efficiency, requiring less training time and fewer resources, making it a preferable choice for this task. Moreover, corrections of AI segmentation by clinicians may lead to biases in the results of AI approaches. A test set, manually annotated, should be used to assess the performance of such methods. KEY POINTS Question While manual delineation of thoracic organs at risk is labor-intensive, prone to errors, and time-consuming, evaluation of AI models performing this task lacks robustness. Findings The deep-learning model using the nnU-Net framework showed excellent performance, generalizability, and robustness in segmenting thoracic organs in CT, enhancing radiotherapy planning efficiency. Clinical relevance Automatic segmentation of thoracic organs at risk can save clinicians time without compromising the quality of the delineations, and extensive evaluation across diverse settings demonstrates the potential of integrating such models into clinical practice.
Collapse
Affiliation(s)
- Corentin Guérendel
- Department of Radiology, Antoni van Leeuwenhoek-The Netherlands Cancer Institute, Amsterdam, The Netherlands.
- GROW-Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, The Netherlands.
| | - Liliana Petrychenko
- Department of Radiology, Antoni van Leeuwenhoek-The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Kalina Chupetlovska
- Department of Radiology, Antoni van Leeuwenhoek-The Netherlands Cancer Institute, Amsterdam, The Netherlands
- University Hospital St. Ivan Rilski, Sofia, Bulgaria
| | - Zuhir Bodalal
- Department of Radiology, Antoni van Leeuwenhoek-The Netherlands Cancer Institute, Amsterdam, The Netherlands
- GROW-Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, The Netherlands
| | - Regina G H Beets-Tan
- Department of Radiology, Antoni van Leeuwenhoek-The Netherlands Cancer Institute, Amsterdam, The Netherlands
- GROW-Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, The Netherlands
| | - Sean Benson
- Department of Radiology, Antoni van Leeuwenhoek-The Netherlands Cancer Institute, Amsterdam, The Netherlands
- Department of Cardiology, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
47
|
Zhang H, Lu Z, Gong P, Zhang S, Yang X, Li X, Feng Z, Li A, Xiao C. High-throughput mesoscopic optical imaging data processing and parsing using differential-guided filtered neural networks. Brain Inform 2024; 11:32. [PMID: 39692944 DOI: 10.1186/s40708-024-00246-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2024] [Accepted: 11/27/2024] [Indexed: 12/19/2024] Open
Abstract
High-throughput mesoscopic optical imaging technology has tremendously boosted the efficiency of procuring massive mesoscopic datasets from mouse brains. Constrained by the imaging field of view, the image strips obtained by such technologies typically require further processing, such as cross-sectional stitching, artifact removal, and signal area cropping, to meet the requirements of subsequent analyse. However, obtaining a batch of raw array mouse brain data at a resolution of 0.65 × 0.65 × 3 μ m 3 can reach 220TB, and the cropping of the outer contour areas in the disjointed processing still relies on manual visual observation, which consumes substantial computational resources and labor costs. In this paper, we design an efficient deep differential guided filtering module (DDGF) by fusing multi-scale iterative differential guided filtering with deep learning, which effectively refines image details while mitigating background noise. Subsequently, by amalgamating DDGF with deep learning network, we propose a lightweight deep differential guided filtering segmentation network (DDGF-SegNet), which demonstrates robust performance on our dataset, achieving Dice of 0.92, Precision of 0.98, Recall of 0.91, and Jaccard index of 0.86. Building on the segmentation, we utilize connectivity analysis for ascertaining three-dimensional spatial orientation of each brain within the array. Furthermore, we streamline the entire processing workflow by developing an automated pipeline optimized for cluster-based message passing interface(MPI) parallel computation, which reduces the processing time for a mouse brain dataset to a mere 1.1 h, enhancing manual efficiency by 25 times and overall data processing efficiency by 2.4 times, paving the way for enhancing the efficiency of big data processing and parsing for high-throughput mesoscopic optical imaging techniques.
Collapse
Affiliation(s)
- Hong Zhang
- Key Laboratory of Biomedical Engineering of Hainan Province, School of Biomedical Engineering, Hainan University, Sanya, 572025, China
| | - Zhikang Lu
- Key Laboratory of Biomedical Engineering of Hainan Province, School of Biomedical Engineering, Hainan University, Sanya, 572025, China
| | - Peicong Gong
- Key Laboratory of Biomedical Engineering of Hainan Province, School of Biomedical Engineering, Hainan University, Sanya, 572025, China
| | - Shilong Zhang
- Key Laboratory of Biomedical Engineering of Hainan Province, School of Biomedical Engineering, Hainan University, Sanya, 572025, China
| | - Xiaoquan Yang
- Key Laboratory of Biomedical Engineering of Hainan Province, School of Biomedical Engineering, Hainan University, Sanya, 572025, China
- HUST-Suzhou Institute for Brainsmatics, JITRI, Suzhou, 215123, China
| | - Xiangning Li
- Key Laboratory of Biomedical Engineering of Hainan Province, School of Biomedical Engineering, Hainan University, Sanya, 572025, China
- HUST-Suzhou Institute for Brainsmatics, JITRI, Suzhou, 215123, China
| | - Zhao Feng
- Key Laboratory of Biomedical Engineering of Hainan Province, School of Biomedical Engineering, Hainan University, Sanya, 572025, China
- HUST-Suzhou Institute for Brainsmatics, JITRI, Suzhou, 215123, China
| | - Anan Li
- Key Laboratory of Biomedical Engineering of Hainan Province, School of Biomedical Engineering, Hainan University, Sanya, 572025, China
- HUST-Suzhou Institute for Brainsmatics, JITRI, Suzhou, 215123, China
| | - Chi Xiao
- Key Laboratory of Biomedical Engineering of Hainan Province, School of Biomedical Engineering, Hainan University, Sanya, 572025, China.
| |
Collapse
|
48
|
Li C, Wang R, He P, Chen W, Wu W, Wu Y. Segmentation prompts classification: A nnUNet-based 3D transfer learning framework with ROI tokenization and cross-task attention for esophageal cancer T-stage diagnosis. EXPERT SYSTEMS WITH APPLICATIONS 2024; 258:125067. [DOI: 10.1016/j.eswa.2024.125067] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2025]
|
49
|
Azad R, Aghdam EK, Rauland A, Jia Y, Avval AH, Bozorgpour A, Karimijafarbigloo S, Cohen JP, Adeli E, Merhof D. Medical Image Segmentation Review: The Success of U-Net. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:10076-10095. [PMID: 39167505 DOI: 10.1109/tpami.2024.3435571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
Automatic medical image segmentation is a crucial topic in the medical domain and successively a critical counterpart in the computer-aided diagnosis paradigm. U-Net is the most widespread image segmentation architecture due to its flexibility, optimized modular design, and success in all medical image modalities. Over the years, the U-Net model has received tremendous attention from academic and industrial researchers who have extended it to address the scale and complexity created by medical tasks. These extensions are commonly related to enhancing the U-Net's backbone, bottleneck, or skip connections, or including representation learning, or combining it with a Transformer architecture, or even addressing probabilistic prediction of the segmentation map. Having a compendium of different previously proposed U-Net variants makes it easier for machine learning researchers to identify relevant research questions and understand the challenges of the biological tasks that challenge the model. In this work, we discuss the practical aspects of the U-Net model and organize each variant model into a taxonomy. Moreover, to measure the performance of these strategies in a clinical application, we propose fair evaluations of some unique and famous designs on well-known datasets. Furthermore, we provide a comprehensive implementation library with trained models. In addition, for ease of future studies, we created an online list of U-Net papers with their possible official implementation.
Collapse
|
50
|
Verbakel J, Boot MR, van der Gaast N, Dunning H, Bakker M, Jaarsma RL, Doornberg JN, Edwards MJR, van de Groes SAW, Hermans E. Symmetry of the left and right tibial plafond; a comparison of 75 distal tibia pairs. Eur J Trauma Emerg Surg 2024; 50:2877-2882. [PMID: 38874625 PMCID: PMC11666608 DOI: 10.1007/s00068-024-02568-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 05/30/2024] [Indexed: 06/15/2024]
Abstract
PURPOSE Tibia plafond or pilon fractures present a high level of complexity, making their surgical management challenging. Three-Dimensional Virtual Planning (3DVP) can assist in preoperative planning to achieve optimal fracture reduction. This study aimed to assess the symmetry of the left and right tibial plafond and whether left-right mirroring can reliably be used. METHODS Bilateral CT scans of the lower limbs of 75 patients without ankle problems or prior fractures of the lower limb were included. The CT images were segmented to create 3D surface models of the tibia. Subsequently, the left tibial models were mirrored and superimposed onto the right tibia models using a Coherent Point Drift surface matching algorithm. The tibias were then cut to create bone models of the distal tibia with a height of 30 mm, and correspondence points were established. The Euclidean distance was calculated between correspondence points and visualized in a boxplot and heatmaps. The articulating surface was selected as a region of interest. RESULTS The median left-right difference was 0.57 mm (IQR, 0.38 - 0.85 mm) of the entire tibial plafond and 0.53 mm (IQR, 0.37 - 0.76 mm) of the articulating surface. The area with the greatest left-right differences were the medial malleoli and the anterior tubercle of the tibial plafond. CONCLUSION The tibial plafond exhibits a high degree of bilateral symmetry. Therefore, the mirrored unfractured tibial plafond may be used as a template to optimize preoperative surgical reduction using 3DVP techniques in patients with pilon fractures.
Collapse
Affiliation(s)
- Joy Verbakel
- Department of Trauma Surgery, Radboud University Medical Center, Geert Grooteplein Zuid, 6525 GA, Nijmegen, The Netherlands.
| | - Miriam R Boot
- Orthopaedic Research Laboratory, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Nynke van der Gaast
- Department of Trauma Surgery, Radboud University Medical Center, Geert Grooteplein Zuid, 6525 GA, Nijmegen, The Netherlands
| | - Hans Dunning
- Orthopaedic Research Laboratory, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Max Bakker
- Orthopaedic Research Laboratory, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Ruurd L Jaarsma
- Department of Orthopaedic & Trauma Surgery, Flinders University and Flinders Medical Centre, Adelaide, Australia
| | - Job N Doornberg
- Department of Orthopaedic & Trauma Surgery, Flinders University and Flinders Medical Centre, Adelaide, Australia
- Department of Orthopaedic Surgery, University Medical Center Groningen, Groningen, The Netherlands
| | - Michael J R Edwards
- Department of Trauma Surgery, Radboud University Medical Center, Geert Grooteplein Zuid, 6525 GA, Nijmegen, The Netherlands
| | | | - Erik Hermans
- Department of Trauma Surgery, Radboud University Medical Center, Geert Grooteplein Zuid, 6525 GA, Nijmegen, The Netherlands
| |
Collapse
|