1
|
Ramanarayanan S, G S R, Fahim MA, Ram K, Venkatesan R, Sivaprakasam M. SHFormer: Dynamic spectral filtering convolutional neural network and high-pass kernel generation transformer for adaptive MRI reconstruction. Neural Netw 2025; 187:107334. [PMID: 40086134 DOI: 10.1016/j.neunet.2025.107334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 10/11/2024] [Accepted: 02/27/2025] [Indexed: 03/16/2025]
Abstract
Attention Mechanism (AM) selectively focuses on essential information for imaging tasks and captures relationships between regions from distant pixel neighborhoods to compute feature representations. Accelerated magnetic resonance image (MRI) reconstruction can benefit from AM, as the imaging process involves acquiring Fourier domain measurements that influence the image representation in a non-local manner. However, AM-based models are more adept at capturing low-frequency information and have limited capacity in constructing high-frequency representations, restricting the models to smooth reconstruction. Secondly, AM-based models need mode-specific retraining for multimodal MRI data as their knowledge is restricted to local contextual variations within modes that might be inadequate to capture the diverse transferable features across heterogeneous data domains. To address these challenges, we propose a neuromodulation-based discriminative multi-spectral AM for scalable MRI reconstruction, that can (i) propagate the context-aware high-frequency details for high-quality image reconstruction, and (ii) capture features reusable to deviated unseen domains in multimodal MRI, to offer high practical value for the healthcare industry and researchers. The proposed network consists of a spectral filtering convolutional neural network to capture mode-specific transferable features to generalize to deviated MRI data domains and a dynamic high-pass kernel generation transformer that focuses on high-frequency details for improved reconstruction. We have evaluated our model on various aspects, such as comparative studies in supervised and self-supervised learning, diffusion model-based training, closed-set and open-set generalization under heterogeneous MRI data, and interpretation-based analysis. Our results show that the proposed method offers scalable and high-quality reconstruction with best improvement margins of ∼1 dB in PSNR and ∼0.01 in SSIM under unseen scenarios. Our code is available at https://github.com/sriprabhar/SHFormer.
Collapse
Affiliation(s)
- Sriprabha Ramanarayanan
- Department of Electrical Engineering, Indian Institute of Technology Madras (IITM), India; Healthcare Technology Innovation Centre, IITM, India.
| | - Rahul G S
- Department of Electrical Engineering, Indian Institute of Technology Madras (IITM), India
| | - Mohammad Al Fahim
- Department of Electrical Engineering, Indian Institute of Technology Madras (IITM), India; Healthcare Technology Innovation Centre, IITM, India
| | - Keerthi Ram
- Healthcare Technology Innovation Centre, IITM, India
| | | | - Mohanasankar Sivaprakasam
- Department of Electrical Engineering, Indian Institute of Technology Madras (IITM), India; Healthcare Technology Innovation Centre, IITM, India
| |
Collapse
|
2
|
Berezhnoy AK, Kalinin AS, Parshin DA, Selivanov AS, Demin AG, Zubov AG, Shaidullina RS, Aitova AA, Slotvitsky MM, Kalemberg AA, Kirillova VS, Syrovnev VA, Agladze KI, Tsvelaya VA. The impact of training image quality with a novel protocol on artificial intelligence-based LGE-MRI image segmentation for potential atrial fibrillation management. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 264:108722. [PMID: 40112687 DOI: 10.1016/j.cmpb.2025.108722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Revised: 03/13/2025] [Accepted: 03/13/2025] [Indexed: 03/22/2025]
Abstract
BACKGROUND Atrial fibrillation (AF) is the most common cardiac arrhythmia, affecting up to 2 % of the population. Catheter ablation is a promising treatment for AF, particularly for paroxysmal AF patients, but it often has high recurrence rates. Developing in silico models of patients' atria during the ablation procedure using cardiac MRI data may help reduce these rates. OBJECTIVE This study aims to develop an effective automated deep learning-based segmentation pipeline by compiling a specialized dataset and employing standardized labeling protocols to improve segmentation accuracy and efficiency. In doing so, we aim to achieve the highest possible accuracy and generalization ability while minimizing the burden on clinicians involved in manual data segmentation. METHODS We collected LGE-MRI data from VMRC and the cDEMRIS database. Two specialists manually labeled the data using standardized protocols to reduce subjective errors. Neural network (nnU-Net and smpU-Net++) performance was evaluated using statistical tests, including sensitivity and specificity analysis. A new database of LGE-MRI images, based on manual segmentation, was created (VMRC). RESULTS Our approach with consistent labeling protocols achieved a Dice coefficient of 92.4 % ± 0.8 % for the cavity and 64.5 % ± 1.9 % for LA walls. Using the pre-trained RIFE model, we attained a Dice score of approximately 89.1 % ± 1.6 % for atrial LGE-MRI imputation, outperforming classical methods. Sensitivity and specificity values demonstrated substantial enhancement in the performance of neural networks trained with the new protocol. CONCLUSION Standardized labeling and RIFE applications significantly improved machine learning tool efficiency for constructing 3D LA models. This novel approach supports integrating state-of-the-art machine learning methods into broader in silico pipelines for predicting ablation outcomes in AF patients.
Collapse
Affiliation(s)
- A K Berezhnoy
- Laboratory of Experimental and Cellular Medicine, Moscow Institute of Physics and Technology, Dolgoprudny, 141701, Russia; M. F. Vladimirsky Moscow Regional Research Clinical Institute, Moscow, 129110, Russia; ITMO University, Kronverksky Pr. 49, bldg. A, St. Petersburg, 197101, Russia.
| | - A S Kalinin
- Laboratory of Experimental and Cellular Medicine, Moscow Institute of Physics and Technology, Dolgoprudny, 141701, Russia
| | - D A Parshin
- ITMO University, Kronverksky Pr. 49, bldg. A, St. Petersburg, 197101, Russia
| | - A S Selivanov
- Laboratory of Experimental and Cellular Medicine, Moscow Institute of Physics and Technology, Dolgoprudny, 141701, Russia
| | - A G Demin
- ITMO University, Kronverksky Pr. 49, bldg. A, St. Petersburg, 197101, Russia
| | - A G Zubov
- ITMO University, Kronverksky Pr. 49, bldg. A, St. Petersburg, 197101, Russia
| | - R S Shaidullina
- ITMO University, Kronverksky Pr. 49, bldg. A, St. Petersburg, 197101, Russia
| | - A A Aitova
- Laboratory of Experimental and Cellular Medicine, Moscow Institute of Physics and Technology, Dolgoprudny, 141701, Russia; M. F. Vladimirsky Moscow Regional Research Clinical Institute, Moscow, 129110, Russia; ITMO University, Kronverksky Pr. 49, bldg. A, St. Petersburg, 197101, Russia
| | - M M Slotvitsky
- Laboratory of Experimental and Cellular Medicine, Moscow Institute of Physics and Technology, Dolgoprudny, 141701, Russia; M. F. Vladimirsky Moscow Regional Research Clinical Institute, Moscow, 129110, Russia; ITMO University, Kronverksky Pr. 49, bldg. A, St. Petersburg, 197101, Russia
| | - A A Kalemberg
- M. F. Vladimirsky Moscow Regional Research Clinical Institute, Moscow, 129110, Russia
| | - V S Kirillova
- Federal State Budgetary Institution "National Medical Research Center named after Academician E.N. Meshalkin" of the Ministry of Health of the Russian Federation, Novosibirsk, 630007, Russia
| | - V A Syrovnev
- Federal State Budgetary Institution "Clinical Hospital No. 1" of the Office of the President of the Russian Federation, Moscow, 121352, Russia
| | - K I Agladze
- Laboratory of Experimental and Cellular Medicine, Moscow Institute of Physics and Technology, Dolgoprudny, 141701, Russia; M. F. Vladimirsky Moscow Regional Research Clinical Institute, Moscow, 129110, Russia
| | - V A Tsvelaya
- Laboratory of Experimental and Cellular Medicine, Moscow Institute of Physics and Technology, Dolgoprudny, 141701, Russia; M. F. Vladimirsky Moscow Regional Research Clinical Institute, Moscow, 129110, Russia; ITMO University, Kronverksky Pr. 49, bldg. A, St. Petersburg, 197101, Russia.
| |
Collapse
|
3
|
Jia X, Wang W, Zhang M, Zhao B. Atten-Nonlocal Unet: Attention and Non-local Unet for medical image segmentation. Comput Biol Med 2025; 191:110129. [PMID: 40239230 DOI: 10.1016/j.compbiomed.2025.110129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Revised: 03/29/2025] [Accepted: 04/01/2025] [Indexed: 04/18/2025]
Abstract
The convolutional neural network(CNN)-based models have emerged as the predominant approach for medical image segmentation due to their effective inductive bias. However, their limitation lies in the lack of long-range information. In this study, we propose the Atten-Nonlocal Unet model that integrates CNN and transformer to overcome this limitation and precisely capture global context in 2D features. Specifically, we utilize the BCSM attention module and the Cross Non-local module to enhance feature representation, thereby improving the segmentation accuracy. Experimental results on the Synapse, ACDC, and AVT datasets show that Atten-Nonlocal Unet achieves DSC scores of 84.15%, 91.57%, and 86.94% respectively, and has 95% HD of 15.17, 1.16, and 4.78 correspondingly. Compared to the existing methods for medical image segmentation, the proposed method demonstrates superior segmentation performance, ensuring high accuracy in segmenting large organs while improving segmentation for small organs.
Collapse
Affiliation(s)
- Xiaofen Jia
- School of Artificial Intelligence, Anhui University of Science and Technology, Huainan, 232001, China.
| | - Wenjie Wang
- School of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan, 232001, China.
| | - Mei Zhang
- Sleep Medicine Center in High-tech District Hospital and Department of Neurology, First Affiliated Hospital of Anhui University of Science and Technology, First People's Hospital of Huainan, Huainan, 232000, China.
| | - Baiting Zhao
- School of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan, 232001, China.
| |
Collapse
|
4
|
Qin J, Xiong J, Liang Z. CNN-Transformer gated fusion network for medical image super-resolution. Sci Rep 2025; 15:15338. [PMID: 40316611 PMCID: PMC12048642 DOI: 10.1038/s41598-025-00119-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2024] [Accepted: 04/25/2025] [Indexed: 05/04/2025] Open
Abstract
To solve the problems of image detail blurring and insufficient utilization of global information in the existing medical image super-resolution reconstruction, this paper proposes a dual-branch fusion network based on residual Transformer network and dynamic convolutional neural network (CTGFSR). The network consists of two branches, one is the global branch based on residual Transformer network, and the other is the local branch based on dynamic convolutional neural network. The global branch uses the self-attention mechanism of Transformer network, which can effectively mine the large-scale global information in the image and improve the overall quality of the image. The local branch uses the characteristic of dynamic convolution to adaptively adjust the convolution kernel parameters, which can enhance the feature extraction ability of convolutional neural network for multi-scale information and improve the detail restoration ability of the image without significantly increasing the network model size. The network uses residual skip connections to preserve the detail information in medical image super-resolution reconstruction. Finally, through the bidirectional gated attention mechanism, the two branches are fused to obtain the final super-resolution reconstruction image. This paper evaluates the performance of the network on two medical image datasets, namely ACDC abdominal MR related to medical image segmentation and L2R2022 lung CT related to registration. The experimental results show that compared with the mainstream super-resolution algorithms, CTGFSR has better overall performance. When the magnification factor is 2 or 4, compared with the convolutional neural network based CFIPC, PDCNCF, ESPCN, FSRCNN, VDSR and the Transformer network based ESRT, SwinIR, the structural similarity SSIM and peak signal-to-noise ratio PSNR have a certain improvement.
Collapse
Affiliation(s)
- Juanjuan Qin
- Department of Artificial Intelligence and Data Science, Guangzhou Xinhua University, Dongguan, 523133, Guangdong, China.
| | - Jian Xiong
- Department of Artificial Intelligence and Data Science, Guangzhou Xinhua University, Dongguan, 523133, Guangdong, China.
| | - Zhantu Liang
- Department of Artificial Intelligence and Data Science, Guangzhou Xinhua University, Dongguan, 523133, Guangdong, China
| |
Collapse
|
5
|
He A, Wu Y, Wang Z, Li T, Fu H. DVPT: Dynamic Visual Prompt Tuning of large pre-trained models for medical image analysis. Neural Netw 2025; 185:107168. [PMID: 39827840 DOI: 10.1016/j.neunet.2025.107168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Revised: 11/07/2024] [Accepted: 01/12/2025] [Indexed: 01/22/2025]
Abstract
Pre-training and fine-tuning have become popular due to the rich representations embedded in large pre-trained models, which can be leveraged for downstream medical tasks. However, existing methods typically either fine-tune all parameters or only task-specific layers of pre-trained models, overlooking the variability in input medical images. As a result, these approaches may lack efficiency or effectiveness. In this study, our goal is to explore parameter-efficient fine-tuning (PEFT) for medical image analysis. To address this challenge, we introduce a novel method called Dynamic Visual Prompt Tuning (DVPT). It can extract knowledge beneficial to downstream tasks from large models with only a few trainable parameters. First, the frozen features are transformed by a lightweight bottleneck layer to learn the domain-specific distribution of downstream medical tasks. Then, a few learnable visual prompts are employed as dynamic queries to conduct cross-attention with the transformed features, aiming to acquire sample-specific features. This DVPT module can be shared across different Transformer layers, further reducing the number of trainable parameters. We conduct extensive experiments with various pre-trained models on medical classification and segmentation tasks. We find that this PEFT method not only efficiently adapts pre-trained models to the medical domain but also enhances data efficiency with limited labeled data. For example, with only 0.5% additional trainable parameters, our method not only outperforms state-of-the-art PEFT methods but also surpasses full fine-tuning by more than 2.20% in Kappa score on the medical classification task. It can save up to 60% of labeled data and 99% of storage cost of ViT-B/16.
Collapse
Affiliation(s)
- Along He
- College of Computer Science, Tianjin Key Laboratory of Network and Data Security Technology, Nankai University, Tianjin, 300350, China
| | - Yanlin Wu
- College of Computer Science, Tianjin Key Laboratory of Network and Data Security Technology, Nankai University, Tianjin, 300350, China
| | - Zhihong Wang
- College of Computer Science, Tianjin Key Laboratory of Network and Data Security Technology, Nankai University, Tianjin, 300350, China
| | - Tao Li
- College of Computer Science, Tianjin Key Laboratory of Network and Data Security Technology, Nankai University, Tianjin, 300350, China.
| | - Huazhu Fu
- Institute of High Performance Computing (IHPC), Agency for Science, Technology and Research (A*STAR), 138632, Singapore
| |
Collapse
|
6
|
Tang X, Li J, Liu Q, Zhou C, Zeng P, Meng Y, Xu J, Tian G, Yang J. SWMA-UNet: Multi-Path Attention Network for Improved Medical Image Segmentation. IEEE J Biomed Health Inform 2025; 29:3609-3618. [PMID: 40030824 DOI: 10.1109/jbhi.2024.3523492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
In recent years, deep learning achieves significant advancements in medical image segmentation. Research finds that integrating Transformers and CNNs effectively addresses the limitations of CNNs in managing long-distance dependencies and understanding global information.However, existing models typically employ a serial approach to combine Transformers and CNNs, which complicates the simultaneous processing of global and local information. To address this, our study proposes a parallel multi-path attention architecture, SWMA-UNET, that integrates Transformers and CNNs. This architecture deeply mines features through parallel strategies while capturing both local details and global context information, thereby enhancing the accuracy of medical image segmentation. Experimental results indicate that our method surpasses all previously reported methods in the literature on the Synapse, ACDC, ISIC 2018 and MoNuSeg datasets.
Collapse
|
7
|
Koehler S, Kuhm J, Huffaker T, Young D, Tandon A, André F, Frey N, Greil G, Hussain T, Engelhardt S. Deep Learning-based Aligned Strain from Cine Cardiac MRI for Detection of Fibrotic Myocardial Tissue in Patients with Duchenne Muscular Dystrophy. Radiol Artif Intell 2025; 7:e240303. [PMID: 40008976 DOI: 10.1148/ryai.240303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/27/2025]
Abstract
Purpose To develop a deep learning (DL) model that derives aligned strain values from cine (noncontrast) cardiac MRI and evaluate performance of these values to predict myocardial fibrosis in patients with Duchenne muscular dystrophy (DMD). Materials and Methods This retrospective study included 139 male patients with DMD who underwent cardiac MRI at a single center between February 2018 and April 2023. A DL pipeline was developed to detect five key frames throughout the cardiac cycle and respective dense deformation fields, allowing for phase-specific strain analysis across patients and from one key frame to the next. Effectiveness of these strain values in identifying abnormal deformations associated with fibrotic segments was evaluated in 57 patients (mean age [± SD], 15.2 years ± 3.1), and reproducibility was assessed in 82 patients by comparing the study method with existing feature-tracking and DL-based methods. Statistical analysis compared strain values using t tests, mixed models, and more than 2000 machine learning models; accuracy, F1 score, sensitivity, and specificity are reported. Results DL-based aligned strain identified five times more differences (29 vs five; P < .01) between fibrotic and nonfibrotic segments compared with traditional strain values and identified abnormal diastolic deformation patterns often missed with traditional methods. In addition, aligned strain values enhanced performance of predictive models for myocardial fibrosis detection, improving specificity by 40%, overall accuracy by 17%, and accuracy in patients with preserved ejection fraction by 61%. Conclusion The proposed aligned strain technique enables motion-based detection of myocardial dysfunction at noncontrast cardiac MRI, facilitating detailed interpatient strain analysis and allowing precise tracking of disease progression in DMD. Keywords: Pediatrics, Image Postprocessing, Heart, Cardiac, Convolutional Neural Network (CNN) Duchenne Muscular Dystrophy Supplemental material is available for this article. © RSNA, 2025.
Collapse
Affiliation(s)
- Sven Koehler
- Department of Internal Medicine III, Heidelberg University Hospital, Im Neuenheimer Feld 410, 69120 Heidelberg, Germany
- German Centre for Cardiovascular Research (DZHK), Partner Sites Heidelberg and Mannheim, Germany
- Medical Faculty of University Heidelberg, Heidelberg University, Heidelberg, Germany
| | - Julian Kuhm
- Department of Internal Medicine III, Heidelberg University Hospital, Im Neuenheimer Feld 410, 69120 Heidelberg, Germany
- German Centre for Cardiovascular Research (DZHK), Partner Sites Heidelberg and Mannheim, Germany
| | - Tyler Huffaker
- Division of Pediatric Cardiology, Department of Pediatrics, UT Southwestern/Children's Health, Dallas, Tex
| | - Daniel Young
- Division of Pediatric Cardiology, Department of Pediatrics, UT Southwestern/Children's Health, Dallas, Tex
| | - Animesh Tandon
- Department of Heart, Vascular, and Thoracic, Children's Institute; Cleveland Clinic Children's Centre for Artificial Intelligence (C4AI); and Cardiovascular Innovation Research Centre, Cleveland Children's Clinic, Cleveland, Ohio
- Department of Biomedical Engineering, Case School of Engineering, Case Western Reserve University, Cleveland, Ohio
| | - Florian André
- Department of Internal Medicine III, Heidelberg University Hospital, Im Neuenheimer Feld 410, 69120 Heidelberg, Germany
- German Centre for Cardiovascular Research (DZHK), Partner Sites Heidelberg and Mannheim, Germany
- Medical Faculty of University Heidelberg, Heidelberg University, Heidelberg, Germany
| | - Norbert Frey
- Department of Internal Medicine III, Heidelberg University Hospital, Im Neuenheimer Feld 410, 69120 Heidelberg, Germany
- German Centre for Cardiovascular Research (DZHK), Partner Sites Heidelberg and Mannheim, Germany
- Medical Faculty of University Heidelberg, Heidelberg University, Heidelberg, Germany
| | - Gerald Greil
- Division of Pediatric Cardiology, Department of Pediatrics, UT Southwestern/Children's Health, Dallas, Tex
| | - Tarique Hussain
- Division of Pediatric Cardiology, Department of Pediatrics, UT Southwestern/Children's Health, Dallas, Tex
| | - Sandy Engelhardt
- Department of Internal Medicine III, Heidelberg University Hospital, Im Neuenheimer Feld 410, 69120 Heidelberg, Germany
- German Centre for Cardiovascular Research (DZHK), Partner Sites Heidelberg and Mannheim, Germany
- Medical Faculty of University Heidelberg, Heidelberg University, Heidelberg, Germany
| |
Collapse
|
8
|
Zhang L, Wu F, Bronik K, Papiez BW. DiffuSeg: Domain-Driven Diffusion for Medical Image Segmentation. IEEE J Biomed Health Inform 2025; 29:3619-3631. [PMID: 40030962 DOI: 10.1109/jbhi.2025.3526806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
In recent years, the deployment of supervised machine learning techniques for segmentation tasks has significantly increased. Nonetheless, the annotation process for extensive datasets remains costly, labor-intensive, and error-prone. While acquiring sufficiently large datasets to train deep learning models is feasible, these datasets often experience a distribution shift relative to the actual test data. This problem is particularly critical in the domain of medical imaging, where it adversely affects the efficacy of automatic segmentation models. In this work, we introduce DiffuSeg, a novel conditional diffusion model developed for medical image data, that exploits any labels to synthesize new images in the target domain. This allows a number of new research directions, including the segmentation task that motivates this work. Our method only requires label maps from any existing datasets and unlabelled images from the target domain for image diffusion. To learn the target domain knowledge, a feature factorization variational autoencoder is proposed to provide conditional information for the diffusion model. Consequently, the segmentation network can be trained with the given labels and the synthetic images, thus avoiding human annotations. Initially, we apply our method to the MNIST dataset and subsequently adapt it for use with medical image segmentation datasets, such as retinal fundus images for vessel segmentation and MRI images for heart segmentation. Our approach exhibits significant improvements over relevant baselines in both image generation and segmentation accuracy, especially in scenarios where annotations for the target dataset are unavailable during training. An open-source implementation of our approach can be released after reviewing..
Collapse
|
9
|
Huang K, Zhou T, Fu H, Zhang Y, Zhou Y, Gong C, Liang D. Learnable Prompting SAM-Induced Knowledge Distillation for Semi-Supervised Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:2295-2306. [PMID: 40030924 DOI: 10.1109/tmi.2025.3530097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
The limited availability of labeled data has driven advancements in semi-supervised learning for medical image segmentation. Modern large-scale models tailored for general segmentation, such as the Segment Anything Model (SAM), have revealed robust generalization capabilities. However, applying these models directly to medical image segmentation still exposes performance degradation. In this paper, we propose a learnable prompting SAM-induced Knowledge distillation framework (KnowSAM) for semi-supervised medical image segmentation. Firstly, we propose a Multi-view Co-training (MC) strategy that employs two distinct sub-networks to employ a co-teaching paradigm, resulting in more robust outcomes. Secondly, we present a Learnable Prompt Strategy (LPS) to dynamically produce dense prompts and integrate an adapter to fine-tune SAM specifically for medical image segmentation tasks. Moreover, we propose SAM-induced Knowledge Distillation (SKD) to transfer useful knowledge from SAM to two sub-networks, enabling them to learn from SAM's predictions and alleviate the effects of incorrect pseudo-labels during training. Notably, the predictions generated by our subnets are used to produce mask prompts for SAM, facilitating effective inter-module information exchange. Extensive experimental results on various medical segmentation tasks demonstrate that our model outperforms the state-of-the-art semi-supervised segmentation approaches. Crucially, our SAM distillation framework can be seamlessly integrated into other semi-supervised segmentation methods to enhance performance. The code will be released upon acceptance of this manuscript at https://github.com/taozh2017/KnowSAM.
Collapse
|
10
|
Li J, Villar-Calle P, Chiu C, Reza M, Narula N, Li C, Zhang J, Nguyen TD, Wang Y, Zhang RS, Kim J, Weinsaft JW, Spincemaille P. Spiral cardiac quantitative susceptibility mapping for differential cardiac chamber oxygenation-Initial validation in relation to invasive blood sampling. Magn Reson Med 2025; 93:2029-2039. [PMID: 39641910 PMCID: PMC11893258 DOI: 10.1002/mrm.30393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Revised: 10/18/2024] [Accepted: 11/18/2024] [Indexed: 12/07/2024]
Abstract
PURPOSE To develop a breath-hold cardiac quantitative susceptibility mapping (QSM) sequence for noninvasive measurement of differential cardiac chamber blood oxygen saturation (ΔSO2). METHODS A non-gated three-dimensional stack-of-spirals QSM sequence was implemented to continuously sample the data throughout the cardiac cycle. Measurements of ΔSO2 between the right and left heart chamber obtained by the proposed sequence and a previously validated navigator Cartesian QSM sequence were compared in three cohorts consisting of healthy volunteers, coronavirus disease 2019 survivors, and patients with pulmonary hypertension. In the pulmonary-hypertension cohort, Bland-Altman plots were used to assess the agreement of ΔSO2 values obtained by QSM and those obtained by invasive right heart catheterization (RHC). RESULTS Compared with navigator QSM (average acquisition time 419 ± 158 s), spiral QSM reduced the scan time on average by over 20-fold to a 20-s breath-hold. In all three cohorts, spiral QSM and navigator QSM yielded similar ΔSO2. Among healthy volunteers and coronavirus disease 2019 survivors, ΔSO2 was 17.41 ± 4.35% versus 17.67 ± 4.09% for spiral and navigator QSM, respectively. In pulmonary-hypertension patients, spiral QSM showed a slightly smaller ΔSO2 bias and narrower 95% limits of agreement than that obtained by navigator QSM (1.09% ± 6.47% vs. 2.79% ± 6.99%) when compared with right heart catheterization. CONCLUSION Breath-hold three-dimensional spiral cardiac QSM for measuring differential cardiac chamber blood oxygenation is feasible and provides values in good agreement with navigator cardiac QSM and with reference right heart catheterization.
Collapse
Affiliation(s)
- Jiahao Li
- Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY, United States
- Radiology, Weill Cornell Medicine, New York, NY, United States
| | | | - Caitlin Chiu
- Medicine, Weill Cornell Medicine, New York, NY, United States
| | - Mahniz Reza
- Medicine, Weill Cornell Medicine, New York, NY, United States
| | - Nupoor Narula
- Medicine, Weill Cornell Medicine, New York, NY, United States
| | - Chao Li
- Radiology, Weill Cornell Medicine, New York, NY, United States
- School of Applied and Engineering Physics, Cornell University, Ithaca, NY, United States
| | - Jinwei Zhang
- Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY, United States
- Radiology, Weill Cornell Medicine, New York, NY, United States
| | - Thanh D. Nguyen
- Radiology, Weill Cornell Medicine, New York, NY, United States
| | - Yi Wang
- Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY, United States
- Radiology, Weill Cornell Medicine, New York, NY, United States
| | - Robert S. Zhang
- Medicine, Weill Cornell Medicine, New York, NY, United States
| | - Jiwon Kim
- Medicine, Weill Cornell Medicine, New York, NY, United States
| | | | | |
Collapse
|
11
|
Chen J, Ye Z, Zhang R, Li H, Fang B, Zhang LB, Wang W. Medical image translation with deep learning: Advances, datasets and perspectives. Med Image Anal 2025; 103:103605. [PMID: 40311301 DOI: 10.1016/j.media.2025.103605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2024] [Revised: 03/07/2025] [Accepted: 04/12/2025] [Indexed: 05/03/2025]
Abstract
Traditional medical image generation often lacks patient-specific clinical information, limiting its clinical utility despite enhancing downstream task performance. In contrast, medical image translation precisely converts images from one modality to another, preserving both anatomical structures and cross-modal features, thus enabling efficient and accurate modality transfer and offering unique advantages for model development and clinical practice. This paper reviews the latest advancements in deep learning(DL)-based medical image translation. Initially, it elaborates on the diverse tasks and practical applications of medical image translation. Subsequently, it provides an overview of fundamental models, including convolutional neural networks (CNNs), transformers, and state space models (SSMs). Additionally, it delves into generative models such as Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), Autoregressive Models (ARs), diffusion Models, and flow Models. Evaluation metrics for assessing translation quality are discussed, emphasizing their importance. Commonly used datasets in this field are also analyzed, highlighting their unique characteristics and applications. Looking ahead, the paper identifies future trends, challenges, and proposes research directions and solutions in medical image translation. It aims to serve as a valuable reference and inspiration for researchers, driving continued progress and innovation in this area.
Collapse
Affiliation(s)
- Junxin Chen
- School of Software, Dalian University of Technology, Dalian 116621, China.
| | - Zhiheng Ye
- School of Software, Dalian University of Technology, Dalian 116621, China.
| | - Renlong Zhang
- Institute of Research and Clinical Innovations, Neusoft Medical Systems Co., Ltd., Beijing, China.
| | - Hao Li
- School of Computing Science, University of Glasgow, Glasgow G12 8QQ, United Kingdom.
| | - Bo Fang
- School of Computer Science, The University of Sydney, Sydney, NSW 2006, Australia.
| | - Li-Bo Zhang
- Department of Radiology, General Hospital of Northern Theater Command, Shenyang 110840, China.
| | - Wei Wang
- Guangdong-Hong Kong-Macao Joint Laboratory for Emotion Intelligence and Pervasive Computing, Artificial Intelligence Research Institute, Shenzhen MSU-BIT University, Shenzhen 518172, China; School of Medical Technology, Beijing Institute of Technology, Beijing 100081, China.
| |
Collapse
|
12
|
Liu J, Shen N, Wang W, Li X, Wang W, Yuan Y, Tian Y, Luo G, Wang K. Lightweight cross-resolution coarse-to-fine network for efficient deformable medical image registration. Med Phys 2025. [PMID: 40280883 DOI: 10.1002/mp.17827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2024] [Revised: 03/10/2025] [Accepted: 03/25/2025] [Indexed: 04/29/2025] Open
Abstract
BACKGROUND Accurate and efficient deformable medical image registration is crucial in medical image analysis. While recent deep learning-based registration methods have achieved state-of-the-art accuracy, they often suffer from extensive network parameters and slow inference times, leading to inefficiency. Efforts to reduce model size and input resolution can improve computational efficiency but frequently result in suboptimal accuracy. PURPOSE To address the trade-off between high accuracy and efficiency, we propose a Lightweight Cross-Resolution Coarse-to-Fine registration framework, termed LightCRCF. METHODS Our method is built on an ultra-lightweight U-Net architecture with only 0.1 million parameters, offering remarkable efficiency. To mitigate accuracy degradation resulting from fewer parameters while preserving the lightweight nature of the networks, LightCRCF introduces three key innovations as follows: (1) selecting an efficient cross-resolution coarse-to-fine (C2F) registration strategy and integrating it into the lightweight network to progressively decompose the deformation fields into multiresolution subfields to capture fine-grained deformations; (2) a Texture-aware Reparameterization (TaRep) module that integrates Sobel and Laplacian operators to extract rich textural information; (3) a Group-flow Reparameterization (GfRep) module that captures diverse deformation modes by decomposing the deformation field into multiple groups. Furthermore, we introduce a structural reparameterization technique that enhances training accuracy through multibranch structures of the TaRep and GfRep modules, while maintaining efficient inference by equivalently transforming these multibranch structures into single-branch standard convolutions. RESULTS We evaluate LightCRCF against various methods on the three public MRI datasets (LPBA, OASIS, and ACDC) and one CT dataset (abdomen CT). Following the previous data division methods, the LPBA dataset comprises 30 training image pairs and nine testing image pairs. For the OASIS dataset, the training, validation, and testing data consist of 1275, 110, and 660 image pairs, respectively. Similarly, for the ACDC dataset, the training, validation, and testing data include 180, 20, and 100 image pairs, respectively. For intersubject registration of the abdomen CT dataset, there are 380 training pairs, six validation pairs, and 42 testing pairs. Compared to state-of-the-art C2F methods, LightCRCF achieves comparable accuracy scores (DSC, HD95, and MSE), while demonstrating significantly superior performance across all efficiency metrics (Params, VRAM, FLOPs, and inference time). Relative to efficiency-first approaches, LightCRCF significantly outperforms these methods in accuracy metrics. CONCLUSIONS Our LightCRCF method offers a favorable trade-off between accuracy and efficiency, maintaining high accuracy while achieving superior efficiency, thereby highlighting its potential for clinical applications. The code will be available at https://github.com/PerceptionComputingLab/LightCRCF.
Collapse
Affiliation(s)
- Jun Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Nuo Shen
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Wenyi Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Xiangyu Li
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Wei Wang
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen, Shenzhen, Guangdong, China
| | - Yongfeng Yuan
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Ye Tian
- Department of Cardiology, The First Affiliated Hospital, Cardiovascular Institute, Harbin Medical University, Harbin, Heilongjiang, China
| | - Gongning Luo
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Kuanquan Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| |
Collapse
|
13
|
Zhang L, Yin X, Liu X, Liu Z. Medical image segmentation by combining feature enhancement Swin Transformer and UperNet. Sci Rep 2025; 15:14565. [PMID: 40281077 PMCID: PMC12032031 DOI: 10.1038/s41598-025-97779-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2024] [Accepted: 04/07/2025] [Indexed: 04/29/2025] Open
Abstract
Medical image segmentation plays a crucial role in assisting clinical diagnosis, yet existing models often struggle with handling diverse and complex medical data, particularly when dealing with multi-scale organ and tissue structures. This paper proposes a novel medical image segmentation model, FE-SwinUper, designed to address these challenges by integrating the strengths of the Swin Transformer and UPerNet architectures. The objective is to enhance multi-scale feature extraction and improve the fusion of hierarchical organ and tissue representations through a feature enhancement Swin Transformer (FE-ST) backbone and an adaptive feature fusion (AFF) module. The FE-ST backbone utilizes self-attention mechanisms to efficiently extract rich spatial and contextual features across different scales, while the AFF module adapts to multi-scale feature fusion, mitigating the loss of contextual information. We evaluate the model on two publicly available medical image segmentation datasets: Synapse multi-organ segmentation dataset and the ACDC cardiac segmentation dataset. Our results show that FE-SwinUper outperforms existing state-of-the-art models in terms of Dice coefficient, pixel accuracy, and Hausdorff distance. The model achieves a Dice score of 91.58% on the Synapse dataset and 90.15% on the ACDC dataset. These results demonstrate the robustness and efficiency of the proposed model, indicating its potential for real-world clinical applications.
Collapse
Affiliation(s)
- Lin Zhang
- College of Computer Science, Weifang University of Science and Technology, Weifang, 262700, China
| | - Xiaochun Yin
- College of Computer Science, Weifang University of Science and Technology, Weifang, 262700, China.
| | - Xuqi Liu
- College of Computer Science, Weifang University of Science and Technology, Weifang, 262700, China
| | - Zengguang Liu
- School of Information Engineering, Shandong Vocational College of Science and Technology, Weifang, 261053, China
| |
Collapse
|
14
|
Chen Y, Zhang X, Huo Y, Wang S. Deep Learning-Based Estimation of Myocardial Material Parameters from Cardiac MRI. Bioengineering (Basel) 2025; 12:433. [PMID: 40281793 PMCID: PMC12024853 DOI: 10.3390/bioengineering12040433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2025] [Revised: 04/01/2025] [Accepted: 04/07/2025] [Indexed: 04/29/2025] Open
Abstract
BACKGROUND Accurate estimation of myocardial material parameters is crucial to understand cardiac biomechanics and plays a key role in advancing computational modeling and clinical applications. Traditional inverse finite element (FE) methods rely on iterative optimization to infer these parameters, which is computationally expensive and time-consuming, limiting their clinical applicability. METHODS This study proposes a deep learning-based approach to rapidly and accurately estimate the left ventricular myocardial material parameters directly from routine cardiac magnetic resonance imaging (CMRI) data. A ResNet18-based model was trained on FEM-derived parameters from a dataset of 1288 healthy subjects. RESULTS The proposed model demonstrated high predictive accuracy on healthy subjects, achieving mean absolute errors of 0.0146 for Ca and 0.0139 for Cb, with mean relative errors below 5.00%. Additionally, we evaluated the model on a small pathological subset (including ARV and HCM cases). The results revealed that while the model maintained strong performance on healthy data, the prediction errors in the pathological samples were higher, indicating increased challenges in modeling diseased myocardial tissue. CONCLUSION This study establishes a computationally efficient and accurate deep learning framework for estimating myocardial material parameters, eliminating the need for time-consuming iterative FE optimization. While the model shows promising performance on healthy subjects, further validation and refinement are required to address its limitations in pathological conditions, thereby paving the way for personalized cardiac modeling and improved clinical decision-making.
Collapse
Affiliation(s)
- Yunhe Chen
- Department of Aeronautics and Astronautics, Fudan University, Shanghai 200433, China;
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China;
- Shanghai Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention, Shanghai 200032, China
| | - Xiwen Zhang
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China;
- Shanghai Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention, Shanghai 200032, China
| | - Yongzhong Huo
- Department of Aeronautics and Astronautics, Fudan University, Shanghai 200433, China;
| | - Shuo Wang
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China;
- Shanghai Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention, Shanghai 200032, China
| |
Collapse
|
15
|
Zhang W, Yang T, Fan J, Wang H, Ji M, Zhang H, Miao J. U-shaped network combining dual-stream fusion mamba and redesigned multilayer perceptron for myocardial pathology segmentation. Med Phys 2025. [PMID: 40247150 DOI: 10.1002/mp.17812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2024] [Revised: 03/10/2025] [Accepted: 03/16/2025] [Indexed: 04/19/2025] Open
Abstract
BACKGROUND Cardiac magnetic resonance imaging (CMR) provides critical pathological information, such as scars and edema, which are vital for diagnosing myocardial infarction (MI). However, due to the limited pathological information in single-sequence CMR images and the small size of pathological regions, automatic segmentation of myocardial pathology remains a significant challenge. PURPOSE In the paper, we propose a novel two-stage anatomical-pathological segmentation framework combining Kolmogorov-Arnold Networks (KAN) and Mamba, aiming to effectively segment myocardial pathology in multi-sequence CMR images. METHODS First, in the coarse segmentation stage, we employed a multiline parallel MambaUnet as the anatomical structure segmentation network to obtain shape prior information. This approach effectively addresses the class imbalance issue and aids in subsequent pathological segmentation. In the fine segmentation stage, we introduced a novel U-shaped segmentation network, KANMambaNet, which features a Dual-Stream Fusion Mamba module. This module enhances the network's ability to capture long-range dependencies while improving its capability to distinguish different pathological features in small regions. Additionally, we developed a Kolmogorov-Arnold Network-based multilayer perceptron (KAN MLP) module that utilizes learnable activation functions instead of fixed nonlinear functions. This design enhances the network's flexibility in handling various pathological features, enabling more accurate differentiation of the pathological characteristics at the boundary between edema and scar regions. Our method achieves competitive segmentation performance compared to state-of-the-art models, particularly in terms of the Dice coefficient. RESULTS We validated our model's performance on the MyoPS2020 dataset, achieving a Dice score of 0.8041 ± $\pm$ 0.0751 for myocardial edema and 0.9051 ± $\pm$ 0.0240 for myocardial scar. Compared to the baseline model MambaUnet, our edema segmentation performance improved by 0.1420, and scar segmentation performance improved by 0.1081. CONCLUSIONS We developed an innovative two-stage anatomical-pathological segmentation framework that integrates KAN and Mamba, effectively segmenting myocardial pathology in multi-sequence CMR images. The experimental results demonstrate that our proposed method achieves superior segmentation performance compared to other state-of-the-art methods.
Collapse
Affiliation(s)
- Wenjie Zhang
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, China
| | - Tiejun Yang
- School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, China
- Key Laboratory of Grain Information Processing and Control (HAUT), Ministry of Education, Zhengzhou, China
- Henan Key Laboratory of Grain Photoelectric Detection and Control (HAUT), Zhengzhou, Henan, China
| | - Jiacheng Fan
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, China
| | - Heng Wang
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, China
| | - Mingzhu Ji
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, China
| | - Huiyao Zhang
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, China
| | - Jianyu Miao
- School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, China
| |
Collapse
|
16
|
Fakhfakh M, Sarry L, Clarysse P. HALSR-Net: Improving CNN Segmentation of Cardiac Left Ventricle MRI with Hybrid Attention and Latent Space Reconstruction. Comput Med Imaging Graph 2025; 123:102546. [PMID: 40245744 DOI: 10.1016/j.compmedimag.2025.102546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2024] [Revised: 02/17/2025] [Accepted: 03/30/2025] [Indexed: 04/19/2025]
Abstract
Accurate cardiac MRI segmentation is vital for detailed cardiac analysis, yet the manual process is labor-intensive and prone to variability. Despite advancements in MRI technology, there remains a significant need for automated methods that can reliably and efficiently segment cardiac structures. This paper introduces HALSR-Net, a novel multi-level segmentation architecture designed to improve the accuracy and reproducibility of cardiac segmentation from Cine-MRI acquisitions, focusing on the left ventricle (LV). The methodology consists of two main phases: first, the extraction of the region of interest (ROI) using a regression model that accurately predicts the location of a bounding box around the LV; second, the semantic segmentation step based on HALSR-Net architecture. This architecture incorporates a Hybrid Attention Pooling Module (HAPM) that merges attention and pooling mechanisms to enhance feature extraction and capture contextual information. Additionally, a reconstruction module leverages latent space features to further improve segmentation accuracy. Experiments conducted on an in-house clinical dataset and two public datasets (ACDC and LVQuan19) demonstrate that HALSR-Net outperforms state-of-the-art architectures, achieving up to 98% accuracy and F1-score for the segmentation of the LV cavity and myocardium. The proposed approach effectively addresses the limitations of existing methods, offering a more accurate and robust solution for cardiac MRI segmentation, thereby likely to improve cardiac function analysis and patient care.
Collapse
Affiliation(s)
- Mohamed Fakhfakh
- Université Clermont Auvergne, CHU Clermont-Ferrand, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000, Clermont-Ferrand, France.
| | - Laurent Sarry
- Université Clermont Auvergne, CHU Clermont-Ferrand, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000, Clermont-Ferrand, France.
| | - Patrick Clarysse
- INSA-Lyon, Université Claude Bernard Lyon 1, CNRS, Inserm, CREATIS UMR 5220, U1294, F-69621, Lyon, France.
| |
Collapse
|
17
|
Zhu Y, Li H, Cao B, Huang K, Liu J. A novel hybrid layer-based encoder-decoder framework for 3D segmentation in congenital heart disease. Sci Rep 2025; 15:11891. [PMID: 40195399 PMCID: PMC11977193 DOI: 10.1038/s41598-025-96251-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2024] [Accepted: 03/26/2025] [Indexed: 04/09/2025] Open
Abstract
The segmentation of cardiac anatomy represents a crucial stage in accurate diagnosis and subsequent treatment planning for patients with congenital heart disease (CHD). However, the current deep learning-based segmentation networks are ineffective when applied to 3D medical images of CHD because of the limited availability of training datasets and the inherent complexity exhibited by the variability of cardiac and large vessel tissues. To address this challenge, we propose a novel hybrid layer-based encoder-decoder framework for 3D CHD image segmentation. The model incorporates a global volume mixing module and a local volume-based multihead attention module, which uses a self-attention mechanism to explicitly capture the local and global dependencies of the 3D image segmentation process. This enables the model to more effectively learn the shape boundary features of organs, thereby facilitating accurate segmentation of the whole heart (WH) and great vessels. We compare our method with several popular networks on the public ImageCHD and HVSMR-2.0 datasets. The experimental results show that the proposed model achieves excellent performance in WH and great vessel segmentation tasks with high Dice coefficients and IoU indices.
Collapse
Affiliation(s)
- Yaoxi Zhu
- Department of Cardiovascular Surgery, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071, China
- Hubei Provincial Engineering Research Center of Minimally Invasive Cardiovascular Surgery, Wuhan, 430071, China
- Wuhan Clinical Research Center for Minimally Invasive Treatment of Structural Heart Disease, Wuhan, 430071, China
| | - Hongbo Li
- Department of Clinical Medicine, HuanKui Academy, Nanchang University, Nanchang, 330031, China
| | - Bingxin Cao
- Department of Cardiovascular Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Kun Huang
- Department of Cardiology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.
| | - Jinping Liu
- Department of Cardiovascular Surgery, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071, China.
- Hubei Provincial Engineering Research Center of Minimally Invasive Cardiovascular Surgery, Wuhan, 430071, China.
- Wuhan Clinical Research Center for Minimally Invasive Treatment of Structural Heart Disease, Wuhan, 430071, China.
| |
Collapse
|
18
|
Murugesan B, Adiga Vasudeva S, Liu B, Lombaert H, Ben Ayed I, Dolz J. Neighbor-aware calibration of segmentation networks with penalty-based constraints. Med Image Anal 2025; 101:103501. [PMID: 39978014 DOI: 10.1016/j.media.2025.103501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 01/08/2025] [Accepted: 02/08/2025] [Indexed: 02/22/2025]
Abstract
Ensuring reliable confidence scores from deep neural networks is of paramount significance in critical decision-making systems, particularly in real-world domains such as healthcare. Recent literature on calibrating deep segmentation networks has resulted in substantial progress. Nevertheless, these approaches are strongly inspired by the advancements in classification tasks, and thus their uncertainty is usually modeled by leveraging the information of individual pixels, disregarding the local structure of the object of interest. Indeed, only the recent Spatially Varying Label Smoothing (SVLS) approach considers pixel spatial relationships across classes, by softening the pixel label assignments with a discrete spatial Gaussian kernel. In this work, we first present a constrained optimization perspective of SVLS and demonstrate that it enforces an implicit constraint on soft class proportions of surrounding pixels. Furthermore, our analysis shows that SVLS lacks a mechanism to balance the contribution of the constraint with the primary objective, potentially hindering the optimization process. Based on these observations, we propose NACL (Neighbor Aware CaLibration), a principled and simple solution based on equality constraints on the logit values, which enables to control explicitly both the enforced constraint and the weight of the penalty, offering more flexibility. Comprehensive experiments on a wide variety of well-known segmentation benchmarks demonstrate the superior calibration performance of the proposed approach, without affecting its discriminative power. Furthermore, ablation studies empirically show the model agnostic nature of our approach, which can be used to train a wide span of deep segmentation networks. The code is available at https://github.com/Bala93/MarginLoss.
Collapse
|
19
|
Waida H, Yamazaki K, Tokuhisa A, Wada M, Wada Y. Investigating self-supervised image denoising with denaturation. Neural Netw 2025; 184:106966. [PMID: 39700824 DOI: 10.1016/j.neunet.2024.106966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Revised: 10/08/2024] [Accepted: 11/25/2024] [Indexed: 12/21/2024]
Abstract
Self-supervised learning for image denoising problems in the presence of denaturation for noisy data is a crucial approach in machine learning. However, theoretical understanding of the performance of the approach that uses denatured data is lacking. To provide better understanding of the approach, in this paper, we analyze a self-supervised denoising algorithm that uses denatured data in depth through theoretical analysis and numerical experiments. Through the theoretical analysis, we discuss that the algorithm finds desired solutions to the optimization problem with the population risk, while the guarantee for the empirical risk depends on the hardness of the denoising task in terms of denaturation levels. We also conduct several experiments to investigate the performance of an extended algorithm in practice. The results indicate that the algorithm training with denatured images works, and the empirical performance aligns with the theoretical results. These results suggest several insights for further improvement of self-supervised image denoising that uses denatured data in future directions.
Collapse
Affiliation(s)
- Hiroki Waida
- Department of Mathematical and Computing Science, Institute of Science Tokyo, 2-12-1 Ookayama, Meguro-ku, Tokyo, 152-8550, Japan
| | - Kimihiro Yamazaki
- Fujitsu Limited, 4-1-1 Kamikodanaka, Nakahara-ku, Kawasaki-shi, Kanagawa, 211-8588, Japan
| | - Atsushi Tokuhisa
- RIKEN Center for Computational Science, 7-1-26 Minatojima-minami-machi, Chuo-ku, Kobe, Hyogo, 650-0047, Japan
| | - Mutsuyo Wada
- Fujitsu Limited, 4-1-1 Kamikodanaka, Nakahara-ku, Kawasaki-shi, Kanagawa, 211-8588, Japan
| | - Yuichiro Wada
- Fujitsu Limited, 4-1-1 Kamikodanaka, Nakahara-ku, Kawasaki-shi, Kanagawa, 211-8588, Japan; RIKEN Center for Advanced Intelligence Project, Nihonbashi 1-chome Mitsui Building, 15th floor, 1-4-1 Nihonbashi, Chuo-ku, Tokyo, 103-0027, Japan.
| |
Collapse
|
20
|
Li G, Xie J, Zhang L, Cheng G, Zhang K, Bai M. Dynamic graph consistency and self-contrast learning for semi-supervised medical image segmentation. Neural Netw 2025; 184:107063. [PMID: 39700823 DOI: 10.1016/j.neunet.2024.107063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 12/12/2024] [Accepted: 12/14/2024] [Indexed: 12/21/2024]
Abstract
Semi-supervised medical image segmentation endeavors to exploit a limited set of labeled data in conjunction with a substantial corpus of unlabeled data, with the aim of training models that can match or even exceed the efficacy of fully supervised segmentation models. Despite the potential of this approach, most existing semi-supervised medical image segmentation techniques that employ consistency regularization predominantly focus on spatial consistency at the image level, often neglecting the crucial role of feature-level channel information. To address this limitation, we propose an innovative method that integrates graph convolutional networks with a consistency regularization framework to develop a dynamic graph consistency approach. This method imposes channel-level constraints across different decoders by leveraging high-level features within the network. Furthermore, we introduce a novel self-contrast learning strategy, which performs image-level comparison within the same batch and engages in pixel-level contrast learning based on pixel positions. This approach effectively overcomes traditional contrast learning challenges related to identifying positive and negative samples, reduces computational resource consumption, and significantly improves model performance. Our experimental evaluation on three distinct medical image segmentation datasets indicates that the proposed method demonstrates superior performance across a variety of test scenarios.
Collapse
Affiliation(s)
- Gang Li
- College of Software, Taiyuan University of Technology, Taiyuan, China
| | - Jinjie Xie
- College of Software, Taiyuan University of Technology, Taiyuan, China.
| | - Ling Zhang
- College of Software, Taiyuan University of Technology, Taiyuan, China.
| | - Guijuan Cheng
- College of Software, Taiyuan University of Technology, Taiyuan, China
| | - Kairu Zhang
- College of Software, Taiyuan University of Technology, Taiyuan, China
| | - Mingqi Bai
- College of Software, Taiyuan University of Technology, Taiyuan, China
| |
Collapse
|
21
|
Nield LE, Manlhiot C, Magor K, Freud L, Chinni B, Ims A, Melamed N, Nevo O, Van Mieghem T, Weisz D, Ronzoni S. Machine Learning to Predict Outcomes of Fetal Cardiac Disease: A Pilot Study. Pediatr Cardiol 2025; 46:895-901. [PMID: 38724761 DOI: 10.1007/s00246-024-03512-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Accepted: 04/23/2024] [Indexed: 03/14/2025]
Abstract
Prediction of outcomes following a prenatal diagnosis of congenital heart disease (CHD) is challenging. Machine learning (ML) algorithms may be used to reduce clinical uncertainty and improve prognostic accuracy. We performed a pilot study to train ML algorithms to predict postnatal outcomes based on clinical data. Specific objectives were to predict (1) in utero or neonatal death, (2) high-acuity neonatal care and (3) favorable outcomes. We included all fetuses with cardiac disease at Sunnybrook Health Sciences Centre, Toronto, Canada, from 2012 to 2021. Prediction models were created using the XgBoost algorithm (tree-based) with fivefold cross-validation. Among 211 cases of fetal cardiac disease, 61 were excluded (39 terminations, 21 lost to follow-up, 1 isolated arrhythmia), leaving a cohort of 150 fetuses. Fifteen (10%) demised (10 neonates) and 65 (48%) of live births required high acuity neonatal care. Of those with clinical follow-up, 60/87 (69%) had a favorable outcome. Prediction models for fetal or neonatal death, high acuity neonatal care and favorable outcome had AUCs of 0.76, 0.84 and 0.73, respectively. The most important predictors for death were the presence of non-cardiac abnormalities combined with more severe CHD. High acuity of postnatal care was predicted by anti Ro antibody and more severe CHD. Favorable outcome was most predicted by no right heart disease combined with genetic abnormalities, and maternal medications. Prediction models using ML provide good discrimination of key prenatal and postnatal outcomes among fetuses with congenital heart disease.
Collapse
Affiliation(s)
- L E Nield
- Sunnybrook Health Sciences Centre, 2075 Bayview Avenue, Toronto, ON, M4N 3M5, Canada.
| | - C Manlhiot
- Department of Pediatrics, Blalock-Taussig-Thomas Congenital Heart Center, Johns Hopkins University, Baltimore, MD, USA
| | - K Magor
- University of Toronto, Toronto, Canada
| | - L Freud
- The Hospital for Sick Children, Toronto, Canada
| | - B Chinni
- Department of Pediatrics, Blalock-Taussig-Thomas Congenital Heart Center, Johns Hopkins University, Baltimore, MD, USA
| | - A Ims
- Department of Pediatrics, Blalock-Taussig-Thomas Congenital Heart Center, Johns Hopkins University, Baltimore, MD, USA
| | - N Melamed
- Sunnybrook Health Sciences Centre, 2075 Bayview Avenue, Toronto, ON, M4N 3M5, Canada
| | - O Nevo
- Sunnybrook Health Sciences Centre, 2075 Bayview Avenue, Toronto, ON, M4N 3M5, Canada
| | - T Van Mieghem
- Department of Obstetrics and Gynaecology, Mount Sinai Hospital Toronto, University of Toronto, Toronto, Canada
| | - D Weisz
- Sunnybrook Health Sciences Centre, 2075 Bayview Avenue, Toronto, ON, M4N 3M5, Canada
| | - S Ronzoni
- Sunnybrook Health Sciences Centre, 2075 Bayview Avenue, Toronto, ON, M4N 3M5, Canada
| |
Collapse
|
22
|
Qiu Y, Meng J, Li B. Semi-supervised Strong-Teacher Consistency Learning for few-shot cardiac MRI image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 261:108613. [PMID: 39893807 DOI: 10.1016/j.cmpb.2025.108613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2024] [Revised: 01/05/2025] [Accepted: 01/19/2025] [Indexed: 02/04/2025]
Abstract
BACKGROUND AND OBJECTIVE Cardiovascular disease is a leading cause of mortality worldwide. Automated analysis of heart structures in MRI is crucial for effective diagnostics. While supervised learning has advanced the field of medical image segmentation, it however requires extensive labelled data, which is often limited for cardiac MRI. METHODS Drawing on the principle of consistency learning, we introduce a novel semi-supervised Strong-Teacher Consistency Network for few-shot multi-class cardiac MRI image segmentation, leveraging largely available unlabelled data. This model incorporates a student-teacher architecture. A multi-teacher structure is introduced to learn diverse perspectives and avoid local optimals when dealing with largely varying cardiac structures and anatomical features. It employs a hybrid loss that emphasizes consistency between student and teacher representations, alongside supervised losses (e.g., Dice and Cross-entropy), tailored to the challenge of unlabelled data. Additionally, we introduced feature-space virtual adversarial training to enhance robust feature learning and model stability. RESULTS Evaluation and ablation studies on the MM-WHS and ACDC benchmark datasets show that the proposed model outperforms nine state-of-the-art semi-supervised methods, particularly with limited annotated data. It achieves 90.14% accuracy on MM-WHS and 78.45% accuracy on ACDC at labelling rates of 25% and 1%, respectively. It also highlights its unique advantages over fully-supervised and single-teacher approaches.
Collapse
Affiliation(s)
- Yuting Qiu
- Department of Computer Science, Loughborough University, LE11 3TU, Leicestershire, UK.
| | - James Meng
- Norwich Medical School, University of East Anglia, NR4 7TJ, Norfolk, UK.
| | - Baihua Li
- Department of Computer Science, Loughborough University, LE11 3TU, Leicestershire, UK.
| |
Collapse
|
23
|
Yang Y, Sun G, Zhang T, Wang R, Su J. Semi-supervised medical image segmentation via weak-to-strong perturbation consistency and edge-aware contrastive representation. Med Image Anal 2025; 101:103450. [PMID: 39798528 DOI: 10.1016/j.media.2024.103450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 12/02/2024] [Accepted: 12/26/2024] [Indexed: 01/15/2025]
Abstract
Despite that supervised learning has demonstrated impressive accuracy in medical image segmentation, its reliance on large labeled datasets poses a challenge due to the effort and expertise required for data acquisition. Semi-supervised learning has emerged as a potential solution. However, it tends to yield satisfactory segmentation performance in the central region of the foreground, but struggles in the edge region. In this paper, we propose an innovative framework that effectively leverages unlabeled data to improve segmentation performance, especially in edge regions. Our proposed framework includes two novel designs. Firstly, we introduce a weak-to-strong perturbation strategy with corresponding feature-perturbed consistency loss to efficiently utilize unlabeled data and guide our framework in learning reliable regions. Secondly, we propose an edge-aware contrastive loss that utilizes uncertainty to select positive pairs, thereby learning discriminative pixel-level features in the edge regions using unlabeled data. In this way, the model minimizes the discrepancy of multiple predictions and improves representation ability, ultimately aiming at impressive performance on both primary and edge regions. We conducted a comparative analysis of the segmentation results on the publicly available BraTS2020 dataset, LA dataset, and the 2017 ACDC dataset. Through extensive quantification and visualization experiments under three standard semi-supervised settings, we demonstrate the effectiveness of our approach and set a new state-of-the-art for semi-supervised medical image segmentation. Our code is released publicly at https://github.com/youngyzzZ/SSL-w2sPC.
Collapse
Affiliation(s)
- Yang Yang
- School of Computer Science and Technology, Harbin Institute of Technology at Shenzhen, Shenzhen, 518055, China
| | - Guoying Sun
- School of Computer Science and Technology, Harbin Institute of Technology at Shenzhen, Shenzhen, 518055, China
| | - Tong Zhang
- Department of Network Intelligence, Peng Cheng Laboratory, Shenzhen, 518055, China
| | - Ruixuan Wang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, 510006, China; Department of Network Intelligence, Peng Cheng Laboratory, Shenzhen, 518055, China.
| | - Jingyong Su
- School of Computer Science and Technology, Harbin Institute of Technology at Shenzhen, Shenzhen, 518055, China; National Key Laboratory of Smart Farm Technologies and Systems, Harbin, 150001, China.
| |
Collapse
|
24
|
Yang D, Gao W. PointCHD: A Point Cloud Benchmark for Congenital Heart Disease Classification and Segmentation. IEEE J Biomed Health Inform 2025; 29:2683-2694. [PMID: 39514354 DOI: 10.1109/jbhi.2024.3495035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]
Abstract
Congenital heart disease (CHD) is one of the most common birth defects. Due to the lack of data and the difficulty of labeling, CHD datasets are scarce. Previous studies focused on CT and other medical image modalities, while point cloud is still unexplored. Point cloud can intuitively model organ shapes, which has obvious advantages in medical analysis and diagnosis assistance. However, the production of medical point cloud dataset is more complex than that of image dataset, and the 3D modeling of internal organs needs to be reconstructed after scanning by high-precision instruments. We propose PointCHD, the first point cloud dataset for CHD diagnosis, with a large number of high precision-annotated and wide-categorized data. PointCHD includes different types of three-dimensional data with varying degrees of distortion, and supports multiple analysis tasks, i.e., classification, segmentation, reconstruction, etc. We also construct a benchmark on PointCHD with the goal of medical diagnosis, we design the analysis process and compare the performances of mainstream point cloud analysis methods. In view of the complex internal and external structures of heart point cloud, we propose a point cloud representation method based on manifold learning. By introducing normals to consider the surface continuity to construct a manifold learning method of adaptive projection plane, we can fully extract the structural features of heart, and achieve the best performance on each task of PointCHD benchmark. Finally, we summarize the existing problems of CHD point cloud analysis and prospects for potential future research directions.
Collapse
|
25
|
Lyu J, Qin C, Wang S, Wang F, Li Y, Wang Z, Guo K, Ouyang C, Tänzer M, Liu M, Sun L, Sun M, Li Q, Shi Z, Hua S, Li H, Chen Z, Zhang Z, Xin B, Metaxas DN, Yiasemis G, Teuwen J, Zhang L, Chen W, Zhao Y, Tao Q, Pang Y, Liu X, Razumov A, Dylov DV, Dou Q, Yan K, Xue Y, Du Y, Dietlmeier J, Garcia-Cabrera C, Al-Haj Hemidi Z, Vogt N, Xu Z, Zhang Y, Chu YH, Chen W, Bai W, Zhuang X, Qin J, Wu L, Yang G, Qu X, Wang H, Wang C. The state-of-the-art in cardiac MRI reconstruction: Results of the CMRxRecon challenge in MICCAI 2023. Med Image Anal 2025; 101:103485. [PMID: 39946779 DOI: 10.1016/j.media.2025.103485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 09/09/2024] [Accepted: 01/27/2025] [Indexed: 03/05/2025]
Abstract
Cardiac magnetic resonance imaging (MRI) provides detailed and quantitative evaluation of the heart's structure, function, and tissue characteristics with high-resolution spatial-temporal imaging. However, its slow imaging speed and motion artifacts are notable limitations. Undersampling reconstruction, especially data-driven algorithms, has emerged as a promising solution to accelerate scans and enhance imaging performance using highly under-sampled data. Nevertheless, the scarcity of publicly available cardiac k-space datasets and evaluation platform hinder the development of data-driven reconstruction algorithms. To address this issue, we organized the Cardiac MRI Reconstruction Challenge (CMRxRecon) in 2023, in collaboration with the 26th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI). CMRxRecon presented an extensive k-space dataset comprising cine and mapping raw data, accompanied by detailed annotations of cardiac anatomical structures. With overwhelming participation, the challenge attracted more than 285 teams and over 600 participants. Among them, 22 teams successfully submitted Docker containers for the testing phase, with 7 teams submitted for both cine and mapping tasks. All teams use deep learning based approaches, indicating that deep learning has predominately become a promising solution for the problem. The first-place winner of both tasks utilizes the E2E-VarNet architecture as backbones. In contrast, U-Net is still the most popular backbone for both multi-coil and single-coil reconstructions. This paper provides a comprehensive overview of the challenge design, presents a summary of the submitted results, reviews the employed methods, and offers an in-depth discussion that aims to inspire future advancements in cardiac MRI reconstruction models. The summary emphasizes the effective strategies observed in Cardiac MRI reconstruction, including backbone architecture, loss function, pre-processing techniques, physical modeling, and model complexity, thereby providing valuable insights for further developments in this field.
Collapse
Affiliation(s)
- Jun Lyu
- School of Computer and Control Engineering, Yantai University, Yantai, China
| | - Chen Qin
- Department of Electrical and Electronic Engineering & I-X, Imperial College London, United Kingdom
| | - Shuo Wang
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai, China
| | - Fanwen Wang
- Department of Bioengineering & Imperial-X, Imperial College London, London W12 7SL, UK; Cardiovascular Magnetic Resonance Unit, Royal Brompton Hospital, Guy's and St Thomas' NHS Foundation Trust, London SW3 6NP, UK
| | - Yan Li
- Department of Radiology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Zi Wang
- Department of Bioengineering & Imperial-X, Imperial College London, London W12 7SL, UK; Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, National Institute for Data Science in Health and Medicine, Institute of Artificial Intelligence, Xiamen University, Xiamen 361102, China
| | - Kunyuan Guo
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, National Institute for Data Science in Health and Medicine, Institute of Artificial Intelligence, Xiamen University, Xiamen 361102, China
| | - Cheng Ouyang
- Department of Computing, Imperial College London, London SW7 2AZ, UK; Department of Brain Sciences, Imperial College London, London SW7 2AZ, UK
| | - Michael Tänzer
- Cardiovascular Magnetic Resonance Unit, Royal Brompton Hospital, Guy's and St Thomas' NHS Foundation Trust, London SW3 6NP, UK; Department of Computing, Imperial College London, London SW7 2AZ, UK
| | - Meng Liu
- Shanghai Pudong Hospital and Human Phenome Institute, Fudan University, Shanghai, China; International Human Phenome Institute (Shanghai), Shanghai, China
| | - Longyu Sun
- Shanghai Pudong Hospital and Human Phenome Institute, Fudan University, Shanghai, China; International Human Phenome Institute (Shanghai), Shanghai, China
| | - Mengting Sun
- Shanghai Pudong Hospital and Human Phenome Institute, Fudan University, Shanghai, China; International Human Phenome Institute (Shanghai), Shanghai, China
| | - Qing Li
- Shanghai Pudong Hospital and Human Phenome Institute, Fudan University, Shanghai, China; International Human Phenome Institute (Shanghai), Shanghai, China
| | - Zhang Shi
- Department of Radiology, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Sha Hua
- Department of Cardiovascular Medicine, Ruijin Hospital Lu Wan Branch, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Hao Li
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433, China
| | - Zhensen Chen
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433, China
| | - Zhenlin Zhang
- Department of Electrical and Electronic Engineering & I-X, Imperial College London, United Kingdom
| | - Bingyu Xin
- Department of Computer Science, Rutgers University, New Brunswick, NJ 08901, USA
| | - Dimitris N Metaxas
- Department of Computer Science, Rutgers University, New Brunswick, NJ 08901, USA
| | - George Yiasemis
- AI for Oncology, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX, Amsterdam, Netherlands
| | - Jonas Teuwen
- AI for Oncology, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX, Amsterdam, Netherlands
| | - Liping Zhang
- CUHK Lab of AI in Radiology (CLAIR), Department of Imaging and Interventional Radiology, The Chinese University of Hong Kong, China
| | - Weitian Chen
- CUHK Lab of AI in Radiology (CLAIR), Department of Imaging and Interventional Radiology, The Chinese University of Hong Kong, China
| | - Yidong Zhao
- Department of Imaging Physics, Delft University of Technology, Lorentzweg 1, 2628CN, Delft, Netherlands
| | - Qian Tao
- Department of Imaging Physics, Delft University of Technology, Lorentzweg 1, 2628CN, Delft, Netherlands
| | - Yanwei Pang
- TJK-BIIT Lab, School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China
| | - Xiaohan Liu
- Institute of Applied Physics and Computational Mathematics, Beijing, 100094, China
| | - Artem Razumov
- Skolkovo Institute Of Science And Technology, Center for Artificial Intelligence Technology, 30/1 Bolshoy blvd., 121205 Moscow, Russia
| | - Dmitry V Dylov
- Skolkovo Institute Of Science And Technology, Center for Artificial Intelligence Technology, 30/1 Bolshoy blvd., 121205 Moscow, Russia; Artificial Intelligence Research Institute, 32/1 Kutuzovsky pr., Moscow, 121170, Russia
| | - Quan Dou
- Department of Biomedical Engineering, University of Virginia, 415 Lane Rd., Charlottesville, VA 22903, United States
| | - Kang Yan
- Department of Biomedical Engineering, University of Virginia, 415 Lane Rd., Charlottesville, VA 22903, United States
| | - Yuyang Xue
- Institute for Imaging, Data and Communications, University of Edinburgh, EH9 3FG, UK
| | - Yuning Du
- Institute for Imaging, Data and Communications, University of Edinburgh, EH9 3FG, UK
| | - Julia Dietlmeier
- Insight SFI Research Centre for Data Analytics, Dublin City University, Glasnevin Dublin 9, Ireland
| | - Carles Garcia-Cabrera
- ML-Labs SFI Centre for Research Training in Machine Learning, Dublin City University, Glasnevin Dublin 9, Ireland
| | - Ziad Al-Haj Hemidi
- Institute of Medical Informatics, Universität zu Lübeck, Ratzeburger Alle 160, 23562 Lübeck, Germany
| | - Nora Vogt
- IADI, INSERM U1254, Université de Lorraine, Rue du Morvan, 54511 Nancy, France
| | - Ziqiang Xu
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, China
| | - Yajing Zhang
- Science & Technology Organization, GE Healthcare, Beijing, China
| | | | | | - Wenjia Bai
- Department of Computing, Imperial College London, London SW7 2AZ, UK; Department of Brain Sciences, Imperial College London, London SW7 2AZ, UK
| | - Xiahai Zhuang
- School of Data Science, Fudan University, Shanghai, China
| | - Jing Qin
- School of Nursing, The Hong Kong Polytechnic University, Hong Kong, China
| | - Lianming Wu
- Department of Radiology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China.
| | - Guang Yang
- Department of Bioengineering & Imperial-X, Imperial College London, London W12 7SL, UK; Cardiovascular Magnetic Resonance Unit, Royal Brompton Hospital, Guy's and St Thomas' NHS Foundation Trust, London SW3 6NP, UK; School of Biomedical Engineering & Imaging Sciences, King's College London, London WC2R 2LS, UK.
| | - Xiaobo Qu
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, National Institute for Data Science in Health and Medicine, Institute of Artificial Intelligence, Xiamen University, Xiamen 361102, China.
| | - He Wang
- Human Phenome Institute, Fudan University, 825 Zhangheng Road, Pudong New District, Shanghai, 201203, China; Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433, China.
| | - Chengyan Wang
- Shanghai Pudong Hospital and Human Phenome Institute, Fudan University, Shanghai, China; International Human Phenome Institute (Shanghai), Shanghai, China.
| |
Collapse
|
26
|
Liu Y, Lin L, Wong KKY, Tang X. ProCNS: Progressive Prototype Calibration and Noise Suppression for Weakly-Supervised Medical Image Segmentation. IEEE J Biomed Health Inform 2025; 29:2845-2858. [PMID: 40030827 DOI: 10.1109/jbhi.2024.3522958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Weakly-supervised segmentation (WSS) has emerged as a solution to mitigate the conflict between annotation cost and model performance by adopting sparse annotation formats (e.g., point, scribble, block, etc.). Typical approaches attempt to exploit anatomy and topology priors to directly expand sparse annotations into pseudo-labels. However, due to lack of attention to the ambiguous boundaries in medical images and insufficient exploration of sparse supervision, existing approaches tend to generate erroneous and overconfident pseudo proposals in noisy regions, leading to cumulative model error and performance degradation. In this work, we propose a novel WSS approach, named ProCNS, encompassing two synergistic modules devised with the principles of progressive prototype calibration and noise suppression. Specifically, we design a Prototype-based Regional Spatial Affinity (PRSA) loss to maximize the pair-wise affinities between spatial and semantic elements, providing our model of interest with more reliable guidance. The affinities are derived from the input images and the prototype-refined predictions. Meanwhile, we propose an Adaptive Noise Perception and Masking (ANPM) module to obtain more enriched and representative prototype representations, which adaptively identifies and masks noisy regions within the pseudo proposals, reducing potential erroneous interference during prototype computation. Furthermore, we generate specialized soft pseudo-labels for the noisy regions identified by ANPM, providing supplementary supervision. Extensive experiments on six medical image segmentation tasks involving different modalities demonstrate that the proposed framework significantly outperforms representative state-of-the-art methods.
Collapse
|
27
|
Li S, Li X, Wang P, Liu K, Wei B, Cong J. An enhanced visual state space model for myocardial pathology segmentation in multi-sequence cardiac MRI. Med Phys 2025. [PMID: 40108817 DOI: 10.1002/mp.17761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2024] [Revised: 02/14/2025] [Accepted: 03/01/2025] [Indexed: 03/22/2025] Open
Abstract
BACKGROUND Myocardial pathology (scar and edema) segmentation plays a crucial role in the diagnosis, treatment, and prognosis of myocardial infarction (MI). However, the current mainstream models for myocardial pathology segmentation have the following limitations when faced with cardiac magnetic resonance(CMR) images with multiple objects and large changes in object scale: the remote modeling ability of convolutional neural networks is insufficient, and the computational complexity of transformers is high, which makes myocardial pathology segmentation challenging. PURPOSE This study aims to develop a novel model to address the image characteristics and algorithmic challenges faced in the myocardial pathology segmentation task and improve the accuracy and efficiency of myocardial pathology segmentation. METHODS We developed a novel visual state space (VSS)-based deep neural network, MPS-Mamba. In order to accurately and adequately extract CMR image features, the encoder employs a dual-branch structure to extract global and local features of the image. Among them, the VSS branch overcomes the limitations of the current mainstream models for myocardial pathology segmentation by modeling remote relationships through linear computability, while the convolutional-based branch provides complementary local information. Given the unique properties of the dual branches, we design a modular dual-branch fusion module for fusing dual branches to enhance the feature representation of the dual encoder. To improve the ability to model objects of different scales in cardiac magnetic resonance (CMR) images, a multi-scale feature fusion (MSF) module is designed to achieve effective integration and fine expression of multi-scale information. To further incorporate anatomical knowledge to optimize segmentation results, a decoder with three decoding branches is designed to output segmentation results of scar, edema, and myocardium, respectively. In addition, multiple sets of constraint functions are used to not only improve the segmentation accuracy of myocardial pathology but also effectively model the spatial position relationship between myocardium, scar, and edema. RESULTS The proposed method was comprehensively evaluated on the MyoPS 2020 dataset, and the results showed that MPS-Mamba achieved an average Dice score of 0.717 ± $\pm$ 0.169 in myocardial scar segmentation, which is superior to the current mainstream methods. In addition, MPS-Mamba also performed well in the edema segmentation task, with an average Dice score of 0.735 ± $\pm$ 0.073. The experimental results further demonstrate the effectiveness of MPS-Mamba in segmenting myocardial pathologies in multi-sequence CMR images, verifying its advantages in myocardial pathology segmentation tasks. CONCLUSIONS Given the effectiveness and superiority of MPS-Mamba, this method is expected to become a potential myocardial pathology segmentation tool that can effectively assist clinical diagnosis.
Collapse
Affiliation(s)
- Shuning Li
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Key Laboratory of Artificial Intelligence Technology in Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Qingdao, China
| | - Xiang Li
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Key Laboratory of Artificial Intelligence Technology in Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Qingdao, China
| | - Pingping Wang
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Key Laboratory of Artificial Intelligence Technology in Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Qingdao, China
| | - Kunmeng Liu
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Key Laboratory of Artificial Intelligence Technology in Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Qingdao, China
| | - Benzheng Wei
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Key Laboratory of Artificial Intelligence Technology in Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Qingdao, China
| | - Jinyu Cong
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Key Laboratory of Artificial Intelligence Technology in Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Qingdao, China
| |
Collapse
|
28
|
Liu T, Tan Z, Jiang H, Huang K. Stagger Network: Rethinking information loss in medical image segmentation with various-sized targets. Neural Netw 2025; 188:107386. [PMID: 40147135 DOI: 10.1016/j.neunet.2025.107386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Revised: 01/10/2025] [Accepted: 03/10/2025] [Indexed: 03/29/2025]
Abstract
Medical image segmentation presents the challenge of segmenting various-size targets, demanding the model to effectively capture both local and global information. Despite recent efforts using CNNs and ViTs to predict annotations of different scales, these approaches often struggle to effectively balance the detection of targets across varying sizes. Simply utilizing local information from CNNs and global relationships from ViTs without considering potential significant divergence in latent feature distributions may result in substantial information loss. To address this issue, in this paper, we will introduce a novel Stagger Network (SNet) and argues that a well-designed fusion structure can mitigate the divergence in latent feature distributions between CNNs and ViTs, thereby reducing information loss. Specifically, to emphasize both global dependencies and local focus, we design a Parallel Module to bridge the semantic gap. Meanwhile, we propose the Stagger Module, trying to fuse the selected features that are more semantically similar. An Information Recovery Module is further adopted to recover complementary information back to the network. As a key contribution, we theoretically analyze that the proposed parallel and stagger strategies would lead to less information loss, thus certifying the SNet's rationale. Experimental results clearly proved that the proposed SNet excels comparisons with recent SOTAs in segmenting on the Synapse dataset where targets are in various sizes. Besides, it also demonstrates superiority on the ACDC and the MoNuSeg datasets where targets are with more consistent dimensions.
Collapse
Affiliation(s)
- Tianyi Liu
- School of Robotics, XJTLU Entrepreneur College (Taicang), Xi'an Jiaotong-Liverpool University, 111 Taicang Road, Taicang, Suzhou, 215123, Jiangsu, China; Department of Computer Science, University of Liverpool, Brownlow Hill, Liverpool, L697ZX, United Kingdom.
| | - Zhaorui Tan
- School of Advanced Technology, Xi'an Jiaotong-Liverpool University, 111 Ren'ai Road, Suzhou Industrial Park, Suzhou, 215123, Jiangsu, China; Department of Computer Science, University of Liverpool, Brownlow Hill, Liverpool, L697ZX, United Kingdom.
| | - Haochuan Jiang
- School of Robotics, XJTLU Entrepreneur College (Taicang), Xi'an Jiaotong-Liverpool University, 111 Taicang Road, Taicang, Suzhou, 215123, Jiangsu, China.
| | - Kaizhu Huang
- Data Science Research Center, Duke Kunshan University, No. 8 Duke Avenue, Suzhou, 215316, Jiangsu, China.
| |
Collapse
|
29
|
Reisdorf P, Gavrysh J, Ammann C, Fenski M, Kolbitsch C, Lange S, Hennemuth A, Schulz-Menger J, Hadler T. Lumos: Software for Multi-level Multi-reader Comparison of Cardiovascular Magnetic Resonance Late Gadolinium Enhancement Scar Quantification. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-025-01437-2. [PMID: 40097767 DOI: 10.1007/s10278-025-01437-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Revised: 01/17/2025] [Accepted: 01/31/2025] [Indexed: 03/19/2025]
Abstract
Cardiovascular magnetic resonance imaging (CMR) offers state-of-the-art myocardial tissue differentiation. The CMR technique late gadolinium enhancement (LGE) currently provides the noninvasive gold standard for the detection of myocardial fibrosis. Typically, thresholding methods are used for fibrotic scar tissue quantification. A major challenge for standardized CMR assessment is large variations in the estimated scar for different methods. The aim was to improve quality assurance for LGE scar quantification, a multi-reader comparison tool "Lumos" was developed to support quality control for scar quantification methods. The thresholding methods and an exact rasterization approach were implemented, as well as a graphical user interface (GUI) with statistical and case-specific tabs. Twenty LGE cases were considered with half of them including artifacts and clinical results for eight scar quantification methods computed. Lumos was successfully implemented as a multi-level multi-reader comparison software, and differences between methods can be seen in the statistical results. Histograms visualize confounding effects of different methods. Connecting the statistical level with the case level allows for backtracking statistical differences to sources of differences in the threshold calculation. Being able to visualize the underlying groundwork for the different methods in the myocardial histogram gives the opportunity to identify causes for different thresholds. Lumos showed the differences in the clinical results between cases with artifacts and cases without artifacts. A video demonstration of Lumos is offered as supplementary material 1. Lumos allows for a multi-reader comparison for LGE scar quantification that offers insights into the origin of reader differences.
Collapse
Affiliation(s)
- Philine Reisdorf
- Working Group on Cardiovascular Magnetic Resonance, Experimental and Clinical Research Center (ECRC), a joint cooperation between the Charité - Universitätsmedizin Berlin and the Max-Delbrück-Center for Molecular Medicine, Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, ECRC Experimental and Clinical Research Center, Lindenberger Weg 80, 13125, Berlin, Germany
- DZHK (German Centre for Cardiovascular Research), partner site Berlin, Berlin, Germany
| | - Jonathan Gavrysh
- Working Group on Cardiovascular Magnetic Resonance, Experimental and Clinical Research Center (ECRC), a joint cooperation between the Charité - Universitätsmedizin Berlin and the Max-Delbrück-Center for Molecular Medicine, Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, ECRC Experimental and Clinical Research Center, Lindenberger Weg 80, 13125, Berlin, Germany
- DZHK (German Centre for Cardiovascular Research), partner site Berlin, Berlin, Germany
| | - Clemens Ammann
- Working Group on Cardiovascular Magnetic Resonance, Experimental and Clinical Research Center (ECRC), a joint cooperation between the Charité - Universitätsmedizin Berlin and the Max-Delbrück-Center for Molecular Medicine, Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, ECRC Experimental and Clinical Research Center, Lindenberger Weg 80, 13125, Berlin, Germany
- DZHK (German Centre for Cardiovascular Research), partner site Berlin, Berlin, Germany
- Department of Cardiology and Nephrology, HELIOS Hospital Berlin-Buch, Berlin, Germany
| | - Maximilian Fenski
- Working Group on Cardiovascular Magnetic Resonance, Experimental and Clinical Research Center (ECRC), a joint cooperation between the Charité - Universitätsmedizin Berlin and the Max-Delbrück-Center for Molecular Medicine, Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, ECRC Experimental and Clinical Research Center, Lindenberger Weg 80, 13125, Berlin, Germany
- DZHK (German Centre for Cardiovascular Research), partner site Berlin, Berlin, Germany
| | - Christoph Kolbitsch
- Physikalisch-Technische Bundesanstalt (PTB), Braunschweig and Berlin, Germany
| | - Steffen Lange
- Department of Computer Sciences, Hochschule Darmstadt - University of Applied Sciences, Darmstadt, Germany
| | - Anja Hennemuth
- DZHK (German Centre for Cardiovascular Research), partner site Berlin, Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
- Fraunhofer MEVIS, Bremen, Germany
- Deutsches Herzzentrum der Charité (DHZC), Institute of Computer-assisted Cardiovascular Medicine, Augustenburger Platz 1, Berlin, Germany
| | - Jeanette Schulz-Menger
- Working Group on Cardiovascular Magnetic Resonance, Experimental and Clinical Research Center (ECRC), a joint cooperation between the Charité - Universitätsmedizin Berlin and the Max-Delbrück-Center for Molecular Medicine, Berlin, Germany.
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, ECRC Experimental and Clinical Research Center, Lindenberger Weg 80, 13125, Berlin, Germany.
- DZHK (German Centre for Cardiovascular Research), partner site Berlin, Berlin, Germany.
- Department of Cardiology and Nephrology, HELIOS Hospital Berlin-Buch, Berlin, Germany.
| | - Thomas Hadler
- Working Group on Cardiovascular Magnetic Resonance, Experimental and Clinical Research Center (ECRC), a joint cooperation between the Charité - Universitätsmedizin Berlin and the Max-Delbrück-Center for Molecular Medicine, Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, ECRC Experimental and Clinical Research Center, Lindenberger Weg 80, 13125, Berlin, Germany
- DZHK (German Centre for Cardiovascular Research), partner site Berlin, Berlin, Germany
| |
Collapse
|
30
|
Zhou D, Zhao M, Liu W, Gu X. HADCN: a hierarchical ascending densely connected network for enhanced medical image segmentation. Med Biol Eng Comput 2025:10.1007/s11517-025-03342-w. [PMID: 40085394 DOI: 10.1007/s11517-025-03342-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Accepted: 02/25/2025] [Indexed: 03/16/2025]
Abstract
Medical image segmentation is a key component in computer-aided diagnostic technology. In the past few years, the U-shaped architecture-based hierarchical model has become the mainstream approach, which however often fails to provide accurate results due to the loss of detailed features. To address this issue, this paper proposes a hierarchical ascending densely connected network, called HADCNet, to capture both local short-range and global long-range pathological features in a hierarchically organized network for more accurate segmentation. First, HADCNet devises a cross-scale ascending densely connected structure with a multi-path attention gate (MAG) to achieve full-scale interaction of global pathological features. Then, spatial-channel reconstruction units (called SRU and CRU) are introduced to decrease redundant computation and facilitate representative feature learning. Finally, multi-scale outputs are aggregated for final imaging. Extensive experiments demonstrate that our method achieves an average DSC of 84.45% and HD95 of 17.55 mm on the Synapse dataset (for multi-organ segmentation), with a similarly impressive performance on the ACDC (for cardiac diagnosis) and ISIC2018 datasets (for lesion segmentation). Additionally, HADCNet can be flexibly incorporated into existing backbone networks for better performance, e.g., combining HADC with TransUnet and SwinUnet, respectively, leads to 3.28% and 2.53% Dice score improvements.
Collapse
Affiliation(s)
- Dibin Zhou
- School of Information Science and Technology, Hangzhou Normal University, Street, Hangzhou, 311121, Zhejiang, China
| | - Mingxuan Zhao
- School of Information Science and Technology, Hangzhou Normal University, Street, Hangzhou, 311121, Zhejiang, China
| | - Wenhao Liu
- School of Information Science and Technology, Hangzhou Normal University, Street, Hangzhou, 311121, Zhejiang, China.
| | - Xirui Gu
- School of Information Science and Technology, Hangzhou Normal University, Street, Hangzhou, 311121, Zhejiang, China
| |
Collapse
|
31
|
Szűcs ÁI, Kári B, Pártos O. Myocardial perfusion imaging SPECT left ventricle segmentation with graphs. EJNMMI Phys 2025; 12:21. [PMID: 40063231 PMCID: PMC11893936 DOI: 10.1186/s40658-025-00728-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2024] [Accepted: 02/17/2025] [Indexed: 03/14/2025] Open
Abstract
PURPOSE Various specialized and general collimators are used for myocardial perfusion imaging (MPI) with single-photon emission computed tomography (SPECT) to assess different types of coronary artery disease (CAD). Alongside the wide variability in imaging characteristics, the apriori "learnt" information of left ventricular (LV) shape can affect the final diagnosis of the imaging protocol. This study evaluates the effect of prior information incorporation into the segmentation process, compared to deep learning (DL) approaches, as well as the differences of 4 collimation techniques on 5 different datasets. METHODS This study was implemented on 80 patients database. 40 patients were coming from mixed black-box collimators, 10 each, from multi-pinhole (MPH), low energy high resolution (LEHR), CardioC and CardioD collimators. The testing was evaluated on a new continuous graph-based approach, which automatically segments the left ventricular volume with prior information on the cardiac geometry. The technique is based on the continuous max-flow (CMF) min-cut algorithm, which performance was evaluated in precision, recall, IoU and Dice score metrics. RESULTS In the testing it was shown that, the developed method showed a good improvement over deep learning reaching higher scores in most of the evaluation metrics. Further investigating the different collimation techniques, the evaluation of receiver operating characterstic (ROC) curves showed different stabilities on the various collimators. Running Wilcoxon signed-rank test on the outlines of the LVs showed differentiability between the collimation procedures. To further investigate these phenomena the model parameters of the LVs were reconstructed and evaluated by the uniform manifold approximation and projection (UMAP) method, which further proved that collimators can be differentiated based on the projected LV shapes alone. CONCLUSIONS The results show that prior information incorporation can enhance the performance of segmentation methods and collimation strategies have a high effect on the projected cardiac geometry.
Collapse
Affiliation(s)
- Ádám István Szűcs
- Computer Algebra, Eötvös Loránd University, Pázmány Péter blvd. 1/c, Budapest, Pest, 1117, Hungary.
| | - Béla Kári
- Nuclear Medicine, Semmelweis University, Üllői street 78b, Budapest, Pest, 1083, Hungary
| | - Oszkár Pártos
- Nuclear Medicine, Semmelweis University, Üllői street 78b, Budapest, Pest, 1083, Hungary
| |
Collapse
|
32
|
Kächele J, Zenk M, Rokuss M, Ulrich C, Wald T, Maier-Hein KH. Enhanced nnU-Net Architectures for Automated MRI Segmentation of Head and Neck Tumors in Adaptive Radiation Therapy. HEAD AND NECK TUMOR SEGMENTATION FOR MR-GUIDED APPLICATIONS : FIRST MICCAI CHALLENGE, HNTS-MRG 2024, HELD IN CONJUNCTION WITH MICCAI 2024, MARRAKESH, MOROCCO, OCTOBER 17, 2024, PROCEEDINGS 2025; 15273:50-64. [PMID: 40291013 PMCID: PMC12023904 DOI: 10.1007/978-3-031-83274-1_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/30/2025]
Abstract
The increasing utilization of MRI in radiation therapy planning for head and neck cancer (HNC) highlights the need for precise tumor segmentation to enhance treatment efficacy and reduce side effects. This work presents segmentation models developed for the HNTS-MRG 2024 challenge by the team mic-dkfz, focusing on automated segmentation of HNC tumors from MRI images at two radiotherapy (RT) stages: before (pre-RT) and 2-4 weeks into RT (mid-RT). For Task 1 (pre-RT segmentation), we built upon the nnU-Net framework, enhancing it with the larger Residual Encoder architecture. We incorporated extensive data augmentation and applied transfer learning by pre-training the model on a diverse set of public 3D medical imaging datasets. For Task 2 (mid-RT segmentation), we adopted a longitudinal approach by integrating registered pre-RT images and their segmentations as additional inputs into the nnU-Net framework. On the test set, our models achieved mean aggregated Dice Similarity Coefficient (aggDSC) scores of 81.2 for Task 1 and 72.7 for Task 2. Especially the primary tumor (GTVp) segmentation is challenging and presents potential for further optimization. These results demonstrate the effectiveness of combining advanced architectures, transfer learning, and longitudinal data integration for automated tumor segmentation in MRI-guided adaptive radiation therapy.
Collapse
Affiliation(s)
- Jessica Kächele
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Medical Faculty Heidelberg, University of Heidelberg, Heidelberg, Germany
- German Cancer Consortium (DKTK), DKFZ, core center Heidelberg, Heidelberg, Germany
| | - Maximilian Zenk
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Medical Faculty Heidelberg, University of Heidelberg, Heidelberg, Germany
| | - Maximilian Rokuss
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany
| | - Constantin Ulrich
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Medical Faculty Heidelberg, University of Heidelberg, Heidelberg, Germany
- National Center for Tumor Diseases (NCT), NCT Heidelberg, A partnership between DKFZ and University Medical Center Heidelberg, Heidelberg, Germany
| | - Tassilo Wald
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Helmholtz Imaging, DKFZ, Heidelberg, Germany
- Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany
| | - Klaus H Maier-Hein
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Helmholtz Imaging, DKFZ, Heidelberg, Germany
- Pattern Analysis and Learning Group, Department of Radiation Oncology, Heidelberg University Hospital, Heidelberg, Germany
| |
Collapse
|
33
|
She Q, Sun S, Ma Y, Li R, Zhang Y. LUCF-Net: Lightweight U-Shaped Cascade Fusion Network for Medical Image Segmentation. IEEE J Biomed Health Inform 2025; 29:2088-2099. [PMID: 40030349 DOI: 10.1109/jbhi.2024.3506829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/08/2025]
Abstract
The performance of modern U-shaped neural networks for medical image segmentation has been significantly enhanced by incorporating Transformer layers. Although Transformer architectures are powerful at extracting global information, its ability to capture local information is limited due to their high complexity. To address this challenge, we proposed a new lightweight U-shaped cascade fusion network (LUCF-Net) for medical image segmentation. It utilized an asymmetrical structural design and incorporated both local and global modules to enhance its capacity for local and global modeling. Additionally, a multi-layer cascade fusion decoding network was designed to further bolster the network's information fusion capabilities. Validation performed on open-source CT, MRI, and dermatology datasets demonstrated that the proposed model outperformed other state-of-the-art methods in handling local-global information, achieving an improvement of 1.46% in Dice coefficient and 2.98 mm in Hausdorff distance on multi-organ segmentation. Furthermore, as a network that combines Convolutional Neural Network and Transformer architectures, it achieves competitive segmentation performance with only 6.93 million parameters and 6.6 gigabytes of floating point operations, without the need for pre-training. In summary, the proposed method demonstrated enhanced performance while retaining a simpler model design compared to other Transformer-based segmentation networks.
Collapse
|
34
|
Sun J, Wang T, Wang M, Li X, Xu Y. Semi-supervised medical image segmentation network based on mutual learning. Med Phys 2025; 52:1589-1600. [PMID: 39636526 DOI: 10.1002/mp.17547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Revised: 10/23/2024] [Accepted: 11/07/2024] [Indexed: 12/07/2024] Open
Abstract
BACKGROUND Semi-supervised learning provides an effective means to address the challenge of insufficient labeled data in medical image segmentation tasks. However, when a semi-supervised segmentation model is overfitted and exhibits cognitive bias, its performance will deteriorate. Errors stemming from cognitive bias can quickly amplify and become difficult to correct during the training process of neural networks, resulting in the continuous accumulation of erroneous knowledge. PURPOSE To address the issue of error accumulation, a novel learning strategy is required to enhance the accuracy of medical image segmentation. METHODS This paper proposes a semi-supervised medical image segmentation network based on mutual learning (MLNet) to alleviate the issue of continuous accumulation of erroneous knowledge. The MLNet adopts a teacher-student network as the backbone framework, training student and teacher networks on labeled data and mutually updating network parameter weights, enabling the two models to learn from each other. Additionally, an image partial exchange algorithm (IPE) as an appropriate perturbation addition method is proposed to reduce the introduction of erroneous information and the disruption to the contextual information of the image. RESULTS In the 10% labeled experiment on the ACDC dataset, our Dice coefficient reached 89.48%, a 9.28% improvement over the baseline model. In the 10% labeled experiment on the BraTS2019 dataset, the proposed method still performs exceptionally well, achieving 84.56%, surpassing other comparative methods. CONCLUSIONS Compared with other methods, experimental results demonstrate that our approach achieves optimal performance across all metrics, proving its effectiveness and reliability.
Collapse
Affiliation(s)
- Junmei Sun
- School of Information Science and Technology, Hangzhou Normal University, Hangzhou, China
| | - Tianyang Wang
- School of Information Science and Technology, Hangzhou Normal University, Hangzhou, China
| | - Meixi Wang
- School of Information Science and Technology, Hangzhou Normal University, Hangzhou, China
| | - Xiumei Li
- School of Information Science and Technology, Hangzhou Normal University, Hangzhou, China
| | | |
Collapse
|
35
|
Yu Q, Ning H, Yang J, Li C, Qi Y, Qu M, Li H, Sun S, Cao P, Feng C. CMR-BENet: A confidence map refinement boundary enhancement network for left ventricular myocardium segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 260:108544. [PMID: 39709745 DOI: 10.1016/j.cmpb.2024.108544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2024] [Revised: 11/06/2024] [Accepted: 12/02/2024] [Indexed: 12/24/2024]
Abstract
BACKGROUND AND OBJECTIVE Left ventricular myocardium segmentation is of great significance for clinical diagnosis, treatment, and prognosis. However, myocardium segmentation is challenging as the medical image quality is disturbed by various factors such as motion, artifacts, and noise. Its accuracy largely depends on the accurate identification of edges and structures. Most existing encoder-decoder based segmentation methods capture limited contextual information and ignore the awareness of myocardial shape and structure, often producing unsatisfactory boundary segmentation results in noisy scenes. Moreover, these methods fail to assess the reliability of the predictions, which is crucial for clinical decisions and applications in medical tasks. Therefore, this study explores how to effectively combine contextual information with myocardial edge structure and confidence maps to improve segmentation performance in an end-to-end network. METHODS In this paper, we propose an end-to-end confidence map refinement boundary enhancement network (CMR-BENet) for left ventricular myocardium segmentation. CMR-BENet has three components: a layer semantic-aware module (LSA), an edge information enhancement module (EIE), and a confidence map-based refinement module (CMR). Specifically, LSA first adaptively fuses high- and low-level semantic information across hierarchical layers to mitigate the bias of single-layer features affected by noise. EIE then improves the edge and structure recognition by designing the edge and mask guidance module (EMG) and the edge structure-aware module (ESA). Finally, CMR provides a simple and efficient way to estimate confidence maps and effectively combines the encoder features to refine the segmentation results. RESULTS Experiments on two echocardiography datasets and one cardiac MRI dataset show that the proposed CMR-BENet outperforms its rivals in the left ventricular myocardium segmentation task with Dice (DI) of 87.71%, 79.33%, and 89.11%, respectively. CONCLUSION This paper utilizes edge information to characterize the shape and structure of the myocardium and introduces learnable confidence maps to evaluate and refine the segmentation results. Our findings provide strong support and reference for physicians in diagnosis and treatment.
Collapse
Affiliation(s)
- Qi Yu
- Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China; National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Shenyang, China
| | - Hongxia Ning
- Department of Cardiovascular Ultrasound, The First Hospital of China Medical University, Shenyang, China; Clinical Medical Research Center of Imaging in Liaoning Province, Shenyang, China
| | - Jinzhu Yang
- Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China; National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Shenyang, China.
| | - Chen Li
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Yiqiu Qi
- Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China; National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Shenyang, China
| | - Mingjun Qu
- Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China; National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Shenyang, China
| | - Honghe Li
- Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China; National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Shenyang, China
| | - Song Sun
- Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China; National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Shenyang, China
| | - Peng Cao
- Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China; National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Shenyang, China
| | - Chaolu Feng
- Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China; National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Shenyang, China
| |
Collapse
|
36
|
Sufian MA, Niu M. Hybrid deep learning for computational precision in cardiac MRI segmentation: Integrating Autoencoders, CNNs, and RNNs for enhanced structural analysis. Comput Biol Med 2025; 186:109597. [PMID: 39967188 DOI: 10.1016/j.compbiomed.2024.109597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2024] [Revised: 12/07/2024] [Accepted: 12/17/2024] [Indexed: 02/20/2025]
Abstract
Recent advancements in cardiac imaging have been significantly enhanced by integrating deep learning models, offering transformative potential in early diagnosis and patient care. The research paper explores the application of hybrid deep learning methodologies, focusing on the roles of Autoencoders, Convolutional Neural Networks (CNNs), and Recurrent Neural Networks (RNNs) in enhancing cardiac image analysis. The study implements a comprehensive approach, combining traditional algorithms such as Sobel, Watershed, and Otsu's Thresholding with advanced deep learning models to achieve precise and accurate imaging outcomes. The Autoencoder model, developed for image enhancement and feature extraction, achieved a notable accuracy of 99.66% on the test data. Optimized for image recognition tasks, the CNN model demonstrated a high precision rate of 98.9%. The RNN model, utilized for sequential data analysis, showed a prediction accuracy of 98%, further underscoring the robustness of the hybrid framework. The research drew upon a diverse range of academic databases and pertinent publications within cardiac imaging and deep learning, focusing on peer-reviewed articles and studies published in the past five years. Models were implemented using the TensorFlow and Keras frameworks. The proposed methodology was evaluated in the clinical validation phase using advanced imaging protocols, including the QuickScan technique and balanced steady-state free precession (bSSFP) imaging. The validation metrics were promising: the Signal-to-Noise Ratio (SNR) was improved by 15%, the Contrast-to-Noise Ratio (CNR) saw an enhancement of 12%, and the ejection fraction (EF) analysis provided a 95% correlation with manually segmented data. These metrics confirm the efficacy of the models, showing significant improvements in image quality and diagnostic accuracy. The integration of adversarial defense strategies, such as adversarial training and model ensembling, have been analyzed to enhance model robustness against malicious inputs. The reliability and comparison of the model's ability have been investigated to maintain clinical integrity, even in adversarial attacks that could otherwise compromise segmentation outcomes. These findings indicate that integrating Autoencoders, CNNs, and RNNs within a hybrid deep-learning framework is promising for enhancing cardiac MRI segmentation and early diagnosis. The study contributes to the field by demonstrating the applicability of these advanced techniques in clinical settings, paving the way for improved patient outcomes through more accurate and timely diagnoses.
Collapse
Affiliation(s)
- Md Abu Sufian
- Shaanxi International Innovation Center for Transportation-Energy-Information Fusion and Sustainability, Chang'an University, Xi'an 710064, China; IVR Low-Carbon Research Institute, School of Energy and Electrical Engineering, Chang'an University, Xi'an 710064, China
| | - Mingbo Niu
- Shaanxi International Innovation Center for Transportation-Energy-Information Fusion and Sustainability, Chang'an University, Xi'an 710064, China; IVR Low-Carbon Research Institute, School of Energy and Electrical Engineering, Chang'an University, Xi'an 710064, China.
| |
Collapse
|
37
|
Shaikh MFW, Mama MS, Proddaturi SH, Vidal J, Gnanasekaran P, Kumar MS, Clarke CJ, Reddy KS, Bello HM, Raquib N, Morani Z. The Role of Artificial Intelligence in the Prediction, Diagnosis, and Management of Cardiovascular Diseases: A Narrative Review. Cureus 2025; 17:e81332. [PMID: 40291312 PMCID: PMC12034035 DOI: 10.7759/cureus.81332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/27/2025] [Indexed: 04/30/2025] Open
Abstract
Cardiovascular diseases (CVDs) remain the leading global cause of mortality, and a high prevalence of cardiac conditions, including premature deaths, have increased from decades until today. However, early detection and management of these conditions are challenging, given their complexity, the scale of affected populations, the dynamic nature of the disease process, and the treatment approach. The transformative potential is being brought by Artificial Intelligence (AI), specifically machine learning (ML) and deep learning technologies, to analyze massive datasets, improve diagnostic accuracy, and optimize treatment strategy. The recent advancements in such AI-based frameworks as the personalization of decision-making support systems for customized medicine automated image assessments drastically increase the precision and efficiency of healthcare professionals. However, implementing AI is widely clogged with obstacles, including regulatory, privacy, and validation across populations. Additionally, despite the desire to incorporate AI into clinical routines, there is no shortage of concern about interoperability and clinician acceptance of the system. Despite these challenges, further research and development are essential for overcoming these hurdles. This review explores the use of AI in cardiovascular care, its limitations for current use, and future integration toward better patient outcomes.
Collapse
Affiliation(s)
| | | | | | - Juan Vidal
- Medicine, Universidad del Azuay, Cuenca, ECU
| | | | - Mekala S Kumar
- Internal Medicine, Sri Venkata Sai (SVS) Medical College, Hyderabad, IND
| | - Cleve J Clarke
- College of Oral Health Sciences, University of Technology, Jamaica, Kingston, JAM
| | - Kalva S Reddy
- Internal Medicine, Sri Venkata Sai (SVS) Medical College, Hyderabad, IND
| | | | - Naama Raquib
- Obstetrics and Gynecology, Grange University Hospital, Newport, GBR
| | - Zoya Morani
- Family Medicine, Washington University of Health and Science, San Pedro, BLZ
| |
Collapse
|
38
|
Lu Z, Zhang J, Cai B, Wu Y, Li D, Liu M, Zhang L. A multi-scale information fusion medical image segmentation network based on convolutional kernel coupled updata mechanism. Comput Biol Med 2025; 187:109723. [PMID: 39879883 DOI: 10.1016/j.compbiomed.2025.109723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2024] [Revised: 12/24/2024] [Accepted: 01/18/2025] [Indexed: 01/31/2025]
Abstract
Medical image segmentation is pivotal in disease diagnosis and treatment. This paper presents a novel network architecture for medical image segmentation, termed TransDLNet, which is engineered to enhance the efficiency of multi-scale information utilization. TransDLNet integrates convolutional neural networks and Transformers, facilitating cross-level multi-scale information fusion for complex medical images. Key to its innovation is the attention-dilated depthwise convolution (ADDC) module, utilizing depthwise convolution (DWConv) with varied dilation rates to enhance local detail capture. A convolution kernel coupled update mechanism and channel information compensation method ensure robust feature representation. Furthermore, the cross-level grouped attention merge (CGAM) module in both encoder and decoder enhances feature interaction and integration across scales, boosting comprehensive representation. We conducted a comprehensive experimental analysis and quantitative evaluation on four datasets representing diverse modalities. The results indicate that the proposed method has good segmentation performance and generalization ability.
Collapse
Affiliation(s)
- Zhihao Lu
- College of Computer Science and Cyber Security, Chengdu University of Technology, Chengdu, 610059, China.
| | - Jinglan Zhang
- West China College of Stomatology, Sichuan University, Chengdu, 610041, China.
| | - Biao Cai
- College of Computer Science and Cyber Security, Chengdu University of Technology, Chengdu, 610059, China.
| | - Yuanyuan Wu
- College of Computer Science and Cyber Security, Chengdu University of Technology, Chengdu, 610059, China.
| | - Dongfen Li
- College of Computer Science and Cyber Security, Chengdu University of Technology, Chengdu, 610059, China.
| | - Mingzhe Liu
- School of Data Science and Artifical Intelligence, Wenzhou University of Technology, Wenzhou, 325035, China.
| | - Lan Zhang
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases, West China Hospital of Stomatology, Sichuan University, Chengdu, 610041, China.
| |
Collapse
|
39
|
Nguyen-Tat TB, Vo HA, Dang PS. QMaxViT-Unet+: A query-based MaxViT-Unet with edge enhancement for scribble-supervised segmentation of medical images. Comput Biol Med 2025; 187:109762. [PMID: 39919665 DOI: 10.1016/j.compbiomed.2025.109762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2024] [Revised: 01/17/2025] [Accepted: 01/27/2025] [Indexed: 02/09/2025]
Abstract
The deployment of advanced deep learning models for medical image segmentation is often constrained by the requirement for extensively annotated datasets. Weakly-supervised learning, which allows less precise labels, has become a promising solution to this challenge. Building on this approach, we propose QMaxViT-Unet+, a novel framework for scribble-supervised medical image segmentation. This framework is built on the U-Net architecture, with the encoder and decoder replaced by Multi-Axis Vision Transformer (MaxViT) blocks. These blocks enhance the model's ability to learn local and global features efficiently. Additionally, our approach integrates a query-based Transformer decoder to refine features and an edge enhancement module to compensate for the limited boundary information in the scribble label. We evaluate the proposed QMaxViT-Unet+ on four public datasets focused on cardiac structures, colorectal polyps, and breast cancer: ACDC, MS-CMRSeg, SUN-SEG, and BUSI. Evaluation metrics include the Dice similarity coefficient (DSC) and the 95th percentile of Hausdorff distance (HD95). Experimental results show that QMaxViT-Unet+ achieves 89.1% DSC and 1.316 mm HD95 on ACDC, 88.4% DSC and 2.226 mm HD95 on MS-CMRSeg, 71.4% DSC and 4.996 mm HD95 on SUN-SEG, and 69.4% DSC and 50.122 mm HD95 on BUSI. These results demonstrate that our method outperforms existing approaches in terms of accuracy, robustness, and efficiency while remaining competitive with fully-supervised learning approaches. This makes it ideal for medical image analysis, where high-quality annotations are often scarce and require significant effort and expense. The code is available at https://github.com/anpc849/QMaxViT-Unet.
Collapse
Affiliation(s)
- Thien B Nguyen-Tat
- University of Information Technology, Ho Chi Minh City, Vietnam; Vietnam National University, Ho Chi Minh City, Vietnam.
| | - Hoang-An Vo
- University of Information Technology, Ho Chi Minh City, Vietnam; Vietnam National University, Ho Chi Minh City, Vietnam
| | - Phuoc-Sang Dang
- University of Information Technology, Ho Chi Minh City, Vietnam; Vietnam National University, Ho Chi Minh City, Vietnam
| |
Collapse
|
40
|
Yuan Y, Wang X, Yang X, Heng PA. Effective Semi-Supervised Medical Image Segmentation With Probabilistic Representations and Prototype Learning. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:1181-1193. [PMID: 39437272 DOI: 10.1109/tmi.2024.3484166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2024]
Abstract
Label scarcity, class imbalance and data uncertainty are three primary challenges that are commonly encountered in the semi-supervised medical image segmentation. In this work, we focus on the data uncertainty issue that is overlooked by previous literature. To address this issue, we propose a probabilistic prototype-based classifier that introduces uncertainty estimation into the entire pixel classification process, including probabilistic representation formulation, probabilistic pixel-prototype proximity matching, and distribution prototype update, leveraging principles from probability theory. By explicitly modeling data uncertainty at the pixel level, model robustness of our proposed framework to tricky pixels, such as ambiguous boundaries and noises, is greatly enhanced when compared to its deterministic counterpart and other uncertainty-aware strategy. Empirical evaluations on three publicly available datasets that exhibit severe boundary ambiguity show the superiority of our method over several competitors. Moreover, our method also demonstrates a stronger model robustness to simulated noisy data. Code is available at https://github.com/IsYuchenYuan/PPC.
Collapse
|
41
|
Jiang C, Wang Y, Yuan Q, Qu P, Li H. A 3D medical image segmentation network based on gated attention blocks and dual-scale cross-attention mechanism. Sci Rep 2025; 15:6159. [PMID: 39979447 PMCID: PMC11842799 DOI: 10.1038/s41598-025-90339-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Accepted: 02/12/2025] [Indexed: 02/22/2025] Open
Abstract
In the field of multi-organ 3D medical image segmentation, Convolutional Neural Networks (CNNs) are limited to extracting local feature information, while Transformer-based architectures suffer from high computational complexity and inadequate extraction of spatial and channel layer information. Moreover, the large number and varying sizes of organs to be segmented result in suboptimal model robustness and segmentation outcomes. To address these challenges, this paper introduces a novel network architecture, DS-UNETR++, specifically designed for 3D medical image segmentation. The proposed network features a dual-branch feature encoding mechanism that categorizes images into coarse-grained and fine-grained types before processing them through the encoding blocks. Each encoding block comprises a downsampling layer and a Gated Shared Weighted Pairwise Attention (G-SWPA) submodule, which dynamically adjusts the influence of spatial and channel attention on feature extraction. Additionally, a Gated Dual-Scale Cross-Attention Module (G-DSCAM) is incorporated at the bottleneck stage. This module employs dimensionality reduction techniques to cross-coarse-grained and fine-grained features, using a gating mechanism to dynamically balance the ratio of these two types of feature information, thereby achieving effective multi-scale feature fusion. Finally, comprehensive evaluations were conducted on four public medical datasets. Experimental results demonstrate that DS-UNETR++ achieves good segmentation performance, highlighting the effectiveness and significance of the proposed method and offering new insights for various organ segmentation tasks.
Collapse
Grants
- No.52165063 The National Natural Science Foundation of China
- No.52165063 The National Natural Science Foundation of China
- No.52165063 The National Natural Science Foundation of China
- No.52165063 The National Natural Science Foundation of China
- No.52165063 The National Natural Science Foundation of China
- No. [2022]K024 The Science Foundation of Guizhou Province, China
- No. [2022]K024 The Science Foundation of Guizhou Province, China
- No. [2022]K024 The Science Foundation of Guizhou Province, China
- No. [2022]K024 The Science Foundation of Guizhou Province, China
- No. [2022]K024 The Science Foundation of Guizhou Province, China
- YKJP202306 the Graduate Student Science and Technology Competition Cultivation Project of Guizhou University, Guizhou Province, China
- YKJP202306 the Graduate Student Science and Technology Competition Cultivation Project of Guizhou University, Guizhou Province, China
- YKJP202306 the Graduate Student Science and Technology Competition Cultivation Project of Guizhou University, Guizhou Province, China
- YKJP202306 the Graduate Student Science and Technology Competition Cultivation Project of Guizhou University, Guizhou Province, China
- YKJP202306 the Graduate Student Science and Technology Competition Cultivation Project of Guizhou University, Guizhou Province, China
- SYS-KF2024-079 Funding for Open Laboratory Projects at Guizhou University
- SYS-KF2024-079 Funding for Open Laboratory Projects at Guizhou University
- SYS-KF2024-079 Funding for Open Laboratory Projects at Guizhou University
- SYS-KF2024-079 Funding for Open Laboratory Projects at Guizhou University
- SYS-KF2024-079 Funding for Open Laboratory Projects at Guizhou University
Collapse
Affiliation(s)
- Chunhui Jiang
- Key Laboratory of Advanced Manufacturing Technology of the Ministry of Education, Guizhou University, Guiyang, 550025, China
| | - Yi Wang
- Guiyang First People's Hospital, Guiyang, 550002, China.
| | - Qingni Yuan
- Key Laboratory of Advanced Manufacturing Technology of the Ministry of Education, Guizhou University, Guiyang, 550025, China.
| | - Pengju Qu
- Key Laboratory of Advanced Manufacturing Technology of the Ministry of Education, Guizhou University, Guiyang, 550025, China
| | - Heng Li
- Key Laboratory of Advanced Manufacturing Technology of the Ministry of Education, Guizhou University, Guiyang, 550025, China
| |
Collapse
|
42
|
Papageorgiou VE, Petmezas G, Dogoulis P, Cordy M, Maglaveras N. Uncertainty CNNs: A path to enhanced medical image classification performance. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2025; 22:528-553. [PMID: 40083281 DOI: 10.3934/mbe.2025020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/16/2025]
Abstract
The automated detection of tumors using medical imaging data has garnered significant attention over the past decade due to the critical need for early and accurate diagnoses. This interest is fueled by advancements in computationally efficient modeling techniques and enhanced data storage capabilities. However, methodologies that account for the uncertainty of predictions remain relatively uncommon in medical imaging. Uncertainty quantification (UQ) is important as it helps decision-makers gauge their confidence in predictions and consider variability in the model inputs. Numerous deterministic deep learning (DL) methods have been developed to serve as reliable medical imaging tools, with convolutional neural networks (CNNs) being the most widely used approach. In this paper, we introduce a low-complexity uncertainty-based CNN architecture for medical image classification, particularly focused on tumor and heart failure (HF) detection. The model's predictive (aleatoric) uncertainty is quantified through a test-set augmentation technique, which generates multiple surrogates of each test image. This process enables the construction of empirical distributions for each image, which allows for the calculation of mean estimates and credible intervals. Importantly, this methodology not only provides UQ, but also significantly improves the model's classification performance. This paper represents the first effort to demonstrate that test-set augmentation can significantly improve the classification performance of medical images. The proposed DL model was evaluated using three datasets: (a) brain magnetic resonance imaging (MRI), (b) lung computed tomography (CT) scans, and (c) cardiac MRI. The low-complexity design of the model enhances its robustness against overfitting, while it is also easily re-trainable in case out-of-distribution data is encountered, due to the reduced computational resources required by the introduced architecture.
Collapse
Affiliation(s)
| | - Georgios Petmezas
- School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | | | - Maxime Cordy
- SerVal, University of Luxembourg, Luxembourg City, Luxembourg
| | - Nicos Maglaveras
- School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece
| |
Collapse
|
43
|
Huang S, Ge Y, Liu D, Hong M, Zhao J, Loui AC. Rethinking Copy-Paste for Consistency Learning in Medical Image Segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:1060-1074. [PMID: 40031728 DOI: 10.1109/tip.2025.3536208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Semi-supervised learning based on consistency learning offers significant promise for enhancing medical image segmentation. Current approaches use copy-paste as an effective data perturbation technique to facilitate weak-to-strong consistency learning. However, these techniques often lead to a decrease in the accuracy of synthetic labels corresponding to the synthetic data and introduce excessive perturbations to the distribution of the training data. Such over-perturbation causes the data distribution to stray from its true distribution, thereby impairing the model's generalization capabilities as it learns the decision boundaries. We propose a weak-to-strong consistency learning framework that integrally addresses these issues with two primary designs: 1) it emphasizes the use of highly reliable data to enhance the quality of labels in synthetic datasets through cross-copy-pasting between labeled and unlabeled datasets; 2) it employs uncertainty estimation and foreground region constraints to meticulously filter the regions for copy-pasting, thus the copy-paste technique implemented introduces a beneficial perturbation to the training data distribution. Our framework expands the copy-paste method by addressing its inherent limitations, and amplifying the potential of data perturbations for consistency learning. We extensively validated our model using six publicly available medical image segmentation datasets across different diagnostic tasks, including the segmentation of cardiac structures, prostate structures, brain structures, skin lesions, and gastrointestinal polyps. The results demonstrate that our method significantly outperforms state-of-the-art models. For instance, on the PROMISE12 dataset for the prostate structure segmentation task, using only 10% labeled data, our method achieves a 15.31% higher Dice score compared to the baseline models. Our experimental code will be made publicly available at https://github.com/slhuang24/RCP4CL.
Collapse
|
44
|
Tölle M, Garthe P, Scherer C, Seliger JM, Leha A, Krüger N, Simm S, Martin S, Eble S, Kelm H, Bednorz M, André F, Bannas P, Diller G, Frey N, Groß S, Hennemuth A, Kaderali L, Meyer A, Nagel E, Orwat S, Seiffert M, Friede T, Seidler T, Engelhardt S. Real world federated learning with a knowledge distilled transformer for cardiac CT imaging. NPJ Digit Med 2025; 8:88. [PMID: 39915633 PMCID: PMC11802793 DOI: 10.1038/s41746-025-01434-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2024] [Accepted: 01/02/2025] [Indexed: 02/09/2025] Open
Abstract
Federated learning is a renowned technique for utilizing decentralized data while preserving privacy. However, real-world applications often face challenges like partially labeled datasets, where only a few locations have certain expert annotations, leaving large portions of unlabeled data unused. Leveraging these could enhance transformer architectures' ability in regimes with small and diversely annotated sets. We conduct the largest federated cardiac CT analysis to date (n = 8, 104) in a real-world setting across eight hospitals. Our two-step semi-supervised strategy distills knowledge from task-specific CNNs into a transformer. First, CNNs predict on unlabeled data per label type and then the transformer learns from these predictions with label-specific heads. This improves predictive accuracy and enables simultaneous learning of all partial labels across the federation, and outperforms UNet-based models in generalizability on downstream tasks. Code and model weights are made openly available for leveraging future cardiac CT analysis.
Collapse
Affiliation(s)
- Malte Tölle
- DZHK (German Centre for Cardiovascular Research), partner site Heidelberg/Mannheim, Heidelberg, Germany.
- Department of Cardiology, Angiology and Pneumology, Heidelberg University Hospital, Heidelberg, Germany.
- Heidelberg University, Heidelberg, Germany.
- Informatics for Life Institute, Heidelberg, Germany.
| | - Philipp Garthe
- Clinic for Cardiology III, University Hospital Münster, Münster, Germany
| | - Clemens Scherer
- DZHK (German Centre for Cardiovascular Research), partner site Munich, Munich, Germany
- Department of Medicine I, LMU University Hospital, LMU Munich, Munich, Germany
| | - Jan Moritz Seliger
- DZHK (German Centre for Cardiovascular Research), partner site Hamburg/Kiel/Lübeck, Hamburg, Germany
- Department of Diagnostic and Interventional Radiology and Nuclear Medicine, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Andreas Leha
- DZHK (German Centre for Cardiovascular Research), partner site Lower Saxony, Göttingen, Germany
- Department of Medical Statistics, University Medical Center Göttingen, Göttingen, Germany
| | - Nina Krüger
- DZHK (German Centre for Cardiovascular Research), partner site Berlin, Berlin, Germany
- Deutsches Herzzentrum der Charité (DHZC), Institute of Computer-assisted Cardiovascular Medicine, Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
- Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany
| | - Stefan Simm
- DZHK (German Centre for Cardiovascular Research), partner site Greifswald, Greifswald, Germany
- Institute of Bioinformatics, University Medicine Greifswald, Greifswald, Germany
| | - Simon Martin
- DZHK (German Centre for Cardiovascular Research), partner site RhineMain, Frankfurt, Germany
- Institute for Experimental and Translational Cardiovascular Imaging, Goethe University, Frankfurt am Main, Germany
| | - Sebastian Eble
- Department of Cardiology, Angiology and Pneumology, Heidelberg University Hospital, Heidelberg, Germany
| | - Halvar Kelm
- Department of Cardiology, Angiology and Pneumology, Heidelberg University Hospital, Heidelberg, Germany
| | - Moritz Bednorz
- Department of Cardiology, Angiology and Pneumology, Heidelberg University Hospital, Heidelberg, Germany
| | - Florian André
- DZHK (German Centre for Cardiovascular Research), partner site Heidelberg/Mannheim, Heidelberg, Germany
- Department of Cardiology, Angiology and Pneumology, Heidelberg University Hospital, Heidelberg, Germany
- Heidelberg University, Heidelberg, Germany
- Informatics for Life Institute, Heidelberg, Germany
| | - Peter Bannas
- DZHK (German Centre for Cardiovascular Research), partner site Hamburg/Kiel/Lübeck, Hamburg, Germany
- Department of Diagnostic and Interventional Radiology and Nuclear Medicine, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Gerhard Diller
- Clinic for Cardiology III, University Hospital Münster, Münster, Germany
| | - Norbert Frey
- DZHK (German Centre for Cardiovascular Research), partner site Heidelberg/Mannheim, Heidelberg, Germany
- Department of Cardiology, Angiology and Pneumology, Heidelberg University Hospital, Heidelberg, Germany
- Heidelberg University, Heidelberg, Germany
- Informatics for Life Institute, Heidelberg, Germany
| | - Stefan Groß
- DZHK (German Centre for Cardiovascular Research), partner site Greifswald, Greifswald, Germany
- Institute of Bioinformatics, University Medicine Greifswald, Greifswald, Germany
| | - Anja Hennemuth
- DZHK (German Centre for Cardiovascular Research), partner site Berlin, Berlin, Germany
- Deutsches Herzzentrum der Charité (DHZC), Institute of Computer-assisted Cardiovascular Medicine, Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
- Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany
| | - Lars Kaderali
- DZHK (German Centre for Cardiovascular Research), partner site Greifswald, Greifswald, Germany
- Institute of Bioinformatics, University Medicine Greifswald, Greifswald, Germany
| | - Alexander Meyer
- DZHK (German Centre for Cardiovascular Research), partner site Berlin, Berlin, Germany
- Deutsches Herzzentrum der Charité (DHZC), Institute of Computer-assisted Cardiovascular Medicine, Berlin, Germany
| | - Eike Nagel
- DZHK (German Centre for Cardiovascular Research), partner site RhineMain, Frankfurt, Germany
- Institute for Experimental and Translational Cardiovascular Imaging, Goethe University, Frankfurt am Main, Germany
| | - Stefan Orwat
- Clinic for Cardiology III, University Hospital Münster, Münster, Germany
| | - Moritz Seiffert
- DZHK (German Centre for Cardiovascular Research), partner site Hamburg/Kiel/Lübeck, Hamburg, Germany
- Department of Cardiology, University Heart and Vascular Center Hamburg, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Tim Friede
- DZHK (German Centre for Cardiovascular Research), partner site Lower Saxony, Göttingen, Germany
- Department of Medical Statistics, University Medical Center Göttingen, Göttingen, Germany
| | - Tim Seidler
- DZHK (German Centre for Cardiovascular Research), partner site Lower Saxony, Göttingen, Germany
- Department of Cardiology, University Medicine Göttingen, Göttingen, Germany
- Department of Cardiology, Campus Kerckhoff of the Justus-Liebig-University at Gießen, Kerckhoff-Clinic, Gießen, Germany
| | - Sandy Engelhardt
- DZHK (German Centre for Cardiovascular Research), partner site Heidelberg/Mannheim, Heidelberg, Germany
- Department of Cardiology, Angiology and Pneumology, Heidelberg University Hospital, Heidelberg, Germany
- Heidelberg University, Heidelberg, Germany
- Informatics for Life Institute, Heidelberg, Germany
| |
Collapse
|
45
|
Li H, Yuan Q, Wang Y, Qu P, Jiang C, Kuang H. An algorithm for cardiac disease detection based on the magnetic resonance imaging. Sci Rep 2025; 15:4053. [PMID: 39901039 PMCID: PMC11790828 DOI: 10.1038/s41598-025-88567-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Accepted: 01/29/2025] [Indexed: 02/05/2025] Open
Abstract
In experiments to detect heart disease on cardiac magnetic resonance imaging (MRI) medical images, existing object detection models face several challenges including low accuracy and unreliable detection results. To tackle these issues, this article proposes an innovative method for Object Detection in cardiac MRI medical images called SA-YOLO. This method is based on the YOLOv8 model but introduces several key modifications. Firstly, the standard Spatial Pyramid Pooling Fast module is replaced with a Multi-Channel Spatial Pyramid Pooling module. Secondly, an attention mechanism combining the ideas of Squeeze-Excitation and Coordinate Attention designed, and integrated into the Neck part of the baseline model. Subsequently, the bounding box regression loss function CIoU loss of the model was replaced with the iSD-IoU loss that combines shape loss and distance loss. Finally, comparative experiments were conducted on the Automated Cardiac Diagnosis Challenge cardiac MRI image dataset where it was found that SA-YOLOv8 achieved better results in detecting cardiac pathologies, and improvement of 7.4% in mAP0.5 value and 5.1% in mAP0.5-0.95 value compared to the baseline model.
Collapse
Affiliation(s)
- Heng Li
- Key Laboratory of Advanced Manufacturing Technology of the Ministry of Education, Guizhou University, Guiyang, 550025, China
| | - Qingni Yuan
- Key Laboratory of Advanced Manufacturing Technology of the Ministry of Education, Guizhou University, Guiyang, 550025, China.
| | - Yi Wang
- The First People's Hospital of Guiyang, Guiyang, 550002, China
| | - Pengju Qu
- Key Laboratory of Advanced Manufacturing Technology of the Ministry of Education, Guizhou University, Guiyang, 550025, China
| | - Chunhui Jiang
- Key Laboratory of Advanced Manufacturing Technology of the Ministry of Education, Guizhou University, Guiyang, 550025, China
| | - Hu Kuang
- Key Laboratory of Advanced Manufacturing Technology of the Ministry of Education, Guizhou University, Guiyang, 550025, China
| |
Collapse
|
46
|
Peng B, Fan C. IEA-Net: Internal and External Dual-Attention Medical Segmentation Network with High-Performance Convolutional Blocks. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025; 38:602-614. [PMID: 39105850 PMCID: PMC11811337 DOI: 10.1007/s10278-024-01217-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Revised: 07/15/2024] [Accepted: 07/29/2024] [Indexed: 08/07/2024]
Abstract
Currently, deep learning is developing rapidly in the field of image segmentation, and medical image segmentation is one of the key applications in this field. Conventional CNN has achieved great success in general medical image segmentation tasks, but it has feature loss in the feature extraction part and lacks the ability to explicitly model remote dependencies, which makes it difficult to adapt to the task of human organ segmentation. Although methods containing attention mechanisms have made good progress in the field of semantic segmentation, most of the current attention mechanisms are limited to a single sample, while the number of samples of human organ images is large, ignoring the correlation between the samples is not conducive to image segmentation. In order to solve these problems, an internal and external dual-attention segmentation network (IEA-Net) is proposed in this paper, and the ICSwR (interleaved convolutional system with residual) module and the IEAM module are designed in this network. The ICSwR contains interleaved convolution and hopping connection, which are used for the initial extraction of the features in the encoder part. The IEAM module (internal and external dual-attention module) consists of the LGGW-SA (local-global Gaussian-weighted self-attention) module and the EA module, which are in a tandem structure. The LGGW-SA module focuses on learning local-global feature correlations within individual samples for efficient feature extraction. Meanwhile, the EA module is designed to capture inter-sample connections, addressing multi-sample complexities. Additionally, skip connections will be incorporated into each IEAM module within both the encoder and decoder to reduce feature loss. We tested our method on the Synapse multi-organ segmentation dataset and the ACDC cardiac segmentation dataset, and the experimental results show that the proposed method achieves better performance than other state-of-the-art methods.
Collapse
Affiliation(s)
- Bincheng Peng
- School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, China.
| | - Chao Fan
- School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, China
| |
Collapse
|
47
|
Kabir MM, Rahman A, Hasan MN, Mridha MF. Computer vision algorithms in healthcare: Recent advancements and future challenges. Comput Biol Med 2025; 185:109531. [PMID: 39675214 DOI: 10.1016/j.compbiomed.2024.109531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 10/05/2024] [Accepted: 12/03/2024] [Indexed: 12/17/2024]
Abstract
Computer vision has emerged as a promising technology with numerous applications in healthcare. This systematic review provides an overview of advancements and challenges associated with computer vision in healthcare. The review highlights the application areas where computer vision has made significant strides, including medical imaging, surgical assistance, remote patient monitoring, and telehealth. Additionally, it addresses the challenges related to data quality, privacy, model interpretability, and integration with existing healthcare systems. Ethical and legal considerations, such as patient consent and algorithmic bias, are also discussed. The review concludes by identifying future directions and opportunities for research, emphasizing the potential impact of computer vision on healthcare delivery and outcomes. Overall, this systematic review underscores the importance of understanding both the advancements and challenges in computer vision to facilitate its responsible implementation in healthcare.
Collapse
Affiliation(s)
- Md Mohsin Kabir
- School of Innovation, Design and Engineering, Mälardalens University, Västerås, 722 20, Sweden.
| | - Ashifur Rahman
- Department of Computer Science and Engineering, Bangladesh University of Business and Technology, Mirpur-2, Dhaka, 1216, Bangladesh.
| | - Md Nahid Hasan
- Department of Computer Science, University of Wisconsin-Milwaukee, Milwaukee, WI 53211, United States.
| | - M F Mridha
- Department of Computer Science, American International University-Bangladesh, Dhaka, 1229, Dhaka, Bangladesh.
| |
Collapse
|
48
|
Pham TV, Vu TN, Le HMQ, Pham VT, Tran TT. CapNet: An Automatic Attention-Based with Mixer Model for Cardiovascular Magnetic Resonance Image Segmentation. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025; 38:94-123. [PMID: 38980628 PMCID: PMC11811363 DOI: 10.1007/s10278-024-01191-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2023] [Revised: 05/21/2024] [Accepted: 05/22/2024] [Indexed: 07/10/2024]
Abstract
Deep neural networks have shown excellent performance in medical image segmentation, especially for cardiac images. Transformer-based models, though having advantages over convolutional neural networks due to the ability of long-range dependence learning, still have shortcomings such as having a large number of parameters and and high computational cost. Additionally, for better results, they are often pretrained on a larger data, thus requiring large memory size and increasing resource expenses. In this study, we propose a new lightweight but efficient model, namely CapNet, based on convolutions and mixing modules for cardiac segmentation from magnetic resonance images (MRI) that can be trained from scratch with a small amount of parameters. To handle varying sizes and shapes which often occur in cardiac systolic and diastolic phases, we propose attention modules for pooling, spatial, and channel information. We also propose a novel loss called the Tversky Shape Power Distance function based on the shape dissimilarity between labels and predictions that shows promising performances compared to other losses. Experiments on three public datasets including ACDC benchmark, Sunnybrook data, and MS-CMR challenge are conducted and compared with other state of the arts (SOTA). For binary segmentation, the proposed CapNet obtained the Dice similarity coefficient (DSC) of 94% and 95.93% for respectively the Endocardium and Epicardium regions with Sunnybrook dataset, 94.49% for Endocardium, and 96.82% for Epicardium with the ACDC data. Regarding the multiclass case, the average DSC by CapNet is 93.05% for the ACDC data; and the DSC scores for the MS-CMR are 94.59%, 92.22%, and 93.99% for respectively the bSSFP, T2-SPAIR, and LGE sequences of the MS-CMR. Moreover, the statistical significance analysis tests with p-value < 0.05 compared with transformer-based methods and some CNN-based approaches demonstrated that the CapNet, though having fewer training parameters, is statistically significant. The promising evaluation metrics show comparative results in both Dice and IoU indices compared to SOTA CNN-based and Transformer-based architectures.
Collapse
Affiliation(s)
- Tien Viet Pham
- Department of Automation Engineering, School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam
| | - Tu Ngoc Vu
- Department of Automation Engineering, School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam
| | - Hoang-Minh-Quang Le
- Department of Automation Engineering, School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam
| | - Van-Truong Pham
- Department of Automation Engineering, School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam
| | - Thi-Thao Tran
- Department of Automation Engineering, School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam.
| |
Collapse
|
49
|
Long J, Liu Y, Ren Y. Semi-supervised medical image segmentation with dual-branch mixup-decoupling confidence training. Med Eng Phys 2025; 136:104285. [PMID: 39979008 DOI: 10.1016/j.medengphy.2025.104285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2024] [Revised: 10/30/2024] [Accepted: 01/05/2025] [Indexed: 02/22/2025]
Abstract
Semi-supervised medical image segmentation algorithms hold significant research and practical value due to their ability to reduce labeling dependency and annotation costs. However, most current algorithms lack diverse regularization methods to effectively exploit robust knowledge from unlabeled data. The pseudo-label filtering methods employed are often overly simplistic, which exacerbates the serious category imbalance problem in medical images. Additionally, these algorithms fail to provide robust semantic representations for comparative learning in multi-scenario settings, making it challenging for the model to learn more discriminative semantic information. To address these issues, we propose a semi-supervised medical image segmentation algorithm that utilizes dual-branch mixup-decoupling confidence training to establish a dual-stream semantic link between labeled and unlabeled images, thereby alleviating semantic ambiguity. Furthermore, we design a bidirectional confidence contrast learning method to maximize the consistency between similar pixels and the distinction between dissimilar pixels in both directions across different feature embeddings in dual views. This enables the model to learn the key features of intra-class similarity and inter-class separability. We conduct a series of experiments on both 2D and 3D datasets, and the experimental results demonstrate that the proposed algorithm achieves notable segmentation performance, outperforming other recent state-of-the-art algorithms.
Collapse
Affiliation(s)
- Jianwu Long
- College of Computer Science and Engineering, Chongqing University of Technology, Chongqing, 400054, China.
| | - Yuanqin Liu
- College of Computer Science and Engineering, Chongqing University of Technology, Chongqing, 400054, China.
| | - Yan Ren
- College of Computer Science and Engineering, Chongqing University of Technology, Chongqing, 400054, China.
| |
Collapse
|
50
|
You X, He J, Yang J, Gu Y. Learning With Explicit Shape Priors for Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:927-940. [PMID: 39331543 DOI: 10.1109/tmi.2024.3469214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/29/2024]
Abstract
Medical image segmentation is a fundamental task for medical image analysis and surgical planning. In recent years, UNet-based networks have prevailed in the field of medical image segmentation. However, convolutional neural networks (CNNs) suffer from limited receptive fields, which fail to model the long-range dependency of organs or tumors. Besides, these models are heavily dependent on the training of the final segmentation head. And existing methods can not well address aforementioned limitations simultaneously. Hence, in our work, we proposed a novel shape prior module (SPM), which can explicitly introduce shape priors to promote the segmentation performance of UNet-based models. The explicit shape priors consist of global and local shape priors. The former with coarse shape representations provides networks with capabilities to model global contexts. The latter with finer shape information serves as additional guidance to relieve the heavy dependence on the learnable prototype in the segmentation head. To evaluate the effectiveness of SPM, we conduct experiments on three challenging public datasets. And our proposed model achieves state-of-the-art performance. Furthermore, SPM can serve as a plug-and-play structure into classic CNNs and Transformer-based backbones, facilitating the segmentation task on different datasets. Source codes are available at https://github.com/AlexYouXin/Explicit-Shape-Priors.
Collapse
|