51
|
You X, He J, Yang J, Gu Y. Learning With Explicit Shape Priors for Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:927-940. [PMID: 39331543 DOI: 10.1109/tmi.2024.3469214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/29/2024]
Abstract
Medical image segmentation is a fundamental task for medical image analysis and surgical planning. In recent years, UNet-based networks have prevailed in the field of medical image segmentation. However, convolutional neural networks (CNNs) suffer from limited receptive fields, which fail to model the long-range dependency of organs or tumors. Besides, these models are heavily dependent on the training of the final segmentation head. And existing methods can not well address aforementioned limitations simultaneously. Hence, in our work, we proposed a novel shape prior module (SPM), which can explicitly introduce shape priors to promote the segmentation performance of UNet-based models. The explicit shape priors consist of global and local shape priors. The former with coarse shape representations provides networks with capabilities to model global contexts. The latter with finer shape information serves as additional guidance to relieve the heavy dependence on the learnable prototype in the segmentation head. To evaluate the effectiveness of SPM, we conduct experiments on three challenging public datasets. And our proposed model achieves state-of-the-art performance. Furthermore, SPM can serve as a plug-and-play structure into classic CNNs and Transformer-based backbones, facilitating the segmentation task on different datasets. Source codes are available at https://github.com/AlexYouXin/Explicit-Shape-Priors.
Collapse
|
52
|
Sun F, Zhou Y, Hu L, Li Y, Zhao D, Chen Y, He Y. EDSRNet: An Enhanced Decoder Semantic Recovery Network for 2D Medical Image Segmentation. IEEE J Biomed Health Inform 2025; 29:1113-1124. [PMID: 40030272 DOI: 10.1109/jbhi.2024.3504829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/06/2025]
Abstract
In recent years, with the advancement of medical imaging technology, medical image segmentation has played a key role in assisting diagnosis and treatment planning. Current deep learning-based medical image segmentation methods mainly adopt encoder-decoder architecture design and have received wide attention. However, these methods still have some limitations, including: (1) Existing methods are often influenced by the significant semantic information gap when supplementing features for the decoder. (2) Existing methods do not simultaneously consider global and local information interaction during decoding, resulting in ineffective semantic recovery. Therefore, this paper proposes a novel Enhanced Decoder Semantic Recovery Network to address these challenges. Firstly, the Multi-Level Semantic Fusion (MLSF) module is introduced, which effectively fuses low-level features of the original image, encoder features, high-level features of the deepest network layer, and decoder features, and assigns weights based on semantic gaps. Secondly, the Multiscale Spatial Attention (MSSA) and Cross Convolution Channel Attention (CCCA) modules are employed to obtain richer feature information. Finally, the Global-Local Semantic Recovery (GLSR) module is designed to achieve better semantic recovery. Experiments on public datasets such as BUSI, CVC-ClinicDB, and Kvasir-SEG demonstrate that the proposed model improves IoU compared to suboptimal algorithms by 0.81%, 0.85% and 1.98%, respectively, significantly enhancing the performance of 2D medical image segmentation. This method provides effective technical support for further development in the field of medical image.
Collapse
|
53
|
Chen J, Liu Y, Wei S, Bian Z, Subramanian S, Carass A, Prince JL, Du Y. A survey on deep learning in medical image registration: New technologies, uncertainty, evaluation metrics, and beyond. Med Image Anal 2025; 100:103385. [PMID: 39612808 PMCID: PMC11730935 DOI: 10.1016/j.media.2024.103385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 10/27/2024] [Accepted: 11/01/2024] [Indexed: 12/01/2024]
Abstract
Deep learning technologies have dramatically reshaped the field of medical image registration over the past decade. The initial developments, such as regression-based and U-Net-based networks, established the foundation for deep learning in image registration. Subsequent progress has been made in various aspects of deep learning-based registration, including similarity measures, deformation regularizations, network architectures, and uncertainty estimation. These advancements have not only enriched the field of image registration but have also facilitated its application in a wide range of tasks, including atlas construction, multi-atlas segmentation, motion estimation, and 2D-3D registration. In this paper, we present a comprehensive overview of the most recent advancements in deep learning-based image registration. We begin with a concise introduction to the core concepts of deep learning-based image registration. Then, we delve into innovative network architectures, loss functions specific to registration, and methods for estimating registration uncertainty. Additionally, this paper explores appropriate evaluation metrics for assessing the performance of deep learning models in registration tasks. Finally, we highlight the practical applications of these novel techniques in medical imaging and discuss the future prospects of deep learning-based image registration.
Collapse
Affiliation(s)
- Junyu Chen
- Department of Radiology and Radiological Science, Johns Hopkins School of Medicine, MD, USA.
| | - Yihao Liu
- Department of Electrical and Computer Engineering, Johns Hopkins University, MD, USA
| | - Shuwen Wei
- Department of Electrical and Computer Engineering, Johns Hopkins University, MD, USA
| | - Zhangxing Bian
- Department of Electrical and Computer Engineering, Johns Hopkins University, MD, USA
| | - Shalini Subramanian
- Department of Radiology and Radiological Science, Johns Hopkins School of Medicine, MD, USA
| | - Aaron Carass
- Department of Electrical and Computer Engineering, Johns Hopkins University, MD, USA
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, MD, USA
| | - Yong Du
- Department of Radiology and Radiological Science, Johns Hopkins School of Medicine, MD, USA
| |
Collapse
|
54
|
Tian Y, Liang Y, Chen Y, Zhang J, Bian H. Multilevel support-assisted prototype optimization network for few-shot medical segmentation of lung lesions. Sci Rep 2025; 15:3290. [PMID: 39865124 PMCID: PMC11770124 DOI: 10.1038/s41598-025-87829-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2024] [Accepted: 01/22/2025] [Indexed: 01/28/2025] Open
Abstract
Medical image annotation is scarce and costly. Few-shot segmentation has been widely used in medical image from only a few annotated examples. However, its research on lesion segmentation for lung diseases is still limited, especially for pulmonary aspergillosis. Lesion areas usually have complex shapes and blurred edges. Lesion segmentation requires more attention to deal with the diversity and uncertainty of lesions. To address this challenge, we propose MSPO-Net, a multilevel support-assisted prototype optimization network designed for few-shot lesion segmentation in computerized tomography (CT) images of lung diseases. MSPO-Net learns lesion prototypes from low-level to high-level features. Self-attention threshold learning strategy can focus on the global information and obtain an optimal threshold for CT images. Our model refines prototypes through a support-assisted prototype optimization module, adaptively enhancing their representativeness for the diversity of lesions and adapting to the unseen lesions better. In clinical examinations, CT is more practical than X-rays. To ensure the quality of our work, we have established a small-scale CT image dataset for three lung diseases and annotated by experienced doctors. Experiments demonstrate that MSPO-Net can improve segmentation performance and robustness of lung disease lesion. MSPO-Net achieves state-of-the-art performance in both single and unseen lung disease segmentation, indicating its potentiality to reduce doctors' workload and improve diagnostic accuracy. This research has certain clinical significance. Code is available at https://github.com/Tian-Yuan-ty/MSPO-Net .
Collapse
Affiliation(s)
- Yuan Tian
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, Shandong, China
| | - Yongquan Liang
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, Shandong, China.
| | - Yufeng Chen
- Shandong Provincial Public Health Clinical Center, Shandong University, Jinan, 250013, Shandong, China
| | - Jingjing Zhang
- Shandong Provincial Public Health Clinical Center, Shandong University, Jinan, 250013, Shandong, China
| | - Hongyang Bian
- Shandong Provincial Public Health Clinical Center, Shandong University, Jinan, 250013, Shandong, China
| |
Collapse
|
55
|
Chen J, Huang W, Zhang J, Debattista K, Han J. Addressing inconsistent labeling with cross image matching for scribble-based medical image segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; PP:842-853. [PMID: 40031274 DOI: 10.1109/tip.2025.3530787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
In recent years, there has been a notable surge in the adoption of weakly-supervised learning for medical image segmentation, utilizing scribble annotation as a means to potentially reduce annotation costs. However, the inherent characteristics of scribble labeling, marked by incompleteness, subjectivity, and a lack of standardization, introduce inconsistencies into the annotations. These inconsistencies become significant challenges for the network's learning process, ultimately affecting the performance of segmentation. To address this challenge, we propose creating a reference set to guide pixel-level feature matching, constructed from class-specific tokens and pixel-level features extracted from variously images. Serving as a repository showcasing diverse pixel styles and classes, the reference set becomes the cornerstone for a pixel-level feature matching strategy. This strategy enables the effective comparison of unlabeled pixels, offering guidance, particularly in learning scenarios characterized by inconsistent and incomplete scribbles. The proposed strategy incorporates smoothing and regression techniques to align pixel-level features across different images. By leveraging the diversity of pixel sources, our matching approach enhances the network's ability to learn consistent patterns from the reference set. This, in turn, mitigates the impact of inconsistent and incomplete labeling, resulting in improved segmentation outcomes. Extensive experiments conducted on three publicly available datasets demonstrate the superiority of our approach over state-of-the-art methods in terms of segmentation accuracy and stability. The code will be made publicly available at https://github.com/jingkunchen/scribble-medical-segmentation.
Collapse
|
56
|
Ouyang Y, Li P, Zhang H, Hu X. Semi-Supervised Medical Image Segmentation Based on Frequency Domain Aware Stable Consistency Regularization. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-025-01397-7. [PMID: 39843719 DOI: 10.1007/s10278-025-01397-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Revised: 12/02/2024] [Accepted: 12/29/2024] [Indexed: 01/24/2025]
Abstract
With the advancement of deep learning models nowadays, they have successfully applied in the semi-supervised medical image segmentation where there are few annotated medical images and a large number of unlabeled ones. A representative approach in this regard is the semi-supervised method based on consistency regularization, which improves model training by imposing consistency constraints (perturbations) on unlabeled data. However, the perturbations in this kind of methods are often artificially designed, which may introduce biases unfavorable to the model learning in the handling of medical image segmentation. On the other hand, the majority of such methods often overlook the supervision in the Encoder stage of training and primarily focus on the outcomes in the later stages, potentially leading to chaotic learning in the initial phase and subsequently impacting the learning process of the model in the later stages. At the meanwhile, they miss the intrinsic spatial-frequency information of the images. Therefore, in this study, we propose a new semi-supervised medical image segmentation approach based on frequency domain aware stable consistency regularization. Specifically, to avoid the bias introduced by artificially setting perturbations, we first utilize the inherent frequency domain information of images, including both high and low frequencies, as the consistency constraint. Secondly, we incorporate supervision in the Encoder stage of model training to ensure that the model does not fail to learn due to the disruption of the original feature space caused by strong augmentation. Finally, extensive experimentation validates the effectiveness of our semi-supervised approach.
Collapse
Affiliation(s)
- Yihao Ouyang
- Key Laboratory of Knowledge Engineering with Big Data (the Ministry of Education of China), Hefei University of Technology, Hefei, 230009, Anhui, China
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, 230009, Anhui, China
| | - Peipei Li
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, 230009, Anhui, China.
- Center for Big Data and Population Health of IHM, Anhui Medical University, Hefei, Anhui, China.
| | - Haixiang Zhang
- Center for Big Data and Population Health of IHM, Anhui Medical University, Hefei, Anhui, China
- Computer Centre, The Second People's Hospital of Hefei, Hefei, 230011, Anhui, China
| | - Xuegang Hu
- Key Laboratory of Knowledge Engineering with Big Data (the Ministry of Education of China), Hefei University of Technology, Hefei, 230009, Anhui, China
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, 230009, Anhui, China
- Anhui Province Key Laboratory of Industry Safety and Emergency Technology, Hefei University of Technology, Hefei, 230009, Anhui, China
| |
Collapse
|
57
|
Trinh MN, Tran TT, Nham DHN, Lo MT, Pham VT. GLAC-Unet: Global-Local Active Contour Loss with an Efficient U-Shaped Architecture for Multiclass Medical Image Segmentation. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-025-01387-9. [PMID: 39821780 DOI: 10.1007/s10278-025-01387-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Revised: 12/02/2024] [Accepted: 12/23/2024] [Indexed: 01/19/2025]
Abstract
The field of medical image segmentation powered by deep learning has recently received substantial attention, with a significant focus on developing novel architectures and designing effective loss functions. Traditional loss functions, such as Dice loss and Cross-Entropy loss, predominantly rely on global metrics to compare predictions with labels. However, these global measures often struggle to address challenges such as occlusion and nonuni-form intensity. To overcome these issues, in this study, we propose a novel loss function, termed Global-Local Active Contour (GLAC) loss, which integrates both global and local image features, reformulated within the Mumford-Shah framework and extended for multiclass segmentation. This approach enables the neural network model to be trained end-to-end while simultaneously segmenting multiple classes. In addition to this, we enhance the U-Net architecture by incorporating Dense Layers, Convolutional Block Attention Modules, and DropBlock. These improvements enable the model to more effectively combine contextual information across layers, capture richer semantic details, and mitigate overfitting, resulting in more precise segmentation outcomes. We validate our proposed method, namely GLAC-Unet, which utilizes the GLAC loss in conjunction with our modified U-shaped architecture, on three biomedical segmentation datasets that span a range of modalities, including two-dimensional and three-dimensional images, such as dermoscopy, cardiac magnetic resonance imaging, and brain magnetic resonance imaging. Extensive experiments demonstrate the promising performance of our approach, achieving a Dice score (DSC) of 0.9125 on the ISIC-2018 dataset, 0.9260 on the Automated Cardiac Diagnosis Challenge (ACDC) 2017, and 0.927 on the Infant Brain MRI Segmentation Challenge 2019. Furthermore, statistical significance testing with p-values consistently smaller than 0.05 on the ISIC-2018 and ACDC datasets confirms the superior performance of the proposed method compared to other state-of-the-art models. These results highlight the robustness and effectiveness of our multiclass segmentation technique, underscoring its potential for biomedical image analysis. Our code will be made available at https://github.com/minhnhattrinh312/Active-Contour-Loss-based-on-Global-and-Local-Intensity.
Collapse
Affiliation(s)
- Minh-Nhat Trinh
- Center of Marine Sciences, University of Algarve, Faro, Portugal
| | - Thi-Thao Tran
- School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam
| | - Do-Hai-Ninh Nham
- Department of Mathematics, The University of Kaiserslautern-Landau (RPTU), Kaiserslautern, Germany
| | - Men-Tzung Lo
- Department of Biomedical Sciences and Engineering, National Central University, Taoyuan City, Taiwan
| | - Van-Truong Pham
- School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam.
| |
Collapse
|
58
|
Bravo D, Frias J, Vera F, Trejos J, Martínez C, Gómez M, González F, Romero E. GastroHUN an Endoscopy Dataset of Complete Systematic Screening Protocol for the Stomach. Sci Data 2025; 12:102. [PMID: 39824869 PMCID: PMC11742658 DOI: 10.1038/s41597-025-04401-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2024] [Accepted: 01/02/2025] [Indexed: 01/20/2025] Open
Abstract
Endoscopy is vital for detecting and diagnosing gastrointestinal diseases. Systematic examination protocols are key to enhancing detection, particularly for the early identification of premalignant conditions. Publicly available endoscopy image databases are crucial for machine learning research, yet challenges persist, particularly in identifying upper gastrointestinal anatomical landmarks to ensure effective and precise endoscopic procedures. However, many existing datasets have inconsistent labeling and limited accessibility, leading to biased models and reduced generalizability. This paper introduces GastroHUN, an open dataset documenting stomach screening procedures based on a systematic protocol. GastroHUN includes 8,834 images from 387 patients and 4,729 labeled video sequences, all annotated by four experts. The dataset covers 22 anatomical landmarks in the stomach and includes an additional category for unqualified images, making it a valuable resource for AI model development. By providing a robust public dataset and baseline deep learning models for image and sequence classification, GastroHUN serves as a benchmark for future research and aids in the development of more effective algorithms.
Collapse
Affiliation(s)
- Diego Bravo
- Universidad Nacional de Colombia, Bogotá, 1100111, Colombia.
- Computer Imaging and Medical Applications Laboratory (CIM@LAB), Bogotá, 1100111, Colombia.
| | - Juan Frias
- Universidad Nacional de Colombia, Medicina Interna, Bogotá, 1100111, Colombia
- Hospital Universitario Nacional de Colombia, Gastroeneterology, Bogotá, 1100111, Colombia
| | - Felipe Vera
- Universidad Nacional de Colombia, Medicina Interna, Bogotá, 1100111, Colombia
- Hospital Universitario Nacional de Colombia, Gastroeneterology, Bogotá, 1100111, Colombia
| | - Juan Trejos
- Universidad Nacional de Colombia, Medicina Interna, Bogotá, 1100111, Colombia
- Hospital Universitario Nacional de Colombia, Gastroeneterology, Bogotá, 1100111, Colombia
| | - Carlos Martínez
- Universidad Nacional de Colombia, Medicina Interna, Bogotá, 1100111, Colombia
- Hospital Universitario Nacional de Colombia, Gastroeneterology, Bogotá, 1100111, Colombia
| | - Martín Gómez
- Universidad Nacional de Colombia, Medicina Interna, Bogotá, 1100111, Colombia.
- Hospital Universitario Nacional de Colombia, Gastroeneterology, Bogotá, 1100111, Colombia.
| | - Fabio González
- Universidad Nacional de Colombia, Bogotá, 1100111, Colombia
- Machine Learning, Perception and Discovery Lab (MindLab), Bogotá, 1100111, Colombia
| | - Eduardo Romero
- Universidad Nacional de Colombia, Bogotá, 1100111, Colombia.
- Computer Imaging and Medical Applications Laboratory (CIM@LAB), Bogotá, 1100111, Colombia.
| |
Collapse
|
59
|
Leivaditis V, Beltsios E, Papatriantafyllou A, Grapatsas K, Mulita F, Kontodimopoulos N, Baikoussis NG, Tchabashvili L, Tasios K, Maroulis I, Dahm M, Koletsis E. Artificial Intelligence in Cardiac Surgery: Transforming Outcomes and Shaping the Future. Clin Pract 2025; 15:17. [PMID: 39851800 PMCID: PMC11763739 DOI: 10.3390/clinpract15010017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2024] [Revised: 01/06/2025] [Accepted: 01/08/2025] [Indexed: 01/26/2025] Open
Abstract
Background: Artificial intelligence (AI) has emerged as a transformative technology in healthcare, with its integration into cardiac surgery offering significant advancements in precision, efficiency, and patient outcomes. However, a comprehensive understanding of AI's applications, benefits, challenges, and future directions in cardiac surgery is needed to inform its safe and effective implementation. Methods: A systematic review was conducted following PRISMA guidelines. Literature searches were performed in PubMed, Scopus, Cochrane Library, Google Scholar, and Web of Science, covering publications from January 2000 to November 2024. Studies focusing on AI applications in cardiac surgery, including risk stratification, surgical planning, intraoperative guidance, and postoperative management, were included. Data extraction and quality assessment were conducted using standardized tools, and findings were synthesized narratively. Results: A total of 121 studies were included in this review. AI demonstrated superior predictive capabilities in risk stratification, with machine learning models outperforming traditional scoring systems in mortality and complication prediction. Robotic-assisted systems enhanced surgical precision and minimized trauma, while computer vision and augmented cognition improved intraoperative guidance. Postoperative AI applications showed potential in predicting complications, supporting patient monitoring, and reducing healthcare costs. However, challenges such as data quality, validation, ethical considerations, and integration into clinical workflows remain significant barriers to widespread adoption. Conclusions: AI has the potential to revolutionize cardiac surgery by enhancing decision making, surgical accuracy, and patient outcomes. Addressing limitations related to data quality, bias, validation, and regulatory frameworks is essential for its safe and effective implementation. Future research should focus on interdisciplinary collaboration, robust testing, and the development of ethical and transparent AI systems to ensure equitable and sustainable advancements in cardiac surgery.
Collapse
Affiliation(s)
- Vasileios Leivaditis
- Department of Cardiothoracic and Vascular Surgery, WestpfalzKlinikum, 67655 Kaiserslautern, Germany; (V.L.); (A.P.); (M.D.)
| | - Eleftherios Beltsios
- Department of Anesthesiology and Intensive Care, Hannover Medical School, 30625 Hannover, Germany;
| | - Athanasios Papatriantafyllou
- Department of Cardiothoracic and Vascular Surgery, WestpfalzKlinikum, 67655 Kaiserslautern, Germany; (V.L.); (A.P.); (M.D.)
| | - Konstantinos Grapatsas
- Department of Thoracic Surgery and Thoracic Endoscopy, Ruhrlandklinik, West German Lung Center, University Hospital Essen, University Duisburg-Essen, 45141 Essen, Germany;
| | - Francesk Mulita
- Department of General Surgery, General University Hospital of Patras, 26504 Patras, Greece; (L.T.); (K.T.)
| | - Nikolaos Kontodimopoulos
- Department of Economics and Sustainable Development, Harokopio University, 17778 Athens, Greece;
| | - Nikolaos G. Baikoussis
- Department of Cardiac Surgery, Ippokrateio General Hospital of Athens, 11527 Athens, Greece;
| | - Levan Tchabashvili
- Department of General Surgery, General University Hospital of Patras, 26504 Patras, Greece; (L.T.); (K.T.)
| | - Konstantinos Tasios
- Department of General Surgery, General University Hospital of Patras, 26504 Patras, Greece; (L.T.); (K.T.)
| | - Ioannis Maroulis
- Department of General Surgery, General University Hospital of Patras, 26504 Patras, Greece; (L.T.); (K.T.)
| | - Manfred Dahm
- Department of Cardiothoracic and Vascular Surgery, WestpfalzKlinikum, 67655 Kaiserslautern, Germany; (V.L.); (A.P.); (M.D.)
| | - Efstratios Koletsis
- Department of Cardiothoracic Surgery, General University Hospital of Patras, 26504 Patras, Greece;
| |
Collapse
|
60
|
Paulauskaite-Taraseviciene A, Siaulys J, Jankauskas A, Jakuskaite G. A Robust Blood Vessel Segmentation Technique for Angiographic Images Employing Multi-Scale Filtering Approach. J Clin Med 2025; 14:354. [PMID: 39860360 PMCID: PMC11765955 DOI: 10.3390/jcm14020354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2024] [Revised: 12/23/2024] [Accepted: 12/27/2024] [Indexed: 01/27/2025] Open
Abstract
Background: This study focuses on the critical task of blood vessel segmentation in medical image analysis, essential for diagnosing cardiovascular diseases and enabling effective treatment planning. Although deep learning architectures often produce very high segmentation results in medical images, coronary computed tomography angiography (CTA) images are more challenging than invasive coronary angiography (ICA) images due to noise and the complexity of vessel structures. Methods: Classical architectures for medical images, such as U-Net, achieve only moderate accuracy, with an average Dice score of 0.722. Results: This study introduces Morpho-U-Net, an enhanced U-Net architecture that integrates advanced morphological operations, including Gaussian blurring, thresholding, and morphological opening/closing, to improve vascular integrity, reduce noise, and achieve a higher Dice score of 0.9108, a precision of 0.9341, and a recall of 0.8872. These enhancements demonstrate superior robustness to noise and intricate vessel geometries. Conclusions: This pre-processing filter effectively reduces noise by grouping neighboring pixels with similar intensity values, allowing the model to focus on relevant anatomical structures, thus outperforming traditional methods in handling the challenges posed by CTA images.
Collapse
Affiliation(s)
- Agne Paulauskaite-Taraseviciene
- Artificial Intelligence Centre, Faculty of Informatics, Kaunas University of Technology, 51423 Kaunas, Lithuania;
- Centre of Excellence for Sustainable Living and Working (SustAInLivWork), 51423 Kaunas, Lithuania; (A.J.); (G.J.)
| | - Julius Siaulys
- Artificial Intelligence Centre, Faculty of Informatics, Kaunas University of Technology, 51423 Kaunas, Lithuania;
- Centre of Excellence for Sustainable Living and Working (SustAInLivWork), 51423 Kaunas, Lithuania; (A.J.); (G.J.)
| | - Antanas Jankauskas
- Centre of Excellence for Sustainable Living and Working (SustAInLivWork), 51423 Kaunas, Lithuania; (A.J.); (G.J.)
- Faculty of Medicine, Lithuanian University of Health Sciences, 44307 Kaunas, Lithuania
| | - Gabriele Jakuskaite
- Centre of Excellence for Sustainable Living and Working (SustAInLivWork), 51423 Kaunas, Lithuania; (A.J.); (G.J.)
- Faculty of Medicine, Lithuanian University of Health Sciences, 44307 Kaunas, Lithuania
| |
Collapse
|
61
|
Duan T, Chen W, Ruan M, Zhang X, Shen S, Gu W. Unsupervised deep learning-based medical image registration: a survey. Phys Med Biol 2025; 70:02TR01. [PMID: 39667278 DOI: 10.1088/1361-6560/ad9e69] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Accepted: 12/12/2024] [Indexed: 12/14/2024]
Abstract
In recent decades, medical image registration technology has undergone significant development, becoming one of the core technologies in medical image analysis. With the rise of deep learning, deep learning-based medical image registration methods have achieved revolutionary improvements in processing speed and automation, showing great potential, especially in unsupervised learning. This paper briefly introduces the core concepts of deep learning-based unsupervised image registration, followed by an in-depth discussion of innovative network architectures and a detailed review of these studies, highlighting their unique contributions. Additionally, this paper explores commonly used loss functions, datasets, and evaluation metrics. Finally, we discuss the main challenges faced by various categories and propose potential future research topics. This paper surveys the latest advancements in unsupervised deep neural network-based medical image registration methods, aiming to help active readers interested in this field gain a deep understanding of this exciting area.
Collapse
Affiliation(s)
- Taisen Duan
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, People's Republic of China
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning 530004, People's Republic of China
| | - Wenkang Chen
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, People's Republic of China
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning 530004, People's Republic of China
| | - Meilin Ruan
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, People's Republic of China
| | - Xuejun Zhang
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, People's Republic of China
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning 530004, People's Republic of China
| | - Shaofei Shen
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, People's Republic of China
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning 530004, People's Republic of China
| | - Weiyu Gu
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, People's Republic of China
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning 530004, People's Republic of China
| |
Collapse
|
62
|
Li Y, Jiang S, Yang Z, Wang L, Wang L, Zhou Z. Data-Oriented Octree Inverse Hierarchical Order Aggregation Hybrid Transformer-CNN for 3D Medical Segmentation. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-024-01299-0. [PMID: 39777616 DOI: 10.1007/s10278-024-01299-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 09/06/2024] [Accepted: 10/07/2024] [Indexed: 01/11/2025]
Abstract
The hybrid CNN-transformer structures harness the global contextualization of transformers with the local feature acuity of CNNs, propelling medical image segmentation to the next level. However, the majority of research has focused on the design and composition of hybrid structures, neglecting the data structure, which enhance segmentation performance, optimize resource efficiency, and bolster model generalization and interpretability. In this work, we propose a data-oriented octree inverse hierarchical order aggregation hybrid transformer-CNN (nnU-OctTN), which focuses on delving deeply into the data itself to identify and harness potential. The nnU-OctTN employs the U-Net as a foundational framework, with the node aggregation transformer serving as the encoder. Data features are stored within an octree data structure with each node computed autonomously yet interconnected through a block-to-block local information exchange mechanism. Oriented towards multi-resolution feature data map learning, a cross-fusion module has been designed that associates the encoder and decoder in a staggered vertical and horizontal approach. Inspired by nnUNet, our framework automatically adapts network parameters to the dataset instead of using pre-trained weights for initialization. The nnU-OctTN method was evaluated on the BTCV, ACDC, and BraTS datasets and achieved excellent performance with dice score coefficient (DSC) 86.95, 92.82, and 90.61, respectively, demonstrating its generalizability and effectiveness. Cross-fusion module effectiveness and model scalability are validated through ablation experiments on BTCV and Kidney. Extensive qualitative and quantitative experimental results demonstrate that nnU-OctTN achieves high-quality 3D medical segmentation that has competitive performance against current state-of-the-art methods, providing a promising idea for clinical applications.
Collapse
Affiliation(s)
- Yuhua Li
- Mechanical Engineering Department, Tianjin University, No. 135, Yaguan Road, Haihe Education Park, Jinnan District, Tianjin City, 300350, China
| | - Shan Jiang
- Mechanical Engineering Department, Tianjin University, No. 135, Yaguan Road, Haihe Education Park, Jinnan District, Tianjin City, 300350, China.
| | - Zhiyong Yang
- Mechanical Engineering Department, Tianjin University, No. 135, Yaguan Road, Haihe Education Park, Jinnan District, Tianjin City, 300350, China
| | - Lixiang Wang
- Mechanical Engineering Department, Tianjin University, No. 135, Yaguan Road, Haihe Education Park, Jinnan District, Tianjin City, 300350, China
| | - Liwen Wang
- Mechanical Engineering Department, Tianjin University, No. 135, Yaguan Road, Haihe Education Park, Jinnan District, Tianjin City, 300350, China
| | - Zeyang Zhou
- Mechanical Engineering Department, Tianjin University, No. 135, Yaguan Road, Haihe Education Park, Jinnan District, Tianjin City, 300350, China
| |
Collapse
|
63
|
Zhang L, Li W, Bi K, Li P, Zhang L, Liu H. FDDSeg: Unleashing the Power of Scribble Annotation for Cardiac MRI Images Through Feature Decomposition Distillation. IEEE J Biomed Health Inform 2025; 29:285-296. [PMID: 38787661 DOI: 10.1109/jbhi.2024.3404884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2024]
Abstract
Cardiovascular diseases can be diagnosed with computer assistance when using the magnetic resonance imaging (MRI) image that is produced by the MRI sensor. Deep learning-based scribbling MRI image segmentation has demonstrated impressive results recently. However, the majority of current approaches possess an excessive number of model parameters and do not completely utilize scribbling annotations. To develop a feature decomposition distillation deep learning method, named FDDSeg, for scribble-supervised cardiac MRI image segmentation. Public ACDC and MSCMR cardiac MRI datasets were used to evaluate the segmentation performance of FDDSeg. FDDSeg adopts a scribble annotation reuse policy to help provide accurate boundaries, and the intermediate features are split class region and class-free region by using the pseudo labels to further improve feature learning. Effective distillation knowledge is then captured by feature decomposition. FDDSeg was compared with 7 state-of-the-art methods, MAAG, ShapePU, CycleMix, Dual-Branch, ZscribbleSeg, Perturbation Dual-Branch as well as ScribbleVC on both ACDC and MSCMR datasets. FDDSeg is shown to perform the best in DSC(89.05% and 88.75%), JC(80.30% and 79.78%) as well as HD95(5.76% and 4.44%) metrics with only 2.01 M of parameters. FDDSeg methods can segment cardiac MRI images more precise with only scribble annotations at lower computation cost, which may help increase the efficiency of quantitative analysis of cardiac.
Collapse
|
64
|
Arega TW, Bricq S, Meriaudeau F. Post-hoc out-of-distribution detection for cardiac MRI segmentation. Comput Med Imaging Graph 2025; 119:102476. [PMID: 39700904 DOI: 10.1016/j.compmedimag.2024.102476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2024] [Revised: 10/29/2024] [Accepted: 12/04/2024] [Indexed: 12/21/2024]
Abstract
In real-world scenarios, medical image segmentation models encounter input images that may deviate from the training images in various ways. These differences can arise from changes in image scanners and acquisition protocols, or even the images can come from a different modality or domain. When the model encounters these out-of-distribution (OOD) images, it can behave unpredictably. Therefore, it is important to develop a system that handles such out-of-distribution images to ensure the safe usage of the models in clinical practice. In this paper, we propose a post-hoc out-of-distribution (OOD) detection method that can be used with any pre-trained segmentation model. Our method utilizes multi-scale representations extracted from the encoder blocks of the segmentation model and employs Mahalanobis distance as a metric to measure the similarity between the input image and the in-distribution images. The segmentation model is pre-trained on a publicly available cardiac short-axis cine MRI dataset. The detection performance of the proposed method is evaluated on 13 different OOD datasets, which can be categorized as near, mild, and far OOD datasets based on their similarity to the in-distribution dataset. The results show that our method outperforms state-of-the-art feature space-based and uncertainty-based OOD detection methods across the various OOD datasets. Our method successfully detects near, mild, and far OOD images with high detection accuracy, showcasing the advantage of using the multi-scale and semantically rich representations of the encoder. In addition to the feature-based approach, we also propose a Dice coefficient-based OOD detection method, which demonstrates superior performance for adversarial OOD detection and shows a high correlation with segmentation quality. For the uncertainty-based method, despite having a strong correlation with the quality of the segmentation results in the near OOD datasets, they failed to detect mild and far OOD images, indicating the weakness of these methods when the images are more dissimilar. Future work will explore combining Mahalanobis distance and uncertainty scores for improved detection of challenging OOD images that are difficult to segment.
Collapse
|
65
|
Pan NY, Huang TY, Yu JJ, Peng HH, Chuang TC, Lin YR, Chung HW, Wu MT. Virtual MOLLI Target: Generative Adversarial Networks Toward Improved Motion Correction in MRI Myocardial T1 Mapping. J Magn Reson Imaging 2025; 61:209-219. [PMID: 38563660 DOI: 10.1002/jmri.29373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 03/21/2024] [Accepted: 03/21/2024] [Indexed: 04/04/2024] Open
Abstract
BACKGROUND The modified Look-Locker inversion recovery (MOLLI) sequence is commonly used for myocardial T1 mapping. However, it acquires images with different inversion times, which causes difficulty in motion correction for respiratory-induced misregistration to a given target image. HYPOTHESIS Using a generative adversarial network (GAN) to produce virtual MOLLI images with consistent heart positions can reduce respiratory-induced misregistration of MOLLI datasets. STUDY TYPE Retrospective. POPULATION 1071 MOLLI datasets from 392 human participants. FIELD STRENGTH/SEQUENCE Modified Look-Locker inversion recovery sequence at 3 T. ASSESSMENT A GAN model with a single inversion time image as input was trained to generate virtual MOLLI target (VMT) images at different inversion times which were subsequently used in an image registration algorithm. Four VMT models were investigated and the best performing model compared with the standard vendor-provided motion correction (MOCO) technique. STATISTICAL TESTS The effectiveness of the motion correction technique was assessed using the fitting quality index (FQI), mutual information (MI), and Dice coefficients of motion-corrected images, plus subjective quality evaluation of T1 maps by three independent readers using Likert score. Wilcoxon signed-rank test with Bonferroni correction for multiple comparison. Significance levels were defined as P < 0.01 for highly significant differences and P < 0.05 for significant differences. RESULTS The best performing VMT model with iterative registration demonstrated significantly better performance (FQI 0.88 ± 0.03, MI 1.78 ± 0.20, Dice 0.84 ± 0.23, quality score 2.26 ± 0.95) compared to other approaches, including the vendor-provided MOCO method (FQI 0.86 ± 0.04, MI 1.69 ± 0.25, Dice 0.80 ± 0.27, quality score 2.16 ± 1.01). DATA CONCLUSION Our GAN model generating VMT images improved motion correction, which may assist reliable T1 mapping in the presence of respiratory motion. Its robust performance, even with considerable respiratory-induced heart displacements, may be beneficial for patients with difficulties in breath-holding. LEVEL OF EVIDENCE 3 TECHNICAL EFFICACY: Stage 1.
Collapse
Affiliation(s)
- Nai-Yu Pan
- Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Teng-Yi Huang
- Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Jui-Jung Yu
- Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Hsu-Hsia Peng
- Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan
| | - Tzu-Chao Chuang
- Department of Electrical Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan
| | - Yi-Ru Lin
- Department of Electronic and Computer Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Hsiao-Wen Chung
- Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan
| | - Ming-Ting Wu
- Department of Radiology, Kaohsiung Veterans General Hospital, Kaohsiung, Taiwan
- School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| |
Collapse
|
66
|
Wang Y, Huang G, Lu Z, Wang Y, Chen X, Yuan X, Li Y, Liu J, Huang Y. HEDN: multi-oriented hierarchical extraction and dual-frequency decoupling network for 3D medical image segmentation. Med Biol Eng Comput 2025; 63:267-291. [PMID: 39316283 DOI: 10.1007/s11517-024-03192-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2024] [Accepted: 08/28/2024] [Indexed: 09/25/2024]
Abstract
Previous 3D encoder-decoder segmentation architectures struggled with fine-grained feature decomposition, resulting in unclear feature hierarchies when fused across layers. Furthermore, the blurred nature of contour boundaries in medical imaging limits the focus on high-frequency contour features. To address these challenges, we propose a Multi-oriented Hierarchical Extraction and Dual-frequency Decoupling Network (HEDN), which consists of three modules: Encoder-Decoder Module (E-DM), Multi-oriented Hierarchical Extraction Module (Multi-HEM), and Dual-frequency Decoupling Module (Dual-DM). The E-DM performs the basic encoding and decoding tasks, while Multi-HEM decomposes and fuses spatial and slice-level features in 3D, enriching the feature hierarchy by weighting them through 3D fusion. Dual-DM separates high-frequency features from the reconstructed network using self-supervision. Finally, the self-supervised high-frequency features separated by Dual-DM are inserted into the process following Multi-HEM, enhancing interactions and complementarities between contour features and hierarchical features, thereby mutually reinforcing both aspects. On the Synapse dataset, HEDN outperforms existing methods, boosting Dice Similarity Score (DSC) by 1.38% and decreasing 95% Hausdorff Distance (HD95) by 1.03 mm. Likewise, on the Automatic Cardiac Diagnosis Challenge (ACDC) dataset, HEDN achieves 0.5% performance gains across all categories.
Collapse
Affiliation(s)
- Yu Wang
- Public Courses Department, Hunan Traditional Chinese Medical College, Zhuzhou, 412012, Hunan, China
| | - Guoheng Huang
- School of Computer Science, Guangdong University of Technology, Guangzhou, 510006, Guangdong, China.
| | - Zeng Lu
- Guangzhou Interesting Pill Network Technology Co., Ltd., Guangzhou, 510630, Guangdong, China
| | - Ying Wang
- Faculty of Applied Sciences, Macao Polytechnic University, Macao SAR, 999078, China
| | - Xuhang Chen
- School of Computer Science and Engineering, Huizhou University, Huizhou, 516001, Guangdong, China
| | - Xiaochen Yuan
- Faculty of Applied Sciences, Macao Polytechnic University, Macao SAR, 999078, China
| | - Yan Li
- Shenzhen Polytechnic University, Shenzhen, 518000, Guangdong, China.
| | - Jieni Liu
- No. 8 Second Ring South Road, Ningxiang Traditional Chinese Medicine Hospital, Ningxiang, 410699, Hunan, China.
| | - Yingping Huang
- Department of Radiation Oncology, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangzhou, 510006, Guangdong, China.
| |
Collapse
|
67
|
Liu L, Aviles-Rivero AI, Schonlieb CB. Contrastive Registration for Unsupervised Medical Image Segmentation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:147-159. [PMID: 37983143 DOI: 10.1109/tnnls.2023.3332003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Medical image segmentation is an important task in medical imaging, as it serves as the first step for clinical diagnosis and treatment planning. While major success has been reported using deep learning supervised techniques, they assume a large and well-representative labeled set. This is a strong assumption in the medical domain where annotations are expensive, time-consuming, and inherent to human bias. To address this problem, unsupervised segmentation techniques have been proposed in the literature. Yet, none of the existing unsupervised segmentation techniques reach accuracies that come even near to the state-of-the-art of supervised segmentation methods. In this work, we present a novel optimization model framed in a new convolutional neural network (CNN)-based contrastive registration architecture for unsupervised medical image segmentation called CLMorph. The core idea of our approach is to exploit image-level registration and feature-level contrastive learning, to perform registration-based segmentation. First, we propose an architecture to capture the image-to-image transformation mapping via registration for unsupervised medical image segmentation. Second, we embed a contrastive learning mechanism in the registration architecture to enhance the discriminative capacity of the network at the feature level. We show that our proposed CLMorph technique mitigates the major drawbacks of existing unsupervised techniques. We demonstrate, through numerical and visual experiments, that our technique substantially outperforms the current state-of-the-art unsupervised segmentation methods on two major medical image datasets.
Collapse
|
68
|
Zhao T, Gu Y, Yang J, Usuyama N, Lee HH, Kiblawi S, Naumann T, Gao J, Crabtree A, Abel J, Moung-Wen C, Piening B, Bifulco C, Wei M, Poon H, Wang S. A foundation model for joint segmentation, detection and recognition of biomedical objects across nine modalities. Nat Methods 2025; 22:166-176. [PMID: 39558098 DOI: 10.1038/s41592-024-02499-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Accepted: 10/02/2024] [Indexed: 11/20/2024]
Abstract
Biomedical image analysis is fundamental for biomedical discovery. Holistic image analysis comprises interdependent subtasks such as segmentation, detection and recognition, which are tackled separately by traditional approaches. Here, we propose BiomedParse, a biomedical foundation model that can jointly conduct segmentation, detection and recognition across nine imaging modalities. This joint learning improves the accuracy for individual tasks and enables new applications such as segmenting all relevant objects in an image through a textual description. To train BiomedParse, we created a large dataset comprising over 6 million triples of image, segmentation mask and textual description by leveraging natural language labels or descriptions accompanying existing datasets. We showed that BiomedParse outperformed existing methods on image segmentation across nine imaging modalities, with larger improvement on objects with irregular shapes. We further showed that BiomedParse can simultaneously segment and label all objects in an image. In summary, BiomedParse is an all-in-one tool for biomedical image analysis on all major image modalities, paving the path for efficient and accurate image-based biomedical discovery.
Collapse
Affiliation(s)
| | - Yu Gu
- Microsoft Research, Redmond, WA, USA
| | | | | | | | | | | | | | - Angela Crabtree
- Earle A. Chiles Research Institute, Providence Cancer Institute, Portland, OR, USA
| | | | | | - Brian Piening
- Providence Genomics, Portland, OR, USA
- Earle A. Chiles Research Institute, Providence Cancer Institute, Portland, OR, USA
| | - Carlo Bifulco
- Providence Genomics, Portland, OR, USA
- Earle A. Chiles Research Institute, Providence Cancer Institute, Portland, OR, USA
| | - Mu Wei
- Microsoft Research, Redmond, WA, USA.
| | | | - Sheng Wang
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA.
- Department of Surgery, University of Washington, Seattle, WA, USA.
| |
Collapse
|
69
|
Liu H, Ren P, Yuan Y, Song C, Luo F. Uncertainty Global Contrastive Learning Framework for Semi-Supervised Medical Image Segmentation. IEEE J Biomed Health Inform 2025; 29:433-442. [PMID: 39504281 DOI: 10.1109/jbhi.2024.3492540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2024]
Abstract
In semi-supervised medical image segmentation, the issue of fuzzy boundaries for segmented objects arises. With limited labeled data and the interaction of boundaries from different segmented objects, classifying segmentation boundaries becomes challenging. To mitigate this issue, we propose an uncertainty global contrastive learning (UGCL) framework. Specifically, we propose a patch filtering method and a classification entropy filtering method to provide reliable pseudo-labels for unlabelled data, while separating fuzzy boundaries and high-entropy pixel points as unreliable points. Considering that unreliable regions contain rich complementary information, we introduce an uncertainty global contrast learning method to distinguish these challenging unreliable regions, enhancing intra-class compactness and inter-class separability at the global data level. Within our optimization framework, we also integrate consistency regularization techniques and select unreliable points as targets for consistency. As demonstrated, the contrastive learning and consistency regularization applied to uncertain points enable us to glean valuable semantic information from unreliable data, which enhances segmentation accuracy. We evaluate our method on two publicly available medical image datasets and compare it with other state-of-the-art semi-supervised medical image segmentation methods, and a series of experimental results show that our method has achieved substantial improvements.
Collapse
|
70
|
Elizar E, Muharar R, Zulkifley MA. DeSPPNet: A Multiscale Deep Learning Model for Cardiac Segmentation. Diagnostics (Basel) 2024; 14:2820. [PMID: 39767181 PMCID: PMC11674640 DOI: 10.3390/diagnostics14242820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2024] [Revised: 12/05/2024] [Accepted: 12/06/2024] [Indexed: 01/11/2025] Open
Abstract
BACKGROUND Cardiac magnetic resonance imaging (MRI) plays a crucial role in monitoring disease progression and evaluating the effectiveness of treatment interventions. Cardiac MRI allows medical practitioners to assess cardiac function accurately by providing comprehensive and quantitative information about the structure and function, hence making it an indispensable tool for monitoring the disease and treatment response. Deep learning-based segmentation enables the precise delineation of cardiac structures including the myocardium, right ventricle, and left ventricle. The accurate segmentation of these structures helps in the diagnosis of heart failure, cardiac functional response to therapies, and understanding the state of the heart functions after treatment. OBJECTIVES The objective of this study is to develop a multiscale deep learning model to segment cardiac organs based on MRI imaging data. Good segmentation performance is difficult to achieve due to the complex nature of the cardiac structure, which includes a variety of chambers, arteries, and tissues. Furthermore, the human heart is also constantly beating, leading to motion artifacts that reduce image clarity and consistency. As a result, a multiscale method is explored to overcome various challenges in segmenting cardiac MRI images. METHODS This paper proposes DeSPPNet, a multiscale-based deep learning network. Its foundation follows encoder-decoder pair architecture that utilizes the Spatial Pyramid Pooling (SPP) layer to improve the performance of cardiac semantic segmentation. The SPP layer is designed to pool features from densely convolutional layers at different scales or sizes, which will be combined to maintain a set of spatial information. By processing features at different spatial resolutions, the multiscale densely connected layer in the form of the Pyramid Pooling Dense Module (PPDM) helps the network to capture both local and global context, preserving finer details of the cardiac structure while also capturing the broader context required to accurately segment larger cardiac structures. The PPDM is incorporated into the deeper layer of the encoder section of the deep learning network to allow it to recognize complex semantic features. RESULTS An analysis of multiple PPDM placement scenarios and structural variations revealed that the 3-path PPDM, positioned at the encoder layer 5, yielded optimal segmentation performance, achieving dice, intersection over union (IoU), and accuracy scores of 0.859, 0.800, and 0.993, respectively. CONCLUSIONS Different PPDM configurations produce a different effect on the network; as such, a shallower layer placement, like encoder layer 4, retains more spatial data that need more parallel paths to gather the optimal set of multiscale features. In contrast, deeper layers contain more informative features but at a lower spatial resolution, which reduces the number of parallel paths required to provide optimal multiscale context.
Collapse
Affiliation(s)
- Elizar Elizar
- Department of Electrical, Electronic and Systems Engineering, Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia, Bangi 43600, Malaysia;
- Department of Electrical and Computer Engineering, Faculty of Engineering, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia;
| | - Rusdha Muharar
- Department of Electrical and Computer Engineering, Faculty of Engineering, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia;
| | - Mohd Asyraf Zulkifley
- Department of Electrical, Electronic and Systems Engineering, Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia, Bangi 43600, Malaysia;
| |
Collapse
|
71
|
Chen H, Gao J, Chen Z, Gao C, Huo S, Jiang M, Pu J, Hu C. Improve myocardial strain estimation based on deformable groupwise registration with a locally low-rank dissimilarity metric. BMC Med Imaging 2024; 24:330. [PMID: 39639206 PMCID: PMC11619273 DOI: 10.1186/s12880-024-01519-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2024] [Accepted: 11/26/2024] [Indexed: 12/07/2024] Open
Abstract
BACKGROUND Current mainstream cardiovascular magnetic resonance-feature tracking (CMR-FT) methods, including optical flow and pairwise registration, often suffer from the drift effect caused by accumulative tracking errors. Here, we developed a CMR-FT method based on deformable groupwise registration with a locally low-rank (LLR) dissimilarity metric to improve myocardial tracking and strain estimation accuracy. METHODS The proposed method, Groupwise-LLR, performs feature tracking by iteratively updating the entire displacement field across all cardiac phases to minimize the sum of the patchwise signal ranks of the deformed movie. The method was compared with alternative CMR-FT methods including the Farneback optical flow, a sequentially pairwise registration method, and a global low rankness-based groupwise registration method via a simulated dataset (n = 20), a public cine data set (n = 100), and an in-house tagging-MRI patient dataset (n = 16). The proposed method was also compared with two general groupwise registration methods, nD + t B-Splines and pTVreg, in simulations and in vivo tracking. RESULTS On the simulated dataset, Groupwise-LLR achieved the lowest point tracking errors (p = 0.13 against pTVreg for the temporally averaged point tracking errors in the long-axis view, and p < 0.05 for all other cases), voxelwise strain errors (all p < 0.05), and global strain errors (p = 0.05 against pTVreg for the longitudinal global strain errors, and p < 0.05 for all other cases). On the public dataset, Groupwise-LLR achieved the lowest contour tracking errors (all p < 0.05), reduced the drift effect in late-diastole, and preserved similar inter-observer reproducibility as the alternative methods. On the patient dataset, Groupwise-LLR correlated better with tagging-MRI for radial strains than the other CMR-FT methods in multiple myocardial segments and levels. CONCLUSIONS The proposed Groupwise-LLR reduces the drift effect and provides more accurate myocardial tracking and strain estimation than the alternative methods. The method may thus facilitate a more accurate estimation of myocardial strains for clinical assessments of cardiac function.
Collapse
Affiliation(s)
- Haiyang Chen
- National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Juan Gao
- National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Zhuo Chen
- National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Chenhao Gao
- National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Sirui Huo
- National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Meng Jiang
- Division of Cardiology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China.
| | - Jun Pu
- Division of Cardiology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China.
| | - Chenxi Hu
- National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China.
| |
Collapse
|
72
|
Azad R, Aghdam EK, Rauland A, Jia Y, Avval AH, Bozorgpour A, Karimijafarbigloo S, Cohen JP, Adeli E, Merhof D. Medical Image Segmentation Review: The Success of U-Net. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:10076-10095. [PMID: 39167505 DOI: 10.1109/tpami.2024.3435571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
Automatic medical image segmentation is a crucial topic in the medical domain and successively a critical counterpart in the computer-aided diagnosis paradigm. U-Net is the most widespread image segmentation architecture due to its flexibility, optimized modular design, and success in all medical image modalities. Over the years, the U-Net model has received tremendous attention from academic and industrial researchers who have extended it to address the scale and complexity created by medical tasks. These extensions are commonly related to enhancing the U-Net's backbone, bottleneck, or skip connections, or including representation learning, or combining it with a Transformer architecture, or even addressing probabilistic prediction of the segmentation map. Having a compendium of different previously proposed U-Net variants makes it easier for machine learning researchers to identify relevant research questions and understand the challenges of the biological tasks that challenge the model. In this work, we discuss the practical aspects of the U-Net model and organize each variant model into a taxonomy. Moreover, to measure the performance of these strategies in a clinical application, we propose fair evaluations of some unique and famous designs on well-known datasets. Furthermore, we provide a comprehensive implementation library with trained models. In addition, for ease of future studies, we created an online list of U-Net papers with their possible official implementation.
Collapse
|
73
|
Rauby B, Xing P, Gasse M, Provost J. Deep Learning in Ultrasound Localization Microscopy: Applications and Perspectives. IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL 2024; 71:1765-1784. [PMID: 39288061 DOI: 10.1109/tuffc.2024.3462299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/19/2024]
Abstract
Ultrasound localization microscopy (ULM) is a novel super-resolution imaging technique that can image the vasculature in vivo at depth with resolution far beyond the conventional limit of diffraction. By relying on the localization and tracking of clinically approved microbubbles injected in the blood stream, ULM can provide not only anatomical visualization but also hemodynamic quantification of the microvasculature. Several deep learning approaches have been proposed to address challenges in ULM including denoising, improving microbubble localization, estimating blood flow velocity, or performing aberration correction. Proposed deep learning methods often outperform their conventional counterparts by improving image quality and reducing processing time. In addition, their robustness to high concentrations of microbubbles can lead to reduced acquisition times in ULM, addressing a major hindrance to ULM clinical application. Herein, we propose a comprehensive review of the diversity of deep learning applications in ULM focusing on approaches assuming a sparse microbubble distribution. We first provide an overview of how existing studies vary in the constitution of their datasets or in the tasks targeted by the deep learning model. We also take a deeper look into the numerous approaches that have been proposed to improve the localization of microbubbles since they differ highly in their formulation of the optimization problem, their evaluation, or their network architectures. We finally discuss the current limitations and challenges of these methods, as well as the promises and potential of deep learning for ULM in the future.
Collapse
|
74
|
Le Y, Zhao C, An J, Zhou J, Deng D, He Y. Progress in the Clinical Application of Artificial Intelligence for Left Ventricle Analysis in Cardiac Magnetic Resonance. Rev Cardiovasc Med 2024; 25:447. [PMID: 39742214 PMCID: PMC11683706 DOI: 10.31083/j.rcm2512447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 08/08/2024] [Accepted: 08/15/2024] [Indexed: 01/03/2025] Open
Abstract
Cardiac magnetic resonance (CMR) imaging enables a one-stop assessment of heart structure and function. Artificial intelligence (AI) can simplify and automate work flows and improve image post-processing speed and diagnostic accuracy; thus, it greatly affects many aspects of CMR. This review highlights the application of AI for left heart analysis in CMR, including quality control, image segmentation, and global and regional functional assessment. Most recent research has focused on segmentation of the left ventricular myocardium and blood pool. Although many algorithms have shown a level comparable to that of human experts, some problems, such as poor performance of basal and apical segmentation and false identification of myocardial structure, remain. Segmentation of myocardial fibrosis is another research hotspot, and most patient cohorts of such studies have hypertrophic cardiomyopathy. Whether the above methods are applicable to other patient groups requires further study. The use of automated CMR interpretation for the diagnosis and prognosis assessment of cardiovascular diseases demonstrates great clinical potential. However, prospective large-scale clinical trials are needed to investigate the real-word application of AI technology in clinical practice.
Collapse
Affiliation(s)
- Yinghui Le
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, 100050 Beijing, China
| | - Chongshang Zhao
- Key Laboratory for Biomedical Engineering of Ministry of Education, Institute of Biomedical Engineering, Zhejiang University, 310058 Hangzhou, Zhejiang, China
| | - Jing An
- Siemens Shenzhen Magnetic Resonance, MR Collaboration NE Asia, 518000 Shenzhen, Guangdong, China
| | - Jiali Zhou
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, 100050 Beijing, China
| | - Dongdong Deng
- School of Biomedical Engineering, Dalian University of Technology, 116024 Dalian, Liaoning, China
| | - Yi He
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, 100050 Beijing, China
| |
Collapse
|
75
|
Wu G, Ji H. RETRACTED ARTICLE: Short-term memory neural network-based cognitive computing in sports training complexity pattern recognition. Soft comput 2024; 28:439. [PMID: 35035279 PMCID: PMC8747855 DOI: 10.1007/s00500-021-06568-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/11/2021] [Indexed: 11/30/2022]
Affiliation(s)
- Guang Wu
- College of Physical Education,
Chongqing Technology and Business University,
Chongqing, 400067 Nan’an China
| | - Hang Ji
- Shijiazhuang School of the Arts,
Shijiazhuang, 050800 Hebei China
| |
Collapse
|
76
|
Kobayashi H, Nakata N, Izuka S, Hongo K, Nishikawa M. Using artificial intelligence and promoter-level transcriptome analysis to identify a biomarker as a possible prognostic predictor of cardiac complications in male patients with Fabry disease. Mol Genet Metab Rep 2024; 41:101152. [PMID: 39484074 PMCID: PMC11525769 DOI: 10.1016/j.ymgmr.2024.101152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Revised: 10/06/2024] [Accepted: 10/07/2024] [Indexed: 11/03/2024] Open
Abstract
Fabry disease is the most frequently occurring form of lysosomal disease in Japan, and is characterized by a wide variety of conditions. Primarily, the three major types of concerns associated with Fabry disease observed during adulthood that must be prevented are central nervous system, renal, and cardiac complications. Cardiac complications, such as cardiomyopathy, cardiac muscle fibrosis, and severe arrhythmia, are the most common mortality causes in patients with Fabry disease. To predict cardiac complications of Fabry disease, we extracted RNA from the venous blood of patients for cap analysis of gene expression (CAGE), performed likelihood ratio tests for each RNA expression dataset obtained from individuals with and without cardiac complications, and analyzed the correlation between cardiac functional factors observed using magnetic resonance imaging data extracted using artificial intelligence algorithms and RNA expression. Our findings showed that CHN1 expression was significantly higher in male Fabry disease patients with cardiac complications and that it could be associated with many cardiac functional factors. CHN1 encodes a GTPase-activating protein, chimerin 1, which is specific to the GTP-binding protein Rac (involved in oxidative stress generation and the promotion of myocardial fibrosis). Thus, CHN1 is a potential predictive biomarker of cardiac complications in Fabry disease; however, further studies are required to confirm this observation.
Collapse
Affiliation(s)
- Hiroshi Kobayashi
- Division of Gene Therapy, Research Center for Medical Sciences, The Jikei University of Medicine, 3-25-8, Nishi-shimbashi, Minato-ku, Tokyo 105-8461, Japan
- Department of Pediatrics, The Jikei University of Medicine, 3-25-8, Nishi-shimbashi, Minato-ku, Tokyo 105-8461, Japan
| | - Norio Nakata
- Division of Artificial Intelligence Medicine, Research Center for Medical Sciences, The Jikei University of Medicine, 3-25-8, Nishi-shimbashi, Minato-ku, Tokyo 105-8461, Japan
- Department of Radiology, The Jikei University of Medicine, 3-25-8, Nishi-shimbashi, Minato-ku, Tokyo 105-8461, Japan
| | - Sayoko Izuka
- Division of Gene Therapy, Research Center for Medical Sciences, The Jikei University of Medicine, 3-25-8, Nishi-shimbashi, Minato-ku, Tokyo 105-8461, Japan
| | - Kenichi Hongo
- Division of Cardiology, Department of Internal Medicine, The Jikei University of Medicine, 3-25-8, Nishi-shimbashi, Minato-ku, Tokyo 105-8461, Japan
| | - Masako Nishikawa
- Clinical Research Support Center, The Jikei University of Medicine, 3-25-8, Nishi-shimbashi, Minato-ku, Tokyo 105-8461, Japan
| |
Collapse
|
77
|
Wang R, Kou Q, Dou L. LIT-Unet: a lightweight and effective model for medical image segmentation. Radiol Phys Technol 2024; 17:878-887. [PMID: 39302610 DOI: 10.1007/s12194-024-00844-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 08/31/2024] [Accepted: 09/04/2024] [Indexed: 09/22/2024]
Abstract
This study aimed to design a simple and efficient automatic segmentation model for medical images, so as to facilitate doctors to make more accurate diagnosis and treatment plan. A hybrid lightweight network LIT-Unet with symmetric encoder-decoder U-shaped architecture is proposed. Synapse multi-organ segmentation dataset and automated cardiac diagnosis challenge (ACDC) dataset were used to test the segmentation performance of the method. Two indexes, Dice similarity coefficient (DSC ↑) and 95% Hausdorff distance (HD95 ↓), were used to evaluate and compare the segmentation ability with the current advanced methods. Ablation experiments were conducted to demonstrate the lightweight nature and effectiveness of our model. For Synapse dataset, our model achieves a higher DSC score (80.40%), an improvement of 3.8% over the typical hybrid model (TransUnet). The 95 HD value is low at 20.67%. For ACDC dataset, LIT-Unet achieves the optimal average DSC (%) of 91.84 compared with other networks listed. Compared to patch expanding, the DSC of our model is intuitively improved by 1.62% with the help of deformable token merging (DTM). These results show that the proposed hierarchical LIT-Unet can achieve significant accuracy and is expected to provide a reliable basis for clinical diagnosis and treatment.
Collapse
Affiliation(s)
- Ru Wang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
- Department of Radiology, Xuzhou Central Hospital, Xuzhou, 221009, China
| | - Qiqi Kou
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China
| | - Lina Dou
- Department of Radiology, Xuzhou Central Hospital, Xuzhou, 221009, China.
| |
Collapse
|
78
|
Jafari R, Kandpal A, Verma R, Aggarwal V, Gupta RK, Singh A. Automatic pipeline for segmentation of LV myocardium on quantitative MR T1 maps using deep learning model and computation of radial T1 and ECV values. NMR IN BIOMEDICINE 2024; 37:e5230. [PMID: 39097976 DOI: 10.1002/nbm.5230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 07/16/2024] [Accepted: 07/18/2024] [Indexed: 08/06/2024]
Abstract
Native T1 mapping is a non-invasive technique used for early detection of diffused myocardial abnormalities, and it provides baseline tissue characterization. Post-contrast T1 mapping enhances tissue differentiation, enables extracellular volume (ECV) calculation, and improves myocardial viability assessment. Accurate and precise segmenting of the left ventricular (LV) myocardium on T1 maps is crucial for assessing myocardial tissue characteristics and diagnosing cardiovascular diseases (CVD). This study presents a deep learning (DL)-based pipeline for automatically segmenting LV myocardium on T1 maps and automatic computation of radial T1 and ECV values. The study employs a multicentric dataset consisting of retrospective multiparametric MRI data of 332 subjects to develop and assess the performance of the proposed method. The study compared DL architectures U-Net and Deep Res U-Net for LV myocardium segmentation, which achieved a dice similarity coefficient of 0.84 ± 0.43 and 0.85 ± 0.03, respectively. The dice similarity coefficients computed for radial sub-segmentation of the LV myocardium on basal, mid-cavity, and apical slices were 0.77 ± 0.21, 0.81 ± 0.17, and 0.61 ± 0.14, respectively. The t-test performed between ground truth vs. predicted values of native T1, post-contrast T1, and ECV showed no statistically significant difference (p > 0.05) for any of the radial sub-segments. The proposed DL method leverages the use of quantitative T1 maps for automatic LV myocardium segmentation and accurately computing radial T1 and ECV values, highlighting its potential for assisting radiologists in objective cardiac assessment and, hence, in CVD diagnostics.
Collapse
Affiliation(s)
- Raufiya Jafari
- Centre for Biomedical Engineering, Indian Institute of Technology Delhi, New Delhi, India
| | - Ankit Kandpal
- Centre for Biomedical Engineering, Indian Institute of Technology Delhi, New Delhi, India
| | - Radhakrishan Verma
- Department of Radiology, Fortis Memorial Research Institute, Gurugram, India
| | - Vinayak Aggarwal
- Department of Cardiology, Fortis Memorial Research Institute, Gurugram, India
| | - Rakesh Kumar Gupta
- Department of Radiology, Fortis Memorial Research Institute, Gurugram, India
| | - Anup Singh
- Centre for Biomedical Engineering, Indian Institute of Technology Delhi, New Delhi, India
- Department of Biomedical Engineering, AIIMS, New Delhi, India
- Yardi School of Artificial Intelligence, Indian Institute of Technology Delhi, New Delhi, India
| |
Collapse
|
79
|
Barmak O, Krak I, Yakovlev S, Manziuk E, Radiuk P, Kuznetsov V. Toward explainable deep learning in healthcare through transition matrix and user-friendly features. Front Artif Intell 2024; 7:1482141. [PMID: 39654544 PMCID: PMC11625760 DOI: 10.3389/frai.2024.1482141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2024] [Accepted: 11/06/2024] [Indexed: 12/12/2024] Open
Abstract
Modern artificial intelligence (AI) solutions often face challenges due to the "black box" nature of deep learning (DL) models, which limits their transparency and trustworthiness in critical medical applications. In this study, we propose and evaluate a scalable approach based on a transition matrix to enhance the interpretability of DL models in medical signal and image processing by translating complex model decisions into user-friendly and justifiable features for healthcare professionals. The criteria for choosing interpretable features were clearly defined, incorporating clinical guidelines and expert rules to align model outputs with established medical standards. The proposed approach was tested on two medical datasets: electrocardiography (ECG) for arrhythmia detection and magnetic resonance imaging (MRI) for heart disease classification. The performance of the DL models was compared with expert annotations using Cohen's Kappa coefficient to assess agreement, achieving coefficients of 0.89 for the ECG dataset and 0.80 for the MRI dataset. These results demonstrate strong agreement, underscoring the reliability of the approach in providing accurate, understandable, and justifiable explanations of DL model decisions. The scalability of the approach suggests its potential applicability across various medical domains, enhancing the generalizability and utility of DL models in healthcare while addressing practical challenges and ethical considerations.
Collapse
Affiliation(s)
- Oleksander Barmak
- Department of Computer Science, Khmelnytskyi National University, Khmelnytskyi, Ukraine
| | - Iurii Krak
- Department of Theoretical Cybernetics, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
- Laboratory of Communicative Information Technologies, V.M. Glushkov Institute of Cybernetics, Kyiv, Ukraine
| | - Sergiy Yakovlev
- Department of Mathematical Modeling and Artificial Intelligence, National Aerospace University “Kharkiv Aviation Institute”, Kharkiv, Ukraine
- Institute of Computer Science and Artificial Intelligence, V.N. Karazin Kharkiv National University, Kharkiv, Ukraine
| | - Eduard Manziuk
- Department of Computer Science, Khmelnytskyi National University, Khmelnytskyi, Ukraine
| | - Pavlo Radiuk
- Department of Computer Science, Khmelnytskyi National University, Khmelnytskyi, Ukraine
| | - Vladislav Kuznetsov
- Department of Theoretical Cybernetics, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
- Laboratory of Communicative Information Technologies, V.M. Glushkov Institute of Cybernetics, Kyiv, Ukraine
| |
Collapse
|
80
|
Li C, Zheng Z, Wu D. Shape-Aware Adversarial Learning for Scribble-Supervised Medical Image Segmentation with a MaskMix Siamese Network: A Case Study of Cardiac MRI Segmentation. Bioengineering (Basel) 2024; 11:1146. [PMID: 39593806 PMCID: PMC11592347 DOI: 10.3390/bioengineering11111146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2024] [Revised: 11/12/2024] [Accepted: 11/12/2024] [Indexed: 11/28/2024] Open
Abstract
The transition in medical image segmentation from fine-grained to coarse-grained annotation methods, notably scribble annotation, offers a practical and efficient preparation for deep learning applications. However, these methods often compromise segmentation precision and result in irregular contours. This study targets the enhancement of scribble-supervised segmentation to match the accuracy of fine-grained annotation. Capitalizing on the consistency of target shapes across unpaired datasets, this study introduces a shape-aware scribble-supervised learning framework (MaskMixAdv) addressing two critical tasks: (1) Pseudo label generation, where a mixup-based masking strategy enables image-level and feature-level data augmentation to enrich coarse-grained scribbles annotations. A dual-branch siamese network is proposed to generate fine-grained pseudo labels. (2) Pseudo label optimization, where a CNN-based discriminator is proposed to refine pseudo label contours by distinguishing them from external unpaired masks during model fine-tuning. MaskMixAdv works under constrained annotation conditions as a label-efficient learning approach for medical image segmentation. A case study on public cardiac MRI datasets demonstrated that the proposed MaskMixAdv outperformed the state-of-the-art methods and narrowed the performance gap between scribble-supervised and mask-supervised segmentation. This innovation cuts annotation time by at least 95%, with only a minor impact on Dice performance, specifically a 2.6% reduction. The experimental outcomes indicate that employing efficient and cost-effective scribble annotation can achieve high segmentation accuracy, significantly reducing the typical requirement for fine-grained annotations.
Collapse
Affiliation(s)
| | - Zhong Zheng
- College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China; (C.L.); (D.W.)
| | | |
Collapse
|
81
|
Han T, Cao H, Yang Y. AS2LS: Adaptive Anatomical Structure-Based Two-Layer Level Set Framework for Medical Image Segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:6393-6408. [PMID: 39446550 DOI: 10.1109/tip.2024.3483563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/26/2024]
Abstract
Medical images often exhibit intricate structures, inhomogeneous intensity, significant noise and blurred edges, presenting challenges for medical image segmentation. Several segmentation algorithms grounded in mathematics, computer science, and medical domains have been proposed to address this matter; nevertheless, there is still considerable scope for improvement. This paper proposes a novel adaptive anatomical structure-based two-layer level set framework (AS2LS) for segmenting organs with concentric structures, such as the left ventricle and the fundus. By adaptive fitting region and edge intensity information, the AS2LS achieves high accuracy in segmenting complex medical images characterized by inhomogeneous intensity, blurred boundaries and interference from surrounding organs. Moreover, we introduce a novel two-layer level set representation based on anatomical structures, coupled with a two-stage level set evolution algorithm. Experimental results demonstrate the superior accuracy of AS2LS in comparison to representative level set methods and deep learning methods.
Collapse
|
82
|
Li Z, Zhang J, Wei S, Gao Y, Cao C, Wu Z. TPAFNet: Transformer-Driven Pyramid Attention Fusion Network for 3D Medical Image Segmentation. IEEE J Biomed Health Inform 2024; 28:6803-6814. [PMID: 39283776 DOI: 10.1109/jbhi.2024.3460745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2024]
Abstract
The field of 3D medical image segmentation is witnessing a growing trend in the utilization of combined networks that integrate convolutional neural networks and transformers. Nevertheless, prevailing hybrid networks are confronted with limitations in their straightforward serial or parallel combination methods and lack an effective mechanism to fuse channel and spatial feature attention. To address these limitations, we present a robust multi-scale 3D medical image segmentation network, the Transformer-Driven Pyramid Attention Fusion Network, which is denoted as TPAFNet, leveraging a hybrid structure of CNN and transformer. Within this framework, we exploit the characteristics of atrous convolution to extract multi-scale information effectively, thereby enhancing the encoding results of the transformer. Furthermore, we introduce the TPAF block in the encoder to seamlessly fuse channel and spatial feature attention from multi-scale feature inputs. In contrast to conventional skip connections that simply concatenate or add features, our decoder is enriched with a TPAF connection, elevating the integration of feature attention between low-level and high-level features. Additionally, we propose a low-level encoding shortcut from the original input to the decoder output, preserving more original image features and contributing to enhanced results. Finally, the deep supervision is implemented using a novel CNN-based voxel-wise classifier to facilitate better network convergence. Experimental results demonstrate that TPAFNet significantly outperforms other state-of-the-art networks on two public datasets, indicating that our research can effectively improve the accuracy of medical image segmentation, thereby assisting doctors in making more precise diagnoses.
Collapse
|
83
|
Huang W, Zhang L, Wang Z, Wang L. Exploring Inherent Consistency for Semi-Supervised Anatomical Structure Segmentation in Medical Imaging. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3731-3741. [PMID: 38743533 DOI: 10.1109/tmi.2024.3400840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Due to the exorbitant expense of obtaining labeled data in the field of medical image analysis, semi-supervised learning has emerged as a favorable method for the segmentation of anatomical structures. Although semi-supervised learning techniques have shown great potential in this field, existing methods only utilize image-level spatial consistency to impose unsupervised regularization on data in label space. Considering that anatomical structures often possess inherent anatomical properties that have not been focused on in previous works, this study introduces the inherent consistency into semi-supervised anatomical structure segmentation. First, the prediction and the ground-truth are projected into an embedding space to obtain latent representations that encapsulate the inherent anatomical properties of the structures. Then, two inherent consistency constraints are designed to leverage these inherent properties by aligning these latent representations. The proposed method is plug-and-play and can be seamlessly integrated with existing methods, thereby collaborating to improve segmentation performance and enhance the anatomical plausibility of the results. To evaluate the effectiveness of the proposed method, experiments are conducted on three public datasets (ACDC, LA, and Pancreas). Extensive experimental results demonstrate that the proposed method exhibits good generalizability and outperforms several state-of-the-art methods.
Collapse
|
84
|
Zhang M, Zhang Y, Liu S, Han Y, Cao H, Qiao B. Dual-attention transformer-based hybrid network for multi-modal medical image segmentation. Sci Rep 2024; 14:25704. [PMID: 39465274 PMCID: PMC11514281 DOI: 10.1038/s41598-024-76234-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2024] [Accepted: 10/11/2024] [Indexed: 10/29/2024] Open
Abstract
Accurate medical image segmentation plays a vital role in clinical practice. Convolutional Neural Network and Transformer are mainstream architectures for this task. However, convolutional neural network lacks the ability of modeling global dependency while Transformer cannot extract local details. In this paper, we propose DATTNet, Dual ATTention Network, an encoder-decoder deep learning model for medical image segmentation. DATTNet is exploited in hierarchical fashion with two novel components: (1) Dual Attention module is designed to model global dependency in spatial and channel dimensions. (2) Context Fusion Bridge is presented to remix the feature maps with multiple scales and construct their correlations. The experiments on ACDC, Synapse and Kvasir-SEG datasets are conducted to evaluate the performance of DATTNet. Our proposed model shows superior performance, effectiveness and robustness compared to SOTA methods, with mean Dice Similarity Coefficient scores of 92.2%, 84.5% and 89.1% on cardiac, abdominal organs and gastrointestinal poly segmentation tasks. The quantitative and qualitative results demonstrate that our proposed DATTNet attains favorable capability across different modalities (MRI, CT, and endoscopy) and can be generalized to various tasks. Therefore, it is envisaged as being potential for practicable clinical applications. The code has been released on https://github.com/MhZhang123/DATTNet/tree/main .
Collapse
Affiliation(s)
- Menghui Zhang
- Department of Hepatobiliary and Pancreatic Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450001, China
| | - Yuchen Zhang
- Department of Hepatobiliary and Pancreatic Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450001, China
| | - Shuaibing Liu
- Department of Hepatobiliary and Pancreatic Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450001, China
| | - Yahui Han
- Department of Pediatric Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450001, China
| | - Honggang Cao
- Department of Hepatobiliary and Pancreatic Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450001, China
| | - Bingbing Qiao
- Department of Hepatobiliary and Pancreatic Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450001, China.
| |
Collapse
|
85
|
Zeng X, Abdullah N, Sumari P. Self-supervised learning framework application for medical image analysis: a review and summary. Biomed Eng Online 2024; 23:107. [PMID: 39465395 PMCID: PMC11514943 DOI: 10.1186/s12938-024-01299-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Accepted: 10/17/2024] [Indexed: 10/29/2024] Open
Abstract
Manual annotation of medical image datasets is labor-intensive and prone to biases. Moreover, the rate at which image data accumulates significantly outpaces the speed of manual annotation, posing a challenge to the advancement of machine learning, particularly in the realm of supervised learning. Self-supervised learning is an emerging field that capitalizes on unlabeled data for training, thereby circumventing the need for extensive manual labeling. This learning paradigm generates synthetic pseudo-labels through pretext tasks, compelling the network to acquire image representations in a pseudo-supervised manner and subsequently fine-tuning with a limited set of annotated data to achieve enhanced performance. This review begins with an overview of prevalent types and advancements in self-supervised learning, followed by an exhaustive and systematic examination of methodologies within the medical imaging domain from 2018 to September 2024. The review encompasses a range of medical image modalities, including CT, MRI, X-ray, Histology, and Ultrasound. It addresses specific tasks, such as Classification, Localization, Segmentation, Reduction of False Positives, Improvement of Model Performance, and Enhancement of Image Quality. The analysis reveals a descending order in the volume of related studies, with CT and MRI leading the list, followed by X-ray, Histology, and Ultrasound. Except for CT and MRI, there is a greater prevalence of studies focusing on contrastive learning methods over generative learning approaches. The performance of MRI/Ultrasound classification and all image types segmentation still has room for further exploration. Generally, this review can provide conceptual guidance for medical professionals to combine self-supervised learning with their research.
Collapse
Affiliation(s)
- Xiangrui Zeng
- School of Computer Sciences, Universiti Sains Malaysia, USM, 11800, Pulau Pinang, Malaysia.
| | - Nibras Abdullah
- Faculty of Computer Studies, Arab Open University, Jeddah, Saudi Arabia.
| | - Putra Sumari
- School of Computer Sciences, Universiti Sains Malaysia, USM, 11800, Pulau Pinang, Malaysia
| |
Collapse
|
86
|
Xu Z, Li J, Yao Q, Li H, Zhao M, Zhou SK. Addressing fairness issues in deep learning-based medical image analysis: a systematic review. NPJ Digit Med 2024; 7:286. [PMID: 39420149 PMCID: PMC11487181 DOI: 10.1038/s41746-024-01276-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 10/03/2024] [Indexed: 10/19/2024] Open
Abstract
Deep learning algorithms have demonstrated remarkable efficacy in various medical image analysis (MedIA) applications. However, recent research highlights a performance disparity in these algorithms when applied to specific subgroups, such as exhibiting poorer predictive performance in elderly females. Addressing this fairness issue has become a collaborative effort involving AI scientists and clinicians seeking to understand its origins and develop solutions for mitigation within MedIA. In this survey, we thoroughly examine the current advancements in addressing fairness issues in MedIA, focusing on methodological approaches. We introduce the basics of group fairness and subsequently categorize studies on fair MedIA into fairness evaluation and unfairness mitigation. Detailed methods employed in these studies are presented too. Our survey concludes with a discussion of existing challenges and opportunities in establishing a fair MedIA and healthcare system. By offering this comprehensive review, we aim to foster a shared understanding of fairness among AI researchers and clinicians, enhance the development of unfairness mitigation methods, and contribute to the creation of an equitable MedIA society.
Collapse
Affiliation(s)
- Zikang Xu
- School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, PR China
- Center for Medical Imaging, Robotics, Analytic Computing & Learning (MIRACLE), Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, Jiangsu, PR China
| | - Jun Li
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing, PR China
| | - Qingsong Yao
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing, PR China
| | - Han Li
- School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, PR China
- Center for Medical Imaging, Robotics, Analytic Computing & Learning (MIRACLE), Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, Jiangsu, PR China
| | - Mingyue Zhao
- School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, PR China
- Center for Medical Imaging, Robotics, Analytic Computing & Learning (MIRACLE), Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, Jiangsu, PR China
| | - S Kevin Zhou
- School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, PR China.
- Center for Medical Imaging, Robotics, Analytic Computing & Learning (MIRACLE), Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, Jiangsu, PR China.
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing, PR China.
- Key Laboratory of Precision and Intelligent Chemistry, University of Science and Technology of China, Hefei, Anhui, PR China.
| |
Collapse
|
87
|
Wang J, Liu C, Zhong Y, Liu X, Wang J. Deep plug-and-play MRI reconstruction based on multiple complementary priors. Magn Reson Imaging 2024; 115:110244. [PMID: 39419362 DOI: 10.1016/j.mri.2024.110244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 09/08/2024] [Accepted: 09/29/2024] [Indexed: 10/19/2024]
Abstract
Magnetic resonance imaging (MRI) is widely used in clinical diagnosis as a safe, non-invasive, high-resolution medical imaging technology, but long scanning time has been a major challenge for this technology. The undersampling reconstruction method has become an important technical means to accelerate MRI by reducing the data sampling rate while maintaining high-quality imaging. However, traditional undersampling reconstruction techniques such as compressed sensing mainly rely on relatively single sparse or low-rank prior information to reconstruct the image, which has limitations in capturing the comprehensive features of images, resulting in the insufficient performance of the reconstructed image in terms of details and key information. In this paper, we propose a deep plug-and-play multiple complementary priors MRI reconstruction model, which combines traditional low-rank matrix recovery model methods and deep learning methods, and integrates global, local and nonlocal priors to improve reconstruction quality. Specifically, we capture the global features of the image through the matrix nuclear norm, and use the deep convolutional neural network denoiser Swin-Conv-UNet (SCUNet) and block-matching and 3-D filtering (BM3D) algorithm to preserve the local details and structural texture of the image, respectively. In addition, we utilize an efficient half-quadratic splitting (HQS) algorithm to solve the proposed model. The experimental results show that our proposed method has better reconstruction ability than the existing popular methods in terms of visual effects and numerical results.
Collapse
Affiliation(s)
- Jianmin Wang
- School of Mathematics and Statistics, Southwest University, Chongqing 400715, China
| | - Chunyan Liu
- School of Mathematics and Statistics, Southwest University, Chongqing 400715, China
| | - Yuxiang Zhong
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China; Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Xinling Liu
- Key Laboratory of Optimization Theory and Applications at China West Normal University of Sichuan Province, Sichuan 637001, China
| | - Jianjun Wang
- School of Mathematics and Statistics, Southwest University, Chongqing 400715, China.
| |
Collapse
|
88
|
Jafari R, Verma R, Aggarwal V, Gupta RK, Singh A. Deep learning-based segmentation of left ventricular myocardium on dynamic contrast-enhanced MRI: a comprehensive evaluation across temporal frames. Int J Comput Assist Radiol Surg 2024; 19:2055-2062. [PMID: 38965165 DOI: 10.1007/s11548-024-03221-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 06/24/2024] [Indexed: 07/06/2024]
Abstract
PURPOSE Cardiac perfusion MRI is vital for disease diagnosis, treatment planning, and risk stratification, with anomalies serving as markers of underlying ischemic pathologies. AI-assisted methods and tools enable accurate and efficient left ventricular (LV) myocardium segmentation on all DCE-MRI timeframes, offering a solution to the challenges posed by the multidimensional nature of the data. This study aims to develop and assess an automated method for LV myocardial segmentation on DCE-MRI data of a local hospital. METHODS The study consists of retrospective DCE-MRI data from 55 subjects acquired at the local hospital using a 1.5 T MRI scanner. The dataset included subjects with and without cardiac abnormalities. The timepoint for the reference frame (post-contrast LV myocardium) was identified using standard deviation across the temporal sequences. Iterative image registration of other temporal images with respect to this reference image was performed using Maxwell's demons algorithm. The registered stack was fed to the model built using the U-Net framework for predicting the LV myocardium at all timeframes of DCE-MRI. RESULTS The mean and standard deviation of the dice similarity coefficient (DSC) for myocardial segmentation using pre-trained network Net_cine is 0.78 ± 0.04, and for the fine-tuned network Net_dyn which predicts mask on all timeframes individually, it is 0.78 ± 0.03. The DSC for Net_dyn ranged from 0.71 to 0.93. The average DSC achieved for the reference frame is 0.82 ± 0.06. CONCLUSION The study proposed a fast and fully automated AI-assisted method to segment LV myocardium on all timeframes of DCE-MRI data. The method is robust, and its performance is independent of the intra-temporal sequence registration and can easily accommodate timeframes with potential registration errors.
Collapse
Affiliation(s)
- Raufiya Jafari
- Centre for Biomedical Engineering, Indian Institute of Technology Delhi, New Delhi, Delhi, 110016, India
| | - Radhakrishan Verma
- Department of Radiology, Fortis Memorial Research Institute, Gurugram, India
| | - Vinayak Aggarwal
- Department of Cardiology, Fortis Memorial Research Institute, Gurugram, India
| | - Rakesh Kumar Gupta
- Department of Radiology, Fortis Memorial Research Institute, Gurugram, India
| | - Anup Singh
- Centre for Biomedical Engineering, Indian Institute of Technology Delhi, New Delhi, Delhi, 110016, India.
- Department of Biomedical Engineering, All India Institute of Medical Sciences, New Delhi, Delhi, India.
- Yardi School of Artificial Intelligence, Indian Institute of Technology Delhi, New Delhi, Delhi, India.
| |
Collapse
|
89
|
D'Angelo T, Bucolo GM, Kamareddine T, Yel I, Koch V, Gruenewald LD, Martin S, Alizadeh LS, Mazziotti S, Blandino A, Vogl TJ, Booz C. Accuracy and time efficiency of a novel deep learning algorithm for Intracranial Hemorrhage detection in CT Scans. LA RADIOLOGIA MEDICA 2024; 129:1499-1506. [PMID: 39123064 PMCID: PMC11480174 DOI: 10.1007/s11547-024-01867-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 08/01/2024] [Indexed: 08/12/2024]
Abstract
PURPOSE To evaluate a deep learning-based pipeline using a Dense-UNet architecture for the assessment of acute intracranial hemorrhage (ICH) on non-contrast computed tomography (NCCT) head scans after traumatic brain injury (TBI). MATERIALS AND METHODS This retrospective study was conducted using a prototype algorithm that evaluated 502 NCCT head scans with ICH in context of TBI. Four board-certified radiologists evaluated in consensus the CT scans to establish the standard of reference for hemorrhage presence and type of ICH. Consequently, all CT scans were independently analyzed by the algorithm and a board-certified radiologist to assess the presence and type of ICH. Additionally, the time to diagnosis was measured for both methods. RESULTS A total of 405/502 patients presented ICH classified in the following types: intraparenchymal (n = 172); intraventricular (n = 26); subarachnoid (n = 163); subdural (n = 178); and epidural (n = 15). The algorithm showed high diagnostic accuracy (91.24%) for the assessment of ICH with a sensitivity of 90.37% and specificity of 94.85%. To distinguish the different ICH types, the algorithm had a sensitivity of 93.47% and a specificity of 99.79%, with an accuracy of 98.54%. To detect midline shift, the algorithm had a sensitivity of 100%. In terms of processing time, the algorithm was significantly faster compared to the radiologist's time to first diagnosis (15.37 ± 1.85 vs 277 ± 14 s, p < 0.001). CONCLUSION A novel deep learning algorithm can provide high diagnostic accuracy for the identification and classification of ICH from unenhanced CT scans, combined with short processing times. This has the potential to assist and improve radiologists' ICH assessment in NCCT scans, especially in emergency scenarios, when time efficiency is needed.
Collapse
Affiliation(s)
- Tommaso D'Angelo
- Diagnostic and Interventional Radiology Unit, BIOMORF Department, University of Messina, Messina, Italy.
- Department of Radiology and Nuclear Medicine, Erasmus MC, 3015 GD, Rotterdam, The Netherlands.
| | - Giuseppe M Bucolo
- Diagnostic and Interventional Radiology Unit, BIOMORF Department, University of Messina, Messina, Italy
- Division of Experimental Imaging, Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt Am Main, Germany
| | - Tarek Kamareddine
- Division of Experimental Imaging, Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt Am Main, Germany
- Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt Am Main, Germany
| | - Ibrahim Yel
- Division of Experimental Imaging, Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt Am Main, Germany
- Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt Am Main, Germany
| | - Vitali Koch
- Division of Experimental Imaging, Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt Am Main, Germany
- Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt Am Main, Germany
| | - Leon D Gruenewald
- Division of Experimental Imaging, Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt Am Main, Germany
- Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt Am Main, Germany
| | - Simon Martin
- Division of Experimental Imaging, Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt Am Main, Germany
- Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt Am Main, Germany
| | - Leona S Alizadeh
- Division of Experimental Imaging, Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt Am Main, Germany
- Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt Am Main, Germany
- Department of Diagnostic and Interventional Radiology and Neuroradiology, Bundeswehr Central Hospital Koblenz, Koblenz, Germany
| | - Silvio Mazziotti
- Diagnostic and Interventional Radiology Unit, BIOMORF Department, University of Messina, Messina, Italy
| | - Alfredo Blandino
- Diagnostic and Interventional Radiology Unit, BIOMORF Department, University of Messina, Messina, Italy
| | - Thomas J Vogl
- Division of Experimental Imaging, Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt Am Main, Germany
- Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt Am Main, Germany
| | - Christian Booz
- Division of Experimental Imaging, Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt Am Main, Germany
- Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt Am Main, Germany
| |
Collapse
|
90
|
Wang R, Mu Z, Wang J, Wang K, Liu H, Zhou Z, Jiao L. ASF-LKUNet: Adjacent-scale fusion U-Net with large kernel for multi-organ segmentation. Comput Biol Med 2024; 181:109050. [PMID: 39205343 DOI: 10.1016/j.compbiomed.2024.109050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 08/17/2024] [Accepted: 08/19/2024] [Indexed: 09/04/2024]
Abstract
In the multi-organ segmentation task of medical images, there are some challenging issues such as the complex background, blurred boundaries between organs, and the larger scale difference in volume. Due to the local receptive fields of conventional convolution operations, it is difficult to obtain desirable results by directly using them for multi-organ segmentation. While Transformer-based models have global information, there is a significant dependency on hardware because of the high computational demands. Meanwhile, the depthwise convolution with large kernel can capture global information and have less computational requirements. Therefore, to leverage the large receptive field and reduce model complexity, we propose a novel CNN-based approach, namely adjacent-scale fusion U-Net with large kernel (ASF-LKUNet) for multi-organ segmentation. We utilize a u-shaped encoder-decoder as the base architecture of ASF-LKUNet. In the encoder path, we design the large kernel residual block, which combines the large and small kernels and can simultaneously capture the global and local features. Furthermore, for the first time, we propose an adjacent-scale fusion and large kernel GRN channel attention that incorporates the low-level details with the high-level semantics by the adjacent-scale feature and then adaptively focuses on the more global and meaningful channel information. Extensive experiments and interpretability analysis are made on the Synapse multi-organ dataset (Synapse) and the ACDC cardiac multi-structure dataset (ACDC). Our proposed ASF-LKUNet achieves 88.41% and 89.45% DSC scores on the Synapse and ACDC datasets, respectively, with 17.96M parameters and 29.14 GFLOPs. These results show that our method achieves superior performance with favorable lower complexity against ten competing approaches.ASF-LKUNet is superior to various competing methods and has less model complexity. Code and the trained models have been released on GitHub.
Collapse
Affiliation(s)
- Rongfang Wang
- School of Artificial Intelligence, Xidian University, China.
| | - Zhaoshan Mu
- School of Artificial Intelligence, Xidian University, China
| | - Jing Wang
- Department of Radiation Oncology, UTSW, United States of America
| | - Kai Wang
- Department of Radiation Oncology, UMMC, United States of America
| | - Hui Liu
- Department of Biostatistics Data Science, KUMC, United States of America
| | - Zhiguo Zhou
- Department of Biostatistics Data Science, KUMC, United States of America
| | - Licheng Jiao
- School of Artificial Intelligence, Xidian University, China
| |
Collapse
|
91
|
Li W, Bian R, Zhao W, Xu W, Yang H. Diversity matters: Cross-head mutual mean-teaching for semi-supervised medical image segmentation. Med Image Anal 2024; 97:103302. [PMID: 39154618 DOI: 10.1016/j.media.2024.103302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 08/08/2024] [Accepted: 08/09/2024] [Indexed: 08/20/2024]
Abstract
Semi-supervised medical image segmentation (SSMIS) has witnessed substantial advancements by leveraging limited labeled data and abundant unlabeled data. Nevertheless, existing state-of-the-art (SOTA) methods encounter challenges in accurately predicting labels for the unlabeled data, giving rise to disruptive noise during training and susceptibility to erroneous information overfitting. Moreover, applying perturbations to inaccurate predictions further impedes consistent learning. To address these concerns, we propose a novel cross-head mutual mean-teaching network (CMMT-Net) incorporated weak-strong data augmentations, thereby benefiting both co-training and consistency learning. More concretely, our CMMT-Net extends the cross-head co-training paradigm by introducing two auxiliary mean teacher models, which yield more accurate predictions and provide supplementary supervision. The predictions derived from weakly augmented samples generated by one mean teacher are leveraged to guide the training of another student with strongly augmented samples. Furthermore, two distinct yet synergistic data perturbations at the pixel and region levels are introduced. We propose mutual virtual adversarial training (MVAT) to smooth the decision boundary and enhance feature representations, and a cross-set CutMix strategy to generate more diverse training samples for capturing inherent structural data information. Notably, CMMT-Net simultaneously implements data, feature, and network perturbations, amplifying model diversity and generalization performance. Experimental results on three publicly available datasets indicate that our approach yields remarkable improvements over previous SOTA methods across various semi-supervised scenarios. The code is available at https://github.com/Leesoon1984/CMMT-Net.
Collapse
Affiliation(s)
- Wei Li
- School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, 100876, China
| | - Ruifeng Bian
- School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, 100876, China
| | - Wenyi Zhao
- School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, 100876, China
| | - Weijin Xu
- School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, 100876, China
| | - Huihua Yang
- School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, 100876, China.
| |
Collapse
|
92
|
Han M, Luo X, Xie X, Liao W, Zhang S, Song T, Wang G, Zhang S. DMSPS: Dynamically mixed soft pseudo-label supervision for scribble-supervised medical image segmentation. Med Image Anal 2024; 97:103274. [PMID: 39043109 DOI: 10.1016/j.media.2024.103274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 05/11/2024] [Accepted: 07/09/2024] [Indexed: 07/25/2024]
Abstract
High performance of deep learning on medical image segmentation rely on large-scale pixel-level dense annotations, which poses a substantial burden on medical experts due to the laborious and time-consuming annotation process, particularly for 3D images. To reduce the labeling cost as well as maintain relatively satisfactory segmentation performance, weakly-supervised learning with sparse labels has attained increasing attentions. In this work, we present a scribble-based framework for medical image segmentation, called Dynamically Mixed Soft Pseudo-label Supervision (DMSPS). Concretely, we extend a backbone with an auxiliary decoder to form a dual-branch network to enhance the feature capture capability of the shared encoder. Considering that most pixels do not have labels and hard pseudo-labels tend to be over-confident to result in poor segmentation, we propose to use soft pseudo-labels generated by dynamically mixing the decoders' predictions as auxiliary supervision. To further enhance the model's performance, we adopt a two-stage approach where the sparse scribbles are expanded based on predictions with low uncertainties from the first-stage model, leading to more annotated pixels to train the second-stage model. Experiments on ACDC dataset for cardiac structure segmentation, WORD dataset for 3D abdominal organ segmentation and BraTS2020 dataset for 3D brain tumor segmentation showed that: (1) compared with the baseline, our method improved the average DSC from 50.46% to 89.51%, from 75.46% to 87.56% and from 52.61% to 76.53% on the three datasets, respectively; (2) DMSPS achieved better performance than five state-of-the-art scribble-supervised segmentation methods, and is generalizable to different segmentation backbones. The code is available online at: https://github.com/HiLab-git/DMSPS.
Collapse
Affiliation(s)
- Meng Han
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Xiangde Luo
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; Shanghai Artificial Intelligence Laboratory, Shanghai, China
| | - Xiangjiang Xie
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Wenjun Liao
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, Sichuan Cancer Center, Chengdu, China; School of Medicine, University of Electronic Science and Technology of China, Chengdu, China
| | - Shichuan Zhang
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, Sichuan Cancer Center, Chengdu, China
| | - Tao Song
- SenseTime Research, Shanghai, China
| | - Guotai Wang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; Shanghai Artificial Intelligence Laboratory, Shanghai, China.
| | - Shaoting Zhang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; Shanghai Artificial Intelligence Laboratory, Shanghai, China.
| |
Collapse
|
93
|
Wyburd MK, Dinsdale NK, Jenkinson M, Namburete AIL. Anatomically plausible segmentations: Explicitly preserving topology through prior deformations. Med Image Anal 2024; 97:103222. [PMID: 38936222 DOI: 10.1016/j.media.2024.103222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 05/23/2024] [Accepted: 05/25/2024] [Indexed: 06/29/2024]
Abstract
Since the rise of deep learning, new medical segmentation methods have rapidly been proposed with extremely promising results, often reporting marginal improvements on the previous state-of-the-art (SOTA) method. However, on visual inspection errors are often revealed, such as topological mistakes (e.g. holes or folds), that are not detected using traditional evaluation metrics. Incorrect topology can often lead to errors in clinically required downstream image processing tasks. Therefore, there is a need for new methods to focus on ensuring segmentations are topologically correct. In this work, we present TEDS-Net: a segmentation network that preserves anatomical topology whilst maintaining segmentation performance that is competitive with SOTA baselines. Further, we show how current SOTA segmentation methods can introduce problematic topological errors. TEDS-Net achieves anatomically plausible segmentation by using learnt topology-preserving fields to deform a prior. Traditionally, topology-preserving fields are described in the continuous domain and begin to break down when working in the discrete domain. Here, we introduce additional modifications that more strictly enforce topology preservation. We illustrate our method on an open-source medical heart dataset, performing both single and multi-structure segmentation, and show that the generated fields contain no folding voxels, which corresponds to full topology preservation on individual structures whilst vastly outperforming the other baselines on overall scene topology. The code is available at: https://github.com/mwyburd/TEDS-Net.
Collapse
Affiliation(s)
- Madeleine K Wyburd
- Oxford Machine Learning Neuroimaging Lab (OMNI) Computer Science Department, University of Oxford, Oxford, OX1 3QG, United Kingdom.
| | - Nicola K Dinsdale
- Oxford Machine Learning Neuroimaging Lab (OMNI) Computer Science Department, University of Oxford, Oxford, OX1 3QG, United Kingdom
| | - Mark Jenkinson
- Wellcome Centre for Integrative Neuroimaging, Wellcome Centre for Integrative Neuroimaging, Oxford, United Kingdom; Australian Institute for Machine Learning (AIML), Department of Computer Science, University of Adelaide, Adelaide, Australia; South Australian Health and Medical Research Institute (SAHMRI), North Terrace, Australia
| | - Ana I L Namburete
- Oxford Machine Learning Neuroimaging Lab (OMNI) Computer Science Department, University of Oxford, Oxford, OX1 3QG, United Kingdom; Wellcome Centre for Integrative Neuroimaging, Wellcome Centre for Integrative Neuroimaging, Oxford, United Kingdom
| |
Collapse
|
94
|
Fortuni F, Ciliberti G, De Chiara B, Conte E, Franchin L, Musella F, Vitale E, Piroli F, Cangemi S, Cornara S, Magnesa M, Spinelli A, Geraci G, Nardi F, Gabrielli D, Colivicchi F, Grimaldi M, Oliva F. Advancements and applications of artificial intelligence in cardiovascular imaging: a comprehensive review. EUROPEAN HEART JOURNAL. IMAGING METHODS AND PRACTICE 2024; 2:qyae136. [PMID: 39776818 PMCID: PMC11705385 DOI: 10.1093/ehjimp/qyae136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/28/2024] [Accepted: 11/20/2024] [Indexed: 01/11/2025]
Abstract
Artificial intelligence (AI) is transforming cardiovascular imaging by offering advancements across multiple modalities, including echocardiography, cardiac computed tomography (CCT), cardiovascular magnetic resonance (CMR), interventional cardiology, nuclear medicine, and electrophysiology. This review explores the clinical applications of AI within each of these areas, highlighting its ability to improve patient selection, reduce image acquisition time, enhance image optimization, facilitate the integration of data from different imaging modality and clinical sources, improve diagnosis and risk stratification. Moreover, we illustrate both the advantages and the limitations of AI across these modalities, acknowledging that while AI can significantly aid in diagnosis, risk stratification, and workflow efficiency, it cannot replace the expertise of cardiologists. Instead, AI serves as a powerful tool to streamline routine tasks, allowing clinicians to focus on complex cases where human judgement remains essential. By accelerating image interpretation and improving diagnostic accuracy, AI holds great potential to improve patient care and clinical decision-making in cardiovascular imaging.
Collapse
Affiliation(s)
- Federico Fortuni
- Cardiology and Cardiovascular Pathophysiology, S. Maria Della Misericordia Hospital, University of Perugia, Piazzale Giorgio Menghini, 3, 06129 Perugia, Italy
| | | | - Benedetta De Chiara
- Cardiology IV, ‘A. De Gasperis’ Department, ASST GOM Niguarda Ca’ Granda, University of Milano-Bicocca, Milan, Italy
| | - Edoardo Conte
- Clinical Cardiology and Cardiovascular Imaging Unit, Galeazzi-Sant'Ambrogio Hospital IRCCS, Milan, Italy
| | - Luca Franchin
- Department of Cardiology, Ospedale Santa Maria Della Misericordia, Azienda Sanitaria Universitaria Friuli Centrale, Udine, Italy
| | - Francesca Musella
- Dipartimento di Cardiologia, Ospedale Santa Maria Delle Grazie, Napoli, Italy
| | - Enrica Vitale
- U.O.C. Cardiologia, Azienda Ospedaliero-Universitaria Senese, Siena, Italy
| | - Francesco Piroli
- S.O.C. Cardiologia Ospedaliera, Presidio Ospedaliero Arcispedale Santa Maria Nuova, Azienda USL di Reggio Emilia—IRCCS, Reggio Emilia, Italy
| | - Stefano Cangemi
- U.O.S. Emodinamica, U.O.C. Cardiologia. Ospedale San Antonio Abate, Erice, Italy
| | - Stefano Cornara
- S.C. Cardiologia Levante, P.O. Levante—Ospedale San Paolo, Savona, Italy
| | - Michele Magnesa
- U.O.C. Cardiologia-UTIC, Ospedale ‘Monsignor R. Dimiccoli’, Barletta, Italy
| | - Antonella Spinelli
- U.O.C. Cardiologia Clinica e Riabilitativa, Presidio Ospedaliero San Filippo Neri—ASL Roma 1, Roma, Italy
| | - Giovanna Geraci
- U.O.C. Cardiologia, Ospedale San Antonio Abate, Erice, Italy
| | - Federico Nardi
- S.C. Cardiology, Santo Spirito Hospital, Casale Monferrato, AL 15033, Italy
| | - Domenico Gabrielli
- Department of Cardio-Thoraco-Vascular Sciences, Division of Cardiology, A.O. San Camillo-Forlanini, Rome, Italy
| | - Furio Colivicchi
- U.O.C. Cardiologia Clinica e Riabilitativa, Presidio Ospedaliero San Filippo Neri—ASL Roma 1, Roma, Italy
| | - Massimo Grimaldi
- U.O.C. Cardiologia, Ospedale Generale Regionale ‘F. Miulli’, Acquaviva Delle Fonti, Italy
| | - Fabrizio Oliva
- Cardiologia 1-Emodinamica, Dipartimento Cardiotoracovascolare ‘A. De Gasperis’, ASST Grande Ospedale Metropolitano Niguarda, Milano, Italy
- Presidente ANMCO (Associazione Nazionale Medici Cardiologi Ospedalieri), Firenze, Italy
- Consigliere Delegato per la Ricerca Fondazione per il Tuo cuore (Heart Care Foundation), Firenze, Italy
| |
Collapse
|
95
|
Wang H, Cao P, Yang J, Zaiane O. Narrowing the semantic gaps in U-Net with learnable skip connections: The case of medical image segmentation. Neural Netw 2024; 178:106546. [PMID: 39053196 DOI: 10.1016/j.neunet.2024.106546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 04/13/2024] [Accepted: 07/14/2024] [Indexed: 07/27/2024]
Abstract
Current state-of-the-art medical image segmentation techniques predominantly employ the encoder-decoder architecture. Despite its widespread use, this U-shaped framework exhibits limitations in effectively capturing multi-scale features through simple skip connections. In this study, we made a thorough analysis to investigate the potential weaknesses of connections across various segmentation tasks, and suggest two key aspects of potential semantic gaps crucial to be considered: the semantic gap among multi-scale features in different encoding stages and the semantic gap between the encoder and the decoder. To bridge these semantic gaps, we introduce a novel segmentation framework, which incorporates a Dual Attention Transformer module for capturing channel-wise and spatial-wise relationships, and a Decoder-guided Recalibration Attention module for fusing DAT tokens and decoder features. These modules establish a principle of learnable connection that resolves the semantic gaps, leading to a high-performance segmentation model for medical images. Furthermore, it provides a new paradigm for effectively incorporating the attention mechanism into the traditional convolution-based architecture. Comprehensive experimental results demonstrate that our model achieves consistent, significant gains and outperforms state-of-the-art methods with relatively fewer parameters. This study contributes to the advancement of medical image segmentation by offering a more effective and efficient framework for addressing the limitations of current encoder-decoder architectures. Code: https://github.com/McGregorWwww/UDTransNet.
Collapse
Affiliation(s)
- Haonan Wang
- School of Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image of Ministry of Education, Northeastern University, Shenyang, China.
| | - Peng Cao
- School of Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image of Ministry of Education, Northeastern University, Shenyang, China.
| | - Jinzhu Yang
- School of Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image of Ministry of Education, Northeastern University, Shenyang, China
| | | |
Collapse
|
96
|
Androshchuk V, Montarello N, Lahoti N, Hill SJ, Zhou C, Patterson T, Redwood S, Niederer S, Lamata P, De Vecchi A, Rajani R. Evolving capabilities of computed tomography imaging for transcatheter valvular heart interventions - new opportunities for precision medicine. THE INTERNATIONAL JOURNAL OF CARDIOVASCULAR IMAGING 2024:10.1007/s10554-024-03247-z. [PMID: 39347934 DOI: 10.1007/s10554-024-03247-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Accepted: 09/16/2024] [Indexed: 10/01/2024]
Abstract
The last decade has witnessed a substantial growth in percutaneous treatment options for heart valve disease. The development in these innovative therapies has been mirrored by advances in multi-detector computed tomography (MDCT). MDCT plays a central role in obtaining detailed pre-procedural anatomical information, helping to inform clinical decisions surrounding procedural planning, improve clinical outcomes and prevent potential complications. Improvements in MDCT image acquisition and processing techniques have led to increased application of advanced analytics in routine clinical care. Workflow implementation of patient-specific computational modeling, fluid dynamics, 3D printing, extended reality, extracellular volume mapping and artificial intelligence are shaping the landscape for delivering patient-specific care. This review will provide an insight of key innovations in the field of MDCT for planning transcatheter heart valve interventions.
Collapse
Affiliation(s)
- Vitaliy Androshchuk
- School of Cardiovascular Medicine & Sciences, Faculty of Life Sciences & Medicine, King's College London, London, UK.
- Guy's & St Thomas' NHS Foundation Trust, King's College London, St Thomas' Hospital, The Reyne Institute, 4th Floor, Lambeth Wing, London, SE1 7EH, UK.
| | - Natalie Montarello
- Cardiovascular Department, St Thomas' Hospital, King's College London, London, UK
| | - Nishant Lahoti
- Cardiovascular Department, St Thomas' Hospital, King's College London, London, UK
| | - Samuel Joseph Hill
- School of Cardiovascular Medicine & Sciences, Faculty of Life Sciences & Medicine, King's College London, London, UK
| | - Can Zhou
- Cardiovascular Department, St Thomas' Hospital, King's College London, London, UK
| | - Tiffany Patterson
- Cardiovascular Department, St Thomas' Hospital, King's College London, London, UK
| | - Simon Redwood
- School of Cardiovascular Medicine & Sciences, Faculty of Life Sciences & Medicine, King's College London, London, UK
| | - Steven Niederer
- School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences & Medicine, King's College London, London, UK
| | - Pablo Lamata
- School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences & Medicine, King's College London, London, UK
| | - Adelaide De Vecchi
- School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences & Medicine, King's College London, London, UK
| | - Ronak Rajani
- Cardiovascular Department, St Thomas' Hospital, King's College London, London, UK
- School of Biomedical Engineering and Imaging Sciences, Faculty of Life Sciences & Medicine, King's College London, London, UK
| |
Collapse
|
97
|
Bian C, Hu C, Cao N. Exploiting K-Space in Magnetic Resonance Imaging Diagnosis: Dual-Path Attention Fusion for K-Space Global and Image Local Features. Bioengineering (Basel) 2024; 11:958. [PMID: 39451334 PMCID: PMC11504126 DOI: 10.3390/bioengineering11100958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Revised: 09/04/2024] [Accepted: 09/21/2024] [Indexed: 10/26/2024] Open
Abstract
Magnetic resonance imaging (MRI) diagnosis, enhanced by deep learning methods, plays a crucial role in medical image processing, facilitating precise clinical diagnosis and optimal treatment planning. Current methodologies predominantly focus on feature extraction from the image domain, which often results in the loss of global features during down-sampling processes. However, the unique global representational capacity of MRI K-space is often overlooked. In this paper, we present a novel MRI K-space-based global feature extraction and dual-path attention fusion network. Our proposed method extracts global features from MRI K-space data and fuses them with local features from the image domain using a dual-path attention mechanism, thereby achieving accurate MRI segmentation for diagnosis. Specifically, our method consists of four main components: an image-domain feature extraction module, a K-space domain feature extraction module, a dual-path attention feature fusion module, and a decoder. We conducted ablation studies and comprehensive comparisons on the Brain Tumor Segmentation (BraTS) MRI dataset to validate the effectiveness of each module. The results demonstrate that our method exhibits superior performance in segmentation diagnostics, outperforming state-of-the-art methods with improvements of up to 63.82% in the HD95 distance evaluation metric. Furthermore, we performed generalization testing and complexity analysis on the Automated Cardiac Diagnosis Challenge (ACDC) MRI cardiac segmentation dataset. The findings indicate robust performance across different datasets, highlighting strong generalizability and favorable algorithmic complexity. Collectively, these results suggest that our proposed method holds significant potential for practical clinical applications.
Collapse
Affiliation(s)
- Congchao Bian
- College of Information Science and Engineering, Hohai University, Nanjing 210098, China;
| | - Can Hu
- College of Computer Science and Software Engineering, Hohai University, Nanjing 210098, China;
| | - Ning Cao
- College of Information Science and Engineering, Hohai University, Nanjing 210098, China;
| |
Collapse
|
98
|
You C, Dai W, Liu F, Min Y, Dvornek NC, Li X, Clifton DA, Staib L, Duncan JS. Mine Your Own Anatomy: Revisiting Medical Image Segmentation With Extremely Limited Labels. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; PP:11136-11151. [PMID: 39269798 PMCID: PMC11903367 DOI: 10.1109/tpami.2024.3461321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/15/2024]
Abstract
Recent studies on contrastive learning have achieved remarkable performance solely by leveraging few labels in the context of medical image segmentation. Existing methods mainly focus on instance discrimination and invariant mapping (i.e., pulling positive samples closer and negative samples apart in the feature space). However, they face three common pitfalls: (1) tailness: medical image data usually follows an implicit long-tail class distribution. Blindly leveraging all pixels in training hence can lead to the data imbalance issues, and cause deteriorated performance; (2) consistency: it remains unclear whether a segmentation model has learned meaningful and yet consistent anatomical features due to the intra-class variations between different anatomical features; and (3) diversity: the intra-slice correlations within the entire dataset have received significantly less attention. This motivates us to seek a principled approach for strategically making use of the dataset itself to discover similar yet distinct samples from different anatomical views. In this paper, we introduce a novel semi-supervised 2D medical image segmentation framework termed Mine yOur owNAnatomy (MONA), and make three contributions. First, prior work argues that every pixel equally matters to the model training; we observe empirically that this alone is unlikely to define meaningful anatomical features, mainly due to lacking the supervision signal. We show two simple solutions towards learning invariances-through the use of stronger data augmentations and nearest neighbors. Second, we construct a set of objectives that encourage the model to be capable of decomposing medical images into a collection of anatomical features in an unsupervised manner. Lastly, we both empirically and theoretically, demonstrate the efficacy of our MONA on three benchmark datasets with different labeled settings, achieving new state-of-the-art under different labeled semi-supervised settings. MONA makes minimal assumptions on domain expertise, and hence constitutes a practical and versatile solution in medical image analysis. We provide the PyTorch-like pseudo-code in supplementary.
Collapse
|
99
|
Tilborghs S, Liang T, Raptis S, Ishikita A, Budts W, Dresselaers T, Bogaert J, Maes F, Wald RM, Van De Bruaene A. Automated biventricular quantification in patients with repaired tetralogy of Fallot using a three-dimensional deep learning segmentation model. J Cardiovasc Magn Reson 2024; 26:101092. [PMID: 39270800 DOI: 10.1016/j.jocmr.2024.101092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 08/22/2024] [Accepted: 09/03/2024] [Indexed: 09/15/2024] Open
Abstract
BACKGROUND Deep learning is the state-of-the-art approach for automated segmentation of the left ventricle (LV) and right ventricle (RV) in cardiovascular magnetic resonance (CMR) images. However, these models have been mostly trained and validated using CMR datasets of structurally normal hearts or cases with acquired cardiac disease, and are therefore not well-suited to handle cases with congenital cardiac disease such as tetralogy of Fallot (TOF). We aimed to develop and validate a dedicated model with improved performance for LV and RV cavity and myocardium quantification in patients with repaired TOF. METHODS We trained a three-dimensional (3D) convolutional neural network (CNN) with 5-fold cross-validation using manually delineated end-diastolic (ED) and end-systolic (ES) short-axis image stacks obtained from either a public dataset containing patients with no or acquired cardiac pathology (n = 100), an institutional dataset of TOF patients (n = 96), or both datasets mixed. Our method allows for missing labels in the training images to accommodate for different ED and ES phases for LV and RV as is commonly the case in TOF. The best performing model was applied to all frames of a separate test set of TOF cases (n = 36) and ED and ES phases were automatically determined for LV and RV separately. The model was evaluated against the performance of a commercial software (suiteHEART®, NeoSoft, Pewaukee, Wisconsin, US). RESULTS Training on the mixture of both datasets yielded the best agreement with the manual ground truth for the TOF cases, achieving a median Dice similarity coefficient of (93.8%, 89.8%) for LV cavity and of (92.9%, 90.9%) for RV cavity at (ED, ES) respectively, and of 80.9% and 61.8% for LV and RV myocardium at ED. The offset in automated ED and ES frame selection was 0.56 and 0.89 frames on average for LV and RV respectively. No statistically significant differences were found between our model and the commercial software for LV quantification (two-sided Wilcoxon signed rank test, p<5%), while RV quantification was significantly improved with our model achieving a mean absolute error of 12 ml for RV cavity compared to 36 ml for the commercial software. CONCLUSION We developed and validated a fully automatic segmentation and quantification approach for LV and RV, including RV mass, in patients with repaired TOF. Compared to a commercial software, our approach is superior for RV quantification indicating its potential in clinical practice.
Collapse
Affiliation(s)
- Sofie Tilborghs
- Department of Electrical Engineering, Division of Processing Speech and Images (ESAT/PSI), KU Leuven, Leuven, Belgium; Medical Imaging Research Center, UZ Leuven, Leuven, Belgium
| | - Tiffany Liang
- Division of Cardiology, Peter Munk Cardiac Centre, University of Toronto, Toronto, Canada
| | - Stavroula Raptis
- Division of Cardiology, Peter Munk Cardiac Centre, University of Toronto, Toronto, Canada
| | - Ayako Ishikita
- Division of Cardiology, Peter Munk Cardiac Centre, University of Toronto, Toronto, Canada
| | - Werner Budts
- Department of Cardiovascular Sciences, KU Leuven, Leuven, Belgium
| | - Tom Dresselaers
- Department of Imaging and Pathology, Division of Radiology, KU Leuven, Leuven, Belgium
| | - Jan Bogaert
- Department of Imaging and Pathology, Division of Radiology, KU Leuven, Leuven, Belgium
| | - Frederik Maes
- Department of Electrical Engineering, Division of Processing Speech and Images (ESAT/PSI), KU Leuven, Leuven, Belgium; Medical Imaging Research Center, UZ Leuven, Leuven, Belgium
| | - Rachel M Wald
- Division of Cardiology, Peter Munk Cardiac Centre, University of Toronto, Toronto, Canada; Department of Medical Imaging, Toronto General Hospital, University of Toronto, Toronto, Canada
| | | |
Collapse
|
100
|
Qu Y, Lu T, Zhang S, Wang G. ScribSD+: Scribble-supervised medical image segmentation based on simultaneous multi-scale knowledge distillation and class-wise contrastive regularization. Comput Med Imaging Graph 2024; 116:102416. [PMID: 39018640 DOI: 10.1016/j.compmedimag.2024.102416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 06/16/2024] [Accepted: 07/04/2024] [Indexed: 07/19/2024]
Abstract
Despite that deep learning has achieved state-of-the-art performance for automatic medical image segmentation, it often requires a large amount of pixel-level manual annotations for training. Obtaining these high-quality annotations is time-consuming and requires specialized knowledge, which hinders the widespread application that relies on such annotations to train a model with good segmentation performance. Using scribble annotations can substantially reduce the annotation cost, but often leads to poor segmentation performance due to insufficient supervision. In this work, we propose a novel framework named as ScribSD+ that is based on multi-scale knowledge distillation and class-wise contrastive regularization for learning from scribble annotations. For a student network supervised by scribbles and the teacher based on Exponential Moving Average (EMA), we first introduce multi-scale prediction-level Knowledge Distillation (KD) that leverages soft predictions of the teacher network to supervise the student at multiple scales, and then propose class-wise contrastive regularization which encourages feature similarity within the same class and dissimilarity across different classes, thereby effectively improving the segmentation performance of the student network. Experimental results on the ACDC dataset for heart structure segmentation and a fetal MRI dataset for placenta and fetal brain segmentation demonstrate that our method significantly improves the student's performance and outperforms five state-of-the-art scribble-supervised learning methods. Consequently, the method has a potential for reducing the annotation cost in developing deep learning models for clinical diagnosis.
Collapse
Affiliation(s)
- Yijie Qu
- University of Electronic Science and Technology of China, Chengdu, China
| | - Tao Lu
- Sichuan Provincial People's Hospital, Chengdu, China
| | - Shaoting Zhang
- University of Electronic Science and Technology of China, Chengdu, China; Shanghai AI lab, Shanghai, China
| | - Guotai Wang
- University of Electronic Science and Technology of China, Chengdu, China; Shanghai AI lab, Shanghai, China.
| |
Collapse
|