1
|
Shi Z, Zhang R, Wei X, Yu C, Xie H, Hu Z, Chen X, Zhang Y, Xie B, Luo Z, Peng W, Xie X, Li F, Long X, Li L, Hu L. LUNETR: Language-Infused UNETR for precise pancreatic tumor segmentation in 3D medical image. Neural Netw 2025; 187:107414. [PMID: 40117980 DOI: 10.1016/j.neunet.2025.107414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2024] [Revised: 12/28/2024] [Accepted: 03/15/2025] [Indexed: 03/23/2025]
Abstract
The identification of early micro-lesions and adjacent blood vessels in CT scans plays a pivotal role in the clinical diagnosis of pancreatic cancer, considering its aggressive nature and high fatality rate. Despite the widespread application of deep learning methods for this task, several challenges persist: (1) the complex background environment in abdominal CT scans complicates the accurate localization of potential micro-tumors; (2) the subtle contrast between micro-lesions within pancreatic tissue and the surrounding tissues makes it challenging for models to capture these features accurately; and (3) tumors that invade adjacent blood vessels pose significant barriers to surgical procedures. To address these challenges, we propose LUNETR (Language-Infused UNETR), an advanced multimodal encoder model that combines textual and image information for precise medical image segmentation. The integration of an autoencoding language model with cross-attention enabling our model to effectively leverage semantic associations between textual and image data, thereby facilitating precise localization of potential pancreatic micro-tumors. Additionally, we designed a Multi-scale Aggregation Attention (MSAA) module to comprehensively capture both spatial and channel characteristics of global multi-scale image data, enhancing the model's capacity to extract features from micro-lesions embedded within pancreatic tissue. Furthermore, in order to facilitate precise segmentation of pancreatic tumors and nearby blood vessels and address the scarcity of multimodal medical datasets, we collaborated with Zhuzhou Central Hospital to construct a multimodal dataset comprising CT images and corresponding pathology reports from 135 pancreatic cancer patients. Our experimental results surpass current state-of-the-art models, with the incorporation of the semantic encoder improving the average Dice score for pancreatic tumor segmentation by 2.23 %. For the Medical Segmentation Decathlon (MSD) liver and lung cancer datasets, our model achieved an average Dice score improvement of 4.31 % and 3.67 %, respectively, demonstrating the efficacy of the LUNETR.
Collapse
Affiliation(s)
- Ziyang Shi
- School of Electronic Information and Physics, Central South University of Forestry and Technology, Changsha 410004, China
| | - Ruopeng Zhang
- School of Electronic Information and Physics, Central South University of Forestry and Technology, Changsha 410004, China
| | - Xiajun Wei
- Department of Radiology, Zhuzhou Hospital Affiliated to Xiangya' School of Medicine, Central South University, Zhuzhou 412002, China
| | - Cheng Yu
- Department of Radiology, The Second Xiangya Hospital, Central South University, Changsha 410011, China
| | - Haojie Xie
- School of Electronic Information and Physics, Central South University of Forestry and Technology, Changsha 410004, China
| | - Zhen Hu
- Department of Radiology, Zhuzhou Hospital Affiliated to Xiangya' School of Medicine, Central South University, Zhuzhou 412002, China
| | - Xili Chen
- School of Electronic Information and Physics, Central South University of Forestry and Technology, Changsha 410004, China
| | - Yongzhong Zhang
- School of Electronic Information and Physics, Central South University of Forestry and Technology, Changsha 410004, China
| | - Bin Xie
- School of Electronic Information and Physics, Central South University of Forestry and Technology, Changsha 410004, China
| | - Zhengmao Luo
- Department of Radiology, Zhuzhou Hospital Affiliated to Xiangya' School of Medicine, Central South University, Zhuzhou 412002, China
| | - Wanxiang Peng
- Department of Radiology, Zhuzhou Hospital Affiliated to Xiangya' School of Medicine, Central South University, Zhuzhou 412002, China
| | - Xiaochun Xie
- Department of Radiology, Zhuzhou Hospital Affiliated to Xiangya' School of Medicine, Central South University, Zhuzhou 412002, China
| | - Fang Li
- Department of Radiology, Zhuzhou Hospital Affiliated to Xiangya' School of Medicine, Central South University, Zhuzhou 412002, China
| | - Xiaoli Long
- Department of Radiology, Zhuzhou Hospital Affiliated to Xiangya' School of Medicine, Central South University, Zhuzhou 412002, China
| | - Lin Li
- School of Electronic Information and Physics, Central South University of Forestry and Technology, Changsha 410004, China.
| | - Linan Hu
- Department of Radiology, Zhuzhou Hospital Affiliated to Xiangya' School of Medicine, Central South University, Zhuzhou 412002, China.
| |
Collapse
|
2
|
Han Z, Zhang Y, Liu L, Zhang Y. UltraNet: Unleashing the Power of Simplicity for Accurate Medical Image Segmentation. Interdiscip Sci 2025; 17:375-389. [PMID: 39729189 DOI: 10.1007/s12539-024-00682-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2024] [Revised: 12/04/2024] [Accepted: 12/05/2024] [Indexed: 12/28/2024]
Abstract
The imperative development of point-of-care diagnosis for accurate and rapid medical image segmentation, has become increasingly urgent in recent years. Although some pioneering work has applied complex modules to improve segmentation performance, resulting models are often heavy, which is not practical for the modern clinical setting of point-of-care diagnosis. To address these challenges, we propose UltraNet, a state-of-the-art lightweight model that achieves competitive performance in segmenting multiple parts of medical images with the lowest parameters and computational complexity. To extract a sufficient amount of feature information and replace cumbersome modules, the Shallow Focus Float Block (ShalFoFo) and the Dual-stream Synergy Feature Extraction (DuSem) are respectively proposed at both shallow and deep levels. ShalFoFo is designed to capture finer-grained features containing more pixels, while DuSem is capable of extracting distinct deep semantic features from two different perspectives. By jointly utilizing them, the accuracy and stability of UltraNet segmentation results are enhanced. To evaluate performance, UltraNet's generalization ability was assessed on five datasets with different tasks. Compared to UNet, UltraNet reduces the parameters and computational complexity by 46 times and 26 times, respectively. Experimental results demonstrate that UltraNet achieves a state-of-the-art balance among parameters, computational complexity, and segmentation performance. Codes are available at https://github.com/Ziii1/UltraNet .
Collapse
Affiliation(s)
- Ziyi Han
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao, 266520, China
| | - Yuanyuan Zhang
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao, 266520, China.
| | - Lin Liu
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao, 266520, China
| | - Yulin Zhang
- College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao, 266590, China
| |
Collapse
|
3
|
I G, T R, A P, P SK, P U, S R. Grey wolf optimization technique with U-shaped and capsule networks-A novel framework for glaucoma diagnosis. MethodsX 2025; 14:103285. [PMID: 40236793 PMCID: PMC11999292 DOI: 10.1016/j.mex.2025.103285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2025] [Accepted: 03/25/2025] [Indexed: 04/17/2025] Open
Abstract
The worldwide prevalence of glaucoma makes it a major reason for blindness thus proper early diagnosis remains essential for preventing major vision deterioration. Current glaucoma screening methods that need expert handling prove to be time-intensive and complicated before yielding appropriate diagnosis and treatment. Our system addresses these difficulties through an automated glaucoma screening platform which combines advanced segmentation methods with classification approaches. A hybrid segmentation method combines Grey Wolf Optimization Algorithm with U-Shaped Networks to obtain precise extraction of the optic disc regions in retinal fundus images. Through GWOA the network achieves optimal segmentation by adopting wolf-inspired behaviors such as circular and jumping movements to identify diverse image textures. The glaucoma classification depends on CapsNet as a deep learning model that provides exceptional image detection to ensure precise diagnosis. The combination of our method delivers 96.01 % segmentation together with classification precision which outstrips traditional approaches while indicating strong capabilities for discovering glaucoma at early stages. This automated diagnosis system elevates clinical accuracy levels through an automated screening method that solves manual process limitations. The detection framework produces better accuracy to improve clinical results in a strong effort to minimize glaucoma-induced blindness worldwide and display its capabilities in real clinical environments.•Hybrid GWOA-UNet++ for precise optic disc segmentation.•CapsNet-based classification for robust glaucoma detection.•Achieved 96.01 % accuracy, surpassing existing methods.
Collapse
Affiliation(s)
- Govindharaj I
- Department of Computer Science and Engineering, Vel Tech Rangarajan Dr.Sagunthala R&D Institute of Science and Technology, Tamil Nadu, 600062, India
| | - Ramesh T
- Department of Computer Science and Engineering, R.M.K Engineering College, Thiruvallur, Tamil Nadu, 601206, India
| | - Poongodai A
- Department of Computer Science and Engineering (Artificial Intelligence), Madanapalle Institute of Technology & Science, Andhra Pradesh, 517325, India
| | - Senthilkumar K. P
- Department of Artificial Intelligence and Data Science, Kings Engineering College, Chennai, Tamil Nadu, 602117, India
| | - Udayasankaran P
- Department of Artificial Intelligence and Data Science, Kings Engineering College, Chennai, Tamil Nadu, 602117, India
| | - Ravichandran S
- Department of Artificial Intelligence and Machine Learning, Kings Engineering College, Chennai, Tamil Nadu, 602117, India
| |
Collapse
|
4
|
Yu B, Ozdemir S, Dong Y, Shao W, Pan T, Shi K, Gong K. Robust whole-body PET image denoising using 3D diffusion models: evaluation across various scanners, tracers, and dose levels. Eur J Nucl Med Mol Imaging 2025; 52:2549-2562. [PMID: 39912940 PMCID: PMC12119227 DOI: 10.1007/s00259-025-07122-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2024] [Accepted: 01/27/2025] [Indexed: 02/07/2025]
Abstract
PURPOSE Whole-body PET imaging plays an essential role in cancer diagnosis and treatment but suffers from low image quality. Traditional deep learning-based denoising methods work well for a specific acquisition but are less effective in handling diverse PET protocols. In this study, we proposed and validated a 3D Denoising Diffusion Probabilistic Model (3D DDPM) as a robust and universal solution for whole-body PET image denoising. METHODS The proposed 3D DDPM gradually injected noise into the images during the forward diffusion phase, allowing the model to learn to reconstruct the clean data during the reverse diffusion process. A 3D convolutional network was trained using high-quality data from the Biograph Vision Quadra PET/CT scanner to generate the score function, enabling the model to capture accurate PET distribution information extracted from the total-body datasets. The trained 3D DDPM was evaluated on datasets from four scanners, four tracer types, and six dose levels representing a broad spectrum of clinical scenarios. RESULTS The proposed 3D DDPM consistently outperformed 2D DDPM, 3D UNet, and 3D GAN, demonstrating its superior denoising performance across all tested conditions. Additionally, the model's uncertainty maps exhibited lower variance, reflecting its higher confidence in its outputs. CONCLUSIONS The proposed 3D DDPM can effectively handle various clinical settings, including variations in dose levels, scanners, and tracers, establishing it as a promising foundational model for PET image denoising. The trained 3D DDPM model of this work can be utilized off the shelf by researchers as a whole-body PET image denoising solution. The code and model are available at https://github.com/Miche11eU/PET-Image-Denoising-Using-3D-Diffusion-Model .
Collapse
Affiliation(s)
- Boxiao Yu
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, FL, USA
| | - Savas Ozdemir
- Department of Radiology, University of Florida, Jacksonville, FL, USA
| | - Yafei Dong
- Yale PET Center, Yale School of Medicine, New Haven, CT, USA
| | - Wei Shao
- Department of Medicine, University of Florida, Gainesville, FL, USA
| | - Tinsu Pan
- Department of Imaging Physics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Kuangyu Shi
- Department of Nuclear Medicine, University of Bern, Bern, Switzerland
| | - Kuang Gong
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, FL, USA.
| |
Collapse
|
5
|
J J, Haw SC, Palanichamy N, Ng KW, Thillaigovindhan SK. IM- LTS: An Integrated Model for Lung Tumor Segmentation using Neural Networks and IoMT. MethodsX 2025; 14:103201. [PMID: 40026592 PMCID: PMC11869539 DOI: 10.1016/j.mex.2025.103201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2025] [Accepted: 02/03/2025] [Indexed: 03/05/2025] Open
Abstract
In recent days, Internet of Medical Things (IoMT) and Deep Learning (DL) techniques are broadly used in medical data processing in decision-making. A lung tumour, one of the most dangerous medical diseases, requires early diagnosis with a higher precision rate. With that concern, this work aims to develop an Integrated Model (IM- LTS) for Lung Tumor Segmentation using Neural Networks (NN) and the Internet of Medical Things (IoMT). The model integrates two architectures, MobileNetV2 and U-NET, for classifying the input lung data. The input CT lung images are pre-processed using Z-score Normalization. The semantic features of lung images are extracted based on texture, intensity, and shape to provide information to the training network.•In this work, the transfer learning technique is incorporated, and the pre-trained NN was used as an encoder for the U-NET model for segmentation. Furthermore, Support Vector Machine is used here to classify input lung data as benign and malignant.•The results are measured based on the metrics such as, specificity, sensitivity, precision, accuracy and F-Score, using the data from benchmark datasets. Compared to the existing lung tumor segmentation and classification models, the proposed model provides better results and evidence for earlier disease diagnosis.
Collapse
Affiliation(s)
- Jayapradha J
- Department of Computing Technologies, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu 603203, India
- Faculty of Computing and Informatics, Multimedia University, Jalan Multimedia, 63100 Cyberjaya, Malaysia
| | - Su-Cheng Haw
- Faculty of Computing and Informatics, Multimedia University, Jalan Multimedia, 63100 Cyberjaya, Malaysia
| | - Naveen Palanichamy
- Faculty of Computing and Informatics, Multimedia University, Jalan Multimedia, 63100 Cyberjaya, Malaysia
| | - Kok-Why Ng
- Faculty of Computing and Informatics, Multimedia University, Jalan Multimedia, 63100 Cyberjaya, Malaysia
| | - Senthil Kumar Thillaigovindhan
- Department of Computing Technologies, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu 603203, India
| |
Collapse
|
6
|
Zhao Z, Li W, Ding X, Sun J, Xu LX. TTGA U-Net: Two-stage two-stream graph attention U-Net for hepatic vessel connectivity enhancement. Comput Med Imaging Graph 2025; 122:102514. [PMID: 40020507 DOI: 10.1016/j.compmedimag.2025.102514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Revised: 02/14/2025] [Accepted: 02/14/2025] [Indexed: 03/03/2025]
Abstract
Accurate segmentation of hepatic vessels is pivotal for guiding preoperative planning in ablation surgery utilizing CT images. While non-contrast CT images often lack observable vessels, we focus on segmenting hepatic vessels within preoperative MR images. However, the vascular structures depicted in MR images are susceptible to noise, leading to challenges in connectivity. To address this issue, we propose a two-stage two-stream graph attention U-Net (i.e., TTGA U-Net) for hepatic vessel segmentation. Specifically, the first-stage network employs a CNN or Transformer-based architecture to preliminarily locate the vessel position, followed by an improved superpixel segmentation method to generate graph structures based on the positioning results. The second-stage network extracts graph node features through two parallel branches of a graph spatial attention network (GAT) and a graph channel attention network (GCT), employing self-attention mechanisms to balance these features. The graph pooling operation is utilized to aggregate node information. Moreover, we introduce a feature fusion module instead of skip connections to merge the two graph attention features, providing additional information to the decoder effectively. We establish a novel well-annotated high-quality MR image dataset for hepatic vessel segmentation and validate the vessel connectivity enhancement network's effectiveness on this dataset and the public dataset 3D IRCADB. Experimental results demonstrate that our TTGA U-Net outperforms state-of-the-art methods, notably enhancing vessel connectivity.
Collapse
Affiliation(s)
- Ziqi Zhao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200030, China
| | - Wentao Li
- Fudan University Shanghai Cancer Center, Shanghai 200030, China; Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200030, China
| | - Xiaoyi Ding
- Ruijin Hospital, Shanghai Jiao Tong University School Of Medicine, Shanghai 200030, China
| | - Jianqi Sun
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200030, China; National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy (NERC-AMRT), and Med-X Research Institute, Shanghai Jiao Tong University, Shanghai 200030, China.
| | - Lisa X Xu
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200030, China; National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy (NERC-AMRT), and Med-X Research Institute, Shanghai Jiao Tong University, Shanghai 200030, China.
| |
Collapse
|
7
|
Xena-Bosch C, Kodali S, Sahi N, Chard D, Llufriu S, Toosy AT, Martinez-Heras E, Prados F. Advances in MRI optic nerve segmentation. Mult Scler Relat Disord 2025; 98:106437. [PMID: 40220726 DOI: 10.1016/j.msard.2025.106437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Revised: 03/05/2025] [Accepted: 04/08/2025] [Indexed: 04/14/2025]
Abstract
Understanding optic nerve structure and monitoring changes within it can provide insights into neurodegenerative diseases like multiple sclerosis, in which optic nerves are often damaged by inflammatory episodes of optic neuritis. Over the past decades, interest in the optic nerve has increased, particularly with advances in magnetic resonance technology and the advent of deep learning solutions. These advances have significantly improved the visualisation and analysis of optic nerves, making it possible to detect subtle changes that aid the early diagnosis and treatment of optic nerve-related diseases, and for planning radiotherapy interventions. Effective segmentation techniques, therefore, are crucial for enhancing the accuracy of predictive models, planning interventions and treatment strategies. This comprehensive review, which includes 27 peer-reviewed articles published between 2007 and 2024, examines and highlights the evolution of optic nerve magnetic resonance imaging segmentation over the past decade, tracing the development from intensity-based methods to the latest deep learning algorithms, including multi-atlas solutions using single or multiple image modalities.
Collapse
Affiliation(s)
- Carla Xena-Bosch
- e-Health Center, Universitat Oberta de Catalunya, Barcelona, Spain.
| | - Srikirti Kodali
- Queen Square MS Centre, Department of Neuroinflammation, UCL Institute of Neurology, Faculty of Brain Sciences, University College London, London, United Kingdom
| | - Nitin Sahi
- Queen Square MS Centre, Department of Neuroinflammation, UCL Institute of Neurology, Faculty of Brain Sciences, University College London, London, United Kingdom
| | - Declan Chard
- Queen Square MS Centre, Department of Neuroinflammation, UCL Institute of Neurology, Faculty of Brain Sciences, University College London, London, United Kingdom; National Institute for Health Research (NIHR) University College London Hospitals (UCLH) Biomedical Research Centre, United Kingdom
| | - Sara Llufriu
- Neuroimmunology and Multiple Sclerosis Unit, Hospital Clínic de Barcelona, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS) and Universitat de Barcelona, Barcelona, Spain
| | - Ahmed T Toosy
- Queen Square MS Centre, Department of Neuroinflammation, UCL Institute of Neurology, Faculty of Brain Sciences, University College London, London, United Kingdom
| | - Eloy Martinez-Heras
- Neuroimmunology and Multiple Sclerosis Unit, Hospital Clínic de Barcelona, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS) and Universitat de Barcelona, Barcelona, Spain
| | - Ferran Prados
- e-Health Center, Universitat Oberta de Catalunya, Barcelona, Spain; Queen Square MS Centre, Department of Neuroinflammation, UCL Institute of Neurology, Faculty of Brain Sciences, University College London, London, United Kingdom; Centre for Medical Image Computing, Department of Medical Physics and Biomedical Engineering, University College London, London, United Kingdom
| |
Collapse
|
8
|
Zhang X, Luan Y, Cui Y, Zhang Y, Lu C, Zhou Y, Zhang Y, Li H, Ju S, Tang T. SDS-Net: A Synchronized Dual-Stage Network for Predicting Patients Within 4.5-h Thrombolytic Treatment Window Using MRI. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025; 38:1681-1689. [PMID: 39466508 DOI: 10.1007/s10278-024-01308-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 10/01/2024] [Accepted: 10/15/2024] [Indexed: 10/30/2024]
Abstract
Timely and precise identification of acute ischemic stroke (AIS) within 4.5 h is imperative for effective treatment decision-making. This study aims to construct a novel network that utilizes limited datasets to recognize AIS patients within this critical window. We conducted a retrospective analysis of 265 AIS patients who underwent both fluid attenuation inversion recovery (FLAIR) and diffusion-weighted imaging (DWI) within 24 h of symptom onset. Patients were categorized based on the time since stroke onset (TSS) into two groups: TSS ≤ 4.5 h and TSS > 4.5 h. The TSS was calculated as the time from stroke onset to MRI completion. We proposed a synchronized dual-stage network (SDS-Net) and a sequential dual-stage network (Dual-stage Net), which were comprised of infarct voxel identification and TSS classification stages. The models were trained on 181 patients and validated on an independent external cohort of 84 patients using metrics of area under the curve (AUC), sensitivity, specificity, and accuracy. A DeLong test was used to statistically compare the performance of the two models. SDS-Net achieved an accuracy of 0.844 with an AUC of 0.914 in the validation dataset, outperforming the Dual-stage Net, which had an accuracy of 0.822 and an AUC of 0.846. In the external test dataset, SDS-Net further demonstrated superior performance with an accuracy of 0.800 and an AUC of 0.879, compared to the accuracy of 0.694 and AUC of 0.744 of Dual-stage Net (P = 0.049). SDS-Net is a robust and reliable tool for identifying AIS patients within a 4.5-h treatment window using MRI. This model can assist clinicians in making timely treatment decisions, potentially improving patient outcomes.
Collapse
Affiliation(s)
- Xiaoyu Zhang
- Nurturing Center of Jiangsu Province for State Laboratory of AI Imaging & Interventional Radiology, Department of Radiology, Zhongda Hospital, Medical School of Southeast University, 87 Dingjiaqiao Road, Nanjing, 210009, China
| | - Ying Luan
- Nurturing Center of Jiangsu Province for State Laboratory of AI Imaging & Interventional Radiology, Department of Radiology, Zhongda Hospital, Medical School of Southeast University, 87 Dingjiaqiao Road, Nanjing, 210009, China
| | - Ying Cui
- Nurturing Center of Jiangsu Province for State Laboratory of AI Imaging & Interventional Radiology, Department of Radiology, Zhongda Hospital, Medical School of Southeast University, 87 Dingjiaqiao Road, Nanjing, 210009, China
| | - Yi Zhang
- Center of Interventional Radiology & Vascular Surgery, Department of Radiology, Zhongda Hospital, Southeast University, 87 Dingjiaqiao Road, Nanjing, 210009, China
| | - Chunqiang Lu
- Nurturing Center of Jiangsu Province for State Laboratory of AI Imaging & Interventional Radiology, Department of Radiology, Zhongda Hospital, Medical School of Southeast University, 87 Dingjiaqiao Road, Nanjing, 210009, China
| | - Yujie Zhou
- Nurturing Center of Jiangsu Province for State Laboratory of AI Imaging & Interventional Radiology, Department of Radiology, Zhongda Hospital, Medical School of Southeast University, 87 Dingjiaqiao Road, Nanjing, 210009, China
| | - Ying Zhang
- Nurturing Center of Jiangsu Province for State Laboratory of AI Imaging & Interventional Radiology, Department of Radiology, Zhongda Hospital, Medical School of Southeast University, 87 Dingjiaqiao Road, Nanjing, 210009, China
| | - Huiming Li
- Nurturing Center of Jiangsu Province for State Laboratory of AI Imaging & Interventional Radiology, Department of Radiology, Zhongda Hospital, Medical School of Southeast University, 87 Dingjiaqiao Road, Nanjing, 210009, China
| | - Shenghong Ju
- Nurturing Center of Jiangsu Province for State Laboratory of AI Imaging & Interventional Radiology, Department of Radiology, Zhongda Hospital, Medical School of Southeast University, 87 Dingjiaqiao Road, Nanjing, 210009, China
| | - Tianyu Tang
- Nurturing Center of Jiangsu Province for State Laboratory of AI Imaging & Interventional Radiology, Department of Radiology, Zhongda Hospital, Medical School of Southeast University, 87 Dingjiaqiao Road, Nanjing, 210009, China.
| |
Collapse
|
9
|
Chen J, Zhu Q, Xie B, Li T. ToPoMesh: accurate 3D surface reconstruction from CT volumetric data via topology modification. Med Biol Eng Comput 2025:10.1007/s11517-025-03381-3. [PMID: 40423893 DOI: 10.1007/s11517-025-03381-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2024] [Accepted: 05/10/2025] [Indexed: 05/28/2025]
Abstract
Traditional computed tomography (CT) methods for 3D reconstruction face resolution limitations and require time-consuming post-processing workflows. While deep learning techniques improve the accuracy of segmentation, traditional voxel-based segmentation and surface reconstruction pipelines tend to introduce artifacts such as disconnected regions, topological inconsistencies, and stepped distortions. To overcome these challenges, we propose ToPoMesh, an end-to-end 3D mesh reconstruction deep learning framework for direct reconstruction of high-fidelity surface meshes from CT volume data. To address the existing problems, our approach introduces three core innovations: (1) accurate local and global shape modeling by preserving and enhancing local feature information through residual connectivity and self-attention mechanisms in graph convolutional networks; (2) an adaptive variant density (Avd) mesh de-pooling strategy, which dynamically optimizes the vertex distribution; (3) a topology modification module that iteratively prunes the error surfaces and boundary smoothing via variable regularity terms to obtain finer mesh surfaces. Experiments on the LiTS, MSD pancreas tumor, MSD hippocampus, and MSD spleen datasets demonstrate that ToPoMesh outperforms state-of-the-art methods. Quantitative evaluations demonstrate a 57.4% reduction in Chamfer distance (liver) and a 0.47% improvement in F-score compared to end-to-end 3D reconstruction methods, while qualitative results confirm enhanced fidelity for thin structures and complex anatomical topologies versus segmentation frameworks. Importantly, our method eliminates the need for manual post-processing, realizes the ability to reconstruct 3D meshes from images, and can provide precise guidance for surgical planning and diagnosis.
Collapse
Affiliation(s)
- Junjia Chen
- College of Computer Science, Beijing University Of Technology, Beijing, 100124, China
| | - Qing Zhu
- College of Computer Science, Beijing University Of Technology, Beijing, 100124, China
| | - Bowen Xie
- Department of Urology, Peking University Third Hospital, Beijing, 100191, China
| | - Tianxing Li
- College of Computer Science, Beijing University Of Technology, Beijing, 100124, China.
| |
Collapse
|
10
|
Kot WY, Au Yeung SY, Leung YY, Leung PH, Yang WF. Evolution of deep learning tooth segmentation from CT/CBCT images: a systematic review and meta-analysis. BMC Oral Health 2025; 25:800. [PMID: 40420051 DOI: 10.1186/s12903-025-05984-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2025] [Accepted: 04/10/2025] [Indexed: 05/28/2025] Open
Abstract
BACKGROUND Deep learning has been utilized to segment teeth from computed tomography (CT) or cone-beam CT (CBCT). However, the performance of deep learning is unknown due to multiple models and diverse evaluation metrics. This systematic review and meta-analysis aims to evaluate the evolution and performance of deep learning in tooth segmentation. METHODS We systematically searched PubMed, Web of Science, Scopus, IEEE Xplore, arXiv.org, and ACM for studies investigating deep learning in human tooth segmentation from CT/CBCT. Included studies were assessed using the Quality Assessment of Diagnostic Accuracy Study (QUADAS-2) tool. Data were extracted for meta-analyses by random-effects models. RESULTS A total of 30 studies were included in the systematic review, and 28 of them were included for meta-analyses. Various deep learning algorithms were categorized according to the backbone network, encompassing single-stage convolutional models, convolutional models with U-Net architecture, Transformer models, convolutional models with attention mechanisms, and combinations of multiple models. Convolutional models with U-Net architecture were the most commonly used deep learning algorithms. The integration of attention mechanism within convolutional models has become a new topic. 29 evaluation metrics were identified, with Dice Similarity Coefficient (DSC) being the most popular. The pooled results were 0.93 [0.93, 0.93] for DSC, 0.86 [0.85, 0.87] for Intersection over Union (IoU), 0.22 [0.19, 0.24] for Average Symmetric Surface Distance (ASSD), 0.92 [0.90, 0.94] for sensitivity, 0.71 [0.26, 1.17] for 95% Hausdorff distance, and 0.96 [0.93, 0.98] for precision. No significant difference was observed in the segmentation of single-rooted or multi-rooted teeth. No obvious correlation between sample size and segmentation performance was observed. CONCLUSIONS Multiple deep learning algorithms have been successfully applied to tooth segmentation from CT/CBCT and their evolution has been well summarized and categorized according to their backbone structures. In future, studies are needed with standardized protocols and open labelled datasets.
Collapse
Affiliation(s)
- Wai Ying Kot
- Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
| | - Sum Yin Au Yeung
- Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
| | - Yin Yan Leung
- Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
| | - Pui Hang Leung
- Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
- Division of Oral & Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
| | - Wei-Fa Yang
- Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China.
- Division of Oral & Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China.
| |
Collapse
|
11
|
Xu L, Halike A, Sen G, Sha M. Medical image segmentation model based on local enhancement driven global optimization. Sci Rep 2025; 15:18281. [PMID: 40414982 DOI: 10.1038/s41598-025-02393-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2025] [Accepted: 05/13/2025] [Indexed: 05/27/2025] Open
Abstract
In medical image segmentation, it is a challenging task to identify and locate the boundary of pathological tissue accurately. In response to this issue, this paper proposes a medical image segmentation model, named Local Enhancement Driven Global Optimization Network (LEGO-Net), and specially develops an Detail and Contour Recognition Module (DCRM) to accurately identify the boundaries of lesion tissue. Specifically, the DCRM has the following two main contributions: Firstly, it improves the network's capability to identify the boundaries of diseased tissue by examining the intricate spatial relationships between row and column elements on the feature map. Secondly, by integrating local modeling with global modeling, the network is able to not only capture the detailed local structural information of the lesion area but also take into account the tissue's overall structure, thereby enhancing the network's capability to delineate the boundaries of the lesion tissue more effectively. Furthermore, to further augment the network's capacity to discern lesion information, this paper introduces a Channel Feature Enhancement Module (CFEM). the CFEM can highlight the importance of elements that are beneficial to foreground feature discrimination. The outcomes demonstrate that the network architecture proposed in this paper is capable of effectively identifying and segmenting the boundaries of pathological tissues.
Collapse
Affiliation(s)
- Lianghui Xu
- Department of Medical Engineering and Technology, Xinjiang Medical University, Urumqi, 830017, Xinjiang, China
- Institute of Medical Engineering Interdisciplinary Research, Xinjiang Medical University, Urumqi, 830017, Xinjiang, China
| | - Ayiguli Halike
- Department of Medical Engineering and Technology, Xinjiang Medical University, Urumqi, 830017, Xinjiang, China
- Institute of Medical Engineering Interdisciplinary Research, Xinjiang Medical University, Urumqi, 830017, Xinjiang, China
| | - Gan Sen
- Department of Medical Engineering and Technology, Xinjiang Medical University, Urumqi, 830017, Xinjiang, China
- Institute of Medical Engineering Interdisciplinary Research, Xinjiang Medical University, Urumqi, 830017, Xinjiang, China
| | - Mo Sha
- Department of Medical Engineering and Technology, Xinjiang Medical University, Urumqi, 830017, Xinjiang, China.
- Institute of Medical Engineering Interdisciplinary Research, Xinjiang Medical University, Urumqi, 830017, Xinjiang, China.
| |
Collapse
|
12
|
Pishghadam N, Esmaeilyfard R, Paknahad M. Explainable deep learning for age and gender estimation in dental CBCT scans using attention mechanisms and multi task learning. Sci Rep 2025; 15:18070. [PMID: 40413203 PMCID: PMC12103566 DOI: 10.1038/s41598-025-03305-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2025] [Accepted: 05/20/2025] [Indexed: 05/27/2025] Open
Abstract
Accurate and interpretable age estimation and gender classification are essential in forensic and clinical diagnostics, particularly when using high-dimensional medical imaging data such as Cone Beam Computed Tomography (CBCT). Traditional CBCT-based approaches often suffer from high computational costs and limited interpretability, reducing their applicability in forensic investigations. This study aims to develop a multi-task deep learning framework that enhances both accuracy and explainability in CBCT-based age estimation and gender classification using attention mechanisms. We propose a multi-task learning (MTL) model that simultaneously estimates age and classifies gender using panoramic slices extracted from CBCT scans. To improve interpretability, we integrate Convolutional Block Attention Module (CBAM) and Grad-CAM visualization, highlighting relevant craniofacial regions. The dataset includes 2,426 CBCT images from individuals aged 7 to 23 years, and performance is assessed using Mean Absolute Error (MAE) for age estimation and accuracy for gender classification. The proposed model achieves a MAE of 1.08 years for age estimation and 95.3% accuracy in gender classification, significantly outperforming conventional CBCT-based methods. CBAM enhances the model's ability to focus on clinically relevant anatomical features, while Grad-CAM provides visual explanations, improving interpretability. Additionally, using panoramic slices instead of full 3D CBCT volumes reduces computational costs without sacrificing accuracy. Our framework improves both accuracy and interpretability in forensic age estimation and gender classification from CBCT images. By incorporating explainable AI techniques, this model provides a computationally efficient and clinically interpretable tool for forensic and medical applications.
Collapse
Affiliation(s)
- Najmeh Pishghadam
- Computer Engineering and Information Technology Department, Shiraz University of Technology, Shiraz, Iran
| | - Rasool Esmaeilyfard
- Computer Engineering and Information Technology Department, Shiraz University of Technology, Shiraz, Iran.
| | - Maryam Paknahad
- Oral and Dental Disease Research Center, Oral and Maxillofacial Radiology Department, Dental School, Shiraz University of Medical Sciences, Shiraz, Iran
| |
Collapse
|
13
|
Matsumoto K, Suzuki M, Ishihara K, Tokunaga K, Matsuda K, Chen J, Yamashiro S, Soejima H, Nakashima N, Kamouchi M. Performance of multimodal prediction models for intracerebral hemorrhage outcomes using real-world data. Int J Med Inform 2025; 202:105989. [PMID: 40412140 DOI: 10.1016/j.ijmedinf.2025.105989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Revised: 05/05/2025] [Accepted: 05/20/2025] [Indexed: 05/27/2025]
Abstract
BACKGROUND We aimed to develop and validate multimodal models integrating computed tomography (CT) images, text and tabular clinical data to predict poor functional outcomes and in-hospital mortality in patients with intracerebral hemorrhage (ICH). These models were designed to assist non-specialists in emergency settings with limited access to stroke specialists. METHODS A retrospective analysis of 527 patients with ICH admitted to a Japanese tertiary hospital between April 2019 and February 2022 was conducted. Deep learning techniques were used to extract features from three-dimensional CT images and unstructured data, which were then combined with tabular data to develop an L1-regularized logistic regression model to predict poor functional outcomes (modified Rankin scale score 3-6) and in-hospital mortality. The model's performance was evaluated by assessing discrimination metrics, calibration plots, and decision curve analysis (DCA) using temporal validation data. RESULTS The multimodal model utilizing both imaging and text data, such as medical interviews, exhibited the highest performance in predicting poor functional outcomes. In contrast, the model that combined imaging with tabular data, including physiological and laboratory results, demonstrated the best predictive performance for in-hospital mortality. These models exhibited high discriminative performance, with areas under the receiver operating curve (AUROCs) of 0.86 (95% CI: 0.79-0.92) and 0.91 (95% CI: 0.84-0.96) for poor functional outcomes and in-hospital mortality, respectively. Calibration was satisfactory for predicting poor functional outcomes, but requires refinement for mortality prediction. The models performed similar to or better than conventional risk scores, and DCA curves supported their clinical utility. CONCLUSION Multimodal prediction models have the potential to aid non-specialists in making informed decisions regarding ICH cases in emergency departments as part of clinical decision support systems. Enhancing real-world data infrastructure and improving model calibration are essential for successful implementation in clinical practice.
Collapse
Affiliation(s)
- Koutarou Matsumoto
- Department of Health Care Administration and Management, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan; Institute for Medical Information Research and Analysis, Saiseikai Kumamoto Hospital, Kumamoto, Japan.
| | - Masahiro Suzuki
- Graduate Degree Program of Applied Data Sciences, Sophia University, Tokyo, Japan
| | - Kazuaki Ishihara
- Department of Computer Science and Information Engineering, Chang Gung University, Taoyuan, Taiwan
| | - Koki Tokunaga
- Department of Pharmacy, Saiseikai Kumamoto Hospital, Kumamoto, Japan
| | - Katsuhiko Matsuda
- Department of Radiology, Saiseikai Kumamoto Hospital, Kumamoto, Japan
| | - Jenhui Chen
- Department of Computer Science and Information Engineering, Chang Gung University, Taoyuan, Taiwan
| | - Shigeo Yamashiro
- Division of Neurosurgery, Saiseikai Kumamoto Hospital, Kumamoto, Japan
| | - Hidehisa Soejima
- Institute for Medical Information Research and Analysis, Saiseikai Kumamoto Hospital, Kumamoto, Japan
| | - Naoki Nakashima
- Medical Information Center, Kyushu University Hospital, Fukuoka, Japan; Department of Medical Informatics, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Masahiro Kamouchi
- Department of Health Care Administration and Management, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan; Center for Cohort Studies, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| |
Collapse
|
14
|
Kousaka J, Iwane AH, Togashi Y. Automated cell structure extraction for 3D electron microscopy by deep learning. Sci Rep 2025; 15:17481. [PMID: 40394179 DOI: 10.1038/s41598-025-01763-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2024] [Accepted: 05/08/2025] [Indexed: 05/22/2025] Open
Abstract
Modeling the 3D structures of cells and tissues is crucial in biology. Sequential cross-sectional images from electron microscopy provide high-resolution intracellular structure information. The segmentation of complex cell structures remains a laborious manual task for experts, demanding time and effort. This bottleneck in analyzing biological images requires efficient and automated solutions. In this study, the deep learning-based automated segmentation of biological images was explored to enable accurate reconstruction of the 3D structures of cells and organelles. An analysis system for the cell images of Cyanidioschyzon merolae, a primitive unicellular red algae, was constructed. This system utilizes sequential cross-sectional images captured by a focused ion beam scanning electron microscope (FIB-SEM). A U-Net was adopted and training was performed to identify and segment cell organelles from single-cell images. In addition, the segment anything model (SAM) and 3D watershed algorithm were employed to extract individual 3D images of each cell from large-scale microscope images containing numerous cells. Finally, the trained U-Net was applied to segment each structure within these 3D images. Through this procedure, the creation of 3D cell models could be fully automated. The adoption of other deep learning techniques and combinations of image processing methods will also be explored to enhance the segmentation accuracy further.
Collapse
Affiliation(s)
- Jin Kousaka
- Graduate School of Life Sciences, Ritsumeikan University, 1-1-1 Noji-higashi, 525-8577, Kusatsu, Shiga, Japan
| | - Atsuko H Iwane
- Laboratory for Cell Field Structure, RIKEN Center for Biosystems Dynamics Research, 3-10-23 Kagamiyama, 739-0046, Higashi-Hiroshima, Hiroshima, Japan
- Laboratory for Comprehensive Bioimaging, RIKEN Center for Biosystems Dynamics Research, 2-2-3 Minatojima-minamimachi, 650-0047, Kobe, Hyogo, Japan
- Research Institute for Cell Design Medical Science, Yamaguchi University, 1-1-1 Minami-Kogushi, 755-8505, Ube, Yamaguchi, Japan
| | - Yuichi Togashi
- Graduate School of Life Sciences, Ritsumeikan University, 1-1-1 Noji-higashi, 525-8577, Kusatsu, Shiga, Japan.
- Laboratory for Cell Field Structure, RIKEN Center for Biosystems Dynamics Research, 3-10-23 Kagamiyama, 739-0046, Higashi-Hiroshima, Hiroshima, Japan.
- Laboratory for Comprehensive Bioimaging, RIKEN Center for Biosystems Dynamics Research, 2-2-3 Minatojima-minamimachi, 650-0047, Kobe, Hyogo, Japan.
| |
Collapse
|
15
|
Lee DK, Shin JS, Choi JS, Choi MH, Hong M. Exhale-Focused Thermal Image Segmentation Using Optical Flow-Based Frame Filtering and Transformer-Aided Deep Networks. Bioengineering (Basel) 2025; 12:542. [PMID: 40428161 PMCID: PMC12108674 DOI: 10.3390/bioengineering12050542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2025] [Revised: 05/12/2025] [Accepted: 05/15/2025] [Indexed: 05/29/2025] Open
Abstract
Since the COVID-19 pandemic, interest in non-contact diagnostic technologies has grown, leading to increased research into remote biosignal monitoring. The respiratory rate, widely used in previous studies, offers limited insight into pulmonary volume. To redress this, we propose a thermal imaging-based framework for respiratory segmentation aimed at estimating non-invasive pulmonary function. The proposed method uses an optical flow magnitude-based thresholding technique to automatically extract exhalation frames and segment them into frame sequences. A TransUNet-based network, combining a Convolutional Neural Network (CNN) encoder-decoder architecture with a Transformer module in the bottleneck, is trained on these sequences. The model's Accuracy, Precision, Recall, IoU, Dice, and F1-score were 0.9832, 0.9833, 0.9830, 0.9651, 0.9822, and 0.9831, respectively, which results demonstrate high segmentation performance. The method enables the respiratory volume to be estimated by detecting exhalation behavior, suggesting its potential as a non-contact tool to monitor pulmonary function and estimate lung volume. Furthermore, research on thermal imaging-based respiratory volume analysis remains limited. This study expands upon conventional respiratory rate-based approaches to provide a new direction for respiratory analysis using vision-based techniques.
Collapse
Affiliation(s)
- Do-Kyeong Lee
- Department of Software Convergence, Soonchunhyang University, Asan 31538, Republic of Korea; (D.-K.L.); (J.-S.S.)
| | - Jae-Sung Shin
- Department of Software Convergence, Soonchunhyang University, Asan 31538, Republic of Korea; (D.-K.L.); (J.-S.S.)
| | - Jae-Sung Choi
- Department of Internal Medicine, Cheonan Hospital, College of Medicine, Soonchunhyang University, Cheonan 31151, Republic of Korea;
| | - Min-Hyung Choi
- Department of Computer Science, Saint Louis University, Louis, MO 63103, USA;
| | - Min Hong
- Department of Computer Software Engineering, Soonchunhyang University, Asan 31538, Republic of Korea
| |
Collapse
|
16
|
Westfechtel SD, Kußmann K, Aßmann C, Huppertz MS, Siepmann RM, Lemainque T, Winter VR, Barabasch A, Kuhl CK, Truhn D, Nebelung S. AI in motion: the impact of data augmentation strategies on mitigating MRI motion artifacts. Eur Radiol 2025:10.1007/s00330-025-11670-6. [PMID: 40381000 DOI: 10.1007/s00330-025-11670-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2024] [Revised: 03/26/2025] [Accepted: 04/15/2025] [Indexed: 05/19/2025]
Abstract
OBJECTIVES Artifacts in clinical MRI can compromise the performance of AI models. This study evaluates how different data augmentation strategies affect an AI model's segmentation performance under variable artifact severity. MATERIALS AND METHODS We used an AI model based on the nnU-Net architecture to automatically quantify lower limb alignment using axial T2-weighted MR images. Three versions of the AI model were trained with different augmentation strategies: (1) no augmentation ("baseline"), (2) standard nnU-net augmentations ("default"), and (3) "default" plus augmentations that emulate MR artifacts ("MRI-specific"). Model performance was tested on 600 MR image stacks (right and left; hip, knee, and ankle) from 20 healthy participants (mean age, 23 ± 3 years, 17 men), each imaged five times under standardized motion to induce artifacts. Two radiologists graded each stack's artifact severity as none, mild, moderate, and severe, and manually measured torsional angles. Segmentation quality was assessed using the Dice similarity coefficient (DSC), while torsional angles were compared between manual and automatic measurements using mean absolute deviation (MAD), intraclass correlation coefficient (ICC), and Pearson's correlation coefficient (r). Statistical analysis included parametric tests and a Linear Mixed-Effects Model. RESULTS MRI-specific augmentation resulted in slightly (yet not significantly) better performance than the default strategy. Segmentation quality decreased with increasing artifact severity, which was partially mitigated by default and MRI-specific augmentations (e.g., severe artifacts, proximal femur: DSCbaseline = 0.58 ± 0.22; DSCdefault = 0.72 ± 0.22; DSCMRI-specific = 0.79 ± 0.14 [p < 0.001]). These augmentations also maintained precise torsional angle measurements (e.g., severe artifacts, femoral torsion: MADbaseline = 20.6 ± 23.5°; MADdefault = 7.0 ± 13.0°; MADMRI-specific = 5.7 ± 9.5° [p < 0.001]; ICCbaseline = -0.10 [p = 0.63; 95% CI: -0.61 to 0.47]; ICCdefault = 0.38 [p = 0.08; -0.17 to 0.76]; ICCMRI-specific = 0.86 [p < 0.001; 0.62 to 0.95]; rbaseline = 0.58 [p < 0.001; 0.44 to 0.69]; rdefault = 0.68 [p < 0.001; 0.56 to 0.77]; rMRI-specific = 0.86 [p < 0.001; 0.81 to 0.9]). CONCLUSION Motion artifacts negatively impact AI models, but general-purpose augmentations enhance robustness effectively. MRI-specific augmentations offer minimal additional benefit. KEY POINTS Question Motion artifacts negatively impact the performance of diagnostic AI models for MRI, but mitigation methods remain largely unexplored. Findings Domain-specific augmentation during training can improve the robustness and performance of a model for quantifying lower limb alignment in the presence of severe artifacts. Clinical relevance Excellent robustness and accuracy are crucial for deploying diagnostic AI models in clinical practice. Including domain knowledge in model training can benefit clinical adoption.
Collapse
Affiliation(s)
- Simon D Westfechtel
- Department for Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany.
| | - Kristoffer Kußmann
- Department for Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany
| | - Cederic Aßmann
- Department for Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany
| | - Marc S Huppertz
- Department for Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany
| | - Robert M Siepmann
- Department for Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany
| | - Teresa Lemainque
- Department for Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany
| | - Vera R Winter
- Department for Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany
| | - Alexandra Barabasch
- Department for Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany
| | - Christiane K Kuhl
- Department for Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany
| | - Daniel Truhn
- Department for Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany
| | - Sven Nebelung
- Department for Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany
| |
Collapse
|
17
|
Pant D, Nytrø Ø, Leventhal BL, Clausen C, Koochakpour K, Stien L, Westbye OS, Koposov R, Røst TB, Frodl T, Skokauskas N. Secondary use of health records for prediction, detection, and treatment planning in the clinical decision support system: a systematic review. BMC Med Inform Decis Mak 2025; 25:190. [PMID: 40380138 DOI: 10.1186/s12911-025-03021-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 05/05/2025] [Indexed: 05/19/2025] Open
Abstract
BACKGROUND This study aims to understand how secondary use of health records can be done for prediction, detection, treatment recommendations, and related tasks in clinical decision support systems. METHODS Articles mentioning the secondary use of EHRs for clinical utility, specifically in prediction, detection, treatment recommendations, and related tasks in decision support were reviewed. We extracted study details, methods, tools, technologies, utility, and performance. RESULTS We found that secondary uses of EHRs are primarily retrospective, mostly conducted using records from hospital EHRs, EHR data networks, and warehouses. EHRs vary in type and quality, making it critical to ensure their completeness and quality for clinical utility. Widely used methods include machine learning, statistics, simulation, and analytics. Secondary use of health records can be applied in any area of medicine. The selection of data, cohorts, tools, technology, and methods depends on the specific clinical utility. CONCLUSION The process for secondary use of health records should include three key steps: 1. Validation of the quality of EHRs, 2. Use of methods, tools, and technologies with proactive training, and 3. Multidimensional assessment of the results and their usefulness. TRIAL REGISTRATION PROSPERO registration number CRD42023409582.
Collapse
Affiliation(s)
- Dipendra Pant
- Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway.
- Department of Child and Adolescent Psychiatry, Clinic of Mental Health Care, St. Olav University Hospital, Trondheim, Norway.
| | - Øystein Nytrø
- Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway
- Department of Computer Science, UiT The Arctic University of Norway, Tromsø, Norway
| | | | - Carolyn Clausen
- Regional Centre for Child and Youth Mental Health and Child Welfare (RKBU Central Norway), Department of Mental Health, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Kaban Koochakpour
- Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway
| | - Line Stien
- Regional Centre for Child and Youth Mental Health and Child Welfare (RKBU Central Norway), Department of Mental Health, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Odd Sverre Westbye
- Regional Centre for Child and Youth Mental Health and Child Welfare (RKBU Central Norway), Department of Mental Health, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway
- Department of Child and Adolescent Psychiatry, Clinic of Mental Health Care, St. Olav University Hospital, Trondheim, Norway
| | - Roman Koposov
- Regional Centre for Child and Youth Mental Health and Child Welfare (RKBU North), UiT The Arctic University of Norway, Tromsø, Norway
| | - Thomas Brox Røst
- Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway
- Vivit AS, Trondheim, Norway
| | | | - Norbert Skokauskas
- Regional Centre for Child and Youth Mental Health and Child Welfare (RKBU Central Norway), Department of Mental Health, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
18
|
Lv L, Han X, Sun Z, Li Z, Wang X, Jiang T, Liu Y, Li T, Xu J, You L, Yao G, Sun FR, Xing J. SSL-DA: Semi-and Self-Supervised Learning with Dual Attention for Echocardiogram Segmentation. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-025-01532-4. [PMID: 40355692 DOI: 10.1007/s10278-025-01532-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2024] [Revised: 04/13/2025] [Accepted: 04/29/2025] [Indexed: 05/14/2025]
Abstract
Echocardiogram analysis plays a crucial role in assessing and diagnosing cardiac function, providing essential data to support medical diagnoses of heart disease. A key task, accurately identifying and segmenting the left ventricle (LV) in echocardiograms, remains challenging and labor-intensive. Current automated cardiac segmentation methods often lack the necessary accuracy and reproducibility, while semi-automated or manual annotations are excessively time-consuming. To address these limitations, we propose a novel segmentation framework, semi-and self-supervised learning with dual attention (SSL-DA) for echocardiogram segmentation. We start with a temporal masking network for pre-training. This network captures valuable information, such as echocardiogram periodicity. It also provides optimized initialization parameters for LV segmentation. We then employ a semi-supervised network to automatically segment the left ventricle, enhancing the model's learning with channel and spatial attention mechanisms to capture global channel dependencies and spatial dependencies across annotations. We evaluated SSL-DA on the publicly available EchoNet-Dynamic dataset, achieving a Dice similarity coefficient of 93.34% (95% CI, 93.23-93.46%), outperforming most prior CNN-based models. To further assess the generalization ability of SSL-DA, we conducted ablation experiments on the CAMUS dataset. Experimental results confirm that SSL-DA can quickly and accurately segment the left ventricle in echocardiograms, showing its potential for robust clinical application.
Collapse
Affiliation(s)
- Lin Lv
- School of Integrated Circuits, Shandong University, 1500 Shunhua Road, Jinan, 250101, Shandong, China
| | - Xing Han
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Zhengxiang Sun
- Faculty of Science, The University of Sydney, Sydney, NSW, Australia
| | - Zhaoguang Li
- School of Integrated Circuits, Shandong University, 1500 Shunhua Road, Jinan, 250101, Shandong, China
| | - Xiuying Wang
- Faculty of Engineering, The University of Sydney, Sydney, NSW, Australia
| | - Tong Jiang
- Key Laboratory of Chinese Internal Medicine of Ministry of Education, Beijing University of Chinese Medicine, Beijing, China
| | - Yiren Liu
- School of Integrated Circuits, Shandong University, 1500 Shunhua Road, Jinan, 250101, Shandong, China
| | - Tianshu Li
- School of Integrated Circuits, Shandong University, 1500 Shunhua Road, Jinan, 250101, Shandong, China
| | - Jingjing Xu
- School of Integrated Circuits, Shandong University, 1500 Shunhua Road, Jinan, 250101, Shandong, China
| | - Liangzhen You
- Key Laboratory of Chinese Internal Medicine of Ministry of Education, Beijing University of Chinese Medicine, Beijing, China
| | - Guihua Yao
- Cardiology Department, Qilu Hospital of Shandong University (Qingdao), Qingdao, China
| | - Feng-Rong Sun
- School of Integrated Circuits, Shandong University, 1500 Shunhua Road, Jinan, 250101, Shandong, China.
| | - Jianping Xing
- School of Integrated Circuits, Shandong University, 1500 Shunhua Road, Jinan, 250101, Shandong, China.
| |
Collapse
|
19
|
Hossain KF, Kamran SA, Ong J, Tavakkoli A. Enhancing efficient deep learning models with multimodal, multi-teacher insights for medical image segmentation. Sci Rep 2025; 15:15948. [PMID: 40335579 PMCID: PMC12059033 DOI: 10.1038/s41598-025-91430-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2024] [Accepted: 02/20/2025] [Indexed: 05/09/2025] Open
Abstract
The rapid evolution of deep learning has dramatically enhanced the field of medical image segmentation, leading to the development of models with unprecedented accuracy in analyzing complex medical images. Deep learning-based segmentation holds significant promise for advancing clinical care and enhancing the precision of medical interventions. However, these models' high computational demand and complexity present significant barriers to their application in resource-constrained clinical settings. To address this challenge, we introduce Teach-Former, a novel knowledge distillation (KD) framework that leverages a Transformer backbone to effectively condense the knowledge of multiple teacher models into a single, streamlined student model. Moreover, it excels in the contextual and spatial interpretation of relationships across multimodal images for more accurate and precise segmentation. Teach-Former stands out by harnessing multimodal inputs (CT, PET, MRI) and distilling the final predictions and the intermediate attention maps, ensuring a richer spatial and contextual knowledge transfer. Through this technique, the student model inherits the capacity for fine segmentation while operating with a significantly reduced parameter set and computational footprint. Additionally, introducing a novel training strategy optimizes knowledge transfer, ensuring the student model captures the intricate mapping of features essential for high-fidelity segmentation. The efficacy of Teach-Former has been effectively tested on two extensive multimodal datasets, HECKTOR21 and PI-CAI22, encompassing various image types. The results demonstrate that our KD strategy reduces the model complexity and surpasses existing state-of-the-art methods to achieve superior performance. The findings of this study indicate that the proposed methodology could facilitate efficient segmentation of complex multimodal medical images, supporting clinicians in achieving more precise diagnoses and comprehensive monitoring of pathological conditions ( https://github.com/FarihaHossain/TeachFormer ).
Collapse
Affiliation(s)
- Khondker Fariha Hossain
- Department of Computer Science and Engineering, University of Nevada, Reno, Reno, NV, 89557, USA.
| | - Sharif Amit Kamran
- Department of Computer Science and Engineering, University of Nevada, Reno, Reno, NV, 89557, USA
| | - Joshua Ong
- Department of Ophthalmology and Visual Sciences, University of Michigan Kellogg Eye Center, Ann Arbor, MI, USA
| | - Alireza Tavakkoli
- Department of Computer Science and Engineering, University of Nevada, Reno, Reno, NV, 89557, USA
| |
Collapse
|
20
|
Verma P, Kumar H, Shukla DK, Satpathy S, Alsekait DM, Khalaf OI, Alzoubi A, Alqadi BS, AbdElminaam DS, Kushwaha A, Singh J. V3DQutrit a volumetric medical image segmentation based on 3D qutrit optimized modified tensor ring model. Sci Rep 2025; 15:15785. [PMID: 40328837 PMCID: PMC12056032 DOI: 10.1038/s41598-025-00537-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2024] [Accepted: 04/29/2025] [Indexed: 05/08/2025] Open
Abstract
This paper introduces 3D-QTRNet, a novel quantum-inspired neural network for volumetric medical image segmentation. Unlike conventional CNNs, which suffer from slow convergence and high complexity, and QINNs, which are limited to grayscale segmentation, our approach leverages qutrit encoding and tensor ring decomposition. These techniques improve segmentation accuracy, optimize memory usage, and accelerate model convergence. The proposed model demonstrates superior performance on the BRATS19 and Spleen datasets, outperforming state-of-the-art CNN and quantum models in terms of Dice similarity and segmentation precision. This work bridges the gap between quantum computing and medical imaging, offering a scalable solution for real-world applications.
Collapse
Affiliation(s)
| | - Harish Kumar
- CSE Department, NIT Kurukhetra, Kurukhetra, Hariyana, India
| | | | - Sambit Satpathy
- CSE Department, Galgotias College of Engineering and Technology, Greater Noida, Uttar Pradesh, India
| | - Deema Mohammed Alsekait
- Department of Information Technology, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, 11671, Riyadh, Saudi Arabia.
| | - Osamah Ibrahim Khalaf
- Department of Solar, Al-Nahrain Research Center for Renewable Energy, Al-Nahrain University, Jadriya, Baghdad, Iraq
| | - Ala Alzoubi
- Faculty of Information Technology, Applied Science Private University, Amman, 11931, Jordan
| | - Basma S Alqadi
- Computer Science Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University, Riyadh, Saudi Arabia
| | - Diaa Salama AbdElminaam
- Jadara Research Center, Jadara University, Irbid, 21110, Jordan
- Faculty of Computers and Artificial Inellgence, Benha University, Benha, Egypt
| | - Arvinda Kushwaha
- CSE Department, Galgotias College of Engineering and Technology, Greater Noida, Uttar Pradesh, India
| | | |
Collapse
|
21
|
Li Y, Deng J, Zhang Y. Universal mapping and patient-specific prior implicit neural representation for enhanced high-resolution MRI in MRI-guided radiotherapy. Med Phys 2025. [PMID: 40317743 DOI: 10.1002/mp.17863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2024] [Revised: 03/20/2025] [Accepted: 04/11/2025] [Indexed: 05/07/2025] Open
Abstract
BACKGROUND Magnetic resonance imaging (MRI), known for its superior soft tissue contrast, plays a crucial role in radiation therapy (RT). The introduction of MR-LINAC systems enables the use of on-board MRI for adaptive radiotherapy (ART) on the day of treatment to maximize treatment accuracy. PURPOSE Due to patient comfort considerations and the time constraints associated with adaptive radiation therapy (ART), reducing the resolution of on-board MRI to accelerate image acquisition can improve efficiency, especially when acquiring multiple MRIs with different contrast weightings. However, the low-resolution imaging makes it challenging to identify key anatomical structures, potentially limiting treatment precision. To address this challenge, super-resolution of on-board MRI has emerged as a viable solution. METHODS To achieve super-resolution for on-board MRI, this study proposed a universal anatomical mapping and patient-specific prior implicit neural representation (USINR) framework. Unlike traditional methods that interpolate solely based on individual on-board MR images, USINR can fully utilize the patient-specific anatomical information from a high-resolution prior MRI. In addition, USINR leverages knowledge about universal mapping between population-based prior MRIs and on-board MRIs, elevating the upper bound of super-resolution performance and enabling faster on-board fine-tuning. RESULTS USINR was evaluated on three datasets, including IXI, BraTS, and an in-house abdominal dataset. It achieved state-of-the-art performance on all of them. For example, on the BraTS dataset, USINR was trained on 1151 paired training samples (for universal anatomical mapping) and tested on 50 patients. It achieved average SSIM, PSNR, and LPIPS scores of 0.9656, 37.12, and 0.0214, respectively, significantly outperforming the published state-of-the-art method SuperFormer, whose corresponding scores were 0.9488, 35.83, and 0.0388. Furthermore, USINR can complete patient-specific training in less than one minute, rendering it a favorable solution in time-constrained ART workflows. In addition to large-scale dataset evaluations, a case study was conducted on an in-house patient at UT Southwestern Medical Center. This case study included two MRI scans (a prior scan for plan simulation and a new one for on-board imaging) from a single patient with a long interval between two scans, during which the tumor size underwent a significant change. Despite these substantial anatomical changes between prior and on-board imaging, USINR was able to accurately capture the change in tumor size, highlighting its robustness for clinical applications. CONCLUSIONS By combining knowledge of universal anatomical mapping with patient-specific prior implicit neural representation, USINR offers a novel and reliable approach for MRI super-resolution. This method enhances the spatial resolution of MR images with minimal processing time, thereby balancing the need for image quality and the efficiency of MRI-guided adaptive radiotherapy.
Collapse
Affiliation(s)
- Yunxiang Li
- Department of Radiation Oncology, UT Southwestern Medical Center, Dallas, Texas, USA
| | - Jie Deng
- Department of Radiation Oncology, UT Southwestern Medical Center, Dallas, Texas, USA
| | - You Zhang
- Department of Radiation Oncology, UT Southwestern Medical Center, Dallas, Texas, USA
| |
Collapse
|
22
|
Zhang Z, Chen Y, Yu H, Wang Z, Wang S, Fan F, Shan H, Zhang Y. UniAda: Domain Unifying and Adapting Network for Generalizable Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:1988-2001. [PMID: 40030769 DOI: 10.1109/tmi.2024.3523319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Learning a generalizable medical image segmentation model is an important but challenging task since the unseen (testing) domains may have significant discrepancies from seen (training) domains due to different vendors and scanning protocols. Existing segmentation methods, typically built upon domain generalization (DG), aim to learn multi-source domain-invariant features through data or feature augmentation techniques, but the resulting models either fail to characterize global domains during training or cannot sense unseen domain information during testing. To tackle these challenges, we propose a domain Unifying and Adapting network (UniAda) for generalizable medical image segmentation, a novel "unifying while training, adapting while testing" paradigm that can learn a domain-aware base model during training and dynamically adapt it to unseen target domains during testing. First, we propose to unify the multi-source domains into a global inter-source domain via a novel feature statistics update mechanism, which can sample new features for the unseen domains, facilitating the training of a domain base model. Second, we leverage the uncertainty map to guide the adaptation of the trained model for each testing sample, considering the specific target domain may be outside the global inter-source domain. Extensive experimental results on two public cross-domain medical datasets and one in-house cross-domain dataset demonstrate the strong generalization capacity of the proposed UniAda over state-of-the-art DG methods. The source code of our UniAda is available at https://github.com/ZhouZhang233/UniAda.
Collapse
|
23
|
Stas D, De Kerf G, Claessens M, Karlhede A, Söderberg J, Dirix P, Ost P. Incorporating indirect MRI information in a CT-based deep learning model for prostate auto-segmentation. Radiother Oncol 2025; 206:110806. [PMID: 39988305 DOI: 10.1016/j.radonc.2025.110806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2024] [Revised: 02/02/2025] [Accepted: 02/19/2025] [Indexed: 02/25/2025]
Abstract
BACKGROUND AND PURPOSE Computed tomography (CT) imaging poses challenges for delineation of soft tissue structures for prostate cancer external beam radiotherapy. Guidelines require the input of magnetic resonance imaging (MRI) information. We developed a deep learning (DL) prostate and organ-at-risk contouring model designed to find the MRI-truth in CT imaging. MATERIAL AND METHODS The study utilized CT-scan data from 165 prostate cancer patients, with 136 scans for training and 29 for testing. The research focused on contouring five regions of interest (ROIs): clinical target volume of the prostate including the venous plexus (VP) (CTV-iVP) and excluding the VP (CTV-eVP), bladder, anorectum and the whole seminal vesicles (SV) according to The European Society for Radiotherapy and Oncology (ESTRO) and Advisory Committee on Radiation Oncology Practice (ACROP) contouring guidelines. Human delineation included fusion of MRI-imaging with the planning CT-scans in the process, but the model itself has never been shown MRI-images during its development. Model training involved a three-dimensional U-Net architecture. A qualitative review was independently performed by two clinicians scoring the model on time-based criteria and the DL segmentation results were compared to manual adaptations using the Dice similarity coefficient (DSC) and the 95th percentile Hausdorff distance (HD95). RESULTS The qualitative review of DL segmentations for CTV-iVP and CTV-eVP showed 2 or 3 out of 3 in 96 % of cases, indicating minimal manual adjustments were needed by clinicians. The DL model demonstrated comparable quantitative performance in delineating CTV-iVP and CTV-eVP with a DSC of 89 % with a standard deviation of 3.3 %. HD95 is 4 mm for CTV-iVP and 4.1 mm CTV-eVP with a standard deviation of 2.1 mm for both contours. Anorectum, bladder and SV scored 3 out of 3 in the qualitative analysis in 62 %, 72 % and 55 % of cases respectively. DSC and HD95 are 90 % and 5.5 mm for anorectum, 96 % and 2.9 mm for bladder, and 81 % and 4.6 mm for the seminal vesicles. CONCLUSION To our knowledge, this is the first DL model designed to implement MRI contouring guidelines in CT imaging and the first model trained according to ESTRO-ACROP contouring guidelines. This CT-based DL model presents a valuable tool for aiding prostate delineation without requiring the actual MRI information.
Collapse
Affiliation(s)
- Daan Stas
- Department of Radiation Oncology, Iridium Network, Antwerp, Belgium; Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium.
| | - Geert De Kerf
- Department of Radiation Oncology, Iridium Network, Antwerp, Belgium; Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium
| | | | | | | | - Piet Dirix
- Department of Radiation Oncology, Iridium Network, Antwerp, Belgium
| | - Piet Ost
- Department of Radiation Oncology, Iridium Network, Antwerp, Belgium
| |
Collapse
|
24
|
Kim N, Park H, Jung YH, Hwang JJ. Enhancing panoramic dental imaging with AI-driven arch surface fitting: achieving improved clarity and accuracy through an optimal reconstruction zone. Dentomaxillofac Radiol 2025; 54:256-267. [PMID: 39832267 DOI: 10.1093/dmfr/twaf006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2024] [Revised: 10/30/2024] [Accepted: 12/19/2024] [Indexed: 01/22/2025] Open
Abstract
OBJECTIVES This study aimed to develop an automated method for generating clearer, well-aligned panoramic views by creating an optimized 3-dimensional (3D) reconstruction zone centred on the teeth. The approach focused on achieving high contrast and clarity in key dental features, including tooth roots, morphology, and periapical lesions, by applying a 3D U-Net deep learning model to generate an arch surface and align the panoramic view. METHODS This retrospective study analysed anonymized cone-beam CT (CBCT) scans from 312 patients (mean age 40 years; range 10-78; 41.3% male, 58.7% female). A 3D U-Net deep learning model segmented the jaw and dentition, facilitating panoramic view generation. During preprocessing, CBCT scans were binarized, and a cylindrical reconstruction method aligned the arch along a straight coordinate system, reducing data size for efficient processing. The 3D U-Net segmented the jaw and dentition in 2 steps, after which the panoramic view was reconstructed using 3D spline curves fitted to the arch, defining the optimal 3D reconstruction zone. This ensured the panoramic view captured essential anatomical details with high contrast and clarity. To evaluate performance, we compared contrast between tooth roots and alveolar bone and assessed intersection over union (IoU) values for tooth shapes and periapical lesions (#42, #44, #46) relative to the conventional method, demonstrating enhanced clarity and improved visualization of critical dental structures. RESULTS The proposed method outperformed the conventional approach, showing significant improvements in the contrast between tooth roots and alveolar bone, particularly for tooth #42. It also demonstrated higher IoU values in tooth morphology comparisons, indicating superior shape alignment. Additionally, when evaluating periapical lesions, our method achieved higher performance with thinner layers, resulting in several statistically significant outcomes. Specifically, average pixel values within lesions were higher for certain layer thicknesses, demonstrating enhanced visibility of lesion boundaries and better visualization. CONCLUSIONS The fully automated AI-based panoramic view generation method successfully created a 3D reconstruction zone centred on the teeth, enabling consistent observation of dental and surrounding tissue structures with high contrast across reconstruction widths. By accurately segmenting the dental arch and defining the optimal reconstruction zone, this method shows significant advantages in detecting pathological changes, potentially reducing clinician fatigue during interpretation while enhancing clinical decision-making accuracy. Future research will focus on further developing and testing this approach to ensure robust performance across diverse patient cases with varied dental and maxillofacial structures, thereby increasing the model's utility in clinical settings. ADVANCES IN KNOWLEDGE This study introduces a novel method for achieving clearer, well-aligned panoramic views focused on the dentition, providing significant improvements over conventional methods.
Collapse
Affiliation(s)
- Nayeon Kim
- Department of Oral and Maxillofacial Radiology, School of Dentistry, Pusan National University, Yangsan 50612, Korea
| | - Hyeonju Park
- Department of Oral and Maxillofacial Radiology, School of Dentistry, Pusan National University, Yangsan 50612, Korea
| | - Yun-Hoa Jung
- Department of Oral and Maxillofacial Radiology, School of Dentistry, Pusan National University, Yangsan 50612, Korea
- Dental and Life Science Institute and Dental Research Institute, School of Dentistry, Pusan National University, Yangsan 50612, Korea
| | - Jae Joon Hwang
- Department of Oral and Maxillofacial Radiology, School of Dentistry, Pusan National University, Yangsan 50612, Korea
- Dental and Life Science Institute and Dental Research Institute, School of Dentistry, Pusan National University, Yangsan 50612, Korea
| |
Collapse
|
25
|
Szkalisity Á, Vanharanta L, Saito H, Vörös C, Li S, Isomäki A, Tomberg T, Strachan C, Belevich I, Jokitalo E, Ikonen E. Nuclear envelope-associated lipid droplets are enriched in cholesteryl esters and increase during inflammatory signaling. EMBO J 2025; 44:2774-2802. [PMID: 40195500 DOI: 10.1038/s44318-025-00423-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2024] [Revised: 03/04/2025] [Accepted: 03/11/2025] [Indexed: 04/09/2025] Open
Abstract
Cholesteryl esters (CEs) and triacylglycerols (TAGs) are stored in lipid droplets (LDs), but their compartmentalisation is not well understood. Here, we established a hyperspectral stimulated Raman scattering microscopy system to identify and quantitatively assess CEs and TAGs in individual LDs of human cells. We found that nuclear envelope-associated lipid droplets (NE-LDs) were enriched in cholesteryl esters compared to lipid droplets in the cytoplasm. Correlative light-volume-electron microscopy revealed that NE-LDs projected towards the cytoplasm and associated with type II nuclear envelope (NE) invaginations. The nuclear envelope localization of sterol O-acyltransferase 1 (SOAT1) contributed to NE-LD generation, as trapping of SOAT1 to the NE further increased their number. Upon stimulation by the pro-inflammatory cytokine TNFα, the number of NE-LDs moderately increased. Moreover, TNFα-induced NF-κB nuclear translocation was fine-tuned by SOAT1: increased SOAT1 activity and NE-LDs associated with faster NF-κB translocation, whereas reduced SOAT1 activity and NE-LDs associated with slower NF-κB translocation. Our findings suggest that the NE is enriched in CEs and that cholesterol esterification can modulate nuclear translocation.
Collapse
Affiliation(s)
- Ábel Szkalisity
- Department of Anatomy and Stem Cells and Metabolism Research Program, Faculty of Medicine, University of Helsinki, 00014, Helsinki, Finland
- Minerva Foundation Institute for Medical Research, 00290, Helsinki, Finland
| | - Lauri Vanharanta
- Department of Anatomy and Stem Cells and Metabolism Research Program, Faculty of Medicine, University of Helsinki, 00014, Helsinki, Finland
- Minerva Foundation Institute for Medical Research, 00290, Helsinki, Finland
| | - Hodaka Saito
- Department of Anatomy and Stem Cells and Metabolism Research Program, Faculty of Medicine, University of Helsinki, 00014, Helsinki, Finland
- Minerva Foundation Institute for Medical Research, 00290, Helsinki, Finland
| | - Csaba Vörös
- Department of Anatomy and Stem Cells and Metabolism Research Program, Faculty of Medicine, University of Helsinki, 00014, Helsinki, Finland
- Minerva Foundation Institute for Medical Research, 00290, Helsinki, Finland
- Synthetic and Systems Biology Unit, Biological Research Centre (BRC), Hungarian Research Network (HUN-REN), 6726, Szeged, Hungary
| | - Shiqian Li
- Department of Anatomy and Stem Cells and Metabolism Research Program, Faculty of Medicine, University of Helsinki, 00014, Helsinki, Finland
- Minerva Foundation Institute for Medical Research, 00290, Helsinki, Finland
| | - Antti Isomäki
- Biomedicum Imaging Unit, Department of Anatomy, Faculty of Medicine, University of Helsinki, 00290, Helsinki, Finland
| | - Teemu Tomberg
- Division of Pharmaceutical Chemistry and Technology, Faculty of Pharmacy, University of Helsinki, 00014, Helsinki, Finland
| | - Clare Strachan
- Division of Pharmaceutical Chemistry and Technology, Faculty of Pharmacy, University of Helsinki, 00014, Helsinki, Finland
| | - Ilya Belevich
- Electron Microscopy Unit, Institute of Biotechnology, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
| | - Eija Jokitalo
- Electron Microscopy Unit, Institute of Biotechnology, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
| | - Elina Ikonen
- Department of Anatomy and Stem Cells and Metabolism Research Program, Faculty of Medicine, University of Helsinki, 00014, Helsinki, Finland.
- Minerva Foundation Institute for Medical Research, 00290, Helsinki, Finland.
| |
Collapse
|
26
|
Arberet S, Ghesu FC, Gao R, Kraus M, Sackett J, Kuusela E, Kamen A. Beam's eye view to fluence maps 3D network for ultra fast VMAT radiotherapy planning. Med Phys 2025; 52:3183-3190. [PMID: 39935217 DOI: 10.1002/mp.17673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Revised: 01/14/2025] [Accepted: 01/14/2025] [Indexed: 02/13/2025] Open
Abstract
BACKGROUND Volumetric modulated arc therapy (VMAT) revolutionizes cancer treatment by precisely delivering radiation while sparing healthy tissues. Fluence maps generation, crucial in VMAT planning, traditionally involves complex and iterative, and thus time consuming processes. These fluence maps are subsequently leveraged for leaf-sequence. The deep-learning approach presented in this article aims to expedite this by directly predicting fluence maps from patient data. PURPOSE To accelerate VMAT treatment planning by quickly predicting fluence maps from a 3D dose map. The predicted fluence maps can be quickly leaf sequenced because the network was trained to take into account the machine constraints. METHODS We developed a 3D network which we trained in a supervised way using a combination ofL 1 $L_1$ andL 2 $L_2$ losses, and radiation therapy (RT) plans generated by Eclipse and from the REQUITE dataset, taking the RT dose map as input and the fluence maps computed from the corresponding RT plans as target. Our network predicts jointly the 180 fluence maps corresponding to the 180 control points (CP) of single arc VMAT plans. In order to help the network, we preprocess the input dose by computing the projections of the 3D dose map to the beam's eye view (BEV) of the 180 CPs, in the same coordinate system as the fluence maps. We generated over 2000 VMAT plans using Eclipse to scale up the dataset size. Additionally, we evaluated various network architectures and analyzed the impact of increasing the dataset size. RESULTS We are measuring the performance in the 2D fluence maps domain using image metrics (PSNR and SSIM), as well as in the 3D dose domain using the dose-volume histogram (DVH) on a test set. The network inference, which does not include the data loading and processing, is less than 20 ms. Using our proposed 3D network architecture as well as increasing the dataset size using Eclipse improved the fluence map reconstruction performance by approximately 8 dB in PSNR compared to a U-Net architecture trained on the original REQUITE dataset. The resulting DVHs are very close to the one of the input target dose. CONCLUSIONS We developed a novel deep learning approach for ultrafast VMAT planning by predicting all the fluence maps of a VMAT arc in one single network inference. The small difference of the DVH validate this approach for ultrafast VMAT planning.
Collapse
Affiliation(s)
- Simon Arberet
- Digital Technology and Innovation, Siemens Healthineers, Princeton, New Jersey, USA
| | - Florin C Ghesu
- Digital Technology and Innovation, Siemens Healthineers, Erlangen, Germany
| | - Riqiang Gao
- Digital Technology and Innovation, Siemens Healthineers, Princeton, New Jersey, USA
| | - Martin Kraus
- Digital Technology and Innovation, Siemens Healthineers, Erlangen, Germany
| | - Jonathan Sackett
- Varian Medical Systems, a Siemens Healthineers Company, Helsinki, Finland
| | - Esa Kuusela
- Varian Medical Systems, a Siemens Healthineers Company, Helsinki, Finland
| | - Ali Kamen
- Digital Technology and Innovation, Siemens Healthineers, Princeton, New Jersey, USA
| |
Collapse
|
27
|
Zhang X, Ou N, Liu C, Zhuo Z, Matthews PM, Liu Y, Ye C, Bai W. Unsupervised brain MRI tumour segmentation via two-stage image synthesis. Med Image Anal 2025; 102:103568. [PMID: 40199108 DOI: 10.1016/j.media.2025.103568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 03/24/2025] [Accepted: 03/25/2025] [Indexed: 04/10/2025]
Abstract
Deep learning shows promise in automated brain tumour segmentation, but it depends on costly expert annotations. Recent advances in unsupervised learning offer an alternative by using synthetic data for training. However, the discrepancy between real and synthetic data limits the accuracy of the unsupervised approaches. In this paper, we propose an approach for unsupervised brain tumour segmentation on magnetic resonance (MR) images via a two-stage image synthesis strategy. This approach accounts for the domain gap between real and synthetic data and aims to generate realistic synthetic data for model training. In the first stage, we train a junior segmentation model using synthetic brain tumour images generated by hand-crafted tumour shape and intensity models, and employs a validation set with distribution shift for model selection. The trained junior model is applied to segment unlabelled real tumour images, generating pseudo labels that capture realistic tumour shape, intensity, and texture. In the second stage, realistic synthetic tumour images are generated by mixing brain images with tumour pseudo labels, closing the domain gap between real and synthetic images. The generated synthetic data is then used to train a senior model for final segmentation. In experiments on five brain imaging datasets, the proposed approach, named as SynthTumour, surpasses existing unsupervised methods and demonstrates high performance for both brain tumour segmentation and ischemic stroke lesion segmentation tasks.
Collapse
Affiliation(s)
- Xinru Zhang
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing, China; Department of Brain Sciences, Imperial College London, London, United Kingdom
| | - Ni Ou
- School of Automation, Beijing Institute of Technology, Beijing, China
| | - Chenghao Liu
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing, China
| | - Zhizheng Zhuo
- Department of Radiology, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Paul M Matthews
- Department of Brain Sciences, Imperial College London, London, United Kingdom; UK Dementia Research Institute, Imperial College London, London, United Kingdom
| | - Yaou Liu
- Department of Radiology, Beijing Tiantan Hospital, Capital Medical University, Beijing, China.
| | - Chuyang Ye
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing, China.
| | - Wenjia Bai
- Department of Brain Sciences, Imperial College London, London, United Kingdom; Department of Computing, Imperial College London, London, United Kingdom.
| |
Collapse
|
28
|
Pandey RK, Rathore YK. Deep learning in 3D cardiac reconstruction: a systematic review of methodologies and dataset. Med Biol Eng Comput 2025; 63:1271-1287. [PMID: 39753994 DOI: 10.1007/s11517-024-03273-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2024] [Accepted: 12/18/2024] [Indexed: 05/10/2025]
Abstract
This study presents an advanced methodology for 3D heart reconstruction using a combination of deep learning models and computational techniques, addressing critical challenges in cardiac modeling and segmentation. A multi-dataset approach was employed, including data from the UK Biobank, MICCAI Multi-Modality Whole Heart Segmentation (MM-WHS) challenge, and clinical datasets of congenital heart disease. Preprocessing steps involved segmentation, intensity normalization, and mesh generation, while the reconstruction was performed using a blend of statistical shape modeling (SSM), graph convolutional networks (GCNs), and progressive GANs. The statistical shape models were utilized to capture anatomical variations through principal component analysis (PCA), while GCNs refined the meshes derived from segmented slices. Synthetic data generated by progressive GANs enabled augmentation, particularly useful for congenital heart conditions. Evaluation of the reconstruction accuracy was performed using metrics such as Dice similarity coefficient (DSC), Chamfer distance, and Hausdorff distance, with the proposed framework demonstrating superior anatomical precision and functional relevance compared to traditional methods. This approach highlights the potential for automated, high-resolution 3D heart reconstruction applicable in both clinical and research settings. The results emphasize the critical role of deep learning in enhancing anatomical accuracy, particularly for rare and complex cardiac conditions. This paper is particularly important for researchers wanting to utilize deep learning in cardiac imaging and 3D heart reconstruction, bringing insights into the integration of modern computational methods.
Collapse
Affiliation(s)
- Rajendra Kumar Pandey
- Department of Computer Science and Engineering, Shri Shankaracharya Institute of Professional Management and Technology, Raipur, (C.G.), India.
| | - Yogesh Kumar Rathore
- Department of Computer Science and Engineering, Shri Shankaracharya Institute of Professional Management and Technology, Raipur, (C.G.), India
| |
Collapse
|
29
|
Lteif D, Appapogu D, Bargal SA, Plummer BA, Kolachalama VB. Anatomy-guided, modality-agnostic segmentation of neuroimaging abnormalities. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2025.04.29.25326682. [PMID: 40343040 PMCID: PMC12060938 DOI: 10.1101/2025.04.29.25326682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 05/11/2025]
Abstract
Magnetic resonance imaging (MRI) offers multiple sequences that provide complementary views of brain anatomy and pathology. However, real-world datasets often exhibit variability in sequence availability due to clinical and logistical constraints. This variability complicates radiological interpretation and limits the generalizability of machine learning models that depend on consistent multimodal input. In this work, we propose an anatomy-guided and modality-agnostic framework for assessing disease-related abnormalities in brain MRI, leveraging structural context to enhance robustness across diverse input configurations. We introduce a novel augmentation strategy, Region ModalMix, which integrates anatomical priors during training to improve model performance when some modalities are absent or variable. We conducted extensive experiments on brain tumor segmentation using the Multimodal Brain Tumor Segmentation Challenge (BraTS) 2020 dataset (n=369). The results demonstrate that our proposed framework outperforms state-of-the-art methods on various missing modality conditions, especially by an average 9.68 mm reduction in 95th percentile Hausdorff Distance and a 1.36% improvement in Dice Similarity Coefficient over baseline models with only one available modailty. Our method is model-agnostic, training-compatible, and broadly applicable to multi-modal neuroimaging pipelines, enabling more reliable abnormality detection in settings with heterogeneous data availability.
Collapse
Affiliation(s)
- Diala Lteif
- Department of Computer Science, Boston University, Boston, MA, USA
- Department of Medicine, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Divya Appapogu
- Department of Computer Science, Boston University, Boston, MA, USA
- Department of Medicine, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Sarah A Bargal
- Department of Computer Science, Georgetown University, Washington, DC, USA
| | - Bryan A Plummer
- Department of Computer Science, Boston University, Boston, MA, USA
| | - Vijaya B Kolachalama
- Department of Computer Science, Boston University, Boston, MA, USA
- Department of Medicine, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
- Faculty of Computing & Data Sciences, Boston University, Boston, MA, USA
| |
Collapse
|
30
|
Baraboo J, DiCarlo A, Berhane H, Shen D, Passman R, Lee DC, McCarthy PM, Arora R, Kim D, Markl M. Deep learning based automated left atrial segmentation and flow quantification of real time phase contrast MRI in patients with atrial fibrillation. THE INTERNATIONAL JOURNAL OF CARDIOVASCULAR IMAGING 2025:10.1007/s10554-025-03407-9. [PMID: 40301204 DOI: 10.1007/s10554-025-03407-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2025] [Accepted: 04/21/2025] [Indexed: 05/01/2025]
Abstract
Real time 2D phase contrast (RTPC) MRI is useful for flow quantification in atrial fibrillation (AF) patients, but data analysis requires time-consuming anatomical contouring for many cardiac time frames. Our goal was to develop a convolutional neural network (CNN) for fully automated left atrial (LA) flow quantification. Forty-four AF patients underwent cardiac MRI including LA RTPC, collecting a median of 358 timeframes per scan. 15,307 semi-manual derived RTPC LA contours comprised ground truth for CNN training, validation, and testing. CNN vs. human performance was assessed using Dice scores (DSC), Hausdorff distance (HD), and flow measures (stasis, velocities, flow). LA contour DSC across all patients were similar to human inter-observer DSC (0.90 vs. 0.93) and a median 4.6 mm [3.5-5.9 mm] HD. There was no impact of heart rate variability on contouring quality (low vs. high variability DSC: 0.92 ± 0.05 vs. 0.91 ± 0.03, p = 0.95). CNN based LA flow quantification showed good to excellent agreement with semi-manual analysis (r > 0.90) and small bias in Bland-Altman analysis for mean velocity (-0.10 cm/s), stasis (1%), and net flow (-2.4 ml/s). This study demonstrated the feasibility of CNN based LA flow analysis with good agreements in LA contours and flow measures and resilience to heartbeat variability in AF.
Collapse
Affiliation(s)
- Justin Baraboo
- Northwestern Biomedical Engineering, Chicago, Illinois, USA.
- Northwestern Radiology, Chicago, Illinois, USA.
- , 737 N. Michigan Avenue Suite 1600, Chicago, Illinois, 312694776, USA.
| | | | - Haben Berhane
- Northwestern Biomedical Engineering, Chicago, Illinois, USA
- Northwestern Radiology, Chicago, Illinois, USA
| | - Daming Shen
- Northwestern Radiology, Chicago, Illinois, USA
| | - Rod Passman
- Northwestern Medicine, Cardiology, Chicago, Illinois, USA
| | - Daniel C Lee
- Northwestern Medicine, Cardiology, Chicago, Illinois, USA
| | | | - Rishi Arora
- Northwestern Medicine, Cardiology, Chicago, Illinois, USA
| | - Dan Kim
- Northwestern Biomedical Engineering, Chicago, Illinois, USA
- Northwestern Radiology, Chicago, Illinois, USA
| | - Michael Markl
- Northwestern Biomedical Engineering, Chicago, Illinois, USA
- Northwestern Radiology, Chicago, Illinois, USA
| |
Collapse
|
31
|
Liu S, Su R, Su J, van Zwam WH, van Doormaal PJ, van der Lugt A, Niessen WJ, van Walsum T. Segmentation-assisted vessel centerline extraction from cerebral CT Angiography. Med Phys 2025. [PMID: 40296200 DOI: 10.1002/mp.17855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 03/07/2025] [Accepted: 04/08/2025] [Indexed: 04/30/2025] Open
Abstract
BACKGROUND The accurate automated extraction of brain vessel centerlines from Computed tomographic angiography (CTA) images plays an important role in diagnosing and treating cerebrovascular diseases such as stroke. Despite its significance, this task is complicated by the complex cerebrovascular structure and heterogeneous imaging quality. PURPOSE This study aims to develop and validate a segmentation-assisted framework designed to improve the accuracy and efficiency of brain vessel centerline extraction from CTA images. We streamline the process of lumen segmentation generation without additional annotation effort from physicians, enhancing the effectiveness of centerline extraction. METHODS The framework integrates four modules: (1) pre-processing techniques that register CTA images with a CT atlas and divide these images into input patches, (2) lumen segmentation generation from annotated vessel centerlines using graph cuts and robust kernel regression, (3) a dual-branch topology-aware UNet (DTUNet) that optimizes the use of the annotated vessel centerlines and the generated lumen segmentation via a topology-aware loss (TAL) and its dual-branch structure, and (4) post-processing methods that skeletonize and refine the lumen segmentation predicted by the DTUNet. RESULTS An in-house dataset derived from a subset of the MR CLEAN Registry is used to evaluate the proposed framework. The dataset comprises 10 intracranial CTA images, and 40 cube CTA sub-images with a resolution of128 × 128 × 128 $128 \times 128 \times 128$ voxels. Via five-fold cross-validation on this dataset, we demonstrate that the proposed framework consistently outperforms state-of-the-art methods in terms of average symmetric centerline distance (ASCD) and overlap (OV). Specifically, it achieves an ASCD of 0.84, anOV 1.0 $\textrm {OV}_{1.0}$ of 0.839, and anOV 1.5 $\textrm {OV}_{1.5}$ of 0.885 for intracranial CTA images, and obtains an ASCD of 1.26, anOV 1.0 $\textrm {OV}_{1.0}$ of 0.779, and anOV 1.5 $\textrm {OV}_{1.5}$ of 0.824 for cube CTA sub-images. Subgroup analyses further suggest that the proposed framework holds promise in clinical applications for stroke diagnosis and treatment. CONCLUSIONS By automating the process of lumen segmentation generation and optimizing the network design of vessel centerline extraction, DTUnet achieves high performance without introducing additional annotation demands. This solution promises to be beneficial in various clinical applications in cerebrovascular disease management.
Collapse
Affiliation(s)
- Sijie Liu
- Institute of Applied Electronics, China Academy of Engineering Physics, Mianyang, China
- Department of Radiology & Nuclear Medicine, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
- National Key Laboratory of Science and Technology on Advanced Laser and High Power Microwave, China Academy of Engineering Physics, Mianyang, China
| | - Ruisheng Su
- Department of Radiology & Nuclear Medicine, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
- Medical Image Analysis Group, Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Jianghang Su
- Department of Radiology & Nuclear Medicine, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Wim H van Zwam
- Department of Radiology & Nuclear Medicine, Maastricht UMC, Cardiovascular Research Institute Maastricht, Maastricht, The Netherlands
| | - Pieter Jan van Doormaal
- Department of Radiology & Nuclear Medicine, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Aad van der Lugt
- Department of Radiology & Nuclear Medicine, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Wiro J Niessen
- Department of Radiology & Nuclear Medicine, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
- Imaging Physics, Department of Applied Sciences, Delft University of Technology, Delft, The Netherlands
| | - Theo van Walsum
- Department of Radiology & Nuclear Medicine, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
| |
Collapse
|
32
|
Henzen NA, Abdulkadir A, Reinhardt J, Blatow M, Kressig RW, Krumm S. Automated segmentation for cortical thickness of the medial perirhinal cortex. Sci Rep 2025; 15:14903. [PMID: 40295570 PMCID: PMC12037834 DOI: 10.1038/s41598-025-98399-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 04/11/2025] [Indexed: 04/30/2025] Open
Abstract
Alzheimer's disease (AD) is characterized by a progressive spread of neurofibrillary tangles (NFT), beginning in the medial perirhinal cortex (mPRC), advancing to the entorhinal cortex (ERC), and subsequently involving the hippocampus, lateral perirhinal cortex (lPRC), and the rest of the brain. Given the close relationship between NFT accumulation and neuronal loss, the mPRC reflects a promising structural marker for early diagnosis of AD. However, only limited tools that automatically measure the cortical thickness of the mPRC are currently available. Utilizing the nnU-Net framework, we trained models on structural MRI of 126 adults, with manually segmented labels as ground truth. These models were then applied to an independent dataset of 103 adults (comprising patients with Alzheimer's dementia, amnestic mild cognitive impairment (aMCI), and healthy controls). High agreement was observed between manual and automated measurements of cortical thickness. Furthermore, we found significant atrophy in the Alzheimer's dementia group in the mPRC, ERC, and lPRC compared to healthy controls. Comparison of the aMCI group and healthy controls revealed significant differences in the ERC only. The results underscore the utility of our automated segmentation tool in advancing Alzheimer's research.
Collapse
Affiliation(s)
- Nicolas A Henzen
- University Department of Geriatric Medicine FELIX PLATTER, Burgfelderstrasse 101, 4055, Basel, Switzerland.
- Faculty of Psychology, University of Basel, Basel, Switzerland.
| | - Ahmed Abdulkadir
- Department of Clinical Neurosciences, Laboratory for Research in Neuroimaging LREN, Centre for Research in Neurosciences, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland
- Center for Artificial Intelligence, Zürich University of Applied Sciences, Winterthur, Switzerland
- University Hospital of Old Age Psychiatry and Psychotherapy, University of Bern, Bern, Switzerland
| | - Julia Reinhardt
- Division of Diagnostic and Interventional Neuroradiology, Department of Radiology, University Hospital Basel, University of Basel, Basel, Switzerland
- Department of Cardiology and Cardiovascular Research Institute Basel (CRIB), University Hospital Basel, University of Basel, Basel, Switzerland
- Department of Orthopedic Surgery and Traumatology, University Hospital of Basel, University of Basel, Basel, Switzerland
| | - Maria Blatow
- Section of Neuroradiology, Department of Radiology and Nuclear Medicine, Neurocenter, Cantonal Hospital Lucerne, University of Lucerne, Lucerne, Switzerland
| | - Reto W Kressig
- University Department of Geriatric Medicine FELIX PLATTER, Burgfelderstrasse 101, 4055, Basel, Switzerland
- Faculty of Medicine, University of Basel, Basel, Switzerland
| | - Sabine Krumm
- University Department of Geriatric Medicine FELIX PLATTER, Burgfelderstrasse 101, 4055, Basel, Switzerland
- Faculty of Medicine, University of Basel, Basel, Switzerland
| |
Collapse
|
33
|
Wu W, Laville A, Deutsch E, Sun R. Deep learning for malignant lymph node segmentation and detection: a review. Front Immunol 2025; 16:1526518. [PMID: 40356919 PMCID: PMC12066500 DOI: 10.3389/fimmu.2025.1526518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Accepted: 03/17/2025] [Indexed: 05/15/2025] Open
Abstract
Radiation therapy remains a cornerstone in the treatment of cancer, with the delineation of Organs at Risk (OARs), tumors, and malignant lymph nodes playing a critical role in the planning process. However, the manual segmentation of these anatomical structures is both time-consuming and costly, with inter-observer and intra-observer variability often leading to delineation errors. In recent years, deep learning-based automatic segmentation has gained increasing attention, leading to a proliferation of scholarly works on OAR and tumor segmentation algorithms utilizing deep learning techniques. Nevertheless, similar comprehensive reviews focusing solely on malignant lymph nodes are scarce. This paper provides an in-depth review of the advancements in deep learning for malignant lymph node segmentation and detection. After a brief overview of deep learning methodologies, the review examines specific models and their outcomes for malignant lymph node segmentation and detection across five clinical sites: head and neck, upper extremity, chest, abdomen, and pelvis. The discussion section extensively covers the current challenges and future trends in this field, analyzing how they might impact clinical applications. This review aims to bridge the gap in literature by providing a focused overview on deep learning applications in the context of malignant lymph node challenges, offering insights into their potential to enhance the precision and efficiency of cancer treatment planning.
Collapse
Affiliation(s)
| | | | - Eric Deutsch
- Unité Mixte de Recherche (UMR) 1030, Gustave Roussy, Department of Radiation
Oncology, Université Paris-Saclay, Villejuif, France
| | - Roger Sun
- Unité Mixte de Recherche (UMR) 1030, Gustave Roussy, Department of Radiation
Oncology, Université Paris-Saclay, Villejuif, France
| |
Collapse
|
34
|
Sun H, Qin J, Liu Z, Jia X, Yan K, Wang L, Liu Z, Gong S. Generation driven understanding of localized 3D scenes with 3D diffusion model. Sci Rep 2025; 15:14385. [PMID: 40274914 PMCID: PMC12022287 DOI: 10.1038/s41598-025-98705-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2025] [Accepted: 04/14/2025] [Indexed: 04/26/2025] Open
Abstract
In recent years, diffusion models have been widely used in 3D scenes-related work. However, the existing diffusion models primarily focus on the global structure and are constrained by predefined dataset categories, which are unable to accurately resolve the detailed structure of complex 3D scenes. This study therefore integrates Denoising Diffusion Probabilistic Models (DDPM) with Learning Dense Volumetric Segmentation from Sparse Annotation (3D U-Net) architecture fusion, a novel approach to local 3D scenes generation-driven understanding is proposed, namely a customized 3D diffusion model (3D-UDDPM) for local cubes. In contrast to conventional global or local single-structure analysis techniques, the 3D-UDDPM framework is designed to prioritize the capture and recovery of local details during the generation of localized 3D scenes. In addition to accurately predicting the distribution of the noise tensor, the framework significantly enhances the understanding of localized scenes by effectively integrating spatial context information. Specifically, 3D-UDDPM combines Markov chain Monte Carlo (MCMC) sampling and variational inference methods to reconstruct clear structural details in a stepwise backward inference manner, thereby driving the generation and understanding of local 3D scenes by internalizing geometric features as a priori knowledge. The innovative diffusion process enables the model to recover fine local details while maintaining global structural coherence during the gradual denoising process. When combined with the spatial convolutional properties of the 3D U-Net architecture, the modelling accuracy and generation quality of complex 3D shapes are further enhanced, ensuring excellent performance in complex environments. The results demonstrate superior performance on two benchmark datasets in comparison to existing methodologies.
Collapse
Affiliation(s)
- Hao Sun
- College of Data Science and Application, Inner Mongolia University of Technology, Hohhot, 010080, China
| | - Junping Qin
- College of Data Science and Application, Inner Mongolia University of Technology, Hohhot, 010080, China.
| | - Zheng Liu
- College of Data Science and Application, Inner Mongolia University of Technology, Hohhot, 010080, China
| | - Xinglong Jia
- College of Data Science and Application, Inner Mongolia University of Technology, Hohhot, 010080, China
| | - Kai Yan
- College of Data Science and Application, Inner Mongolia University of Technology, Hohhot, 010080, China
| | - Lei Wang
- College of Data Science and Application, Inner Mongolia University of Technology, Hohhot, 010080, China
| | - Zhiqiang Liu
- College of Information Engineering, Inner Mongolia University of Technology, Hohhot, 010080, China
| | - Shaofei Gong
- Inner Mongolia Smart Animal Husbandry Information Technology Group, Hohhot, 010013, China
| |
Collapse
|
35
|
Jiang L, Hu J, Huang T. Improved SwinUNet with fusion transformer and large kernel convolutional attention for liver and tumor segmentation in CT images. Sci Rep 2025; 15:14286. [PMID: 40274913 PMCID: PMC12022277 DOI: 10.1038/s41598-025-98938-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Accepted: 04/15/2025] [Indexed: 04/26/2025] Open
Abstract
Segmentation of both liver and liver tumors is a critical step in radiation therapy of hepatocellular carcinoma. Despite numerous algorithms have been proposed for organ and tumor delineation, automatic segmentation of livers and liver tumors remains a challenge due to their blurred boundaries and low tissue contrast compared to surrounding organs within CT images. The U-Net-based methods have achieved significant success in this task. However, they often suffer from the limitation that feature extraction lacks relationships, i.e., context, among adjacent areas, thereby leading to uncertainty in segmentation results. To address with this challenge, we incorporate both global-local context and attention into the Swin-UNet. Firstly, we introduce a Swin-neighborhood Fusion Transformer Block (SFTB) to capture both global and local context in an image, enabling us to distinguish instances and their boundaries effectively. Secondly, we design a Large-kernel Convolutional Attention Block (LCAB) with two types of attention to highlight crucial features. Experiments on the LiTS and 3D-IRCADb datasets demonstrate the effectiveness of the proposed method, with dice scores of 0.9559 and 0.9610 for liver segmentation, and 0.7614 and 0.7138 for liver tumor segmentation. The code is available at https://github.com/JennieHJN/image-segmentation/tree/master .
Collapse
Affiliation(s)
- Linfeng Jiang
- School of Artificial Intelligence, Chongqing University of Technology, Chongqing, China.
| | - Jiani Hu
- School of Artificial Intelligence, Chongqing University of Technology, Chongqing, China
| | - Tongyuan Huang
- School of Artificial Intelligence, Chongqing University of Technology, Chongqing, China
| |
Collapse
|
36
|
Chen C, Liu J, Yin H, Huang B. A Vision-Based Method for Detecting the Position of Stacked Goods in Automated Storage and Retrieval Systems. SENSORS (BASEL, SWITZERLAND) 2025; 25:2623. [PMID: 40285312 PMCID: PMC12031210 DOI: 10.3390/s25082623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/10/2025] [Revised: 04/19/2025] [Accepted: 04/19/2025] [Indexed: 04/29/2025]
Abstract
Automated storage and retrieval systems (AS/RS) play a crucial role in modern logistics, yet effectively monitoring cargo stacking patterns remains challenging. While computer vision and deep learning offer promising solutions, existing methods struggle to balance detection accuracy, computational efficiency, and environmental adaptability. This paper proposes a novel machine vision-based detection algorithm that integrates a pallet surface object detection network (STEGNet) with a box edge detection algorithm. STEGNet's core innovation is the Efficient Gated Pyramid Feature Network (EG-FPN), which integrates a Gated Feature Fusion module and a Lightweight Attention Mechanism to optimize feature extraction and fusion. In addition, we introduce a geometric constraint method for box edge detection and employ a Perspective-n-Point (PnP)-based 2D-to-3D transformation approach for precise pose estimation. Experimental results show that STEGNet achieves 93.49% mAP on our proposed GY Warehouse Box View 4-Dimension (GY-WSBW-4D) dataset and 83.2% mAP on the WSGID-B dataset, surpassing existing benchmarks. The lightweight variant maintains competitive accuracy while reducing the model size by 34% and increasing the inference speed by 68%. In practical applications, the system achieves pose estimation with a Mean Absolute Error within 4 cm and a Rotation Angle Error below 2°, demonstrating robust performance in complex warehouse environments. This research provides a reliable solution for automated cargo stack monitoring in modern logistics systems.
Collapse
Affiliation(s)
- Chuanjun Chen
- Department of Automation, Tsinghua University, Beijing 100084, China or (C.C.); (H.Y.)
- BZS (Beijing) Technology Development Co., Ltd., No.1 Jiaochangkou, Deshengmenwai, Beijing 100120, China;
| | - Junjie Liu
- BZS (Beijing) Technology Development Co., Ltd., No.1 Jiaochangkou, Deshengmenwai, Beijing 100120, China;
| | - Haonan Yin
- Department of Automation, Tsinghua University, Beijing 100084, China or (C.C.); (H.Y.)
| | - Biqing Huang
- Department of Automation, Tsinghua University, Beijing 100084, China or (C.C.); (H.Y.)
| |
Collapse
|
37
|
Liu G, Huang W, Li Y, Zhang Q, Fu J, Tang H, Huang J, Zhang Z, Zhang L, Wang Y, Hu J. A weakly-supervised follicle segmentation method in ultrasound images. Sci Rep 2025; 15:13771. [PMID: 40258856 PMCID: PMC12012036 DOI: 10.1038/s41598-025-95957-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2024] [Accepted: 03/25/2025] [Indexed: 04/23/2025] Open
Abstract
Accurate follicle segmentation in ultrasound images is crucial for monitoring follicle development, a key factor in fertility treatments. However, obtaining pixel-level annotations for fully supervised instance segmentation is often impractical due to time and workload constraints. This paper presents a weakly supervised instance segmentation method that leverages bounding boxes as approximate annotations, aiming to assist clinicians with automated tools for follicle development monitoring. We propose the Weakly Supervised Follicle Segmentation (WSFS) method, a novel one-stage weakly supervised segmentation technique model designed to enhance the ultrasound images of follicles, which incorporates a Convolutional Neural Network (CNN) backbone augmented with a Feature Pyramid Network (FPN) module for multi-scale feature representation, critical for capturing the diverse sizes and shapes of follicles. By leveraging Multiple Instance Learning (MIL), we formulated the learning process in a weakly supervised manner and developed an end-to-end trainable model that efficiently addresses the issue of annotation scarcity. Furthermore, the WSFS can be used as a prompt proposal to enhance the performance of the Segmentation Anything Model (SAM), a well-known pre-trained segmentation model utilizing few-shot learning strategies. In addition, this study introduces the Follicle Ultrasound Image Dataset (FUID), addressing the scarcity in reproductive health data and aiding future research in computer-aided diagnosis. The experimental results on both the public dataset USOVA3D and private dataset FUID showed that our method performs competitively with fully supervised methods. Our approach achieves performance with mAP of 0.957, IOU of 0.714 and Dice Score of 0.83, competitive to fully supervised methods that rely on pixel-level labeled masks, despite operating with less detailed annotations.
Collapse
Affiliation(s)
- Guanyu Liu
- Big Data Institute, Central South University, Changsha, 410083, China
| | - Weihong Huang
- Big Data Institute, Central South University, Changsha, 410083, China
- Mobile Health Ministry of Education - China Mobile Joint Laboratory, Xiangya Hospital, Central South University, Changsha, 410000, China
- Xiangjiang Laboratory, Changsha, 410205, China
| | - Yanping Li
- Department of Reproductive Medicine, Xiangya Hospital, Central South University, Changsha, Hunan, 410000, China
- Clinical Research Center for Women's Reproductive Health in Hunan Province, Changsha, Hunan, 410000, China
| | - Qiong Zhang
- Department of Reproductive Medicine, Xiangya Hospital, Central South University, Changsha, Hunan, 410000, China
- Clinical Research Center for Women's Reproductive Health in Hunan Province, Changsha, Hunan, 410000, China
| | - Jing Fu
- Department of Reproductive Medicine, Xiangya Hospital, Central South University, Changsha, Hunan, 410000, China
- Clinical Research Center for Women's Reproductive Health in Hunan Province, Changsha, Hunan, 410000, China
| | - Hongying Tang
- Department of Reproductive Medicine, Xiangya Hospital, Central South University, Changsha, Hunan, 410000, China
- Clinical Research Center for Women's Reproductive Health in Hunan Province, Changsha, Hunan, 410000, China
| | - Jia Huang
- School of Life Science, Central South University, Changsha, 410083, China
| | - Zhongteng Zhang
- School of Computer Sciences and Engineering, Central South University, Changsha, 410083, China
| | - Lei Zhang
- Laboratory of Vision Engineering (LoVE), School of computer science, University of Lincoln, Lincoln, LN6 7TS, UK
| | - Yu Wang
- Department of Reproductive Medicine, Xiangya Hospital, Central South University, Changsha, Hunan, 410000, China.
- Clinical Research Center for Women's Reproductive Health in Hunan Province, Changsha, Hunan, 410000, China.
| | - Jianzhong Hu
- Big Data Institute, Central South University, Changsha, 410083, China.
- Mobile Health Ministry of Education - China Mobile Joint Laboratory, Xiangya Hospital, Central South University, Changsha, 410000, China.
| |
Collapse
|
38
|
Polattimur R, Yıldırım MS, Dandıl E. Fractal-Based Architectures with Skip Connections and Attention Mechanism for Improved Segmentation of MS Lesions in Cervical Spinal Cord. Diagnostics (Basel) 2025; 15:1041. [PMID: 40310404 PMCID: PMC12025551 DOI: 10.3390/diagnostics15081041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2025] [Revised: 04/15/2025] [Accepted: 04/16/2025] [Indexed: 05/02/2025] Open
Abstract
Background/Objectives: Multiple sclerosis (MS) is an autoimmune disease that damages the myelin sheath of the central nervous system, which includes the brain and spinal cord. Although MS lesions in the brain are more frequently investigated, MS lesions in the cervical spinal cord (CSC) can be much more specific for the diagnosis of the disease. Furthermore, as lesion burden in the CSC is directly related to disease progression, the presence of lesions in the CSC may help to differentiate MS from other neurological diseases. Methods: In this study, two novel deep learning models based on fractal architectures are proposed for the automatic detection and segmentation of MS lesions in the CSC by improving the convolutional and connection structures used in the layers of the U-Net architecture. In our previous study, we introduced the FractalSpiNet architecture by incorporating fractal convolutional block structures into the U-Net framework to develop a deeper network for segmenting MS lesions in the CPC. In this study, to improve the detection of smaller structures and finer details in the images, an attention mechanism is integrated into the FractalSpiNet architecture, resulting in the Att-FractalSpiNet model. In addition, in the second hybrid model, a fractal convolutional block is incorporated into the skip connection structure of the U-Net architecture, resulting in the development of the Con-FractalU-Net model. Results: Experimental studies were conducted using U-Net, FractalSpiNet, Con-FractalU-Net, and Att-FractalSpiNet architectures to detect the CSC region and the MS lesions within its boundaries. In segmenting the CSC region, the proposed Con-FractalU-Net architecture achieved the highest Dice Similarity Coefficient (DSC) score of 98.89%. Similarly, in detecting MS lesions within the CSC region, the Con-FractalU-Net model again achieved the best performance with a DSC score of 91.48%. Conclusions: For segmentation of the CSC region and detection of MS lesions, the proposed fractal-based Con-FractalU-Net and Att-FractalSpiNet architectures achieved higher scores than the baseline U-Net architecture, particularly in segmenting small and complex structures.
Collapse
Affiliation(s)
- Rukiye Polattimur
- Department of Electronics and Computer Engineering, Institute of Graduate, Bilecik Seyh Edebali University, 11230 Bilecik, Türkiye;
| | - Mehmet Süleyman Yıldırım
- Department of Computer Technology, Söğüt Vocational School, Bilecik Şeyh Edebali University, Sögüt, 11600 Bilecik, Türkiye;
| | - Emre Dandıl
- Department of Computer Engineering, Faculty of Engineering, Bilecik Seyh Edebali University, 11230 Bilecik, Türkiye
| |
Collapse
|
39
|
Ryu WS, Schellingerhout D, Park J, Chung J, Jeong SW, Gwak DS, Kim BJ, Kim JT, Hong KS, Lee KB, Park TH, Park SS, Park JM, Kang K, Cho YJ, Park HK, Lee BC, Yu KH, Oh MS, Lee SJ, Kim JG, Cha JK, Kim DH, Lee J, Park MS, Kim D, Bang OY, Kim EY, Sohn CH, Kim H, Bae HJ, Kim DE. Deep learning-based automatic segmentation of cerebral infarcts on diffusion MRI. Sci Rep 2025; 15:13214. [PMID: 40240396 PMCID: PMC12003832 DOI: 10.1038/s41598-025-91032-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2024] [Accepted: 02/18/2025] [Indexed: 04/18/2025] Open
Abstract
We explored effects of (1) training with various sample sizes of multi-site vs. single-site training data, (2) cross-site domain adaptation, and (3) data sources and features on the performance of algorithms segmenting cerebral infarcts on Magnetic Resonance Imaging (MRI). We used 10,820 annotated diffusion-weighted images (DWIs) from 10 university hospitals. Algorithms based on 3D U-net were trained using progressively larger subsamples (ranging from 217 to 8661), while internal testing employed a distinct set of 2159 DWIs. External validation was conducted using three unrelated datasets (n = 2777, 50, and 250). For domain adaptation, we utilized 50 to 1000 subsamples from the 2777-image external target dataset. As the size of the multi-site training data increased from 217 to 1732, the Dice similarity coefficient (DSC) and average Hausdorff distance (AHD) improved from 0.58 to 0.65 and from 16.1 to 3.75 mm, respectively. Further increases in sample size to 4330 and 8661 led to marginal gains in DSC (to 0.68 and 0.70, respectively) and in AHD (to 2.92 and 1.73). Similar outcomes were observed in external testing. Notably, performance was relatively poor for segmenting brainstem or hyperacute (< 3 h) infarcts. Domain adaptation, even with a small subsample (n = 50) of external data, conditioned the algorithm trained with 217 images to perform comparably to an algorithm trained with 8661 images. In conclusion, the use of multi-site data (approximately 2000 DWIs) and domain adaptation significantly enhances the performance and generalizability of deep learning algorithms for infarct segmentation.
Collapse
Affiliation(s)
- Wi-Sun Ryu
- Artificial Intelligence Research Center, JLK Inc., Seoul, South Korea
- National Priority Research Center for Stroke and Department of Neurology, Dongguk University Ilsan Hospital, 27, Dongguk-ro, Ilsandong-gu, Goyang, South Korea
| | - Dawid Schellingerhout
- Department of Neuroradiology and Imaging Physics, The University of Texas M.D. Anderson Cancer Center, Houston, USA
| | - Jonghyeok Park
- Artificial Intelligence Research Center, JLK Inc., Seoul, South Korea
| | - Jinyong Chung
- National Priority Research Center for Stroke and Department of Neurology, Dongguk University Ilsan Hospital, 27, Dongguk-ro, Ilsandong-gu, Goyang, South Korea
- Bioimaging Data Curation Center, Seoul, South Korea
| | - Sang-Wuk Jeong
- National Priority Research Center for Stroke and Department of Neurology, Dongguk University Ilsan Hospital, 27, Dongguk-ro, Ilsandong-gu, Goyang, South Korea
| | - Dong-Seok Gwak
- National Priority Research Center for Stroke and Department of Neurology, Dongguk University Ilsan Hospital, 27, Dongguk-ro, Ilsandong-gu, Goyang, South Korea
- Bioimaging Data Curation Center, Seoul, South Korea
| | - Beom Joon Kim
- Department of Neurology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, South Korea
| | - Joon-Tae Kim
- Department of Neurology, Chonnam National University Hospital, Chonnam National University Medical School, Gwangju, South Korea
| | - Keun-Sik Hong
- Department of Neurology, Inje University Ilsan Paik Hospital, Inje University College of Medicine, Goyang, South Korea
| | - Kyung Bok Lee
- Department of Neurology, Soonchunhyang University Hospital, College of Medical Science, Soon Chun Hyang University, Seoul, South Korea
| | - Tai Hwan Park
- Department of Neurology, Seoul Medical Center, Seoul, South Korea
| | - Sang-Soon Park
- Department of Neurology, Seoul Medical Center, Seoul, South Korea
| | - Jong-Moo Park
- Department of Neurology, Uijeongbu Eulji Medical Center, Eulji University School of Medicine, Uijeongbu, South Korea
| | - Kyusik Kang
- Department of Neurology, Nowon Eulji Medical Center, Eulji University School of Medicine, Seoul, South Korea
| | - Yong-Jin Cho
- Department of Neurology, Inje University Ilsan Paik Hospital, Inje University College of Medicine, Goyang, South Korea
| | - Hong-Kyun Park
- Department of Neurology, Inje University Ilsan Paik Hospital, Inje University College of Medicine, Goyang, South Korea
| | - Byung-Chul Lee
- Department of Neurology, Hallym University Sacred Heart Hospital, College of Medicine, Hallym University, Anyang, South Korea
| | - Kyung-Ho Yu
- Department of Neurology, Hallym University Sacred Heart Hospital, College of Medicine, Hallym University, Anyang, South Korea
| | - Mi Sun Oh
- Department of Neurology, Hallym University Sacred Heart Hospital, College of Medicine, Hallym University, Anyang, South Korea
| | - Soo Joo Lee
- Department of Neurology, Eulji University Hospital, Eulji University School of Medicine, Daejeon, South Korea
| | - Jae Guk Kim
- Department of Neurology, Eulji University Hospital, Eulji University School of Medicine, Daejeon, South Korea
| | - Jae-Kwan Cha
- Department of Neurology, Dong-A University Hospital, Dong-A University College of Medicine, Busan, South Korea
| | - Dae-Hyun Kim
- Department of Neurology, Dong-A University Hospital, Dong-A University College of Medicine, Busan, South Korea
| | - Jun Lee
- Department of Neurology, Yeungnam University Hospital, Daegu, South Korea
| | - Man Seok Park
- Department of Neurology, Chonnam National University Hospital, Chonnam National University Medical School, Gwangju, South Korea
| | - Dongmin Kim
- Artificial Intelligence Research Center, JLK Inc., Seoul, South Korea
| | - Oh Young Bang
- Department of Neurology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Eung Yeop Kim
- Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Chul-Ho Sohn
- Department of Radiology, College of Medicine, Seoul National University, Seoul, South Korea
| | - Hosung Kim
- USC Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of USC, University of Southern California, Los Angeles, CA, USA
| | - Hee-Joon Bae
- Department of Neurology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, South Korea
| | - Dong-Eog Kim
- National Priority Research Center for Stroke and Department of Neurology, Dongguk University Ilsan Hospital, 27, Dongguk-ro, Ilsandong-gu, Goyang, South Korea.
- Bioimaging Data Curation Center, Seoul, South Korea.
| |
Collapse
|
40
|
Luo R, Guo S, Hniopek J, Bocklitz T. 3D Hyperspectral Data Analysis with Spatially Aware Deep Learning for Diagnostic Applications. Anal Chem 2025; 97:7729-7737. [PMID: 40179245 PMCID: PMC12004353 DOI: 10.1021/acs.analchem.4c05549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2024] [Revised: 03/24/2025] [Accepted: 03/26/2025] [Indexed: 04/05/2025]
Abstract
Nowadays, with the rise of artificial intelligence (AI), deep learning algorithms play an increasingly important role in various traditional fields of research. Recently, these algorithms have already spread into data analysis for Raman spectroscopy. However, most current methods only use 1-dimensional (1D) spectral data classification, instead of considering any neighboring information in space. Despite some successes, this type of methods wastes the 3-dimensional (3D) structure of Raman hyperspectral scans. Therefore, to investigate the feasibility of preserving the spatial information on Raman spectroscopy for data analysis, spatially aware deep learning algorithms were applied into a colorectal tissue data set with 3D Raman hyperspectral scans. This data set contains Raman spectra from normal, hyperplasia, adenoma, carcinoma tissues as well as artifacts. First, a modified version of 3D U-Net was utilized for segmentation; second, another convolutional neural network (CNN) using 3D Raman patches was utilized for pixel-wise classification. Both methods were compared with the conventional 1D CNN method, which worked as baseline. Based on the results of both epithelial tissue detection and colorectal cancer detection, it is shown that using spatially neighboring information on 3D Raman scans can increase the performance of deep learning models, although it might also increase the complexity of network training. Apart from the colorectal tissue data set, experiments were also conducted on a cholangiocarcinoma data set for generalizability verification. The findings in this study can also be potentially applied into future tasks regarding spectroscopic data analysis, especially for improving model performance in a spatially aware way.
Collapse
Affiliation(s)
- Ruihao Luo
- Institute
of Physical Chemistry (IPC) and Abbe School of Photonics (ASP), Friedrich-Schiller-Universität Jena, Helmholtzweg 4, 07743 Jena, Germany
- Leibniz
Institute of Photonic Technology (IPHT), Albert-Einstein-Straße 9, 07745 Jena, Germany
| | - Shuxia Guo
- Institute
of Physical Chemistry (IPC) and Abbe School of Photonics (ASP), Friedrich-Schiller-Universität Jena, Helmholtzweg 4, 07743 Jena, Germany
- Leibniz
Institute of Photonic Technology (IPHT), Albert-Einstein-Straße 9, 07745 Jena, Germany
| | - Julian Hniopek
- Institute
of Physical Chemistry (IPC) and Abbe School of Photonics (ASP), Friedrich-Schiller-Universität Jena, Helmholtzweg 4, 07743 Jena, Germany
- Leibniz
Institute of Photonic Technology (IPHT), Albert-Einstein-Straße 9, 07745 Jena, Germany
| | - Thomas Bocklitz
- Institute
of Physical Chemistry (IPC) and Abbe School of Photonics (ASP), Friedrich-Schiller-Universität Jena, Helmholtzweg 4, 07743 Jena, Germany
- Leibniz
Institute of Photonic Technology (IPHT), Albert-Einstein-Straße 9, 07745 Jena, Germany
| |
Collapse
|
41
|
Ndzimbong W, Fourniol C, Themyr L, Thome N, Keeza Y, Sauer B, Piéchaud PT, Méjean A, Marescaux J, George D, Mutter D, Hostettler A, Collins T. TRUSTED: The Paired 3D Transabdominal Ultrasound and CT Human Data for Kidney Segmentation and Registration Research. Sci Data 2025; 12:615. [PMID: 40221416 PMCID: PMC11993632 DOI: 10.1038/s41597-025-04467-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Accepted: 01/14/2025] [Indexed: 04/14/2025] Open
Abstract
Inter-modal image registration (IMIR) and image segmentation with abdominal Ultrasound (US) data have many important clinical applications, including image-guided surgery, automatic organ measurement, and robotic navigation. However, research is severely limited by the lack of public datasets. We propose TRUSTED (the Tridimensional Renal Ultra Sound TomodEnsitometrie Dataset), comprising paired transabdominal 3DUS and CT kidney images from 48 human patients (96 kidneys), including segmentation, and anatomical landmark annotations by two experienced radiographers. Inter-rater segmentation agreement was over 93% (Dice score), and gold-standard segmentations were generated using the STAPLE algorithm. Seven anatomical landmarks were annotated, for IMIR systems development and evaluation. To validate the dataset's utility, 4 competitive Deep-Learning models for kidney segmentation were benchmarked, yielding average DICE scores from 79.63% to 90.09% for CT, and 70.51% to 80.70% for US images. Four IMIR methods were benchmarked, and Coherent Point Drift performed best with an average Target Registration Error of 4.47 mm and Dice score of 84.10%. The TRUSTED dataset may be used freely to develop and validate segmentation and IMIR methods.
Collapse
Affiliation(s)
- William Ndzimbong
- University of Strasbourg, ICUBE, Strasbourg, France.
- Research Institute against Digestive Cancer (IRCAD), Strasbourg, France.
| | | | - Loic Themyr
- Conservatoire National des Arts et Métiers (CNAM), CEDRIC, Paris, France
| | | | - Yvonne Keeza
- Research Institute against Digestive Cancer (IRCAD), Kigali, Rwanda
| | - Benoît Sauer
- Department of Radiology, Clinique Sainte-Anne, Groupe MIM, Strasbourg, France
| | | | | | - Jacques Marescaux
- Research Institute against Digestive Cancer (IRCAD), Strasbourg, France
| | - Daniel George
- University of Strasbourg, CNRS, ICUBE, Strasbourg, France
| | - Didier Mutter
- Institute of Image-Guided Surgery (IHU), Strasbourg, France
- Hepato-digestive Unit, University Hospital of Strasbourg (HUS), Strasbourg, France
- Research Institute against Digestive Cancer (IRCAD), Strasbourg, France
| | | | - Toby Collins
- Research Institute against Digestive Cancer (IRCAD), Strasbourg, France.
| |
Collapse
|
42
|
Li Y, Hui L, Wang X, Zou L, Chua S. Lung nodule detection using a multi-scale convolutional neural network and global channel spatial attention mechanisms. Sci Rep 2025; 15:12313. [PMID: 40210738 PMCID: PMC11986029 DOI: 10.1038/s41598-025-97187-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2024] [Accepted: 04/02/2025] [Indexed: 04/12/2025] Open
Abstract
Early detection of lung nodules is crucial for the prevention and treatment of lung cancer. However, current methods face challenges such as missing small nodules, variations in nodule size, and high false positive rates. To address these challenges, we propose a Global Channel Spatial Attention Mechanism (GCSAM). Building upon it, we develop a Candidate Nodule Detection Network (CNDNet) and a False Positive Reduction Network (FPRNet). CNDNet employs Res2Net as its backbone network to capture multi-scale features of lung nodules, utilizing GCSAM to fuse global contextual information, adaptively adjust feature weights, and refine processing along the spatial dimension. Additionally, we design a Hierarchical Progressive Feature Fusion (HPFF) module to effectively combine deep semantic information with shallow positional information, enabling high-sensitivity detection of nodules of varying sizes. FPRNet significantly reduces the false positive rate by accurately distinguishing true nodules from similar structures. Experimental results on the LUNA16 dataset demonstrate that our method achieves a competitive performance metric (CPM) value of 0.929 and a sensitivity of 0.977 under 2 false positives per scan. Compared to existing methods, our proposed method effectively reduces false positives while maintaining high sensitivity, achieving competitive results.
Collapse
Affiliation(s)
- Yongbin Li
- Faculty of Medical Information Engineering, Zunyi Medical University, 563000, Zunyi, Guizhou, China
- Faculty of Computer Science and Information Technology, Universiti Malaysia Sarawak, 94300, Kota Samarahan, Sarawak, Malaysia
| | - Linhu Hui
- Faculty of Medical Information Engineering, Zunyi Medical University, 563000, Zunyi, Guizhou, China
| | - Xiaohua Wang
- Faculty of Medical Information Engineering, Zunyi Medical University, 563000, Zunyi, Guizhou, China
| | - Liping Zou
- Faculty of Medical Information Engineering, Zunyi Medical University, 563000, Zunyi, Guizhou, China
| | - Stephanie Chua
- Faculty of Computer Science and Information Technology, Universiti Malaysia Sarawak, 94300, Kota Samarahan, Sarawak, Malaysia.
| |
Collapse
|
43
|
Ahmad I, Anwar SJ, Hussain B, Ur Rehman A, Bermak A. Anatomy guided modality fusion for cancer segmentation in PET CT volumes and images. Sci Rep 2025; 15:12153. [PMID: 40204866 PMCID: PMC11982402 DOI: 10.1038/s41598-025-95757-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2024] [Accepted: 03/24/2025] [Indexed: 04/11/2025] Open
Abstract
Segmentation in computed tomography (CT) provides detailed anatomical information, while positron emission tomography (PET) provide the metabolic activity of cancer. Existing segmentation models in CT and PET either rely on early fusion, which struggles to effectively capture independent features from each modality, or late fusion, which is computationally expensive and fails to leverage the complementary nature of the two modalities. This research addresses the gap by proposing an intermediate fusion approach that optimally balances the strengths of both modalities. Our method leverages anatomical features to guide the fusion process while preserving spatial representation quality. We achieve this through the separate encoding of anatomical and metabolic features followed by an attentive fusion decoder. Unlike traditional fixed normalization techniques, we introduce novel "zero layers" with learnable normalization. The proposed intermediate fusion reduces the number of filters, resulting in a lightweight model. Our approach demonstrates superior performance, achieving a dice score of 0.8184 and an [Formula: see text] score of 2.31. The implications of this study include more precise tumor delineation, leading to enhanced cancer diagnosis and more effective treatment planning.
Collapse
Affiliation(s)
- Ibtihaj Ahmad
- Northwestern Polytechnical University, Xi'an, 710072, Shaanxi, People's Republic of China
- School of Public Health, Shandong University, Jinan, Shandong, People's Republic of China
| | - Sadia Jabbar Anwar
- Northwestern Polytechnical University, Xi'an, 710072, Shaanxi, People's Republic of China
| | - Bagh Hussain
- Northwestern Polytechnical University, Xi'an, 710072, Shaanxi, People's Republic of China
| | - Atiq Ur Rehman
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar.
| | - Amine Bermak
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| |
Collapse
|
44
|
Pan P, Zhang C, Sun J, Guo L. Multi-scale conv-attention U-Net for medical image segmentation. Sci Rep 2025; 15:12041. [PMID: 40199917 PMCID: PMC11978844 DOI: 10.1038/s41598-025-96101-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2024] [Accepted: 03/26/2025] [Indexed: 04/10/2025] Open
Abstract
U-Net-based network structures are widely used in medical image segmentation. However, effectively capturing multi-scale features and spatial context information of complex organizational structures remains a challenge. To address this, we propose a novel network structure based on the U-Net backbone. This model integrates the Adaptive Convolution (AC) module, Multi-Scale Learning (MSL) module, and Conv-Attention module to enhance feature expression ability and segmentation performance. The AC module dynamically adjusts the convolutional kernel through an adaptive convolutional layer. This enables the model to extract features of different shapes and scales adaptively, further improving its performance in complex scenarios. The MSL module is designed for multi-scale information fusion. It effectively aggregates fine-grained and high-level semantic features from different resolutions, creating rich multi-scale connections between the encoding and decoding processes. On the other hand, the Conv-Attention module incorporates an efficient attention mechanism into the skip connections. It captures global context information using a low-dimensional proxy for high-dimensional data. This approach reduces computational complexity while maintaining effective spatial and channel information extraction. Experimental validation on the CVC-ClinicDB, MICCAI 2023 Tooth, and ISIC2017 datasets demonstrates that our proposed MSCA-UNet significantly improves segmentation accuracy and model robustness. At the same time, it remains lightweight and outperforms existing segmentation methods.
Collapse
Affiliation(s)
- Peng Pan
- College of Technology and Data, Yantai Nanshan University, Yantai, 265713, China
| | - Chengxue Zhang
- College of Technology and Data, Yantai Nanshan University, Yantai, 265713, China
| | - Jingbo Sun
- College of Technology and Data, Yantai Nanshan University, Yantai, 265713, China.
| | - Lina Guo
- College of Technology and Data, Yantai Nanshan University, Yantai, 265713, China
| |
Collapse
|
45
|
Zhou Y, Su H, Wang T, Hu Q. Onet: Twin U-Net Architecture for Unsupervised Binary Semantic Segmentation in Radar and Remote Sensing Images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:2161-2172. [PMID: 40031275 DOI: 10.1109/tip.2025.3530816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Segmenting objects from cluttered backgrounds in single-channel images, such as marine radar echoes, medical images, and remote sensing images, poses significant challenges due to limited texture, color information, and diverse target types. This paper proposes a novel solution: the Onet, an O-shaped assembly of twin U-Net deep neural networks, designed for unsupervised binary semantic segmentation. The Onet, trained with an intensity-complementary image pair and without the need for annotated labels, maximizes the Jensen-Shannon divergence (JSD) between the densely localized features and the class probability maps. By leveraging the symmetry of U-Net, Onet subtly strengthens the dependence between dense local features, global features, and class probability maps during the training process. The design of the complementary input pair aligns with the theoretical requirement that optimizing JSD needs the class probability of negative samples to accurately estimate the marginal distribution. Compared to the current leading unsupervised segmentation methods, the Onet demonstrates superior performance in target segmentation in marine radar frames and cloud segmentation in remote sensing images. Notably, we found that Onet's foreground prediction significantly enhances the signal-to-noise ratio (SNR) of targets amidst marine radar clutter. Onet's source code is publicly accessible at https://github.com/joeyee/Onet.
Collapse
|
46
|
Qiu J, Karageorgos GM, Peng X, Ghose S, Yang Z, Dentinger A, Xu Z, Jo J, Ragupathi S, Xu G, Abdulaziz N, Gandikota G, Wang X, Mills D. SwinDAF3D: Pyramid Swin Transformers with Deep Attentive Features for Automated Finger Joint Segmentation in 3D Ultrasound Images for Rheumatoid Arthritis Assessment. Bioengineering (Basel) 2025; 12:390. [PMID: 40281750 PMCID: PMC12025309 DOI: 10.3390/bioengineering12040390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2025] [Revised: 04/01/2025] [Accepted: 04/03/2025] [Indexed: 04/29/2025] Open
Abstract
Rheumatoid arthritis (RA) is a chronic autoimmune disease that can cause severe joint damage and functional impairment. Ultrasound imaging has shown promise in providing real-time assessment of synovium inflammation associated with the early stages of RA. Accurate segmentation of the synovium region and quantification of inflammation-specific imaging biomarkers are crucial for assessing and grading RA. However, automatic segmentation of the synovium in 3D ultrasound is challenging due to ambiguous boundaries, variability in synovium shape, and inhomogeneous intensity distribution. In this work, we introduce a novel network architecture, Swin Transformers with Deep Attentive Features for 3D segmentation (SwinDAF3D), which integrates Swin Transformers into a Deep Attentive Features framework. The developed architecture leverages the hierarchical structure and shifted windows of Swin Transformers to capture rich, multi-scale and attentive contextual information, improving the modeling of long-range dependencies and spatial hierarchies in 3D ultrasound images. In a six-fold cross-validation study with 3D ultrasound images of RA patients' finger joints (n = 72), our SwinDAF3D model achieved the highest performance with a Dice Score (DSC) of 0.838 ± 0.013, an Intersection over Union (IoU) of 0.719 ± 0.019, and Surface Dice Score (SDSC) of 0.852 ± 0.020, compared to 3D UNet (DSC: 0.742 ± 0.025; IoU: 0.589 ± 0.031; SDSC: 0.661 ± 0.029), DAF3D (DSC: 0.813 ± 0.017; IoU: 0.689 ± 0.022; SDSC: 0.817 ± 0.013), Swin UNETR (DSC: 0.808 ± 0.025; IoU: 0.678 ± 0.032; SDSC: 0.822 ± 0.039), UNETR++ (DSC: 0.810 ± 0.014; IoU: 0.684 ± 0.018; SDSC: 0.829 ± 0.027) and TransUNet (DSC: 0.818 ± 0.013; IoU: 0.692 ± 0.017; SDSC: 0.815 ± 0.016) models. This ablation study demonstrates the effectiveness of combining a Swin Transformers feature pyramid with a deep attention mechanism, improving the segmentation accuracy of the synovium in 3D ultrasound. This advancement shows great promise in enabling more efficient and standardized RA screening using ultrasound imaging.
Collapse
Affiliation(s)
- Jianwei Qiu
- GE HealthCare Technology & Innovation Center, Niskayuna, NY 12309, USA; (G.M.K.); (S.G.); (A.D.); (D.M.)
| | - Grigorios M. Karageorgos
- GE HealthCare Technology & Innovation Center, Niskayuna, NY 12309, USA; (G.M.K.); (S.G.); (A.D.); (D.M.)
| | - Xiaorui Peng
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI 48109, USA; (X.P.); (Z.X.); (J.J.); (S.R.); (G.X.); (X.W.)
| | - Soumya Ghose
- GE HealthCare Technology & Innovation Center, Niskayuna, NY 12309, USA; (G.M.K.); (S.G.); (A.D.); (D.M.)
| | | | - Aaron Dentinger
- GE HealthCare Technology & Innovation Center, Niskayuna, NY 12309, USA; (G.M.K.); (S.G.); (A.D.); (D.M.)
| | - Zhanpeng Xu
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI 48109, USA; (X.P.); (Z.X.); (J.J.); (S.R.); (G.X.); (X.W.)
| | - Janggun Jo
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI 48109, USA; (X.P.); (Z.X.); (J.J.); (S.R.); (G.X.); (X.W.)
| | - Siddarth Ragupathi
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI 48109, USA; (X.P.); (Z.X.); (J.J.); (S.R.); (G.X.); (X.W.)
| | - Guan Xu
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI 48109, USA; (X.P.); (Z.X.); (J.J.); (S.R.); (G.X.); (X.W.)
| | - Nada Abdulaziz
- Division of Rheumatology, Department of Internal Medicine, University of Michigan, Ann Arbor, MI 48109, USA;
| | - Girish Gandikota
- Department of Radiology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA;
| | - Xueding Wang
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI 48109, USA; (X.P.); (Z.X.); (J.J.); (S.R.); (G.X.); (X.W.)
| | - David Mills
- GE HealthCare Technology & Innovation Center, Niskayuna, NY 12309, USA; (G.M.K.); (S.G.); (A.D.); (D.M.)
| |
Collapse
|
47
|
Xu X, Sun C, Yu H, Yan G, Zhu Q, Kong X, Pan Y, Xu H, Zheng T, Zhou C, Wang Y, Xiao J, Chen R, Li M, Zhang S, Hu H, Zou Y, Wang J, Wang G, Wu D. Site effects in multisite fetal brain MRI: morphological insights into early brain development. Eur Radiol 2025; 35:1830-1842. [PMID: 39299951 DOI: 10.1007/s00330-024-11084-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 06/06/2024] [Accepted: 08/26/2024] [Indexed: 09/22/2024]
Abstract
OBJECTIVE To evaluate multisite effects on fetal brain MRI. Specifically, to identify crucial acquisition factors affecting fetal brain structural measurements and developmental patterns, while assessing the effectiveness of existing harmonization methods in mitigating site effects. MATERIALS AND METHODS Between May 2017 and March 2022, T2-weighted fast spin-echo sequences in-utero MRI were performed on healthy fetuses from retrospectively recruited pregnant volunteers on four different scanners at four sites. A generalized additive model (GAM) was used to quantitatively assess site effects, including field strength (FS), manufacturer (M), in-plane resolution (R), and slice thickness (ST), on subcortical volume and cortical morphological measurements, including cortical thickness, curvature, and sulcal depth. Growth models were selected to elucidate the developmental trajectories of these morphological measurements. Welch's test was performed to evaluate the influence of site effects on developmental trajectories. The comBat-GAM harmonization method was applied to mitigate site-related biases. RESULTS The final analytic sample consisted of 340 MRI scans from 218 fetuses (mean GA, 30.1 weeks ± 4.4 [range, 21.7-40 weeks]). GAM results showed that lower FS and lower spatial resolution led to overestimations in selected brain regions of subcortical volumes and cortical morphological measurements. Only the peak cortical thickness in developmental trajectories was significantly influenced by the effects of FS and R. Notably, ComBat-GAM harmonization effectively removed site effects while preserving developmental patterns. CONCLUSION Our findings pinpointed the key acquisition factors in in-utero fetal brain MRI and underscored the necessity of data harmonization when pooling multisite data for fetal brain morphology investigations. KEY POINTS Question How do specific site MRI acquisition factors affect fetal brain imaging? Finding Lower FS and spatial resolution overestimated subcortical volumes and cortical measurements. Cortical thickness in developmental trajectories was influenced by FS and in-plane resolution. Clinical relevance This study provides important guidelines for the fetal MRI community when scanning fetal brains and underscores the necessity of data harmonization of cross-center fetal studies.
Collapse
Affiliation(s)
- Xinyi Xu
- Department of Biomedical Engineering, College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, China
| | - Cong Sun
- Department of Radiology, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Hong Yu
- Dalian Municipal Women and Children's Medical Center (Group), Dalian, China
| | - Guohui Yan
- Department of Radiology, Women's Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Qingqing Zhu
- Department of Radiology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Xianglei Kong
- Department of Radiology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Yibin Pan
- Department of Obstetrics and Gynecology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Key Laboratory of Reproductive Dysfunction Management of Zhejiang Province, Zhejiang Provincial Clinical Research Center for Obstetrics and Gynecology, Hangzhou, China
| | - Haoan Xu
- Department of Biomedical Engineering, College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, China
| | - Tianshu Zheng
- Department of Biomedical Engineering, College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, China
| | - Chi Zhou
- Department of Biomedical Engineering, College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, China
| | - Yutian Wang
- Department of Biomedical Engineering, College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, China
| | - Jiaxin Xiao
- Department of Biomedical Engineering, College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, China
- School of Biomedical Engineering & Imaging Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK
| | - Ruike Chen
- Department of Biomedical Engineering, College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, China
| | - Mingyang Li
- Department of Biomedical Engineering, College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, China
| | - Songying Zhang
- Department of Obstetrics and Gynecology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Key Laboratory of Reproductive Dysfunction Management of Zhejiang Province, Zhejiang Provincial Clinical Research Center for Obstetrics and Gynecology, Hangzhou, China
| | - Hongjie Hu
- Department of Radiology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China.
| | - Yu Zou
- Department of Radiology, Women's Hospital, Zhejiang University School of Medicine, Hangzhou, China.
| | - Jingshi Wang
- Dalian Municipal Women and Children's Medical Center (Group), Dalian, China.
| | - Guangbin Wang
- Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China.
| | - Dan Wu
- Department of Biomedical Engineering, College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, China.
| |
Collapse
|
48
|
Zhang B, Huang H, Shen Y, Sun M. MM-UKAN++: A Novel Kolmogorov-Arnold Network-Based U-Shaped Network for Ultrasound Image Segmentation. IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL 2025; 72:498-514. [PMID: 40031744 DOI: 10.1109/tuffc.2025.3539262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Ultrasound (US) imaging is an important and commonly used medical imaging modality. Accurate and fast automatic segmentation of regions of interest (ROIs) in US images is essential for enhancing the efficiency of clinical and robot-assisted diagnosis. However, US images suffer from low contrast, fuzzy boundaries, and significant scale variations in ROIs. Existing convolutional neural network (CNN)-based and transformer-based methods struggle with model efficiency and explainability. To address these challenges, we introduce MM-UKAN++, a novel U-shaped network based on Kolmogorov-Arnold networks (KANs). MM-UKAN++ leverages multilevel KAN layers as the encoder and decoder within the U-network architecture and incorporates an innovative multidimensional attention mechanism to refine skip connections by weighting features from frequency-channel and spatial perspectives. In addition, the network effectively integrates multiscale information, fusing outputs from different scale decoders to generate precise segmentation predictions. MM-UKAN++ achieves higher segmentation accuracy with lower computational cost and outperforms other mainstream methods on several open-source datasets for US image segmentation tasks, including achieving 69.42% IoU, 81.30% Dice, and 3.31 mm HD in the BUSI dataset with 3.17 G floating point of operations (FLOPs) and 9.90 M parameters. The excellent performance on our automatic carotid artery US scanning and diagnostic system further proves the speed and accuracy of MM-UKAN++. Besides, the good performance in other medical image segmentation tasks reveals the promising applications of MM-UKAN++. The code is available on GitHub.
Collapse
|
49
|
Kim MS, Amm E, Parsi G, ElShebiny T, Motro M. Automated dentition segmentation: 3D UNet-based approach with MIScnn framework. J World Fed Orthod 2025; 14:84-90. [PMID: 39489636 DOI: 10.1016/j.ejwf.2024.09.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 09/18/2024] [Accepted: 09/18/2024] [Indexed: 11/05/2024]
Abstract
INTRODUCTION Advancements in technology have led to the adoption of digital workflows in dentistry, which require the segmentation of regions of interest from cone-beam computed tomography (CBCT) scans. These segmentations assist in diagnosis, treatment planning, and research. However, manual segmentation is an expensive and labor-intensive process. Therefore, automated methods, such as convolutional neural networks (CNNs), provide a more efficient way to generate segmentations from CBCT scans. METHODS A three-dimensional UNet-based CNN model, utilizing the Medical Image Segmentation CNN framework, was used for training and generating predictions from CBCT scans. A dataset of 351 CBCT scans, with ground-truth labels created through manual segmentation using AI-assisted segmentation software, was prepared. Data preprocessing, augmentation, and model training were performed, and the performance of the proposed CNN model was analyzed. RESULTS The CNN model achieved high accuracy in segmenting maxillary and mandibular teeth from CBCT scans, with average Dice Similarity Coefficient values of 91.83% and 91.35% for maxillary and mandibular teeth, respectively. Performance metrics, including Intersection over Union, precision, and recall, further confirmed the model's effectiveness. CONCLUSIONS The study demonstrates the efficacy of the three-dimensional UNet-based CNN model within the Medical Image Segmentation CNN framework for automated segmentation of maxillary and mandibular dentition from CBCT scans. Automated segmentation using CNNs has the potential to deliver accurate and efficient results, offering a significant advantage over traditional segmentation methods.
Collapse
Affiliation(s)
- Min Seok Kim
- Department of Orthodontics and Dentofacial Orthopedics, Boston University Goldman School of Dentistry, Boston, Massachusetts.
| | - Elie Amm
- Department of Orthodontics and Dentofacial Orthopedics, Boston University Goldman School of Dentistry, Boston, Massachusetts
| | - Goli Parsi
- Department of Orthodontics and Dentofacial Orthopedics, Boston University Goldman School of Dentistry, Boston, Massachusetts
| | - Tarek ElShebiny
- Department of Orthodontics, Case Western Reserve University School of Dental Medicine, Cleveland, Ohio
| | - Melih Motro
- Department of Orthodontics and Dentofacial Orthopedics, Boston University Goldman School of Dentistry, Boston, Massachusetts
| |
Collapse
|
50
|
Singh DP, Banerjee T, Kour P, Swain D, Narayan Y. CICADA (UCX): A novel approach for automated breast cancer classification through aggressiveness delineation. Comput Biol Chem 2025; 115:108368. [PMID: 39914074 DOI: 10.1016/j.compbiolchem.2025.108368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 01/15/2025] [Accepted: 01/26/2025] [Indexed: 02/26/2025]
Abstract
Breast cancer remains one of the leading causes of mortality worldwide, with current classification and segmentation techniques often falling short in accurately distinguishing between benign and malignant cases. The study both emphasize the novel approach, CICADA (UCX), specifically designed for breast segmentation with a focus on delineating aggressiveness. While the title highlights segmentation, the abstract expands on this by detailing the model's effectiveness in enhancing diagnostic precision in classifying aggressive tumor characteristics. Breast cancer segmentation pertains to the delineation of malignant tissue borders in medical imaging. The objective is to precisely delineate the malignant area from healthy tissues, facilitating reliable evaluation of tumor attributes like location, size, and form. Historically, manual segmentation by radiologists has been the benchmark; however, it is labor-intensive and susceptible to fluctuation among different observers and within the same observer. With the advancement of medical imaging technologies, there is an increasing demand for automated or semi-automatic systems capable of performing segmentation with efficiency and precision. These strategies seek to minimize human error, enhance reproducibility, and expedite diagnosis, so enabling prompt treatment. A significant problem in breast cancer segmentation is the variability in tumor morphology among various patients and imaging techniques. Neoplasms exhibit considerable variability in dimensions, morphology, and density, complicating the formulation of a universal approach. Moreover, elements like breast tissue density, which might hinder tumor appearance in mammograms, further complicate segmentation. A further barrier is the necessity for extensive, meticulously annotated datasets to train and test machine learning models, as medical picture annotation is labor-intensive and demands specialized expertise. Notwithstanding these obstacles, automated breast cancer segmentation has demonstrated significant potential in clinical applications. It assists radiologists in swiftly and precisely identifying questionable areas, resulting in earlier diagnosis and enhanced patient outcomes. Automated devices can aid in treatment planning by delivering accurate measures of tumor size and location, which are essential for establishing suitable surgical or radiation methods. This study addresses these limitations by introducing CICADA (UCX), which aims to enhance diagnostic precision and operational efficiency in clinical applications. The present study focuses on the creation and assessment of a sophisticated medical picture segmentation model, called Cheetah Inspired Convex Adaptive Discriminator Algorithm with Unet Convenet Xt CICADA (UCX), by contrasting it with the most advanced techniques currently in use. With a mean IOU of 96.34 %, a Dice Coefficient/F1-Score of 99.6461 %, and an AUC of 99.88 %, the suggested model performs quite well. The study incorporates various feature selection techniques like Particle Swarm Optimisation, Dragon Fly, Grey Wolf and our proposed novel technique named as CICADA (UCX). Through a thorough comparison analysis using many approaches, the paper highlights the advantages of CICADA (UCX) for medical picture segmentation. The study advances the area by offering fresh perspectives on segmentation accuracy, with a focus on obtaining a high Dice Coefficient/F1-Score. The results highlight how CICADA (UCX) has the ability to greatly improve medical image analysis and enable more precise and effective diagnosis. The CICADA (UCX) model, a revolutionary approach to medical picture segmentation, is presented in this study, which is a significant improvement over other existing technique. The model outperforms state-of-the-art methods in a thorough comparison investigation, showing higher performance across important assessment measures including mean IOU, Dice Coefficient/F1-Score, and AUC. Notably, the model scores a remarkable 99.6461 % Dice Coefficient/F1-Score, demonstrating accurate medical structural delineation. An important aspect of medical imaging applications is segmentation accuracy, which is greatly improved by this study. The results point to possible improvements in operational efficiency and diagnostic accuracy, which would be beneficial to patients as well as medical personnel. This discovery has significance for improving medical picture segmentation techniques and promoting technological developments in medical imaging and computer-aided diagnosis.
Collapse
Affiliation(s)
- Davinder Paul Singh
- Department of Computer Science and Engineering, School of Technology, Pandit Deendayal Energy University, Gandhinagar, Gujarat, India.
| | | | - Pawandeep Kour
- Department of Chemistry, University of Kashmir, Srinagar, Jammu and Kashmir, India.
| | - Debabrata Swain
- Department of Computer Science and Engineering, School of Technology, Pandit Deendayal Energy University, Gandhinagar, Gujarat, India
| | | |
Collapse
|