1
|
Tian F, Zhai J, Gong J, Lei W, Chang S, Ju F, Qian S, Zou X. SAM-MedUS: a foundational model for universal ultrasound image segmentation. J Med Imaging (Bellingham) 2025; 12:027001. [PMID: 40028655 PMCID: PMC11865838 DOI: 10.1117/1.jmi.12.2.027001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Revised: 01/19/2025] [Accepted: 02/04/2025] [Indexed: 03/05/2025] Open
Abstract
Purpose Segmentation of ultrasound images for medical diagnosis, monitoring, and research is crucial, and although existing methods perform well, they are limited by specific organs, tumors, and image devices. Applications of the Segment Anything Model (SAM), such as SAM-med2d, use a large number of medical datasets that contain only a small fraction of the ultrasound medical images. Approach In this work, we proposed a SAM-MedUS model for generic ultrasound image segmentation that utilizes the latest publicly available ultrasound image dataset to create a diverse dataset containing eight site categories for training and testing. We integrated ConvNext V2 and CM blocks in the encoder for better global context extraction. In addition, a boundary loss function is used to improve the segmentation of fuzzy boundaries and low-contrast ultrasound images. Results Experimental results show that SAM-MedUS outperforms recent methods on multiple ultrasound datasets. For the more easily datasets such as the adult kidney, it achieves 87.93% IoU and 93.58% dice, whereas for more complex ones such as the infant vein, IoU and dice reach 62.31% and 78.93%, respectively. Conclusions We collected and collated an ultrasound dataset of multiple different site types to achieve uniform segmentation of ultrasound images. In addition, the use of additional auxiliary branches ConvNext V2 and CM block enhances the ability of the model to extract global information and the use of boundary loss allows the model to exhibit robust performance and excellent generalization ability.
Collapse
Affiliation(s)
- Feng Tian
- Hunan Normal University, The School of Physics and Electronics, Changsha, China
| | - Jintao Zhai
- Hunan University, College of Computer Science and Electronic Engineering, Changsha, China
| | - Jinru Gong
- Hunan Normal University, The School of Physics and Electronics, Changsha, China
| | - Weirui Lei
- Hunan Normal University, The School of Physics and Electronics, Changsha, China
| | - Shuai Chang
- Hunan Normal University, The School of Physics and Electronics, Changsha, China
| | - Fangfang Ju
- Hunan Normal University, The School of Physics and Electronics, Changsha, China
| | - Shengyou Qian
- Hunan Normal University, The School of Physics and Electronics, Changsha, China
| | - Xiao Zou
- Hunan Normal University, The School of Physics and Electronics, Changsha, China
| |
Collapse
|
2
|
Dong P, Zhang R, Li J, Liu C, Liu W, Hu J, Yang Y, Li X. An ultrasound image segmentation method for thyroid nodules based on dual-path attention mechanism-enhanced UNet+. BMC Med Imaging 2024; 24:341. [PMID: 39695984 DOI: 10.1186/s12880-024-01521-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Accepted: 12/02/2024] [Indexed: 12/20/2024] Open
Abstract
PURPOSE This study aims to design an auxiliary segmentation model for thyroid nodules to increase diagnostic accuracy and efficiency, thereby reducing the workload of medical personnel. METHODS This study proposes a Dual-Path Attention Mechanism (DPAM)-UNet++ model, which can automatically segment thyroid nodules in ultrasound images. Specifically, the model incorporates dual-path attention modules into the skip connections of the UNet++ network to capture global contextual information in feature maps. The model's performance was evaluated using Intersection over Union (IoU), F1_score, accuracy, etc. Additionally, a new integrated loss function was designed for the DPAM-UNet++ network. RESULTS Comparative experiments with classical segmentation models revealed that the DPAM-UNet++ model achieved an IoU of 0.7451, an F1_score of 0.8310, an accuracy of 0.9718, a precision of 0.8443, a recall of 0.8702, an Area Under Curve (AUC) of 0.9213, and an HD95 of 35.31. Except for the precision metric, this model outperformed the other models on all the indicators and achieved a segmentation effect that was more similar to that of the ground truth labels. Additionally, ablation experiments verified the effectiveness and necessity of the dual-path attention mechanism and the integrated loss function. CONCLUSION The segmentation model proposed in this study can effectively capture global contextual information in ultrasound images and accurately identify the locations of nodule areas. The model yields excellent segmentation results, especially for small and multiple nodules. Additionally, the integrated loss function improves the segmentation of nodule edges, enhancing the model's accuracy in segmenting edge details.
Collapse
Affiliation(s)
- Peizhen Dong
- College of Information Science and Technology, Shihezi University, Shihezi, 832003, Xinjiang, China
| | - Ronghua Zhang
- College of Information Science and Technology, Shihezi University, Shihezi, 832003, Xinjiang, China
| | - Jun Li
- Department of Medical Ultrasound, The First Affiliated Hospital of Medical College, Shihezi University, Shihezi, 832003, Xinjiang, China
| | - Changzheng Liu
- College of Information Science and Technology, Shihezi University, Shihezi, 832003, Xinjiang, China.
| | - Wen Liu
- Department of Medical Ultrasound, The First Affiliated Hospital of Medical College, Shihezi University, Shihezi, 832003, Xinjiang, China
| | - Jiale Hu
- College of Information Science and Technology, Shihezi University, Shihezi, 832003, Xinjiang, China
| | - Yongqiang Yang
- College of Information Science and Technology, Shihezi University, Shihezi, 832003, Xinjiang, China
| | - Xiang Li
- College of Information Science and Technology, Shihezi University, Shihezi, 832003, Xinjiang, China
| |
Collapse
|
3
|
Hu M, Zhang Y, Xue H, Lv H, Han S. Mamba- and ResNet-Based Dual-Branch Network for Ultrasound Thyroid Nodule Segmentation. Bioengineering (Basel) 2024; 11:1047. [PMID: 39451422 PMCID: PMC11504408 DOI: 10.3390/bioengineering11101047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Revised: 10/15/2024] [Accepted: 10/18/2024] [Indexed: 10/26/2024] Open
Abstract
Accurate segmentation of thyroid nodules in ultrasound images is crucial for the diagnosis of thyroid cancer and preoperative planning. However, the segmentation of thyroid nodules is challenging due to their irregular shape, blurred boundary, and uneven echo texture. To address these challenges, a novel Mamba- and ResNet-based dual-branch network (MRDB) is proposed. Specifically, the visual state space block (VSSB) from Mamba and ResNet-34 are utilized to construct a dual encoder for extracting global semantics and local details, and establishing multi-dimensional feature connections. Meanwhile, an upsampling-convolution strategy is employed in the left decoder focusing on image size and detail reconstruction. A convolution-upsampling strategy is used in the right decoder to emphasize gradual feature refinement and recovery. To facilitate the interaction between local details and global context within the encoder and decoder, cross-skip connection is introduced. Additionally, a novel hybrid loss function is proposed to improve the boundary segmentation performance of thyroid nodules. Experimental results show that MRDB outperforms the state-of-the-art approaches with DSC of 90.02% and 80.6% on two public thyroid nodule datasets, TN3K and TNUI-2021, respectively. Furthermore, experiments on a third external dataset, DDTI, demonstrate that our method improves the DSC by 10.8% compared to baseline and exhibits good generalization to clinical small-scale thyroid nodule datasets. The proposed MRDB can effectively improve thyroid nodule segmentation accuracy and has great potential for clinical applications.
Collapse
Affiliation(s)
- Min Hu
- Department of Medical Electronics, School of Biomedical Engineering, Air Force Medical University, Xi’an 710032, China; (M.H.); (Y.Z.); (H.X.); (H.L.)
| | - Yaorong Zhang
- Department of Medical Electronics, School of Biomedical Engineering, Air Force Medical University, Xi’an 710032, China; (M.H.); (Y.Z.); (H.X.); (H.L.)
- School of Information and Control Engineering, Xi’an University of Architecture and Technology, Xi’an 710055, China
| | - Huijun Xue
- Department of Medical Electronics, School of Biomedical Engineering, Air Force Medical University, Xi’an 710032, China; (M.H.); (Y.Z.); (H.X.); (H.L.)
| | - Hao Lv
- Department of Medical Electronics, School of Biomedical Engineering, Air Force Medical University, Xi’an 710032, China; (M.H.); (Y.Z.); (H.X.); (H.L.)
| | - Shipeng Han
- Department of Medical Electronics, School of Biomedical Engineering, Air Force Medical University, Xi’an 710032, China; (M.H.); (Y.Z.); (H.X.); (H.L.)
| |
Collapse
|
4
|
Liang B, Peng F, Luo D, Zeng Q, Wen H, Zheng B, Zou Z, An L, Wen H, Wen X, Liao Y, Yuan Y, Li S. Automatic segmentation of 15 critical anatomical labels and measurements of cardiac axis and cardiothoracic ratio in fetal four chambers using nnU-NetV2. BMC Med Inform Decis Mak 2024; 24:128. [PMID: 38773456 PMCID: PMC11106923 DOI: 10.1186/s12911-024-02527-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Accepted: 05/02/2024] [Indexed: 05/23/2024] Open
Abstract
BACKGROUND Accurate segmentation of critical anatomical structures in fetal four-chamber view images is essential for the early detection of congenital heart defects. Current prenatal screening methods rely on manual measurements, which are time-consuming and prone to inter-observer variability. This study develops an AI-based model using the state-of-the-art nnU-NetV2 architecture for automatic segmentation and measurement of key anatomical structures in fetal four-chamber view images. METHODS A dataset, consisting of 1,083 high-quality fetal four-chamber view images, was annotated with 15 critical anatomical labels and divided into training/validation (867 images) and test (216 images) sets. An AI-based model using the nnU-NetV2 architecture was trained on the annotated images and evaluated using the mean Dice coefficient (mDice) and mean intersection over union (mIoU) metrics. The model's performance in automatically computing the cardiac axis (CAx) and cardiothoracic ratio (CTR) was compared with measurements from sonographers with varying levels of experience. RESULTS The AI-based model achieved a mDice coefficient of 87.11% and an mIoU of 77.68% for the segmentation of critical anatomical structures. The model's automated CAx and CTR measurements showed strong agreement with those of experienced sonographers, with respective intraclass correlation coefficients (ICCs) of 0.83 and 0.81. Bland-Altman analysis further confirmed the high agreement between the model and experienced sonographers. CONCLUSION We developed an AI-based model using the nnU-NetV2 architecture for accurate segmentation and automated measurement of critical anatomical structures in fetal four-chamber view images. Our model demonstrated high segmentation accuracy and strong agreement with experienced sonographers in computing clinically relevant parameters. This approach has the potential to improve the efficiency and reliability of prenatal cardiac screening, ultimately contributing to the early detection of congenital heart defects.
Collapse
Affiliation(s)
- Bocheng Liang
- Department of Ultrasound, Shenzhen Maternity&Child Healthcare Hospital, Shenzhen, 518028, China
| | - Fengfeng Peng
- Department of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Dandan Luo
- Department of Ultrasound, Shenzhen Maternity&Child Healthcare Hospital, Shenzhen, 518028, China
| | - Qing Zeng
- Department of Ultrasound, Shenzhen Maternity&Child Healthcare Hospital, Shenzhen, 518028, China
| | - Huaxuan Wen
- Department of Ultrasound, Shenzhen Maternity&Child Healthcare Hospital, Shenzhen, 518028, China
| | - Bowen Zheng
- Department of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Zhiying Zou
- Department of Ultrasound, Shenzhen Maternity&Child Healthcare Hospital, Shenzhen, 518028, China
| | - Liting An
- Department of Ultrasound, Shenzhen Maternity&Child Healthcare Hospital, Shenzhen, 518028, China
| | - Huiying Wen
- Department of Ultrasound, Shenzhen Maternity&Child Healthcare Hospital, Shenzhen, 518028, China
| | - Xin Wen
- Department of Ultrasound, Shenzhen Maternity&Child Healthcare Hospital, Shenzhen, 518028, China
| | - Yimei Liao
- Department of Ultrasound, Shenzhen Maternity&Child Healthcare Hospital, Shenzhen, 518028, China
| | - Ying Yuan
- Department of Ultrasound, Shenzhen Maternity&Child Healthcare Hospital, Shenzhen, 518028, China
| | - Shengli Li
- Department of Ultrasound, Shenzhen Maternity&Child Healthcare Hospital, Shenzhen, 518028, China.
| |
Collapse
|
5
|
Xu X, Liu D, Huang G, Wang M, Lei M, Jia Y. Computer aided diagnosis of diabetic retinopathy based on multi-view joint learning. Comput Biol Med 2024; 174:108428. [PMID: 38631117 DOI: 10.1016/j.compbiomed.2024.108428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 04/02/2024] [Accepted: 04/04/2024] [Indexed: 04/19/2024]
Abstract
Diabetic retinopathy (DR) is a kind of ocular complication of diabetes, and its degree grade is an essential basis for early diagnosis of patients. Manual diagnosis is a long and expensive process with a specific risk of misdiagnosis. Computer-aided diagnosis can provide more accurate and practical treatment recommendations. In this paper, we propose a multi-view joint learning DR diagnostic model called RT2Net, which integrates the global features of fundus images and the local detailed features of vascular images to reduce the limitations of single fundus image learning. Firstly, the original image is preprocessed using operations such as contrast-limited adaptive histogram equalization, and the vascular structure of the extracted DR image is segmented. Then, the vascular image and fundus image are input into two branch networks of RT2Net for feature extraction, respectively, and the feature fusion module adaptively fuses the feature vectors' output from the branch networks. Finally, the optimized classification model is used to identify the five categories of DR. This paper conducts extensive experiments on the public datasets EyePACS and APTOS 2019 to demonstrate the method's effectiveness. The accuracy of RT2Net on the two datasets reaches 88.2% and 85.4%, and the area under the receiver operating characteristic curve (AUC) is 0.98 and 0.96, respectively. The excellent classification ability of RT2Net for DR can significantly help patients detect and treat lesions early and provide doctors with a more reliable diagnosis basis, which has significant clinical value for diagnosing DR.
Collapse
Affiliation(s)
- Xuebin Xu
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an 710121, Shaanxi, China; Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an 710121, Shaanxi, China; Xi'an Key Laboratory of Big Data and Intelligent Computing, Xi'an 710121, Shaanxi, China.
| | - Dehua Liu
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an 710121, Shaanxi, China; Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an 710121, Shaanxi, China; Xi'an Key Laboratory of Big Data and Intelligent Computing, Xi'an 710121, Shaanxi, China.
| | - Guohua Huang
- Weinan Central Hospital, Xi'an 714099, Shaanxi, China.
| | - Muyu Wang
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an 710121, Shaanxi, China; Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an 710121, Shaanxi, China; Xi'an Key Laboratory of Big Data and Intelligent Computing, Xi'an 710121, Shaanxi, China.
| | - Meng Lei
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an 710121, Shaanxi, China; Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an 710121, Shaanxi, China; Xi'an Key Laboratory of Big Data and Intelligent Computing, Xi'an 710121, Shaanxi, China.
| | - Yang Jia
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an 710121, Shaanxi, China; Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an 710121, Shaanxi, China; Xi'an Key Laboratory of Big Data and Intelligent Computing, Xi'an 710121, Shaanxi, China.
| |
Collapse
|