1
|
Vidyarthi A. Probabilistic hierarchical clustering based identification and segmentation of brain tumors in magnetic resonance imaging. BIOMED ENG-BIOMED TE 2024; 69:181-192. [PMID: 37871189 DOI: 10.1515/bmt-2021-0313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Accepted: 10/11/2023] [Indexed: 10/25/2023]
Abstract
The automatic segmentation of the abnormality region from the head MRI is a challenging task in the medical science domain. The abnormality in the form of the tumor comprises the uncontrolled growth of the cells. The automatic identification of the affected cells using computerized software systems is demanding in the past several years to provide a second opinion to radiologists. In this paper, a new clustering approach is introduced based on the machine learning aspect that clusters the tumor region from the input MRI using disjoint tree generation followed by tree merging. Further, the proposed algorithm is improved by introducing the theory of joint probabilities and nearest neighbors. Later, the proposed algorithm is automated to find the number of clusters required with its nearest neighbors to do semantic segmentation of the tumor cells. The proposed algorithm provides good semantic segmentation results having the DB index-0.11 and Dunn index-13.18 on the SMS dataset. While the experimentation with BRATS 2015 dataset yields Dice complete=80.5 %, Dice core=73.2 %, and Dice enhanced=62.8 %. The comparative analysis of the proposed approach with benchmark models and algorithms proves the model's significance and its applicability to do semantic segmentation of the tumor cells with the average increment in the accuracy of around ±2.5 % with machine learning algorithms.
Collapse
Affiliation(s)
- Ankit Vidyarthi
- Department of CSE & IT, Jaypee Institute of Technology, Noida, India
| |
Collapse
|
2
|
He S, Li Q, Li X, Zhang M. A Lightweight Convolutional Neural Network Based on Dynamic Level-Set Loss Function for Spine MR Image Segmentation. J Magn Reson Imaging 2024; 59:1438-1453. [PMID: 37382232 DOI: 10.1002/jmri.28877] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 06/09/2023] [Accepted: 06/09/2023] [Indexed: 06/30/2023] Open
Abstract
BACKGROUND Spine MR image segmentation is important foundation for computer-aided diagnostic (CAD) algorithms of spine disorders. Convolutional neural networks segment effectively, but require high computational costs. PURPOSE To design a lightweight model based on dynamic level-set loss function for high segmentation performance. STUDY TYPE Retrospective. POPULATION Four hundred forty-eight subjects (3163 images) from two separate datasets. Dataset-1: 276 subjects/994 images (53.26% female, mean age 49.02 ± 14.09), all for disc degeneration screening, 188 had disc degeneration, 67 had herniated disc. Dataset-2: public dataset with 172 subjects/2169 images, 142 patients with vertebral degeneration, 163 patients with disc degeneration. FIELD STRENGTH/SEQUENCE T2 weighted turbo spin echo sequences at 3T. ASSESSMENT Dynamic Level-set Net (DLS-Net) was compared with four mainstream (including U-net++) and four lightweight models, and manual label made by five radiologists (vertebrae, discs, spinal fluid) used as segmentation evaluation standard. Five-fold cross-validation are used for all experiments. Based on segmentation, a CAD algorithm of lumbar disc was designed for assessing DLS-Net's practicality, and the text annotation (normal, bulging, or herniated) from medical history data were used as evaluation standard. STATISTICAL TESTS All segmentation models were evaluated with DSC, accuracy, precision, and AUC. The pixel numbers of segmented results were compared with manual label using paired t-tests, with P < 0.05 indicating significance. The CAD algorithm was evaluated with accuracy of lumbar disc diagnosis. RESULTS With only 1.48% parameters of U-net++, DLS-Net achieved similar accuracy in both datasets (Dataset-1: DSC 0.88 vs. 0.89, AUC 0.94 vs. 0.94; Dataset-2: DSC 0.86 vs. 0.86, AUC 0.93 vs. 0.93). The segmentation results of DLS-Net showed no significant differences with manual labels in pixel numbers for discs (Dataset-1: 1603.30 vs. 1588.77, P = 0.22; Dataset-2: 863.61 vs. 886.4, P = 0.14) and vertebrae (Dataset-1: 3984.28 vs. 3961.94, P = 0.38; Dataset-2: 4806.91 vs. 4732.85, P = 0.21). Based on DLS-Net's segmentation results, the CAD algorithm achieved higher accuracy than using non-cropped MR images (87.47% vs. 61.82%). DATA CONCLUSION The proposed DLS-Net has fewer parameters but achieves similar accuracy to U-net++, helps CAD algorithm achieve higher accuracy, which facilitates wider application. EVIDENCE LEVEL 2 TECHNICAL EFFICACY: Stage 1.
Collapse
Affiliation(s)
- Siyuan He
- School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, China
| | - Qi Li
- School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, China
- Zhongshan Institute of Changchun University of Science and Technology, Zhongshan, China
| | - Xianda Li
- School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, China
| | - Mengchao Zhang
- Department of Radiology, China-Japan Union Hospital of Jilin University, Changchun, China
| |
Collapse
|
3
|
Zhang Y, Yang G, Gong C, Zhang J, Wang S, Wang Y. Polyp segmentation with interference filtering and dynamic uncertainty mining. Phys Med Biol 2024; 69:075016. [PMID: 38382099 DOI: 10.1088/1361-6560/ad2b94] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 02/21/2024] [Indexed: 02/23/2024]
Abstract
Objective.Accurate polyp segmentation from colo-noscopy images plays a crucial role in the early diagnosis and treatment of colorectal cancer. However, existing polyp segmentation methods are inevitably affected by various image noises, such as reflections, motion blur, and feces, which significantly affect the performance and generalization of the model. In addition, coupled with ambiguous boundaries between polyps and surrounding tissue, i.e. small inter-class differences, accurate polyp segmentation remains a challenging problem.Approach.To address these issues, we propose a novel two-stage polyp segmentation method that leverages a preprocessing sub-network (Pre-Net) and a dynamic uncertainty mining network (DUMNet) to improve the accuracy of polyp segmentation. Pre-Net identifies and filters out interference regions before feeding the colonoscopy images to the polyp segmentation network DUMNet. Considering the confusing polyp boundaries, DUMNet employs the uncertainty mining module (UMM) to dynamically focus on foreground, background, and uncertain regions based on different pixel confidences. UMM helps to mine and enhance more detailed context, leading to coarse-to-fine polyp segmentation and precise localization of polyp regions.Main results.We conduct experiments on five popular polyp segmentation benchmarks: ETIS, CVC-ClinicDB, CVC-ColonDB, EndoScene, and Kvasir. Our method achieves state-of-the-art performance. Furthermore, the proposed Pre-Net has strong portability and can improve the accuracy of existing polyp segmentation models.Significance.The proposed method improves polyp segmentation performance by eliminating interference and mining uncertain regions. This aids doctors in making precise and reduces the risk of colorectal cancer. Our code will be released athttps://github.com/zyh5119232/DUMNet.
Collapse
Affiliation(s)
- Yunhua Zhang
- Northeastern University, Shenyang 110819, People's Republic of China
- DUT Artificial Intelligence Institute, Dalian 116024, People's Republic of China
| | - Gang Yang
- Northeastern University, Shenyang 110819, People's Republic of China
| | - Congjin Gong
- Northeastern University, Shenyang 110819, People's Republic of China
| | - Jianhao Zhang
- Northeastern University, Shenyang 110819, People's Republic of China
| | - Shuo Wang
- Northeastern University, Shenyang 110819, People's Republic of China
| | - Yutao Wang
- Northeastern University, Shenyang 110819, People's Republic of China
| |
Collapse
|
4
|
Long J, Ren Y, Yang C, Ren P, Zeng Z. MDT: semi-supervised medical image segmentation with mixup-decoupling training. Phys Med Biol 2024; 69:065012. [PMID: 38324897 DOI: 10.1088/1361-6560/ad2715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 02/07/2024] [Indexed: 02/09/2024]
Abstract
Objective. In the field of medicine, semi-supervised segmentation algorithms hold crucial research significance while also facing substantial challenges, primarily due to the extreme scarcity of expert-level annotated medical image data. However, many existing semi-supervised methods still process labeled and unlabeled data in inconsistent ways, which can lead to knowledge learned from labeled data being discarded to some extent. This not only lacks a variety of perturbations to explore potential robust information in unlabeled data but also ignores the confirmation bias and class imbalance issues in pseudo-labeling methods.Approach. To solve these problems, this paper proposes a semi-supervised medical image segmentation method 'mixup-decoupling training (MDT)' that combines the idea of consistency and pseudo-labeling. Firstly, MDT introduces a new perturbation strategy 'mixup-decoupling' to fully regularize training data. It not only mixes labeled and unlabeled data at the data level but also performs decoupling operations between the output predictions of mixed target data and labeled data at the feature level to obtain strong version predictions of unlabeled data. Then it establishes a dual learning paradigm based on consistency and pseudo-labeling. Secondly, MDT employs a novel categorical entropy filtering approach to pick high-confidence pseudo-labels for unlabeled data, facilitating more refined supervision.Main results. This paper compares MDT with other advanced semi-supervised methods on 2D and 3D datasets separately. A large number of experimental results show that MDT achieves competitive segmentation performance and outperforms other state-of-the-art semi-supervised segmentation methods.Significance. This paper proposes a semi-supervised medical image segmentation method MDT, which greatly reduces the demand for manually labeled data and eases the difficulty of data annotation to a great extent. In addition, MDT not only outperforms many advanced semi-supervised image segmentation methods in quantitative and qualitative experimental results, but also provides a new and developable idea for semi-supervised learning and computer-aided diagnosis technology research.
Collapse
Affiliation(s)
- Jianwu Long
- College of Computer Science and Engineering, Chongqing University of Technology, Chongqing 400054, People's Republic of China
| | - Yan Ren
- College of Computer Science and Engineering, Chongqing University of Technology, Chongqing 400054, People's Republic of China
| | - Chengxin Yang
- College of Computer Science and Engineering, Chongqing University of Technology, Chongqing 400054, People's Republic of China
| | - Pengcheng Ren
- College of Computer Science and Engineering, Chongqing University of Technology, Chongqing 400054, People's Republic of China
| | - Ziqin Zeng
- College of Computer Science and Engineering, Chongqing University of Technology, Chongqing 400054, People's Republic of China
| |
Collapse
|
5
|
Qu G, Lu B, Shi J, Wang Z, Yuan Y, Xia Y, Pan Z, Lin Y. Motion-artifact-augmented pseudo-label network for semi-supervised brain tumor segmentation. Phys Med Biol 2024; 69:055023. [PMID: 38406849 DOI: 10.1088/1361-6560/ad2634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 02/05/2024] [Indexed: 02/27/2024]
Abstract
MRI image segmentation is widely used in clinical practice as a prerequisite and a key for diagnosing brain tumors. The quest for an accurate automated segmentation method for brain tumor images, aiming to ease clinical doctors' workload, has gained significant attention as a research focal point. Despite the success of fully supervised methods in brain tumor segmentation, challenges remain. Due to the high cost involved in annotating medical images, the dataset available for training fully supervised methods is very limited. Additionally, medical images are prone to noise and motion artifacts, negatively impacting quality. In this work, we propose MAPSS, a motion-artifact-augmented pseudo-label network for semi-supervised segmentation. Our method combines motion artifact data augmentation with the pseudo-label semi-supervised training framework. We conduct several experiments under different semi-supervised settings on a publicly available dataset BraTS2020 for brain tumor segmentation. The experimental results show that MAPSS achieves accurate brain tumor segmentation with only a small amount of labeled data and maintains robustness in motion-artifact-influenced images. We also assess the generalization performance of MAPSS using the Left Atrium dataset. Our algorithm is of great significance for assisting doctors in formulating treatment plans and improving treatment quality.
Collapse
Affiliation(s)
- Guangcan Qu
- School of the 1st Clinical Medical Sciences (School of Information and Engineering), Wenzhou Medical University, Wenzhou 325000, People's Republic of China
| | - Beichen Lu
- School of the 1st Clinical Medical Sciences (School of Information and Engineering), Wenzhou Medical University, Wenzhou 325000, People's Republic of China
| | - Jialin Shi
- School of the 1st Clinical Medical Sciences (School of Information and Engineering), Wenzhou Medical University, Wenzhou 325000, People's Republic of China
| | - Ziyi Wang
- School of the 1st Clinical Medical Sciences (School of Information and Engineering), Wenzhou Medical University, Wenzhou 325000, People's Republic of China
| | - Yaping Yuan
- School of the 1st Clinical Medical Sciences (School of Information and Engineering), Wenzhou Medical University, Wenzhou 325000, People's Republic of China
| | - Yifan Xia
- School of the 1st Clinical Medical Sciences (School of Information and Engineering), Wenzhou Medical University, Wenzhou 325000, People's Republic of China
| | - Zhifang Pan
- School of the 1st Clinical Medical Sciences (School of Information and Engineering), Wenzhou Medical University, Wenzhou 325000, People's Republic of China
| | - Yezhi Lin
- School of the 1st Clinical Medical Sciences (School of Information and Engineering), Wenzhou Medical University, Wenzhou 325000, People's Republic of China
| |
Collapse
|
6
|
Jiang L, Ma LY, Zeng TY, Ying SH. UFPS: A unified framework for partially annotated federated segmentation in heterogeneous data distribution. Patterns (N Y) 2024; 5:100917. [PMID: 38370123 PMCID: PMC10873159 DOI: 10.1016/j.patter.2024.100917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 08/14/2023] [Accepted: 01/03/2024] [Indexed: 02/20/2024]
Abstract
Partially supervised segmentation is a label-saving method based on datasets with fractional classes labeled and intersectant. Its practical application in real-world medical scenarios is, however, hindered by privacy concerns and data heterogeneity. To address these issues without compromising privacy, federated partially supervised segmentation (FPSS) is formulated in this work. The primary challenges for FPSS are class heterogeneity and client drift. We propose a unified federated partially labeled segmentation (UFPS) framework to segment pixels within all classes for partially annotated datasets by training a comprehensive global model that avoids class collision. Our framework includes unified label learning (ULL) and sparse unified sharpness aware minimization (sUSAM) for class and feature space unification, respectively. Through empirical studies, we find that traditional methods in partially supervised segmentation and federated learning often struggle with class collision when combined. Our extensive experiments on real medical datasets demonstrate better deconflicting and generalization capabilities of UFPS.
Collapse
Affiliation(s)
- Le Jiang
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Li Yan Ma
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Tie Yong Zeng
- Department of Mathematics, Chinese University of Hong Kong, Hongkong, China
| | - Shi Hui Ying
- Department of Mathematics, Shanghai University, Shanghai, China
| |
Collapse
|
7
|
Li G, Jin D, Yu Q, Zheng Y, Qi M. MultiIB-TransUNet: Transformer with multiple information bottleneck blocks for CT and ultrasound image segmentation. Med Phys 2024; 51:1178-1189. [PMID: 37528654 DOI: 10.1002/mp.16662] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2023] [Revised: 06/07/2023] [Accepted: 07/19/2023] [Indexed: 08/03/2023] Open
Abstract
BACKGROUND Accurate medical image segmentation is crucial for disease diagnosis and surgical planning. Transformer networks offer a promising alternative for medical image segmentation as they can learn global features through self-attention mechanisms. To further enhance performance, many researchers have incorporated more Transformer layers into their models. However, this approach often results in the model parameters increasing significantly, causing a potential rise in complexity. Moreover, the datasets of medical image segmentation usually have fewer samples, which leads to the risk of overfitting of the model. PURPOSE This paper aims to design a medical image segmentation model that has fewer parameters and can effectively alleviate overfitting. METHODS We design a MultiIB-Transformer structure consisting of a single Transformer layer and multiple information bottleneck (IB) blocks. The Transformer layer is used to capture long-distance spatial relationships to extract global feature information. The IB block is used to compress noise and improve model robustness. The advantage of this structure is that it only needs one Transformer layer to achieve the state-of-the-art (SOTA) performance, significantly reducing the number of model parameters. In addition, we designed a new skip connection structure. It only needs two 1× 1 convolutions, the high-resolution feature map can effectively have both semantic and spatial information, thereby alleviating the semantic gap. RESULTS The proposed model is on the Breast UltraSound Images (BUSI) dataset, and the IoU and F1 evaluation indicators are 67.75 and 87.78. On the Synapse multi-organ segmentation dataset, the Param, Hausdorff Distance (HD) and Dice Similarity Cofficient (DSC) evaluation indicators are 22.30, 20.04 and 81.83. CONCLUSIONS Our proposed model (MultiIB-TransUNet) achieved superior results with fewer parameters compared to other models.
Collapse
Affiliation(s)
- Guangju Li
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Dehu Jin
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Qi Yu
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Yuanjie Zheng
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Meng Qi
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| |
Collapse
|
8
|
Ma F, Li S, Wang S, Guo Y, Wu F, Meng J, Dai C. Deep-learning segmentation method for optical coherence tomography angiography in ophthalmology. J Biophotonics 2024; 17:e202300321. [PMID: 37801660 DOI: 10.1002/jbio.202300321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 09/28/2023] [Accepted: 10/04/2023] [Indexed: 10/08/2023]
Abstract
PURPOSE The optic disc and the macular are two major anatomical structures in the human eye. Optic discs are associated with the optic nerve. Macular mainly involves degeneration and impaired function of the macular region. Reliable optic disc and macular segmentation are necessary for the automated screening of retinal diseases. METHODS A swept-source OCTA system was designed to capture OCTA images of human eyes. To address these segmentation tasks, first, we constructed a new Optic Disc and Macula in fundus Image with optical coherence tomography angiography (OCTA) dataset (ODMI). Second, we proposed a Coarse and Fine Attention-Based Network (CFANet). RESULTS The five metrics of our methods on ODMI are 98.91 % , 98.47 % , 89.77 % , 98.49 % , and 89.77 % , respectively. CONCLUSIONS Experimental results show that our CFANet has achieved good performance on segmentation for the optic disc and macula in OCTA.
Collapse
Affiliation(s)
- Fei Ma
- School of Computer Science, Qufu Normal University, Shandong, China
| | - Sien Li
- School of Computer Science, Qufu Normal University, Shandong, China
| | - Shengbo Wang
- School of Computer Science, Qufu Normal University, Shandong, China
| | - Yanfei Guo
- School of Computer Science, Qufu Normal University, Shandong, China
| | - Fei Wu
- School of Automation, Nanjing University of Posts and Telecommunications, Jiangsu, China
| | - Jing Meng
- School of Computer Science, Qufu Normal University, Shandong, China
| | - Cuixia Dai
- College Science, Shanghai Institute of Technology, Shanghai, China
| |
Collapse
|
9
|
Zhu C, Chai X, Xiao Y, Liu X, Zhang R, Yang Z, Wang Z. Swin-Net: A Swin-Transformer-Based Network Combing with Multi-Scale Features for Segmentation of Breast Tumor Ultrasound Images. Diagnostics (Basel) 2024; 14:269. [PMID: 38337784 PMCID: PMC10854866 DOI: 10.3390/diagnostics14030269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 01/19/2024] [Accepted: 01/22/2024] [Indexed: 02/12/2024] Open
Abstract
Breast cancer is one of the most common cancers in the world, especially among women. Breast tumor segmentation is a key step in the identification and localization of the breast tumor region, which has important clinical significance. Inspired by the swin-transformer model with powerful global modeling ability, we propose a semantic segmentation framework named Swin-Net for breast ultrasound images, which combines Transformer and Convolutional Neural Networks (CNNs) to effectively improve the accuracy of breast ultrasound segmentation. Firstly, our model utilizes a swin-transformer encoder with stronger learning ability, which can extract features of images more precisely. In addition, two new modules are introduced in our method, including the feature refinement and enhancement module (RLM) and the hierarchical multi-scale feature fusion module (HFM), given that the influence of ultrasonic image acquisition methods and the characteristics of tumor lesions is difficult to capture. Among them, the RLM module is used to further refine and enhance the feature map learned by the transformer encoder. The HFM module is used to process multi-scale high-level semantic features and low-level details, so as to achieve effective cross-layer feature fusion, suppress noise, and improve model segmentation performance. Experimental results show that Swin-Net performs significantly better than the most advanced methods on the two public benchmark datasets. In particular, it achieves an absolute improvement of 1.4-1.8% on Dice. Additionally, we provide a new dataset of breast ultrasound images on which we test the effect of our model, further demonstrating the validity of our method. In summary, the proposed Swin-Net framework makes significant advancements in breast ultrasound image segmentation, providing valuable exploration for research and applications in this domain.
Collapse
Affiliation(s)
- Chengzhang Zhu
- School of Humanities, Central South University, Changsha 410012, China;
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; (X.C.); (R.Z.); (Z.Y.)
| | - Xian Chai
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; (X.C.); (R.Z.); (Z.Y.)
| | - Yalong Xiao
- School of Humanities, Central South University, Changsha 410012, China;
| | - Xu Liu
- Department of Medical Ultrasound, Hunan Cancer Hospital/The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha 410031, China;
| | - Renmao Zhang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; (X.C.); (R.Z.); (Z.Y.)
| | - Zhangzheng Yang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; (X.C.); (R.Z.); (Z.Y.)
| | - Zhiyuan Wang
- Department of Medical Ultrasound, Hunan Cancer Hospital/The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha 410031, China;
| |
Collapse
|
10
|
Wang K, Jin K, Cheng Z, Liu X, Wang C, Guan X, Xu X, Ye J, Wang W, Wang S. Multi-scale consistent self-training network for semi-supervised orbital tumor segmentation. Med Phys 2024. [PMID: 38277474 DOI: 10.1002/mp.16945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 11/20/2023] [Accepted: 12/10/2023] [Indexed: 01/28/2024] Open
Abstract
PURPOSE Segmentation of orbital tumors in CT images is of great significance for orbital tumor diagnosis, which is one of the most prevalent diseases of the eye. However, the large variety of tumor sizes and shapes makes the segmentation task very challenging, especially when the available annotation data is limited. METHODS To this end, in this paper, we propose a multi-scale consistent self-training network (MSCINet) for semi-supervised orbital tumor segmentation. Specifically, we exploit the semantic-invariance features by enforcing the consistency between the predictions of different scales of the same image to make the model more robust to size variation. Moreover, we incorporate a new self-training strategy, which adopts iterative training with an uncertainty filtering mechanism to filter the pseudo-labels generated by the model, to eliminate the accumulation of pseudo-label error predictions and increase the generalization of the model. RESULTS For evaluation, we have built two datasets, the orbital tumor binary segmentation dataset (Orbtum-B) and the orbital multi-organ segmentation dataset (Orbtum-M). Experimental results on these two datasets show that our proposed method can both achieve state-of-the-art performance. In our datasets, there are a total of 55 patients containing 602 2D images. CONCLUSION In this paper, we develop a new semi-supervised segmentation method for orbital tumors, which is designed for the characteristics of orbital tumors and exhibits excellent performance compared to previous semi-supervised algorithms.
Collapse
Affiliation(s)
- Keyi Wang
- School of Mechanical, Electrical and Information Engineering at Shandong University, Weihai, China
| | - Kai Jin
- Department of Ophthalmology, the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Zhiming Cheng
- School of Automation, Hangzhou Dianzi University, Hangzhou, China
| | - Xindi Liu
- Department of Ophthalmology, the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Changjun Wang
- Department of Ophthalmology, the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Xiaojun Guan
- Department of Radiology, the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Xiaojun Xu
- Department of Radiology, the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Juan Ye
- Department of Ophthalmology, the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Wenyu Wang
- School of Mechanical, Electrical and Information Engineering at Shandong University, Weihai, China
| | - Shuai Wang
- School of Mechanical, Electrical and Information Engineering at Shandong University, Weihai, China
- Suzhou Research Institute of Shandong University, Suzhou, China
| |
Collapse
|
11
|
Xiao H, Li L, Liu Q, Zhang Q, Liu J, Liu Z. Context-aware and local-aware fusion with transformer for medical image segmentation. Phys Med Biol 2024; 69:025011. [PMID: 38086076 DOI: 10.1088/1361-6560/ad14c6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 12/12/2023] [Indexed: 01/11/2024]
Abstract
Objective. Convolutional neural networks (CNNs) have made significant progress in medical image segmentation tasks. However, for complex segmentation tasks, CNNs lack the ability to establish long-distance relationships, resulting in poor segmentation performance. The characteristics of intra-class diversity and inter-class similarity in images increase the difficulty of segmentation. Additionally, some focus areas exhibit a scattered distribution, making segmentation even more challenging.Approach. Therefore, this work proposed a new Transformer model, FTransConv, to address the issues of inter-class similarity, intra-class diversity, and scattered distribution in medical image segmentation tasks. To achieve this, three Transformer-CNN modules were designed to extract global and local information, and a full-scale squeeze-excitation module was proposed in the decoder using the idea of full-scale connections.Main results. Without any pre-training, this work verified the effectiveness of FTransConv on three public COVID-19 CT datasets and MoNuSeg. Experiments have shown that FTransConv, which has only 26.98M parameters, outperformed other state-of-the-art models, such as Swin-Unet, TransAttUnet, UCTransNet, LeViT-UNet, TransUNet, UTNet, and SAUNet++. This model achieved the best segmentation performance with a DSC of 83.22% in COVID-19 datasets and 79.47% in MoNuSeg.Significance. This work demonstrated that our method provides a promising solution for regions with high inter-class similarity, intra-class diversity and scatter distribution in image segmentation.
Collapse
Affiliation(s)
- Hanguang Xiao
- College of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, People's Republic of China
| | - Li Li
- College of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, People's Republic of China
| | - Qiyuan Liu
- College of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, People's Republic of China
| | - Qihang Zhang
- College of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, People's Republic of China
| | - Junqi Liu
- College of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, People's Republic of China
| | - Zhi Liu
- College of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, People's Republic of China
| |
Collapse
|
12
|
Morton Colbert Z, Arrington D, Foote M, Gårding J, Fay D, Huo M, Pinkham M, Ramachandran P. Repurposing traditional U-Net predictions for sparse SAM prompting in medical image segmentation. Biomed Phys Eng Express 2024; 10:025004. [PMID: 38118182 DOI: 10.1088/2057-1976/ad17a7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 12/20/2023] [Indexed: 12/22/2023]
Abstract
Objective:Automated medical image segmentation (MIS) using deep learning has traditionally relied on models built and trained from scratch, or at least fine-tuned on a target dataset. The Segment Anything Model (SAM) by Meta challenges this paradigm by providing zero-shot generalisation capabilities. This study aims to develop and compare methods for refining traditional U-Net segmentations by repurposing them for automated SAM prompting.Approach:A 2D U-Net with EfficientNet-B4 encoder was trained using 4-fold cross-validation on an in-house brain metastases dataset. Segmentation predictions from each validation set were used for automatic sparse prompt generation via a bounding box prompting method (BBPM) and novel implementations of the point prompting method (PPM). The PPMs frequently produced poor slice predictions (PSPs) that required identification and substitution. A slice was identified as a PSP if it (1) contained multiple predicted regions per lesion or (2) possessed outlier foreground pixel counts relative to the patient's other slices. Each PSP was substituted with a corresponding initial U-Net or SAM BBPM prediction. The patients' mean volumetric dice similarity coefficient (DSC) was used to evaluate and compare the methods' performances.Main results:Relative to the initial U-Net segmentations, the BBPM improved mean patient DSC by 3.93 ± 1.48% to 0.847 ± 0.008 DSC. PSPs constituted 20.01-21.63% of PPMs' predictions and without substitution performance dropped by 82.94 ± 3.17% to 0.139 ± 0.023 DSC. Pairing the two PSP identification techniques yielded a sensitivity to PSPs of 92.95 ± 1.20%. By combining this approach with BBPM prediction substitution, the PPMs achieved segmentation accuracies on par with the BBPM, improving mean patient DSC by up to 4.17 ± 1.40% and reaching 0.849 ± 0.007 DSC.Significance:The proposed PSP identification and substitution techniques bridge the gap between PPM and BBPM performance for MIS. Additionally, the uniformity observed in our experiments' results demonstrates the robustness of SAM to variations in prompting style. These findings can assist in the design of both automatically and manually prompted pipelines.
Collapse
Affiliation(s)
| | | | | | | | - Dominik Fay
- Elekta Instrument AB, Sweden
- KTH Royal Institute of Technology, Sweden
| | - Michael Huo
- Princess Alexandra Hospital, Brisbane, Australia
| | - Mark Pinkham
- Princess Alexandra Hospital, Brisbane, Australia
| | | |
Collapse
|
13
|
Wang B, Yang J, Zhou Y, Yang Y, Tian X, Zhang G, Zhang X. LEACS: a learnable and efficient active contour model with space-frequency pooling for medical image segmentation. Phys Med Biol 2024; 69:015026. [PMID: 38048633 DOI: 10.1088/1361-6560/ad1212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2023] [Accepted: 12/04/2023] [Indexed: 12/06/2023]
Abstract
Diseases can be diagnosed and monitored by extracting regions of interest (ROIs) from medical images. However, accurate and efficient delineation and segmentation of ROIs in medical images remain challenging due to unrefined boundaries, inhomogeneous intensity and limited image acquisition. To overcome these problems, we propose an end-to-end learnable and efficient active contour segmentation model, which integrates a global convex segmentation (GCS) module into a light-weighted encoder-decoder convolutional segmentation network with a multiscale attention module (ED-MSA). The GCS automatically obtains the initialization and corresponding parameters of the curve deformation according to the prediction map generated by the ED-MSA, while provides the refined object boundary prediction for ED-MSA optimization. To provide precise and reliable initial contour for the GCS, we design the space-frequency pooling operation layers in the encoder stage of ED-MSA, which can effectively reduce the number of iterations of the GCS. Beside, we construct ED-MSA using the depth-wise separable convolutional residual module to mitigate the overfitting of the model. The effectiveness of our method is validated on four challenging medical image datasets. Code is here:https://github.com/Yang-fashion/ED-MSA_GCS.
Collapse
Affiliation(s)
- Bing Wang
- College of Mathematics and Information Science, Hebei University, Baoding, 071000, Hebei, People's Republic of China
- Hebei Key Laboratory of machine Learning and Computational Intelligence, Hebei University, Baoding, 071000, Hebei, People's Republic of China
| | - Jie Yang
- College of Mathematics and Information Science, Hebei University, Baoding, 071000, Hebei, People's Republic of China
| | - Yunlai Zhou
- College of Mathematics and Information Science, Hebei University, Baoding, 071000, Hebei, People's Republic of China
| | - Ying Yang
- Hebei University Affiliated Hospital, Baoding, 071000, Hebei, People's Republic of China
| | - Xuedong Tian
- College of Cyber Security and Computer, Hebei University, Baoding, 071000, Hebei, People's Republic of China
| | - Guochun Zhang
- Hebei Key Laboratory of machine Learning and Computational Intelligence, Hebei University, Baoding, 071000, Hebei, People's Republic of China
| | - Xin Zhang
- College of Electronic Information Engineering, Hebei University, Baoding, 071000, Hebei, People's Republic of China
| |
Collapse
|
14
|
Xi H, Dong H, Sheng Y, Cui H, Huang C, Li J, Zhu J. MSCT-UNET: multi-scale contrastive transformer within U-shaped network for medical image segmentation. Phys Med Biol 2023; 69:015022. [PMID: 38061069 DOI: 10.1088/1361-6560/ad135d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Accepted: 12/07/2023] [Indexed: 12/30/2023]
Abstract
Objective.Automatic mutli-organ segmentation from anotomical images is essential in disease diagnosis and treatment planning. The U-shaped neural network with encoder-decoder has achieved great success in various segmentation tasks. However, a pure convolutional neural network (CNN) is not suitable for modeling long-range relations due to limited receptive fields, and a pure transformer is not good at capturing pixel-level features.Approach.We propose a new hybrid network named MSCT-UNET which fuses CNN features with transformer features at multi-scale and introduces multi-task contrastive learning to improve the segmentation performance. Specifically, the multi-scale low-level features extracted from CNN are further encoded through several transformers to build hierarchical global contexts. Then the cross fusion block fuses the low-level and high-level features in different directions. The deep-fused features are flowed back to the CNN and transformer branch for the next scale fusion. We introduce multi-task contrastive learning including a self-supervised global contrast learning and a supervised local contrast learning into MSCT-UNET. We also make the decoder stronger by using a transformer to better restore the segmentation map.Results.Evaluation results on ACDC, Synapase and BraTS datasets demonstrate the improved performance over other methods compared. Ablation study results prove the effectiveness of our major innovations.Significance.The hybrid encoder of MSCT-UNET can capture multi-scale long-range dependencies and fine-grained detail features at the same time. The cross fusion block can fuse these features deeply. The multi-task contrastive learning of MSCT-UNET can strengthen the representation ability of the encoder and jointly optimize the networks. The source code is publicly available at:https://github.com/msctunet/MSCT_UNET.git.
Collapse
Affiliation(s)
- Heran Xi
- School of Electronic Engineering, Heilongjiang University, Harbin, 150001, People's Republic of China
| | - Haoji Dong
- School of Computer Science and Technology, Heilongjiang University, Harbin, 150000, People's Republic of China
| | - Yue Sheng
- School of Computer Science and Technology, Heilongjiang University, Harbin, 150000, People's Republic of China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, 3000, Australia
| | - Chengying Huang
- School of Computer Science and Technology, Heilongjiang University, Harbin, 150000, People's Republic of China
| | - Jinbao Li
- Qilu University of Technology (Shandong Academy of Science), Shandong Artificial Intelligence Institute, Jinnan, 250014, People's Republic of China
| | - Jinghua Zhu
- School of Computer Science and Technology, Heilongjiang University, Harbin, 150000, People's Republic of China
| |
Collapse
|
15
|
Ding W, Li Z. Curriculum Consistency Learning and Multi-Scale Contrastive Constraint in Semi-Supervised Medical Image Segmentation. Bioengineering (Basel) 2023; 11:10. [PMID: 38247886 PMCID: PMC10812906 DOI: 10.3390/bioengineering11010010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 07/24/2023] [Accepted: 07/27/2023] [Indexed: 01/23/2024] Open
Abstract
Data scarcity poses a significant challenge in medical image segmentation, thereby highlighting the importance of leveraging sparse annotation data. In addressing this issue, semi-supervised learning has emerged as an effective approach for training neural networks using limited labeled data. In this study, we introduced a curriculum consistency constraint within the context of semi-supervised medical image segmentation, thus drawing inspiration from the human learning process. By dynamically comparing patch features with full image features, we enhanced the network's ability to learn. Unlike existing methods, our approach adapted the patch size to simulate the human curriculum process, thereby progressing from easy to hard tasks. This adjustment guided the model toward improved convergence optima and generalization. Furthermore, we employed multi-scale contrast learning to enhance the representation of features. Our method capitalizes on the features extracted from multiple layers to explore additional semantic information and point-wise representations. To evaluate the effectiveness of our proposed approach, we conducted experiments on the Kvasir-SEG polyp dataset and the ISIC 2018 skin lesion dataset. The experimental results demonstrated that our method surpassed state-of-the-art semi-supervised methods by achieving a 9.2% increase in the mean intersection over union (mIoU) for the Kvasir-SEG dataset. This improvement substantiated the efficacy of our proposed curriculum consistency constraint and multi-scale contrastive loss.
Collapse
Affiliation(s)
| | - Zhen Li
- Department of Computer and Information Engineering, School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), Shenzhen 518000, China;
| |
Collapse
|
16
|
Li H, Ding J, Shi X, Zhang Q, Yu P, Li H. D-SAT: dual semantic aggregation transformer with dual attention for medical image segmentation. Phys Med Biol 2023; 69:015013. [PMID: 37607559 DOI: 10.1088/1361-6560/acf2e5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 08/22/2023] [Indexed: 08/24/2023]
Abstract
Objective. Medical image segmentation is significantly essential to assist clinicians in facilitating a quick and accurate diagnoses. However, most of the existing methods are still challenged by the loss of semantic information, blurred boundaries and the huge semantic gap between the encoder and decoder.Approach. To tackle these issues, a dual semantic aggregation transformer with dual attention is proposed for medical image segmentation. Firstly, the dual-semantic feature aggregation module is designed to build a bridge between convolutional neural network (CNN) and Transformer, effectively aggregating CNN's local feature detail ability and Transformer's long-range modeling ability to mitigate semantic information loss. Thereafter, the strip spatial attention mechanism is put forward to alleviate the blurred boundaries during encoding by constructing pixel-level feature relations across CSWin Transformer blocks from different spatial dimensions. Finally, a feature distribution gated attention module is constructed in the skip connection between the encoder and decoder to decrease the large semantic gap by filtering out the noise in low-level semantic information when fusing low-level and high-level semantic features during decoding.Main results. Comprehensive experiments conducted on abdominal multi-organ segmentation, cardiac diagnosis, polyp segmentation and skin lesion segmentation serve to validate the generalization and effectiveness of the proposed dual semantic aggregation transformer with dual attention (D-SAT). The superiority of D-SAT over current state-of-the-art methods is substantiated by both subjective and objective evaluations, revealing its remarkable performance in terms of segmentation accuracy and quality.Significance. The proposed method subtly preserves the local feature details and global context information in medical image segmentation, providing valuable support to improve diagnostic efficiency for clinicians and early disease control for patients. Code is available athttps://github.com/Dxkm/D-SAT.
Collapse
Affiliation(s)
- Haiyan Li
- School of Information Science and Engineering, Yunnan University, Kunming 650504, People's Republic of China
| | - Jiayu Ding
- School of Information Science and Engineering, Yunnan University, Kunming 650504, People's Republic of China
| | - Xin Shi
- Department of Urology Surgery, The Second Hospital Affiliated to the Medical University of Kunming, Kunming, People's Republic of China
- Faculty of Life Science and Technology, Kunming University of Science and Technology, Kunming, People's Republic of China
| | - Qi Zhang
- School of Environmental and Chemical Engineering, Kunming Metallurgy College, Kunming, People's Republic of China
| | - Pengfei Yu
- School of Information Science and Engineering, Yunnan University, Kunming 650504, People's Republic of China
| | - Hongsong Li
- School of Information Science and Engineering, Yunnan University, Kunming 650504, People's Republic of China
| |
Collapse
|
17
|
Gao C, Cheng J, Yang Z, Chen Y, Zhu M. SCA-Former: transformer-like network based on stream-cross attention for medical image segmentation. Phys Med Biol 2023; 68:245008. [PMID: 37802056 DOI: 10.1088/1361-6560/ad00fe] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Accepted: 10/06/2023] [Indexed: 10/08/2023]
Abstract
Objective. Deep convolutional neural networks (CNNs) have been widely applied in medical image analysis and achieved satisfactory performances. While most CNN-based methods exhibit strong feature representation capabilities, they face challenges in encoding long-range interaction information due to the limited receptive fields. Recently, the Transformer has been proposed to alleviate this issue, but its cost is greatly enlarging the model size, which may inhibit its promotion.Approach. To take strong long-range interaction modeling ability and small model size into account simultaneously, we propose a Transformer-like block-based U-shaped network for medical image segmentation, dubbed as SCA-Former. Furthermore, we propose a novel stream-cross attention (SCA) module to enforce the network to focus on finding a balance between local and global representations by extracting multi-scale and interactive features along spatial and channel dimensions. And SCA can effectively extract channel, multi-scale spatial, and long-range information for a more comprehensive feature representation.Main results. Experimental results demonstrate that SCA-Former outperforms the current state-of-the-art (SOTA) methods on three public datasets, including GLAS, ISIC 2017 and LUNG.Significance. This work exhibits a promising method to enhance the feature representation of convolutional neural networks and improve segmentation performance.
Collapse
Affiliation(s)
- Chengrui Gao
- School of Computer Science, Sichuan University, Chengdu, People's Republic of China
- Vision Computing Lab, Sichuan University, Chengdu, People's Republic of China
| | - Junlong Cheng
- School of Computer Science, Sichuan University, Chengdu, People's Republic of China
- Vision Computing Lab, Sichuan University, Chengdu, People's Republic of China
| | - Ziyuan Yang
- School of Computer Science, Sichuan University, Chengdu, People's Republic of China
| | - Yingyu Chen
- School of Computer Science, Sichuan University, Chengdu, People's Republic of China
| | - Min Zhu
- School of Computer Science, Sichuan University, Chengdu, People's Republic of China
- Vision Computing Lab, Sichuan University, Chengdu, People's Republic of China
| |
Collapse
|
18
|
Li J, Ye J, Zhang R, Wu Y, Berhane GS, Deng H, Shi H. CPFTransformer: transformer fusion context pyramid medical image segmentation network. Front Neurosci 2023; 17:1288366. [PMID: 38130692 PMCID: PMC10733526 DOI: 10.3389/fnins.2023.1288366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 11/22/2023] [Indexed: 12/23/2023] Open
Abstract
Introduction The application of U-shaped convolutional neural network (CNN) methods in medical image segmentation tasks has yielded impressive results. However, this structure's single-level context information extraction capability can lead to problems such as boundary blurring, so it needs to be improved. Additionally, the convolution operation's inherent locality restricts its ability to capture global and long-distance semantic information interactions effectively. Conversely, the transformer model excels at capturing global information. Methods Given these considerations, this paper presents a transformer fusion context pyramid medical image segmentation network (CPFTransformer). The CPFTransformer utilizes the Swin Transformer to integrate edge perception for segmentation edges. To effectively fuse global and multi-scale context information, we introduce an Edge-Aware module based on a context pyramid, which specifically emphasizes local features like edges and corners. Our approach employs a layered Swin Transformer with a shifted window mechanism as an encoder to extract contextual features. A decoder based on a symmetric Swin Transformer is employed for upsampling operations, thereby restoring the resolution of feature maps. The encoder and decoder are connected by an Edge-Aware module for the extraction of local features such as edges and corners. Results Experimental evaluations on the Synapse multi-organ segmentation task and the ACDC dataset demonstrate the effectiveness of our method, yielding a segmentation accuracy of 79.87% (DSC) and 20.83% (HD) in the Synapse multi-organ segmentation task. Discussion The method proposed in this paper, which combines the context pyramid mechanism and Transformer, enables fast and accurate automatic segmentation of medical images, thereby significantly enhancing the precision and reliability of medical diagnosis. Furthermore, the approach presented in this study can potentially be extended to image segmentation of other organs in the future.
Collapse
Affiliation(s)
- Jiao Li
- College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan, China
| | - Jinyu Ye
- College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan, China
| | - Ruixin Zhang
- College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan, China
| | - Yue Wu
- College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan, China
| | | | - Hongxia Deng
- College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan, China
| | - Hong Shi
- School of Artificial Intelligence, Shenzhen Polytechnic, Shenzhen, China
| |
Collapse
|
19
|
Jiang X, Zhu Y, Liu Y, Wang N, Yi L. MC-DC: An MLP-CNN Based Dual-path Complementary Network for Medical Image Segmentation. Comput Methods Programs Biomed 2023; 242:107846. [PMID: 37806121 DOI: 10.1016/j.cmpb.2023.107846] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 10/03/2023] [Accepted: 10/04/2023] [Indexed: 10/10/2023]
Abstract
BACKGROUND Fusing the CNN and Transformer in the encoder has recently achieved outstanding performance in medical image segmentation. However, two obvious limitations require addressing: (1) The utilization of Transformer leads to heavy parameters, and its intricate structure demands ample data and resources for training, and (2) most previous research had predominantly focused on enhancing the performance of the feature encoder, with little emphasis placed on the design of the feature decoder. METHODS To this end, we propose a novel MLP-CNN based dual-path complementary (MC-DC) network for medical image segmentation, which replaces the complex Transformer with a cost-effective Multi-Layer Perceptron (MLP). Specifically, a dual-path complementary (DPC) module is designed to effectively fuse multi-level features from MLP and CNN. To respectively reconstruct global and local information, the dual-path decoder is proposed which is mainly composed of cross-scale global feature fusion (CS-GF) module and cross-scale local feature fusion (CS-LF) module. Moreover, we leverage a simple and efficient segmentation mask feature fusion (SMFF) module to merge the segmentation outcomes generated by the dual-path decoder. RESULTS Comprehensive experiments were performed on three typical medical image segmentation tasks. For skin lesions segmentation, our MC-DC network achieved 91.69% Dice and 9.52mm ASSD on the ISIC2018 dataset. In addition, the 91.6% Dice and 94.4% Dice were respectively obtained on the Kvasir-SEG dataset and CVC-ClinicDB dataset for polyp segmentation. Moreover, we also conducted experiments on the private COVID-DS36 dataset for lung lesion segmentation. Our MC-DC has achieved 87.6% [87.1%, 88.1%], and 92.3% [91.8%, 92.7%] on ground-glass opacity, interstitial infiltration, and lung consolidation, respectively. CONCLUSIONS The experimental results indicate that the proposed MC-DC network exhibits exceptional generalization capability and surpasses other state-of-the-art methods in higher results and lower computational complexity.
Collapse
Affiliation(s)
- Xiaoben Jiang
- School of Information Science and Technology, East China University of Science and Technology, Shanghai, 200237, China
| | - Yu Zhu
- School of Information Science and Technology, East China University of Science and Technology, Shanghai, 200237, China.
| | - Yatong Liu
- School of Information Science and Technology, East China University of Science and Technology, Shanghai, 200237, China
| | - Nan Wang
- School of Information Science and Technology, East China University of Science and Technology, Shanghai, 200237, China
| | - Lei Yi
- Department of Burn, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| |
Collapse
|
20
|
He S, Li Q, Li X, Zhang M. LSW-Net: Lightweight Deep Neural Network Based on Small-World properties for Spine MR Image Segmentation. J Magn Reson Imaging 2023; 58:1762-1776. [PMID: 37118994 DOI: 10.1002/jmri.28735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 03/30/2023] [Accepted: 03/30/2023] [Indexed: 04/30/2023] Open
Abstract
BACKGROUND Segmenting spinal tissues from MR images is important for automatic image analysis. Deep neural network-based segmentation methods are efficient, yet have high computational costs. PURPOSE To design a lightweight model based on small-world properties (LSW-Net) to segment spinal MR images, suitable for low-computing-power embedded devices. STUDY TYPE Retrospective. POPULATION A total of 386 subjects (2948 images) from two independent sources. Dataset I: 214 subjects/779 images, all for disk degeneration screening, 147 had disk degeneration, 52 had herniated disc. Dataset II: 172 subjects/2169 images, 142 patients with vertebral degeneration, 163 patients with disc degeneration. 70% images in each dataset for training, 20% for validation, and 10% for testing. FIELD STRENGTH/SEQUENCE T1- and T2-weighted turbo spin echo sequences at 3 T. ASSESSMENT Segmentation performance of LSW-Net was compared with four mainstream (including U-net and U-net++) and five lightweight models using five radiologists' manual segmentations (vertebrae, disks, spinal fluid) as reference standard. LSW-Net was also deployed on NVIDIA Jetson nano to compare the pixels number in segmented vertebrae and disks. STATISTICAL TESTS All models were evaluated with accuracy, precision, Dice similarity coefficient (DSC), and area under the receiver operating characteristic (AUC). Pixel numbers segmented by LSW-Net on the embedded device were compared with manual segmentation using paired t-tests, with P < 0.05 indicating significance. RESULTS LSW-Net had 98.5% fewer parameters than U-net but achieved similar accuracy in both datasets (dataset I: DSC 0.84 vs. 0.87, AUC 0.92 vs. 0.94; dataset II: DSC 0.82 vs. 0.82, AUC 0.88 vs. 0.88). LSW-Net showed no significant differences in pixel numbers for vertebrae (dataset I: 5893.49 vs. 5752.61, P = 0.21; dataset II: 5073.42 vs. 5137.12, P = 0.56) and disks (dataset I: 1513.07 vs. 1535.69, P = 0.42; dataset II: 1049.74 vs. 1087.88, P = 0.24) segmentation on an embedded device compared to manual segmentation. DATA CONCLUSION Proposed LSW-Net achieves high accuracy with fewer parameters than U-net and can be deployed on embedded device, facilitating wider application. EVIDENCE LEVEL 2. TECHNICAL EFFICACY 1.
Collapse
Affiliation(s)
- Siyuan He
- School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, China
| | - Qi Li
- School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, China
- Zhongshan Institute of Changchun University of Science and Technology, Zhongshan, China
| | - Xianda Li
- School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, China
| | - Mengchao Zhang
- Department of Radiology, China-Japan Union Hospital of Jilin University, Changchun, China
| |
Collapse
|
21
|
Hu J, Yu C, Yi Z, Zhang H. Enhancing Robustness of Medical Image Segmentation Model with Neural Memory Ordinary Differential Equation. Int J Neural Syst 2023; 33:2350060. [PMID: 37743765 DOI: 10.1142/s0129065723500600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Deep neural networks (DNNs) have emerged as a prominent model in medical image segmentation, achieving remarkable advancements in clinical practice. Despite the promising results reported in the literature, the effectiveness of DNNs necessitates substantial quantities of high-quality annotated training data. During experiments, we observe a significant decline in the performance of DNNs on the test set when there exists disruption in the labels of the training dataset, revealing inherent limitations in the robustness of DNNs. In this paper, we find that the neural memory ordinary differential equation (nmODE), a recently proposed model based on ordinary differential equations (ODEs), not only addresses the robustness limitation but also enhances performance when trained by the clean training dataset. However, it is acknowledged that the ODE-based model tends to be less computationally efficient compared to the conventional discrete models due to the multiple function evaluations required by the ODE solver. Recognizing the efficiency limitation of the ODE-based model, we propose a novel approach called the nmODE-based knowledge distillation (nmODE-KD). The proposed method aims to transfer knowledge from the continuous nmODE to a discrete layer, simultaneously enhancing the model's robustness and efficiency. The core concept of nmODE-KD revolves around enforcing the discrete layer to mimic the continuous nmODE by minimizing the KL divergence between them. Experimental results on 18 organs-at-risk segmentation tasks demonstrate that nmODE-KD exhibits improved robustness compared to ODE-based models while also mitigating the efficiency limitation.
Collapse
Affiliation(s)
- Junjie Hu
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu 610065, P. R. China
| | - Chengrong Yu
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu 610065, P. R. China
| | - Zhang Yi
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu 610065, P. R. China
| | - Haixian Zhang
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu 610065, P. R. China
| |
Collapse
|
22
|
Wang D, Wang Z, Chen L, Xiao H, Yang B. Cross-Parallel Transformer: Parallel ViT for Medical Image Segmentation. Sensors (Basel) 2023; 23:9488. [PMID: 38067861 PMCID: PMC10708613 DOI: 10.3390/s23239488] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 11/22/2023] [Accepted: 11/23/2023] [Indexed: 01/24/2024]
Abstract
Medical image segmentation primarily utilizes a hybrid model consisting of a Convolutional Neural Network and sequential Transformers. The latter leverage multi-head self-attention mechanisms to achieve comprehensive global context modelling. However, despite their success in semantic segmentation, the feature extraction process is inefficient and demands more computational resources, which hinders the network's robustness. To address this issue, this study presents two innovative methods: PTransUNet (PT model) and C-PTransUNet (C-PT model). The C-PT module refines the Vision Transformer by substituting a sequential design with a parallel one. This boosts the feature extraction capabilities of Multi-Head Self-Attention via self-correlated feature attention and channel feature interaction, while also streamlining the Feed-Forward Network to lower computational demands. On the Synapse public dataset, the PT and C-PT models demonstrate improvements in DSC accuracy by 0.87% and 3.25%, respectively, in comparison with the baseline model. As for the parameter count and FLOPs, the PT model aligns with the baseline model. In contrast, the C-PT model shows a decrease in parameter count by 29% and FLOPs by 21.4% relative to the baseline model. The proposed segmentation models in this study exhibit benefits in both accuracy and efficiency.
Collapse
Affiliation(s)
| | | | | | | | - Bo Yang
- College of Engineering and Design, Hunan Normal University, Changsha 410081, China; (D.W.); (Z.W.); (L.C.); (H.X.)
| |
Collapse
|
23
|
Wang Y, Wang J, Zhou W, Liu Z, Yang C. MAUNext: a lightweight segmentation network for medical images. Phys Med Biol 2023; 68:235003. [PMID: 37931318 DOI: 10.1088/1361-6560/ad0a1f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 11/06/2023] [Indexed: 11/08/2023]
Abstract
Objective. The primary objective of this study is to enhance medical image segmentation techniques for clinical research by prioritizing accuracy and the number of parameters.Approach. To achieve this objective, a novel codec-based MAUNext approach is devised, focusing on lightweight backbone design and the integration of skip connections utilizing multiscale, attention mechanism, and other strategic components. The approach is composed of three core modules: a multi-scale attentional convolution module for improved accuracy and parameter reduction, a collaborative neighbourhood-attention MLP encoding module to enhance segmentation performance, and a tiny skip-connected cross-layer semantic fusion module to bridge the semantic gap between encoder and decoder.Main results. The study extensively evaluates the MAUNext approach alongside eight state-of-the-art methods on three renowned datasets: Kagglelung, ISIC, and Brain. The experimental outcomes robustly demonstrate that the proposed approach surpasses other methods in terms of both parameter numbers and accuracy. This achievement holds promise for effectively addressing medical image segmentation tasks.Significance. Automated medical image segmentation, particularly in organ and lesion identification, plays a pivotal role in clinical diagnosis and treatment. Manual segmentation is resource-intensive, thus automated methods are highly valuable. The study underscores the clinical significance of automated segmentation by providing an advanced solution through the innovative MAUNext approach. This approach offers substantial improvements in accuracy and efficiency, which can significantly aid clinical decision-making and patient treatment.
Collapse
Affiliation(s)
- Yuhang Wang
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang 550025, People's Republic of China
| | - Jihong Wang
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang 550025, People's Republic of China
| | - Wen Zhou
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang 550025, People's Republic of China
| | - Zijie Liu
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang 550025, People's Republic of China
| | - Chen Yang
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang 550025, People's Republic of China
| |
Collapse
|
24
|
Leng P, Xu Z, Zhu Z, Pan Z. Blend U-Net: Redesigning Skip Connections to Obtain Multiscale Features for Lung CT Images Segmentation. Curr Med Imaging 2023; 20:CMIR-EPUB-135937. [PMID: 37936446 DOI: 10.2174/0115734056268487231029154123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 08/16/2023] [Accepted: 09/27/2023] [Indexed: 11/09/2023]
Abstract
BACKGROUND Lung cancer is a pervasive and persistent issue worldwide, with the highest morbidity and mortality among all cancers for many years. In the medical field, computer tomography (CT) images of the lungs are currently recognized as the best way to help doctors detect lung nodules and thus diagnose lung cancer. U-Net is a deep learning network with an encoder-decoder structure, which is extensively employed for medical image segmentation and has derived many improved versions. However, these advancements do not utilize various feature information from all scales, and there is still room for future enhancement. METHODS In this study, we proposed a new model called Blend U-Net, which incorporates nested structures, redesigned long and short skip connections, and deep supervisions. The nested structures and the long and short skip connections combined characteristic information of different levels from feature maps in all scales, while the deep supervision learning hierarchical representations from all-scale concatenated feature maps. Additionally, we employed a mixed loss function to obtain more accurate results. RESULTS We evaluated the performance of the Blend U-Net against other architectures on the publicly available Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI) dataset. Moreover, the accuracy of the segmentation was verified by using the dice coefficient. Blend U-Net with a boost of 0.83 points produced the best outcome in a number of baselines. CONCLUSION Based on the results, our method achieves superior performance in terms of dice coefficient compared to other methods and demonstrates greater proficiency in segmenting lung nodules of varying sizes.
Collapse
Affiliation(s)
- Pengfei Leng
- School of Public Health, Hangzhou Normal University, Hangzhou, China
- Institute of VR and Intelligent System, Hangzhou Normal University, Hangzhou, China
| | - Zhifei Xu
- School of Public Health, Hangzhou Normal University, Hangzhou, China
- Institute of VR and Intelligent System, Hangzhou Normal University, Hangzhou, China
| | - Zhaohui Zhu
- School of Public Health, Hangzhou Normal University, Hangzhou, China
- Institute of VR and Intelligent System, Hangzhou Normal University, Hangzhou, China
| | - Zhiggeng Pan
- School of Artifical Intelligence (School of Future Technology), Nanjing University of Information Science and Technology, Nanjing, China
- Institute of VR and Intelligent System, Hangzhou Normal University, Hangzhou, China
| |
Collapse
|
25
|
Wang X, Li X, Du R, Zhong Y, Lu Y, Song T. Anatomical Prior-Based Automatic Segmentation for Cardiac Substructures from Computed Tomography Images. Bioengineering (Basel) 2023; 10:1267. [PMID: 38002391 PMCID: PMC10669053 DOI: 10.3390/bioengineering10111267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 10/12/2023] [Accepted: 10/24/2023] [Indexed: 11/26/2023] Open
Abstract
Cardiac substructure segmentation is a prerequisite for cardiac diagnosis and treatment, providing a basis for accurate calculation, modeling, and analysis of the entire cardiac structure. CT (computed tomography) imaging can be used for a noninvasive qualitative and quantitative evaluation of the cardiac anatomy and function. Cardiac substructures have diverse grayscales, fuzzy boundaries, irregular shapes, and variable locations. We designed a deep learning-based framework to improve the accuracy of the automatic segmentation of cardiac substructures. This framework integrates cardiac anatomical knowledge; it uses prior knowledge of the location, shape, and scale of cardiac substructures and separately processes the structures of different scales. Through two successive segmentation steps with a coarse-to-fine cascaded network, the more easily segmented substructures were coarsely segmented first; then, the more difficult substructures were finely segmented. The coarse segmentation result was used as prior information and combined with the original image as the input for the model. Anatomical knowledge of the large-scale substructures was embedded into the fine segmentation network to guide and train the small-scale substructures, achieving efficient and accurate segmentation of ten cardiac substructures. Sixty cardiac CT images and ten substructures manually delineated by experienced radiologists were retrospectively collected; the model was evaluated using the DSC (Dice similarity coefficient), Recall, Precision, and the Hausdorff distance. Compared with current mainstream segmentation models, our approach demonstrated significantly higher segmentation accuracy, with accurate segmentation of ten substructures of different shapes and sizes, indicating that the segmentation framework fused with prior anatomical knowledge has superior segmentation performance and can better segment small targets in multi-target segmentation tasks.
Collapse
Grants
- Grant 12126610, Grant 81971691, Grant 81801809, Grant 81830052, Grant 81827802, and Grant U1811461,Grant 201804020053,Grant 2018B030312002,Grant 20190302108GX,grant 18DZ2260400, grant 2020B1212060032, Grant 2021B0101190003. Yao Lu
Collapse
Affiliation(s)
- Xuefang Wang
- Shien-Ming Wu School of Intelligent Engineering, South China University of Technology, Guangzhou 511400, China;
| | - Xinyi Li
- Department of Radiology, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou 510150, China;
| | - Ruxu Du
- Guangzhou Janus Biotechnology Co., Ltd., Guangzhou 511400, China;
| | - Yong Zhong
- Shien-Ming Wu School of Intelligent Engineering, South China University of Technology, Guangzhou 511400, China;
| | - Yao Lu
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510275, China
- Guangdong Province Key Laboratory of Computational Science, Sun Yat-sen University, Guangzhou 510275, China
- State Key Laboratory of Oncology in South China, Guangzhou 510060, China
| | - Ting Song
- Department of Radiology, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou 510150, China;
| |
Collapse
|
26
|
Kalejahi BK, Meshgini S, Danishvar S. Segmentation of Brain Tumor Using a 3D Generative Adversarial Network. Diagnostics (Basel) 2023; 13:3344. [PMID: 37958240 PMCID: PMC10649332 DOI: 10.3390/diagnostics13213344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 10/15/2023] [Accepted: 10/16/2023] [Indexed: 11/15/2023] Open
Abstract
Images of brain tumors may only show up in a small subset of scans, so important details may be missed. Further, because labeling is typically a labor-intensive and time-consuming task, there are typically only a small number of medical imaging datasets available for analysis. The focus of this research is on the MRI images of the human brain, and an attempt has been made to propose a method for the accurate segmentation of these images to identify the correct location of tumors. In this study, GAN is utilized as a classification network to detect and segment of 3D MRI images. The 3D GAN network model provides dense connectivity, followed by rapid network convergence and improved information extraction. Mutual training in a generative adversarial network can bring the segmentation results closer to the labeled data to improve image segmentation. The BraTS 2021 dataset of 3D images was used to compare two experimental models.
Collapse
Affiliation(s)
- Behnam Kiani Kalejahi
- Department of Biomedical Engineering, Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz 385Q+246, Iran;
| | - Saeed Meshgini
- Department of Biomedical Engineering, Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz 385Q+246, Iran;
| | - Sebelan Danishvar
- Department of Electronic and Computer Engineering, Brunel University, London UB8 3PH, UK
| |
Collapse
|
27
|
Cao R, Ning L, Zhou C, Wei P, Ding Y, Tan D, Zheng C. CFANet: Context Feature Fusion and Attention Mechanism Based Network for Small Target Segmentation in Medical Images. Sensors (Basel) 2023; 23:8739. [PMID: 37960438 PMCID: PMC10650041 DOI: 10.3390/s23218739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 10/21/2023] [Accepted: 10/24/2023] [Indexed: 11/15/2023]
Abstract
Medical image segmentation plays a crucial role in clinical diagnosis, treatment planning, and disease monitoring. The automatic segmentation method based on deep learning has developed rapidly, with segmentation results comparable to clinical experts for large objects, but the segmentation accuracy for small objects is still unsatisfactory. Current segmentation methods based on deep learning find it difficult to extract multiple scale features of medical images, leading to an insufficient detection capability for smaller objects. In this paper, we propose a context feature fusion and attention mechanism based network for small target segmentation in medical images called CFANet. CFANet is based on U-Net structure, including the encoder and the decoder, and incorporates two key modules, context feature fusion (CFF) and effective channel spatial attention (ECSA), in order to improve segmentation performance. The CFF module utilizes contextual information from different scales to enhance the representation of small targets. By fusing multi-scale features, the network captures local and global contextual cues, which are critical for accurate segmentation. The ECSA module further enhances the network's ability to capture long-range dependencies by incorporating attention mechanisms at the spatial and channel levels, which allows the network to focus on information-rich regions while suppressing irrelevant or noisy features. Extensive experiments are conducted on four challenging medical image datasets, namely ADAM, LUNA16, Thoracic OAR, and WORD. Experimental results show that CFANet outperforms state-of-the-art methods in terms of segmentation accuracy and robustness. The proposed method achieves excellent performance in segmenting small targets in medical images, demonstrating its potential in various clinical applications.
Collapse
Affiliation(s)
- Ruifen Cao
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, School of Computer Science and Technology, Anhui University, Hefei 230601, China; (R.C.); (L.N.)
| | - Long Ning
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, School of Computer Science and Technology, Anhui University, Hefei 230601, China; (R.C.); (L.N.)
| | - Chao Zhou
- Institute of Energy, Hefei Comprehensive National Science Center, Hefei 230031, China;
| | - Pijing Wei
- Institutes of Physical Science and Information Technology, Anhui University, Hefei 230601, China;
| | - Yun Ding
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, Hefei 230601, China;
| | - Dayu Tan
- Institutes of Physical Science and Information Technology, Anhui University, Hefei 230601, China;
| | - Chunhou Zheng
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, Hefei 230601, China;
| |
Collapse
|
28
|
Zou L, Cai Z, Qiu Y, Gui L, Mao L, Yang X. CTG-Net: an efficient cascaded framework driven by terminal guidance mechanism for dilated pancreatic duct segmentation. Phys Med Biol 2023; 68:215006. [PMID: 37586389 DOI: 10.1088/1361-6560/acf110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 08/16/2023] [Indexed: 08/18/2023]
Abstract
Pancreatic duct dilation indicates a high risk of various pancreatic diseases. Segmentation for dilated pancreatic duct (DPD) on computed tomography (CT) image shows the potential to assist the early diagnosis, surgical planning and prognosis. Because of the DPD's tiny size, slender tubular structure and the surrounding distractions, most current researches on DPD segmentation achieve low accuracy and always have segmentation errors on the terminal DPD regions. To address these problems, we propose a cascaded terminal guidance network to efficiently improve the DPD segmentation performance. Firstly, a basic cascaded segmentation architecture is established to get the pancreas and coarse DPD segmentation, a DPD graph structure is build on the coarse DPD segmentation to locate the terminal DPD regions. Then, a terminal anatomy attention module is introduced for jointly learning the local intensity from the CT images, feature cues from the coarse DPD segmentation and global anatomy information from the designed pancreas anatomy-aware maps. Finally, a terminal distraction attention module which explicitly learns the distribution of the terminal distraction regions is proposed to reduce the false positive and false negative predictions. We also propose a new metric called tDice to measure the terminal segmentation accuracy for targets with tubular structures and two other metrics for segmentation error evaluation. We collect our dilated pancreatic duct segmentation dataset with 150 CT scans from patients with five types of pancreatic tumors. Experimental results on our dataset show that our proposed approach boosts DPD segmentation accuracy by nearly 20% compared with the existing results, and achieves more than 9% improvement for the terminal segmentation accuracy compared with the state-of-the-art methods.
Collapse
Affiliation(s)
- Liwen Zou
- Department of Mathematics, Nanjing University, Nanjing, 210093, People's Republic of China
| | - Zhenghua Cai
- Medical School, Nanjing University, Nanjing, 210007, People's Republic of China
| | - Yudong Qiu
- Department of General Surgery, Nanjing Drum Tower Hospital, Nanjing, 210008, People's Republic of China
| | - Luying Gui
- School of Mathematics and Statistics, Nanjing University of Science and Technology, Nanjing, 210094, People's Republic of China
| | - Liang Mao
- Department of General Surgery, Nanjing Drum Tower Hospital, Nanjing, 210008, People's Republic of China
| | - Xiaoping Yang
- Department of Mathematics, Nanjing University, Nanjing, 210093, People's Republic of China
| |
Collapse
|
29
|
Masse‐Gignac N, Flórez‐Jiménez S, Mac‐Thiong J, Duong L. Attention-gated U-Net networks for simultaneous axial/sagittal planes segmentation of injured spinal cords. J Appl Clin Med Phys 2023; 24:e14123. [PMID: 37735825 PMCID: PMC10562020 DOI: 10.1002/acm2.14123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Revised: 07/13/2023] [Accepted: 07/14/2023] [Indexed: 09/23/2023] Open
Abstract
Magnetic resonance imaging is currently the gold standard for the evaluation of spinal cord injuries. Automatic analysis of these injuries is however challenging, as MRI resolutions vary for different planes of analysis and physiological features are often distorted around these injuries. This study proposes a new CNN-based segmentation method in which information is exchanged between two networks analyzing the scans from different planes. Our aim was to develop a robust method for automatic segmentation of the spinal cord in patients having suffered traumatic injuries. The database consisted of 106 sagittal MRI scans from 94 patients with traumatic spinal cord injuries. Our method used an innovative approach where the scans were analyzed in series under the axial and sagittal plane by two different convolutional networks. The results were compared with those of Deepseg 2D from the Spinal Cord Toolbox (SCT), which was taken as state-of-the-art. Comparisons were evaluated using K-Fold cross-validation combined with statistical t-test results on separate test data. Our method achieved significantly better results than Deepseg 2D, with an average Dice coefficient of 0.95 against 0.88 for Deepseg 2D (p <0.001). Other metrics were also used to compare the segmentations, all of which showed significantly better results for our approach. In this study, we introduce a robust method for spinal cord segmentation which is capable of adequately segmenting spinal cords affected by traumatic injuries, improving upon the methods contained in SCT.
Collapse
Affiliation(s)
- Nicolas Masse‐Gignac
- Department of software and IT engineeringÉcole de technologie supérieureMontréalCanada
- Department of orthopedic surgeryHopital Sacré‐CoeurMontréalCanada
| | - Salomón Flórez‐Jiménez
- Department of software and IT engineeringÉcole de technologie supérieureMontréalCanada
- Department of orthopedic surgeryHopital Sacré‐CoeurMontréalCanada
| | - Jean‐Marc Mac‐Thiong
- Department of software and IT engineeringÉcole de technologie supérieureMontréalCanada
- Department of orthopedic surgeryHopital Sacré‐CoeurMontréalCanada
| | - Luc Duong
- Department of software and IT engineeringÉcole de technologie supérieureMontréalCanada
- Department of orthopedic surgeryHopital Sacré‐CoeurMontréalCanada
| |
Collapse
|
30
|
Xu X, Deng HH, Gateno J, Yan P. Federated Multi-Organ Segmentation With Inconsistent Labels. IEEE Trans Med Imaging 2023; 42:2948-2960. [PMID: 37097793 PMCID: PMC10592562 DOI: 10.1109/tmi.2023.3270140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Federated learning is an emerging paradigm allowing large-scale decentralized learning without sharing data across different data owners, which helps address the concern of data privacy in medical image analysis. However, the requirement for label consistency across clients by the existing methods largely narrows its application scope. In practice, each clinical site may only annotate certain organs of interest with partial or no overlap with other sites. Incorporating such partially labeled data into a unified federation is an unexplored problem with clinical significance and urgency. This work tackles the challenge by using a novel federated multi-encoding U-Net (Fed-MENU) method for multi-organ segmentation. In our method, a multi-encoding U-Net (MENU-Net) is proposed to extract organ-specific features through different encoding sub-networks. Each sub-network can be seen as an expert of a specific organ and trained for that client. Moreover, to encourage the organ-specific features extracted by different sub-networks to be informative and distinctive, we regularize the training of the MENU-Net by designing an auxiliary generic decoder (AGD). Extensive experiments on six public abdominal CT datasets show that our Fed-MENU method can effectively obtain a federated learning model using the partially labeled datasets with superior performance to other models trained by either localized or centralized learning methods. Source code is publicly available at https://github.com/DIAL-RPI/Fed-MENU.
Collapse
|
31
|
Szentimrey Z, Ameri G, Hong CX, Cheung RYK, Ukwatta E, Eltahawi A. Automated segmentation and measurement of the female pelvic floor from the mid-sagittal plane of 3D ultrasound volumes. Med Phys 2023; 50:6215-6227. [PMID: 36964964 DOI: 10.1002/mp.16389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 03/17/2023] [Accepted: 03/17/2023] [Indexed: 03/27/2023] Open
Abstract
BACKGROUND Transperineal ultrasound (TPUS) is a valuable imaging tool for evaluating patients with pelvic floor disorders, including pelvic organ prolapse (POP). Currently, measurements of anatomical structures in the mid-sagittal plane of 2D and 3D US volumes are obtained manually, which is time-consuming, has high intra-rater variability, and requires an expert in pelvic floor US interpretation. Manual segmentation and biometric measurement can take 15 min per 2D mid-sagittal image by an expert operator. An automated segmentation method would provide quantitative data relevant to pelvic floor disorders and improve the efficiency and reproducibility of segmentation-based biometric methods. PURPOSE Develop a fast, reproducible, and automated method of acquiring biometric measurements and organ segmentations from the mid-sagittal plane of female 3D TPUS volumes. METHODS Our method used a nnU-Net segmentation model to segment the pubis symphysis, urethra, bladder, rectum, rectal ampulla, and anorectal angle in the mid-sagittal plane of female 3D TPUS volumes. We developed an algorithm to extract relevant biometrics from the segmentations. Our dataset included 248 3D TPUS volumes, 126/122 rest/Valsalva split, from 135 patients. System performance was assessed by comparing the automated results with manual ground truth data using the Dice similarity coefficient (DSC) and average absolute difference (AD). Intra-class correlation coefficient (ICC) and time difference were used to compare reproducibility and efficiency between manual and automated methods respectively. High ICC, low AD and reduction in time indicated an accurate and reliable automated system, making TPUS an efficient alternative for POP assessment. Paired t-test and non-parametric Wilcoxon signed-rank test were conducted, with p < 0.05 determining significance. RESULTS The nnU-Net segmentation model reported average DSC and p values (in brackets), compared to the next best tested model, of 87.4% (<0.0001), 68.5% (<0.0001), 61.0% (0.1), 54.6% (0.04), 49.2% (<0.0001) and 33.7% (0.02) for bladder, rectum, urethra, pubic symphysis, anorectal angle, and rectal ampulla respectively. The average ADs for the bladder neck position, bladder descent, rectal ampulla descent and retrovesical angle were 3.2 mm, 4.5 mm, 5.3 mm and 27.3°, respectively. The biometric algorithm had an ICC > 0.80 for the bladder neck position, bladder descent and rectal ampulla descent when compared to manual measurements, indicating high reproducibility. The proposed algorithms required approximately 1.27 s to analyze one image. The manual ground truths were performed by a single expert operator. In addition, due to high operator dependency for TPUS image collection, we would need to pursue further studies with images collected from multiple operators. CONCLUSIONS Based on our search in scientific databases (i.e., Web of Science, IEEE Xplore Digital Library, Elsevier ScienceDirect and PubMed), this is the first reported work of an automated segmentation and biometric measurement system for the mid-sagittal plane of 3D TPUS volumes. The proposed algorithm pipeline can improve the efficiency (1.27 s compared to 15 min manually) and has high reproducibility (high ICC values) compared to manual TPUS analysis for pelvic floor disorder diagnosis. Further studies are needed to verify this system's viability using multiple TPUS operators and multiple experts for performing manual segmentation and extracting biometrics from the images.
Collapse
Affiliation(s)
| | | | - Christopher X Hong
- Department of Obstetrics & Gynaecology, University of Michigan, Ann Arbor, Michigan, USA
| | - Rachel Y K Cheung
- Department of Obstetrics & Gynaecology, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong
| | - Eranga Ukwatta
- School of Engineering, University of Guelph, Guelph, Ontario, Canada
| | - Ahmed Eltahawi
- Cosm Medical, Toronto, Ontario, Canada
- Information System Department, Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
| |
Collapse
|
32
|
Shen L, Wang Q, Zhang Y, Qin F, Jin H, Zhao W. DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation. Medicine (Baltimore) 2023; 102:e35328. [PMID: 37773842 PMCID: PMC10545043 DOI: 10.1097/md.0000000000035328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 08/31/2023] [Indexed: 10/01/2023] Open
Abstract
U-Net has attained immense popularity owing to its performance in medical image segmentation. However, it cannot be modeled explicitly over remote dependencies. By contrast, the transformer can effectively capture remote dependencies by leveraging the self-attention (SA) of the encoder. Although SA, an important characteristic of the transformer, can find correlations between them based on the original data, secondary computational complexity might retard the processing rate of high-dimensional data (such as medical images). Furthermore, SA is limited because the correlation between samples is overlooked; thus, there is considerable scope for improvement. To this end, based on Swin-UNet, we introduce a dynamic selective attention mechanism for the convolution kernels. The weight of each convolution kernel is calculated to fuse the results dynamically. This attention mechanism permits each neuron to adaptively modify its receptive field size in response to multiscale input information. A local cross-channel interaction strategy without dimensionality reduction was introduced, which effectively eliminated the influence of downscaling on learning channel attention. Through suitable cross-channel interactions, model complexity can be significantly reduced while maintaining its performance. Subsequently, the global interaction between the encoder features is used to extract more fine-grained features. Simultaneously, the mixed loss function of the weighted cross-entropy loss and Dice loss is used to alleviate category imbalances and achieve better results when the sample number is unbalanced. We evaluated our proposed method on abdominal multiorgan segmentation and cardiac segmentation datasets, achieving Dice similarity coefficient and 95% Hausdorff distance metrics of 80.30 and 14.55%, respectively, on the Synapse dataset and Dice similarity coefficient metrics of 90.80 on the ACDC dataset. The experimental results show that our proposed method has good generalization ability and robustness, and it is a powerful tool for medical image segmentation.
Collapse
Affiliation(s)
- Longfeng Shen
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
- Anhui Big-Data Research Center on University Management, Huaibei, China
| | - Qiong Wang
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, China
- Anhui Big-Data Research Center on University Management, Huaibei, China
| | - Yingjie Zhang
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, China
- Anhui Big-Data Research Center on University Management, Huaibei, China
| | - Fenglan Qin
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, China
- Anhui Big-Data Research Center on University Management, Huaibei, China
| | - Hengjun Jin
- People’s Hospital of Huaibei City, Huaibei, China
| | - Wei Zhao
- People’s Hospital of Huaibei City, Huaibei, China
| |
Collapse
|
33
|
Ma J, Yuan G, Guo C, Gang X, Zheng M. SW-UNet: a U-Net fusing sliding window transformer block with CNN for segmentation of lung nodules. Front Med (Lausanne) 2023; 10:1273441. [PMID: 37841008 PMCID: PMC10569032 DOI: 10.3389/fmed.2023.1273441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 09/12/2023] [Indexed: 10/17/2023] Open
Abstract
Medical images are information carriers that visually reflect and record the anatomical structure of the human body, and play an important role in clinical diagnosis, teaching and research, etc. Modern medicine has become increasingly inseparable from the intelligent processing of medical images. In recent years, there have been more and more attempts to apply deep learning theory to medical image segmentation tasks, and it is imperative to explore a simple and efficient deep learning algorithm for medical image segmentation. In this paper, we investigate the segmentation of lung nodule images. We address the above-mentioned problems of medical image segmentation algorithms and conduct research on medical image fusion algorithms based on a hybrid channel-space attention mechanism and medical image segmentation algorithms with a hybrid architecture of Convolutional Neural Networks (CNN) and Visual Transformer. To the problem that medical image segmentation algorithms are difficult to capture long-range feature dependencies, this paper proposes a medical image segmentation model SW-UNet based on a hybrid CNN and Vision Transformer (ViT) framework. Self-attention mechanism and sliding window design of Visual Transformer are used to capture global feature associations and break the perceptual field limitation of convolutional operations due to inductive bias. At the same time, a widened self-attentive vector is used to streamline the number of modules and compress the model size so as to fit the characteristics of a small amount of medical data, which makes the model easy to be overfitted. Experiments on the LUNA16 lung nodule image dataset validate the algorithm and show that the proposed network can achieve efficient medical image segmentation on a lightweight scale. In addition, to validate the migratability of the model, we performed additional validation on other tumor datasets with desirable results. Our research addresses the crucial need for improved medical image segmentation algorithms. By introducing the SW-UNet model, which combines CNN and ViT, we successfully capture long-range feature dependencies and break the perceptual field limitations of traditional convolutional operations. This approach not only enhances the efficiency of medical image segmentation but also maintains model scalability and adaptability to small medical datasets. The positive outcomes on various tumor datasets emphasize the potential migratability and broad applicability of our proposed model in the field of medical image analysis.
Collapse
Affiliation(s)
- Jiajun Ma
- Shenhua Hollysys Information Technology Co., Ltd., Beijing, China
| | - Gang Yuan
- The First Affiliated Hospital of Dalian Medical University, Dalian, China
| | - Chenhua Guo
- School of Software, North University of China, Taiyuan, China
| | | | - Minting Zheng
- The First Affiliated Hospital of Dalian Medical University, Dalian, China
| |
Collapse
|
34
|
Xing C, Dong H, Xi H, Ma J, Zhu J. Multi-task contrastive learning for semi-supervised medical image segmentation with multi-scale uncertainty estimation. Phys Med Biol 2023; 68:185006. [PMID: 37586383 DOI: 10.1088/1361-6560/acf10f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 08/16/2023] [Indexed: 08/18/2023]
Abstract
Objective. Automated medical image segmentation is vital for the prevention and treatment of disease. However, medical data commonly exhibit class imbalance in practical applications, which may lead to unclear boundaries of specific classes and make it difficult to effectively segment certain tail classes in the results of semi-supervised medical image segmentation.Approach. We propose a novel multi-task contrastive learning framework for semi-supervised medical image segmentation with multi-scale uncertainty estimation. Specifically, the framework includes a student-teacher model. We introduce global image-level contrastive learning in the encoder to address the class imbalance and local pixel-level contrastive learning in the decoder to achieve intra-class aggregation and inter-class separation. Furthermore, we propose a multi-scale uncertainty-aware consistency loss to reduce noise caused by pseudo-label bias.Main results. Experiments on three public datasets ACDC, LA and LiTs show that our method achieves higher segmentation performance compared with state-of-the-art semi-supervised segmentation methods.Significance. The multi-task contrastive learning in our method facilitates the negative impact of class imbalance and achieves better classification results. The multi-scale uncertainty estimation encourages consistent predictions for the same input under different perturbations, motivating the teacher model to generate high-quality pseudo-labels. Code is available athttps://github.com/msctransu/MCSSMU.git.
Collapse
Affiliation(s)
- Chengcheng Xing
- School of Computer Science and Technology, Heilongjiang University, Harbin, 150000, People's Republic of China
| | - Haoji Dong
- School of Computer Science and Technology, Heilongjiang University, Harbin, 150000, People's Republic of China
| | - Heran Xi
- School of Electronic Engineering, Heilongjiang University, Harbin, 150001, People's Republic of China
| | - Jiquan Ma
- School of Computer Science and Technology, Heilongjiang University, Harbin, 150000, People's Republic of China
| | - Jinghua Zhu
- School of Computer Science and Technology, Heilongjiang University, Harbin, 150000, People's Republic of China
| |
Collapse
|
35
|
Lee HH, Tang Y, Yang Q, Yu X, Cai LY, Remedios LW, Bao S, Landman BA, Huo Y. Semantic-Aware Contrastive Learning for Multi-Object Medical Image Segmentation. IEEE J Biomed Health Inform 2023; 27:4444-4453. [PMID: 37310834 PMCID: PMC10524443 DOI: 10.1109/jbhi.2023.3285230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Medical image segmentation, or computing voxel-wise semantic masks, is a fundamental yet challenging task in medical imaging domain. To increase the ability of encoder-decoder neural networks to perform this task across large clinical cohorts, contrastive learning provides an opportunity to stabilize model initialization and enhances downstream tasks performance without ground-truth voxel-wise labels. However, multiple target objects with different semantic meanings and contrast level may exist in a single image, which poses a problem for adapting traditional contrastive learning methods from prevalent "image-level classification" to "pixel-level segmentation". In this article, we propose a simple semantic-aware contrastive learning approach leveraging attention masks and image-wise labels to advance multi-object semantic segmentation. Briefly, we embed different semantic objects to different clusters rather than the traditional image-level embeddings. We evaluate our proposed method on a multi-organ medical image segmentation task with both in-house data and MICCAI Challenge 2015 BTCV datasets. Compared with current state-of-the-art training strategies, our proposed pipeline yields a substantial improvement of 5.53% and 6.09% on Dice score for both medical image segmentation cohorts respectively (p-value 0.01). The performance of the proposed method is further assessed on external medical image cohort via MICCAI Challenge FLARE 2021 dataset, and achieves a substantial improvement from Dice 0.922 to 0.933 (p-value 0.01).
Collapse
|
36
|
Qinhong D, Yue H, Wendong B, Yukun D, Huan Y, Yongming X. MAS-Net:Multi-modal Assistant Segmentation Network For Lumbar Intervertebral Disc. Phys Med Biol 2023; 68:175044. [PMID: 37567228 DOI: 10.1088/1361-6560/acef9f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 08/11/2023] [Indexed: 08/13/2023]
Abstract
Objective.Despite advancements in medical imaging technology, the diagnosis and positioning of lumbar disc diseases still heavily rely on the expertise and experience of medical professionals. This process is often time-consuming, labor-intensive, and susceptible to subjective factors. Achieving automatic positioning and segmentation of lumbar intervertebral disc (LID) is the first and critical step in intelligent diagnosis of lumbar disc diseases. However, due to the complexity of the vertebral body and the ambiguity of the soft tissue boundaries of the LID, accurate and intelligent segmentation of LIDs remains challenging. The study aims to accurately and intelligently segment and locate LIDs by fully utilizing multi-modal lumbar magnetic resonance Images (MRIs).Approach.A novel multi-modal assistant segmentation network (MAS-Net) is proposed in this paper. The architecture consists of four key components: the multi-branch fusion encoder (MBFE), the cross-modality correlation evaluation (CMCE), the channel fusion transformer (CFT), and the selective Kernel (SK) based decoder. The MBFE module captures and integrates various modal features, while the CMCE module facilitates the fusion process between the MBFE and decoder. The CFT module selectively guides the flow of information between the MBFE and decoder and effectively utilizes skip connections from multiple layers. The SK module computes the significance of each channel using global pooling operations and applies weights to the input feature maps to improve the models recognition of important features.Main results.The proposed MAS-Net achieved a dice coefficient of 93.08% on IVD3Seg and 93.22% on DualModalDisc dataset, outperforming the current state-of-the-art network, accurately segmenting the LIDs, and generating a 3D model that can precisely display the LIDs.Significance.MAS-Net automates the diagnostics process and addresses challenges faced by doctors. Simplifying and enhancing the clarity of visual representation, multi-modal MRI allows for better information complementation and LIDs segmentation. By successfully integrating data from various modalities, the accuracy of LID segmentation is improved.
Collapse
Affiliation(s)
- Du Qinhong
- Department of Computer Science and Technology, Qingdao University, QingDao, People's Republic of China
| | - He Yue
- Department of Computer Science and Technology, Qingdao University, QingDao, People's Republic of China
| | - Bu Wendong
- Department of Computer Science and Technology, Qingdao University, QingDao, People's Republic of China
| | - Du Yukun
- Department of Spinal surgery, The affiliated hospital of Qingdao University, QingDao, People's Republic of China
| | - Yang Huan
- Department of Computer Science and Technology, Qingdao University, QingDao, People's Republic of China
| | - Xi Yongming
- Department of Spinal surgery, The affiliated hospital of Qingdao University, QingDao, People's Republic of China
| |
Collapse
|
37
|
Khouy M, Jabrane Y, Ameur M, Hajjam El Hassani A. Medical Image Segmentation Using Automatic Optimized U-Net Architecture Based on Genetic Algorithm. J Pers Med 2023; 13:1298. [PMID: 37763066 PMCID: PMC10533074 DOI: 10.3390/jpm13091298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 07/29/2023] [Accepted: 08/07/2023] [Indexed: 09/29/2023] Open
Abstract
Image segmentation is a crucial aspect of clinical decision making in medicine, and as such, it has greatly enhanced the sustainability of medical care. Consequently, biomedical image segmentation has become a prominent research area in the field of computer vision. With the advent of deep learning, many manual design-based methods have been proposed and have shown promising results in achieving state-of-the-art performance in biomedical image segmentation. However, these methods often require significant expert knowledge and have an enormous number of parameters, necessitating substantial computational resources. Thus, this paper proposes a new approach called GA-UNet, which employs genetic algorithms to automatically design a U-shape convolution neural network with good performance while minimizing the complexity of its architecture-based parameters, thereby addressing the above challenges. The proposed GA-UNet is evaluated on three datasets: lung image segmentation, cell nuclei segmentation in microscope images (DSB 2018), and liver image segmentation. Interestingly, our experimental results demonstrate that the proposed method achieves competitive performance with a smaller architecture and fewer parameters than the original U-Net model. It achieves an accuracy of 98.78% for lung image segmentation, 95.96% for cell nuclei segmentation in microscope images (DSB 2018), and 98.58% for liver image segmentation by using merely 0.24%, 0.48%, and 0.67% of the number of parameters in the original U-Net architecture for the lung image segmentation dataset, the DSB 2018 dataset, and the liver image segmentation dataset, respectively. This reduction in complexity makes our proposed approach, GA-UNet, a more viable option for deployment in resource-limited environments or real-world implementations that demand more efficient and faster inference times.
Collapse
Affiliation(s)
- Mohammed Khouy
- MSC Laboratory, Cadi Ayyad University, Marrakech 40000, Morocco; (M.K.); (Y.J.); (M.A.)
| | - Younes Jabrane
- MSC Laboratory, Cadi Ayyad University, Marrakech 40000, Morocco; (M.K.); (Y.J.); (M.A.)
| | - Mustapha Ameur
- MSC Laboratory, Cadi Ayyad University, Marrakech 40000, Morocco; (M.K.); (Y.J.); (M.A.)
| | - Amir Hajjam El Hassani
- Nanomedicine Imagery & Therapeutics Laboratory, EA4662—Bourgogne-Franche-Comté University, University of Technologie of Belfort Montbéliard, CEDEX, 90010 Belfort, France
| |
Collapse
|
38
|
Chen Y, Wang T, Tang H, Zhao L, Zhang X, Tan T, Gao Q, Du M, Tong T. CoTrFuse: a novel framework by fusing CNN and transformer for medical image segmentation. Phys Med Biol 2023; 68:175027. [PMID: 37605997 DOI: 10.1088/1361-6560/acede8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2023] [Accepted: 08/07/2023] [Indexed: 08/23/2023]
Abstract
Medical image segmentation is a crucial and intricate process in medical image processing and analysis. With the advancements in artificial intelligence, deep learning techniques have been widely used in recent years for medical image segmentation. One such technique is the U-Net framework based on the U-shaped convolutional neural networks (CNN) and its variants. However, these methods have limitations in simultaneously capturing both the global and the remote semantic information due to the restricted receptive domain caused by the convolution operation's intrinsic features. Transformers are attention-based models with excellent global modeling capabilities, but their ability to acquire local information is limited. To address this, we propose a network that combines the strengths of both CNN and Transformer, called CoTrFuse. The proposed CoTrFuse network uses EfficientNet and Swin Transformer as dual encoders. The Swin Transformer and CNN Fusion module are combined to fuse the features of both branches before the skip connection structure. We evaluated the proposed network on two datasets: the ISIC-2017 challenge dataset and the COVID-QU-Ex dataset. Our experimental results demonstrate that the proposed CoTrFuse outperforms several state-of-the-art segmentation methods, indicating its superiority in medical image segmentation. The codes are available athttps://github.com/BinYCn/CoTrFuse.
Collapse
Affiliation(s)
- Yuanbin Chen
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China
- Fujian Key Lab of Medical Instrumentation & Pharmaceutical Technology, Fuzhou University, Fuzhou 350116, People's Republic of China
| | - Tao Wang
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China
- Fujian Key Lab of Medical Instrumentation & Pharmaceutical Technology, Fuzhou University, Fuzhou 350116, People's Republic of China
| | - Hui Tang
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China
- Fujian Key Lab of Medical Instrumentation & Pharmaceutical Technology, Fuzhou University, Fuzhou 350116, People's Republic of China
| | - Longxuan Zhao
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China
- Fujian Key Lab of Medical Instrumentation & Pharmaceutical Technology, Fuzhou University, Fuzhou 350116, People's Republic of China
| | - Xinlin Zhang
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China
- Fujian Key Lab of Medical Instrumentation & Pharmaceutical Technology, Fuzhou University, Fuzhou 350116, People's Republic of China
| | - Tao Tan
- Faculty of Applied Science, Macao Polytechnic University, Macao 999078, People's Republic of China
| | - Qinquan Gao
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China
- Fujian Key Lab of Medical Instrumentation & Pharmaceutical Technology, Fuzhou University, Fuzhou 350116, People's Republic of China
| | - Min Du
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China
- Fujian Key Lab of Medical Instrumentation & Pharmaceutical Technology, Fuzhou University, Fuzhou 350116, People's Republic of China
| | - Tong Tong
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, People's Republic of China
- Fujian Key Lab of Medical Instrumentation & Pharmaceutical Technology, Fuzhou University, Fuzhou 350116, People's Republic of China
| |
Collapse
|
39
|
Peng T, Wu Y, Gu Y, Xu D, Wang C, Li Q, Cai J. Intelligent contour extraction approach for accurate segmentation of medical ultrasound images. Front Physiol 2023; 14:1177351. [PMID: 37675280 PMCID: PMC10479019 DOI: 10.3389/fphys.2023.1177351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 07/28/2023] [Indexed: 09/08/2023] Open
Abstract
Introduction: Accurate contour extraction in ultrasound images is of great interest for image-guided organ interventions and disease diagnosis. Nevertheless, it remains a problematic issue owing to the missing or ambiguous outline between organs (i.e., prostate and kidney) and surrounding tissues, the appearance of shadow artifacts, and the large variability in the shape of organs. Methods: To address these issues, we devised a method that includes four stages. In the first stage, the data sequence is acquired using an improved adaptive selection principal curve method, in which a limited number of radiologist defined data points are adopted as the prior. The second stage then uses an enhanced quantum evolution network to help acquire the optimal neural network. The third stage involves increasing the precision of the experimental outcomes after training the neural network, while using the data sequence as the input. In the final stage, the contour is smoothed using an explicable mathematical formula explained by the model parameters of the neural network. Results: Our experiments showed that our approach outperformed other current methods, including hybrid and Transformer-based deep-learning methods, achieving an average Dice similarity coefficient, Jaccard similarity coefficient, and accuracy of 95.7 ± 2.4%, 94.6 ± 2.6%, and 95.3 ± 2.6%, respectively. Discussion: This work develops an intelligent contour extraction approach on ultrasound images. Our approach obtained more satisfactory outcome compared with recent state-of-the-art approaches . The knowledge of precise boundaries of the organ is significant for the conservation of risk structures. Our developed approach has the potential to enhance disease diagnosis and therapeutic outcomes.
Collapse
Affiliation(s)
- Tao Peng
- School of Future Science and Engineering, Soochow University, Suzhou, China
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Kowloon, Hong Kong SAR, China
- Department of Radiation Oncology, UT Southwestern Medical Center, Dallas, TX, United States
| | - Yiyun Wu
- Department of Ultrasound, Jiangsu Province Hospital of Chinese Medicine, Nanjing, Jiangsu, China
| | - Yidong Gu
- Department of Medical Ultrasound, The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Suzhou, Jiangsu, China
| | - Daqiang Xu
- Department of Radiology, The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Suzhou, Jiangsu, China
| | - Caishan Wang
- Department of Ultrasound, The Second Affiliated Hospital of Soochow University, Suzhou, China
| | - Quan Li
- Center of Stomatology, The Second Affiliated Hospital of Soochow University, Suzhou, China
| | - Jing Cai
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Kowloon, Hong Kong SAR, China
| |
Collapse
|
40
|
Pan H, Gao B, Bai W, Li B, Li Y, Zhang M, Wang H, Zhao X, Chen M, Yin C, Kong W. WA-ResUNet: A Focused Tail Class MRI Medical Image Segmentation Algorithm. Bioengineering (Basel) 2023; 10:945. [PMID: 37627829 PMCID: PMC10451191 DOI: 10.3390/bioengineering10080945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 07/28/2023] [Accepted: 08/04/2023] [Indexed: 08/27/2023] Open
Abstract
Medical image segmentation can effectively identify lesions in medicine, but some small and rare lesions cannot be well identified. Existing studies do not take into account the uncertainty of the occurrence of diseased tissue, and the problem of long-tailed distribution of medical data. Meanwhile, the grayscale image obtained from Magnetic Resonance Imaging (MRI) detection has problems, such as the features being difficult to extract and invalid features being difficult to distinguish. In order to solve these problems, we propose a new weighted attention ResUNet (WA-ResUNet) and a class weight formula based on the number of images contained in the class, which improves the performance of the model in the low-frequency class and the overall effect of the model by improving the degree of attention paid to the valid features and invalid ones and rebalancing the learning efficiency among the classes. We evaluated our method on an uterine MRI dataset and compared it with the ResUNet. WA-ResUNet increased Intersection over Union (IoU) in the low-frequency class (Nabothian cysts) by 21.87%, and the overall mIoU increased by more than 6.5%.
Collapse
Affiliation(s)
- Haixia Pan
- College of Software, Beihang University, Beijing 100191, China
| | - Bo Gao
- College of Software, Beihang University, Beijing 100191, China
| | - Wenpei Bai
- Department of Obstetrics and Gynecology, Beijing Shijitan Hospital, Capital Medical University, Beijing 100038, China
| | - Bin Li
- Department of MRI, Beijing Shijitan Hospital, Capital Medical University/Ninth Clinical Medical College, Peking University, Beijing 100038, China
| | - Yanan Li
- College of Software, Beihang University, Beijing 100191, China
| | - Meng Zhang
- College of Software, Beihang University, Beijing 100191, China
| | - Hongqiang Wang
- College of Software, Beihang University, Beijing 100191, China
| | - Xiaoran Zhao
- College of Software, Beihang University, Beijing 100191, China
| | - Minghuang Chen
- Department of Obstetrics and Gynecology, Beijing Shijitan Hospital, Capital Medical University, Beijing 100038, China
| | - Cong Yin
- Department of Obstetrics and Gynecology, Beijing Shijitan Hospital, Capital Medical University, Beijing 100038, China
| | - Weiya Kong
- Department of Obstetrics and Gynecology, Beijing Shijitan Hospital, Capital Medical University, Beijing 100038, China
| |
Collapse
|
41
|
Feng Y, Cong Y, Xing S, Wang H, Zhao C, Zhang X, Yao Q. Distance Matters: A Distance-Aware Medical Image Segmentation Algorithm. Entropy (Basel) 2023; 25:1169. [PMID: 37628199 PMCID: PMC10453236 DOI: 10.3390/e25081169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 08/01/2023] [Accepted: 08/03/2023] [Indexed: 08/27/2023]
Abstract
The transformer-based U-Net network structure has gained popularity in the field of medical image segmentation. However, most networks overlook the impact of the distance between each patch on the encoding process. This paper proposes a novel GC-TransUnet for medical image segmentation. The key innovation is that it takes into account the relationships between patch blocks based on their distances, optimizing the encoding process in traditional transformer networks. This optimization results in improved encoding efficiency and reduced computational costs. Moreover, the proposed GC-TransUnet is combined with U-Net to accomplish the segmentation task. In the encoder part, the traditional vision transformer is replaced by the global context vision transformer (GC-VIT), eliminating the need for the CNN network while retaining skip connections for subsequent decoders. Experimental results demonstrate that the proposed algorithm achieves superior segmentation results compared to other algorithms when applied to medical images.
Collapse
Affiliation(s)
- Yuncong Feng
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China; (Y.F.); (Y.C.); (S.X.); (H.W.); (C.Z.); (Q.Y.)
- Artificial Intelligence Research Institute, Changchun University of Technology, Changchun 130012, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China
| | - Yeming Cong
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China; (Y.F.); (Y.C.); (S.X.); (H.W.); (C.Z.); (Q.Y.)
| | - Shuaijie Xing
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China; (Y.F.); (Y.C.); (S.X.); (H.W.); (C.Z.); (Q.Y.)
| | - Hairui Wang
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China; (Y.F.); (Y.C.); (S.X.); (H.W.); (C.Z.); (Q.Y.)
| | - Cuixing Zhao
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China; (Y.F.); (Y.C.); (S.X.); (H.W.); (C.Z.); (Q.Y.)
| | - Xiaoli Zhang
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China
| | - Qingan Yao
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China; (Y.F.); (Y.C.); (S.X.); (H.W.); (C.Z.); (Q.Y.)
| |
Collapse
|
42
|
Zhou H, Sun C, Huang H, Fan M, Yang X, Zhou L. Feature-guided attention network for medical image segmentation. Med Phys 2023; 50:4871-4886. [PMID: 36746870 DOI: 10.1002/mp.16253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 01/03/2023] [Accepted: 01/06/2023] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND U-Net and its variations have achieved remarkable performances in medical image segmentation. However, they have two limitations. First, the shallow layer feature of the encoder always contains background noise. Second, semantic gaps exist between the features of the encoder and the decoder. Skip-connections directly connect the encoder to the decoder, which will lead to the fusion of semantically dissimilar feature maps. PURPOSE To overcome these two limitations, this paper proposes a novel medical image segmentation algorithm, called feature-guided attention network, which consists of U-Net, the cross-level attention filtering module (CAFM), and the attention-guided upsampling module (AUM). METHODS In the proposed method, the AUM and the CAFM were introduced into the U-Net, where the AUM learns to filter the background noise in the low-level feature map of the encoder and the CAFM tries to eliminate the semantic gap between the encoder and the decoder. Specifically, the AUM adopts a top-down pathway to use the high-level feature map so as to filter the background noise in the low-level feature map of the encoder. The AUM uses the encoder features to guide the upsampling of the corresponding decoder features, thus eliminating the semantic gap between them. Four medical image segmentation tasks, including coronary atherosclerotic plaque segmentation (Dataset A), retinal vessel segmentation (Dataset B), skin lesion segmentation (Dataset C), and multiclass retinal edema lesions segmentation (Dataset D), were used to validate the proposed method. RESULTS For Dataset A, the proposed method achieved higher Intersection over Union (IoU) (67.91 ± 3.82 % $67.91\pm 3.82\%$ ), dice (79.39 ± 3.37 % $79.39\pm 3.37\%$ ), accuracy (98.39 ± 0.34 % $98.39\pm 0.34\%$ ), and sensitivity (85.10 ± 3.74 % $85.10\pm 3.74\%$ ) than the previous best method: CA-Net. For Dataset B, the proposed method achieved higher sensitivity (83.50%) and accuracy (97.55%) than the previous best method: SCS-Net. For Dataset C, the proposed method had highest IoU (83.47 ± 0.41 % $83.47\pm 0.41\%$ ) and dice (90.81 ± 0.34 % $90.81\pm 0.34\%$ ) than those of all compared previous methods. For Dataset D, the proposed method had highest dice (average: 81.53%; retina edema area [REA]: 83.78%; pigment epithelial detachment [PED] 77.13%), sensitivity (REA: 89.01%; SRF: 85.50%), specificity (REA: 99.35%; PED: 100.00), and accuracy (98.73%) among all compared previous networks. In addition, the number of parameters of the proposed method was 2.43 M, which is less than CA-Net (3.21 M) and CPF-Net (3.07 M). CONCLUSIONS The proposed method demonstrated state-of-the-art performance, outperforming other top-notch medical image segmentation algorithms. The CAFM filtered the background noise in the low-level feature map of the encoder, while the AUM eliminated the semantic gap between the encoder and the decoder. Furthermore, the proposed method was of high computational efficiency.
Collapse
Affiliation(s)
- Hao Zhou
- National Key Laboratory of Science and Technology of Underwater Vehicle, Harbin Engineering University, Harbin, China
| | - Chaoyu Sun
- Fourth Affiliated Hospital, Harbin Medical University, Harbin, China
| | - Hai Huang
- National Key Laboratory of Science and Technology of Underwater Vehicle, Harbin Engineering University, Harbin, China
| | - Mingyu Fan
- College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, China
| | - Xu Yang
- State Key Laboratory of Management and Control for Complex System, Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Linxiao Zhou
- Fourth Affiliated Hospital, Harbin Medical University, Harbin, China
| |
Collapse
|
43
|
Sui G, Zhang Z, Liu S, Chen S, Liu X. Pulmonary nodules segmentation based on domain adaptation. Phys Med Biol 2023; 68:155015. [PMID: 37406634 DOI: 10.1088/1361-6560/ace498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 07/05/2023] [Indexed: 07/07/2023]
Abstract
With the development of deep learning, the methods based on transfer learning have promoted the progress of medical image segmentation. However, the domain shift and complex background information of medical images limit the further improvement of the segmentation accuracy. Domain adaptation can compensate for the sample shortage by learning important information from a similar source dataset. Therefore, a segmentation method based on adversarial domain adaptation with background mask (ADAB) is proposed in this paper. Firstly, two ADAB networks are built for the source and target data segmentation, respectively. Next, to extract the foreground features that are the input of the discriminators, the background masks are generated according to the region growth algorithm. Then, to update the parameters in the target network without being affected by the conflict between the distinguishing differences of the discriminator and the domain shift reduction of the adversarial domain adaptation, a gradient reversal layer propagation is embedded in the ADAB model for the target data. Finally, an enhanced boundaries loss is deduced to make the target network sensitive to the edge of the area to be segmented. The performance of the proposed method is evaluated in the segmentation of pulmonary nodules in computed tomography images. Experimental results show that the proposed approach has a potential prospect in medical image processing.
Collapse
Affiliation(s)
- Guozheng Sui
- College of Automation and Electronic Engineering, Qingdao University of Science and Technology, People's Republic of China
| | - Zaixian Zhang
- Radiology Department, The Affiliated Hospital of Qingdao University, People's Republic of China
| | - Shunli Liu
- Radiology Department, The Affiliated Hospital of Qingdao University, People's Republic of China
| | - Shuang Chen
- College of Automation and Electronic Engineering, Qingdao University of Science and Technology, People's Republic of China
| | - Xuefeng Liu
- College of Automation and Electronic Engineering, Qingdao University of Science and Technology, People's Republic of China
| |
Collapse
|
44
|
Saeed N, Ridzuan M, Majzoub RA, Yaqub M. Prompt-Based Tuning of Transformer Models for Multi-Center Medical Image Segmentation of Head and Neck Cancer. Bioengineering (Basel) 2023; 10:879. [PMID: 37508906 PMCID: PMC10376048 DOI: 10.3390/bioengineering10070879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 07/07/2023] [Accepted: 07/13/2023] [Indexed: 07/30/2023] Open
Abstract
Medical image segmentation is a vital healthcare endeavor requiring precise and efficient models for appropriate diagnosis and treatment. Vision transformer (ViT)-based segmentation models have shown great performance in accomplishing this task. However, to build a powerful backbone, the self-attention block of ViT requires large-scale pre-training data. The present method of modifying pre-trained models entails updating all or some of the backbone parameters. This paper proposes a novel fine-tuning strategy for adapting a pretrained transformer-based segmentation model on data from a new medical center. This method introduces a small number of learnable parameters, termed prompts, into the input space (less than 1% of model parameters) while keeping the rest of the model parameters frozen. Extensive studies employing data from new unseen medical centers show that the prompt-based fine-tuning of medical segmentation models provides excellent performance regarding the new-center data with a negligible drop regarding the old centers. Additionally, our strategy delivers great accuracy with minimum re-training on new-center data, significantly decreasing the computational and time costs of fine-tuning pre-trained models. Our source code will be made publicly available.
Collapse
Affiliation(s)
- Numan Saeed
- Department of Machine Learning, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi 7909, United Arab Emirates
| | - Muhammad Ridzuan
- Department of Machine Learning, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi 7909, United Arab Emirates
| | - Roba Al Majzoub
- Department of Computer Vision, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi 7909, United Arab Emirates
| | - Mohammad Yaqub
- Department of Computer Vision, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi 7909, United Arab Emirates
| |
Collapse
|
45
|
Wang T, Huang Z, Wu J, Cai Y, Li Z. Semi-Supervised Medical Image Segmentation with Co-Distribution Alignment. Bioengineering (Basel) 2023; 10:869. [PMID: 37508896 PMCID: PMC10376634 DOI: 10.3390/bioengineering10070869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 07/13/2023] [Accepted: 07/14/2023] [Indexed: 07/30/2023] Open
Abstract
Medical image segmentation has made significant progress when a large amount of labeled data are available. However, annotating medical image segmentation datasets is expensive due to the requirement of professional skills. Additionally, classes are often unevenly distributed in medical images, which severely affects the classification performance on minority classes. To address these problems, this paper proposes Co-Distribution Alignment (Co-DA) for semi-supervised medical image segmentation. Specifically, Co-DA aligns marginal predictions on unlabeled data to marginal predictions on labeled data in a class-wise manner with two differently initialized models before using the pseudo-labels generated by one model to supervise the other. Besides, we design an over-expectation cross-entropy loss for filtering the unlabeled pixels to reduce noise in their pseudo-labels. Quantitative and qualitative experiments on three public datasets demonstrate that the proposed approach outperforms existing state-of-the-art semi-supervised medical image segmentation methods on both the 2D CaDIS dataset and the 3D LGE-MRI and ACDC datasets, achieving an mIoU of 0.8515 with only 24% labeled data on CaDIS, and a Dice score of 0.8824 and 0.8773 with only 20% data on LGE-MRI and ACDC, respectively.
Collapse
Affiliation(s)
- Tao Wang
- Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, College of Computer and Control Engineering, Minjiang University, Fuzhou 350108, China
- College of Computer and Data Science, Fuzhou University, Fuzhou 350108, China
- The Key Laboratory of Cognitive Computing and Intelligent Information Processing of Fujian Education Institutions, Wuyi University, Wuyishan 354300, China
| | - Zhongzheng Huang
- College of Computer and Data Science, Fuzhou University, Fuzhou 350108, China
| | - Jiawei Wu
- School of Electrical and Mechanical Engineering, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Yuanzheng Cai
- Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, College of Computer and Control Engineering, Minjiang University, Fuzhou 350108, China
| | - Zuoyong Li
- Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, College of Computer and Control Engineering, Minjiang University, Fuzhou 350108, China
| |
Collapse
|
46
|
Zhang F, Wang Q, Lu N, Chen D, Jiang H, Yang A, Yu Y, Wang Y. Applying a novel two-step deep learning network to improve the automatic delineation of esophagus in non-small cell lung cancer radiotherapy. Front Oncol 2023; 13:1174530. [PMID: 37534258 PMCID: PMC10391539 DOI: 10.3389/fonc.2023.1174530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 05/22/2023] [Indexed: 08/04/2023] Open
Abstract
Purpose To introduce a model for automatic segmentation of thoracic organs at risk (OARs), especially the esophagus, in non-small cell lung cancer radiotherapy, using a novel two-step deep learning network. Materials and methods A total of 59 lung cancer patients' CT images were enrolled, of which 39 patients were randomly selected as the training set, 8 patients as the validation set, and 12 patients as the testing set. The automatic segmentations of the six OARs including the esophagus were carried out. In addition, two sets of treatment plans were made on the basis of the manually delineated tumor and OARs (Plan1) as well as the manually delineated tumor and the automatically delineated OARs (Plan2). The Dice similarity coefficient (DSC), 95% Hausdorff distance (HD95), and average surface distance (ASD) of the proposed model were compared with those of U-Net as a benchmark. Next, two groups of plans were also compared according to the dose-volume histogram parameters. Results The DSC, HD95, and ASD of the proposed model were better than those of U-Net, while the two groups of plans were almost the same. The highest mean DSC of the proposed method was 0.94 for the left lung, and the lowest HD95 and ASD were 3.78 and 1.16 mm for the trachea, respectively. Moreover, the DSC reached 0.73 for the esophagus. Conclusions The two-step segmentation method can accurately segment the OARs of lung cancer. The mean DSC of the esophagus realized preliminary clinical significance (>0.70). Choosing different deep learning networks based on different characteristics of organs offers a new option for automatic segmentation in radiotherapy.
Collapse
Affiliation(s)
- Fuli Zhang
- Radiation Oncology Department, The Seventh Medical Center of Chinese People's Liberation Army (PLA) General Hospital, Beijing, China
| | - Qiusheng Wang
- School of Automation Science and Electrical Engineering, Beihang University, Beijing, China
| | - Na Lu
- Radiation Oncology Department, The Seventh Medical Center of Chinese People's Liberation Army (PLA) General Hospital, Beijing, China
| | - Diandian Chen
- Radiation Oncology Department, The Seventh Medical Center of Chinese People's Liberation Army (PLA) General Hospital, Beijing, China
| | - Huayong Jiang
- Radiation Oncology Department, The Seventh Medical Center of Chinese People's Liberation Army (PLA) General Hospital, Beijing, China
| | - Anning Yang
- School of Automation Science and Electrical Engineering, Beihang University, Beijing, China
| | - Yanjun Yu
- Radiation Oncology Department, The Seventh Medical Center of Chinese People's Liberation Army (PLA) General Hospital, Beijing, China
| | - Yadi Wang
- Radiation Oncology Department, The Seventh Medical Center of Chinese People's Liberation Army (PLA) General Hospital, Beijing, China
| |
Collapse
|
47
|
Costanzo A, Ertl-Wagner B, Sussman D. AFNet Algorithm for Automatic Amniotic Fluid Segmentation from Fetal MRI. Bioengineering (Basel) 2023; 10:783. [PMID: 37508809 PMCID: PMC10376488 DOI: 10.3390/bioengineering10070783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 06/25/2023] [Accepted: 06/27/2023] [Indexed: 07/30/2023] Open
Abstract
Amniotic Fluid Volume (AFV) is a crucial fetal biomarker when diagnosing specific fetal abnormalities. This study proposes a novel Convolutional Neural Network (CNN) model, AFNet, for segmenting amniotic fluid (AF) to facilitate clinical AFV evaluation. AFNet was trained and tested on a manually segmented and radiologist-validated AF dataset. AFNet outperforms ResUNet++ by using efficient feature mapping in the attention block and transposing convolutions in the decoder. Our experimental results show that AFNet achieved a mean Intersection over Union (mIoU) of 93.38% on our dataset, thereby outperforming other state-of-the-art models. While AFNet achieves performance scores similar to those of the UNet++ model, it does so while utilizing merely less than half the number of parameters. By creating a detailed AF dataset with an improved CNN architecture, we enable the quantification of AFV in clinical practice, which can aid in diagnosing AF disorders during gestation.
Collapse
Affiliation(s)
- Alejo Costanzo
- Department of Electrical, Computer and Biomedical Engineering, Faculty of Engineering and Architectural Sciences, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada
- Institute for Biomedical Engineering, Science and Technology (iBEST), Toronto Metropolitan University and St. Michael's Hospital, Toronto, ON M5B 1T8, Canada
| | - Birgit Ertl-Wagner
- Department of Diagnostic Imaging, The Hospital for Sick Children, Toronto, ON M5G 1X8, Canada
- Department of Medical Imaging, Faculty of Medicine, University of Toronto, Toronto, ON M5T 1W7, Canada
| | - Dafna Sussman
- Department of Electrical, Computer and Biomedical Engineering, Faculty of Engineering and Architectural Sciences, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada
- Institute for Biomedical Engineering, Science and Technology (iBEST), Toronto Metropolitan University and St. Michael's Hospital, Toronto, ON M5B 1T8, Canada
- Department of Obstetrics and Gynecology, Faculty of Medicine, University of Toronto, Toronto, ON M5G 1E2, Canada
| |
Collapse
|
48
|
Li X, Fang X, Yang G, Su S, Zhu L, Yu Z. TransU²-Net: An Effective Medical Image Segmentation Framework Based on Transformer and U²-Net. IEEE J Transl Eng Health Med 2023; 11:441-450. [PMID: 37817826 PMCID: PMC10561737 DOI: 10.1109/jtehm.2023.3289990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 04/15/2023] [Accepted: 06/17/2023] [Indexed: 10/12/2023]
Abstract
BACKGROUND In the past few years, U-Net based U-shaped architecture and skip-connections have made incredible progress in the field of medical image segmentation. U2-Net achieves good performance in computer vision. However, in the medical image segmentation task, U2-Net with over nesting is easy to overfit. PURPOSE A 2D network structure TransU2-Net combining transformer and a lighter weight U2-Net is proposed for automatic segmentation of brain tumor magnetic resonance image (MRI). METHODS The light-weight U2-Net architecture not only obtains multi-scale information but also reduces redundant feature extraction. Meanwhile, the transformer block embedded in the stacked convolutional layer obtains more global information; the transformer with skip-connection enhances spatial domain information representation. A new multi-scale feature map fusion strategy as a postprocessing method was proposed for better fusing high and low-dimensional spatial information. RESULTS Our proposed model TransU2-Net achieves better segmentation results, on the BraTS2021 dataset, our method achieves an average dice coefficient of 88.17%; Evaluation on the publicly available MSD dataset, we perform tumor evaluation, we achieve a dice coefficient of 74.69%; in addition to comparing the TransU2-Net results are compared with previously proposed 2D segmentation methods. CONCLUSIONS We propose an automatic medical image segmentation method combining transformers and U2-Net, which has good performance and is of clinical importance. The experimental results show that the proposed method outperforms other 2D medical image segmentation methods. Clinical Translation Statement: We use the BarTS2021 dataset and the MSD dataset which are publicly available databases. All experiments in this paper are in accordance with medical ethics.
Collapse
Affiliation(s)
- Xiang Li
- School of Safety Science and EngineeringAnhui University of Science and TechnologyHuainan232000China
| | - Xianjin Fang
- School of Computer Science and EngineeringAnhui University of Science and TechnologyHuainan232000China
- Institute of Artificial IntelligenceHefei Comprehensive National Science CenterHefei230009China
| | - Gaoming Yang
- School of Computer Science and EngineeringAnhui University of Science and TechnologyHuainan232000China
| | - Shuzhi Su
- School of Computer Science and EngineeringAnhui University of Science and TechnologyHuainan232000China
| | - Li Zhu
- Shanghai Chest Hospital, School of MedicineShanghai Jiao Tong UniversityShanghai200030China
| | - Zekuan Yu
- School of Computer Science and EngineeringAnhui University of Science and TechnologyHuainan232000China
- Academy for Engineering and TechnologyFudan UniversityShanghai200433China
| |
Collapse
|
49
|
Zhang S, Niu Y. LcmUNet: A Lightweight Network Combining CNN and MLP for Real-Time Medical Image Segmentation. Bioengineering (Basel) 2023; 10:712. [PMID: 37370643 DOI: 10.3390/bioengineering10060712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 05/26/2023] [Accepted: 06/06/2023] [Indexed: 06/29/2023] Open
Abstract
In recent years, UNet and its improved variants have become the main methods for medical image segmentation. Although these models have achieved excellent results in segmentation accuracy, their large number of network parameters and high computational complexity make it difficult to achieve medical image segmentation in real-time therapy and diagnosis rapidly. To address this problem, we introduce a lightweight medical image segmentation network (LcmUNet) based on CNN and MLP. We designed LcmUNet's structure in terms of model performance, parameters, and computational complexity. The first three layers are convolutional layers, and the last two layers are MLP layers. In the convolution part, we propose an LDA module that combines asymmetric convolution, depth-wise separable convolution, and an attention mechanism to reduce the number of network parameters while maintaining a strong feature-extraction capability. In the MLP part, we propose an LMLP module that helps enhance contextual information while focusing on local information and improves segmentation accuracy while maintaining high inference speed. This network also covers skip connections between the encoder and decoder at various levels. Our network achieves real-time segmentation results accurately in extensive experiments. With only 1.49 million model parameters and without pre-training, LcmUNet demonstrated impressive performance on different datasets. On the ISIC2018 dataset, it achieved an IoU of 85.19%, 92.07% recall, and 92.99% precision. On the BUSI dataset, it achieved an IoU of 63.99%, 79.96% recall, and 76.69% precision. Lastly, on the Kvasir-SEG dataset, LcmUNet achieved an IoU of 81.89%, 88.93% recall, and 91.79% precision.
Collapse
Affiliation(s)
- Shuai Zhang
- School of Computer and Information Science, Chongqing Normal University, Chongqing 401331, China
| | - Yanmin Niu
- School of Computer and Information Science, Chongqing Normal University, Chongqing 401331, China
| |
Collapse
|
50
|
Shi P, Qiu J, Abaxi SMD, Wei H, Lo FPW, Yuan W. Generalist Vision Foundation Models for Medical Imaging: A Case Study of Segment Anything Model on Zero-Shot Medical Segmentation. Diagnostics (Basel) 2023; 13:diagnostics13111947. [PMID: 37296799 DOI: 10.3390/diagnostics13111947] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 05/26/2023] [Accepted: 05/31/2023] [Indexed: 06/12/2023] Open
Abstract
Medical image analysis plays an important role in clinical diagnosis. In this paper, we examine the recent Segment Anything Model (SAM) on medical images, and report both quantitative and qualitative zero-shot segmentation results on nine medical image segmentation benchmarks, covering various imaging modalities, such as optical coherence tomography (OCT), magnetic resonance imaging (MRI), and computed tomography (CT), as well as different applications including dermatology, ophthalmology, and radiology. Those benchmarks are representative and commonly used in model development. Our experimental results indicate that while SAM presents remarkable segmentation performance on images from the general domain, its zero-shot segmentation ability remains restricted for out-of-distribution images, e.g., medical images. In addition, SAM exhibits inconsistent zero-shot segmentation performance across different unseen medical domains. For certain structured targets, e.g., blood vessels, the zero-shot segmentation of SAM completely failed. In contrast, a simple fine-tuning of it with a small amount of data could lead to remarkable improvement of the segmentation quality, showing the great potential and feasibility of using fine-tuned SAM to achieve accurate medical image segmentation for a precision diagnostics. Our study indicates the versatility of generalist vision foundation models on medical imaging, and their great potential to achieve desired performance through fine-turning and eventually address the challenges associated with accessing large and diverse medical datasets in support of clinical diagnostics.
Collapse
Affiliation(s)
- Peilun Shi
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Jianing Qiu
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- Department of Computing, Imperial College London, London SW7 2AZ, UK
| | - Sai Mu Dalike Abaxi
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Hao Wei
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Frank P-W Lo
- Hamlyn Centre, Department of Surgery and Cancer, Imperial College London, London SW7 2AZ, UK
| | - Wu Yuan
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| |
Collapse
|