1
|
Zhang X, Zhang S, Jiang Y, Tian L. MEF-Net: Multi-scale and edge feature fusion network for intracranial hemorrhage segmentation in CT images. Comput Biol Med 2025; 192:110245. [PMID: 40286496 DOI: 10.1016/j.compbiomed.2025.110245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Revised: 04/14/2025] [Accepted: 04/21/2025] [Indexed: 04/29/2025]
Abstract
Intracranial Hemorrhage (ICH) refers to cerebral bleeding resulting from ruptured blood vessels within the brain. Delayed and inaccurate diagnosis and treatment of ICH can lead to fatality or disability. Therefore, early and precise diagnosis of intracranial hemorrhage is crucial for protecting patients' lives. Automatic segmentation of hematomas in CT images can provide doctors with essential diagnostic support and improve diagnostic efficiency. CT images of intracranial hemorrhage exhibit characteristics such as multi-scale, multi-target, and blurred edges. This paper proposes a Multi-scale and Edge Feature Fusion Network (MEF-Net) to effectively extract multi-scale and edge features and fully fuse these features through a fusion mechanism. The network first extracts the multi-scale features and edge features of the image through the encoder and the edge detection module respectively, then fuses the deep information, and employs the multi-kernel attention module to process the shallow features, enhancing the multi-target recognition capability. Finally, the feature maps from each module are combined to produce the segmentation result. Experimental results indicate that this method has achieved average DICE scores of 0.7508 and 0.7443 in two public datasets respectively, surpassing those of several advanced methods in medical image segmentation currently available. The proposed MEF-Net significantly improves the accuracy of intracranial hemorrhage segmentation.
Collapse
Affiliation(s)
- Xiufeng Zhang
- Mechanical and Electrical Engineering, Dalian Minzu University, Liaohe West Road 18, Dalian, China
| | - Shichen Zhang
- Mechanical and Electrical Engineering, Dalian Minzu University, Liaohe West Road 18, Dalian, China.
| | - Yunfei Jiang
- Mechanical and Electrical Engineering, Dalian Minzu University, Liaohe West Road 18, Dalian, China
| | - Lingzhuo Tian
- Mechanical and Electrical Engineering, Dalian Minzu University, Liaohe West Road 18, Dalian, China
| |
Collapse
|
2
|
Anari S, Sadeghi S, Sheikhi G, Ranjbarzadeh R, Bendechache M. Explainable attention based breast tumor segmentation using a combination of UNet, ResNet, DenseNet, and EfficientNet models. Sci Rep 2025; 15:1027. [PMID: 39762417 PMCID: PMC11704294 DOI: 10.1038/s41598-024-84504-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2024] [Accepted: 12/24/2024] [Indexed: 01/11/2025] Open
Abstract
This study utilizes the Breast Ultrasound Image (BUSI) dataset to present a deep learning technique for breast tumor segmentation based on a modified UNet architecture. To improve segmentation accuracy, the model integrates attention mechanisms, such as the Convolutional Block Attention Module (CBAM) and Non-Local Attention, with advanced encoder architectures, including ResNet, DenseNet, and EfficientNet. These attention mechanisms enable the model to focus more effectively on relevant tumor areas, resulting in significant performance improvements. Models incorporating attention mechanisms outperformed those without, as reflected in superior evaluation metrics. The effects of Dice Loss and Binary Cross-Entropy (BCE) Loss on the model's performance were also analyzed. Dice Loss maximized the overlap between predicted and actual segmentation masks, leading to more precise boundary delineation, while BCE Loss achieved higher recall, improving the detection of tumor areas. Grad-CAM visualizations further demonstrated that attention-based models enhanced interpretability by accurately highlighting tumor areas. The findings denote that combining advanced encoder architectures, attention mechanisms, and the UNet framework can yield more reliable and accurate breast tumor segmentation. Future research will explore the use of multi-modal imaging, real-time deployment for clinical applications, and more advanced attention mechanisms to further improve segmentation performance.
Collapse
Affiliation(s)
- Shokofeh Anari
- Department of Accounting, Economic and Financial Sciences, Islamic Azad University, South Tehran Branch, Tehran, Iran.
| | - Soroush Sadeghi
- School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran
| | - Ghazaal Sheikhi
- Final International University, Kyrenia, Mersin 10, North Cyprus, Turkey
| | - Ramin Ranjbarzadeh
- School of Computing of Computing, Faculty of Engineering and Computing, Dublin City University, Dublin, Ireland
| | - Malika Bendechache
- ADAPT Research Centre, School of Computer Science, University of Galway, Galway, Ireland
| |
Collapse
|
3
|
Agarwal R, Chowdhury A, Chatterjee RK, Chel H, Murmu C, Murmu N, Nandi D. Deep Quasi-Recurrent Self-Attention With Dual Encoder-Decoder in Biomedical CT Image Segmentation. IEEE J Biomed Health Inform 2024; 28:7195-7205. [PMID: 39172619 DOI: 10.1109/jbhi.2024.3447689] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/24/2024]
Abstract
Developing deep learning models for accurate segmentation of biomedical CT images is challenging due to their complex structures, anatomy variations, noise, and unavailability of sufficient labeled data to train the models. There are many models in the literature, but the researchers are yet to be satisfied with their performance in analyzing biomedical Computed Tomography (CT) images. In this article, we pioneer a deep quasi-recurrent self-attention structure that works with a dual encoder-decoder. The proposed novel deep quasi-recurrent self-attention architecture evokes parameter reuse capability that offers consistency in learning and quick convergence of the model. Furthermore, the quasi-recurrent structure leverages the features acquired from the previous time points and elevates the segmentation quality. The model also efficiently addresses long-range dependencies through a selective focus on contextual information and hierarchical representation. Moreover, the dynamic and adaptive operation, incremental and efficient information processing of the deep quasi-recurrent self-attention structure leads to improved generalization across different scales and levels of abstraction. Along with the model, we innovate a new training strategy that fits with the proposed deep quasi-recurrent self-attention architecture. The model performance is evaluated on various publicly available CT scan datasets and compared with state-of-the-art models. The result shows that the proposed model outperforms them in segmentation quality and training speed. The model can assist physicians in improving the accuracy of medical diagnoses.
Collapse
|
4
|
Agarwal R, Ghosal P, Sadhu AK, Murmu N, Nandi D. Multi-scale dual-channel feature embedding decoder for biomedical image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 257:108464. [PMID: 39447437 DOI: 10.1016/j.cmpb.2024.108464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 10/07/2024] [Accepted: 10/13/2024] [Indexed: 10/26/2024]
Abstract
BACKGROUND AND OBJECTIVE Attaining global context along with local dependencies is of paramount importance for achieving highly accurate segmentation of objects from image frames and is challenging while developing deep learning-based biomedical image segmentation. Several transformer-based models have been proposed to handle this issue in biomedical image segmentation. Despite this, segmentation accuracy remains an ongoing challenge, as these models often fall short of the target range due to their limited capacity to capture critical local and global contexts. However, the quadratic computational complexity is the main limitation of these models. Moreover, a large dataset is required to train those models. METHODS In this paper, we propose a novel multi-scale dual-channel decoder to mitigate this issue. The complete segmentation model uses two parallel encoders and a dual-channel decoder. The encoders are based on convolutional networks, which capture the features of the input images at multiple levels and scales. The decoder comprises a hierarchy of Attention-gated Swin Transformers with a fine-tuning strategy. The hierarchical Attention-gated Swin Transformers implements a multi-scale, multi-level feature embedding strategy that captures short and long-range dependencies and leverages the necessary features without increasing computational load. At the final stage of the decoder, a fine-tuning strategy is implemented that refines the features to keep the rich features and reduce the possibility of over-segmentation. RESULTS The proposed model is evaluated on publicly available LiTS, 3DIRCADb, and spleen datasets obtained from Medical Segmentation Decathlon. The model is also evaluated on a private dataset from Medical College Kolkata, India. We observe that the proposed model outperforms the state-of-the-art models in liver tumor and spleen segmentation in terms of evaluation metrics at a comparative computational cost. CONCLUSION The novel dual-channel decoder embeds multi-scale features and creates a representation of both short and long-range contexts efficiently. It also refines the features at the final stage to select only necessary features. As a result, we achieve better segmentation performance than the state-of-the-art models.
Collapse
Affiliation(s)
- Rohit Agarwal
- Department of Computer Science and Engineering, National Institute of Technology, Durgapur 713209, West Bengal, India
| | - Palash Ghosal
- Department of Information Technology, Sikkim Manipal Institute of Technology, Sikkim Manipal University, India
| | - Anup K Sadhu
- EKO Diagnostic Center, Medical College Kolkata, India
| | - Narayan Murmu
- Department of Computer Science and Engineering, National Institute of Technology, Durgapur 713209, West Bengal, India
| | - Debashis Nandi
- Department of Computer Science and Engineering, National Institute of Technology, Durgapur 713209, West Bengal, India.
| |
Collapse
|
5
|
Zhang K, Zhu Y, Li H, Zeng Z, Liu Y, Zhang Y. MDANet: Multimodal difference aware network for brain stroke segmentation. Biomed Signal Process Control 2024; 95:106383. [DOI: 10.1016/j.bspc.2024.106383] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
6
|
Kang Z, Xiao E, Li Z, Wang L. Deep Learning Based on ResNet-18 for Classification of Prostate Imaging-Reporting and Data System Category 3 Lesions. Acad Radiol 2024; 31:2412-2423. [PMID: 38302387 DOI: 10.1016/j.acra.2023.12.042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 12/25/2023] [Accepted: 12/30/2023] [Indexed: 02/03/2024]
Abstract
RATIONALE AND OBJECTIVES To explore the classification and prediction efficacy of the deep learning model for benign prostate lesions, non-clinically significant prostate cancer (non-csPCa) and clinically significant prostate cancer (csPCa) in Prostate Imaging-Reporting and Data System (PI-RADS) 3 lesions. MATERIALS AND METHODS From January 2015 to December 2021, lesions diagnosed with PI-RADS 3 by multi-parametric MRI or bi-parametric MRI were retrospectively included. They were classified as benign prostate lesions, non-csPCa, and csPCa according to the pathological results. T2-weighted images of the lesions were divided into a training set and a test set according to 8:2. ResNet-18 was used for model training. All statistical analyses were performed using Python open-source libraries. The receiver operating characteristic curve (ROC) was used to evaluate the predictive effectiveness of the model. T-SNE was used for image semantic feature visualization. The class activation mapping was used to visualize the area focused by the model. RESULTS A total of 428 benign prostate lesion images, 158 non-csPCa images and 273 csPCa images were included. The precision in predicting benign prostate disease, non-csPCa and csPCa were 0.882, 0.681 and 0.851, and the area under the ROC were 0.875, 0.89 and 0.929, respectively. Semantic feature analysis showed strong classification separability between csPCa and benign prostate lesions. The class activation map showed that the deep learning model can focus on the area of the prostate or the location of PI-RADS 3 lesions. CONCLUSION Deep learning model with T2-weighted images based on ResNet-18 can realize accurate classification of PI-RADS 3 lesions.
Collapse
Affiliation(s)
- Zhen Kang
- Department of Radiology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei Province, China
| | - Enhua Xiao
- Department of Radiology, the Second Xiangya Hospital, Central South University, Changsha, Hunan Province, China
| | - Zhen Li
- Department of Radiology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei Province, China
| | - Liang Wang
- Department of Radiology, Beijing Friendship Hospital, Capital Medical University, Beijing, China.
| |
Collapse
|
7
|
Kuang H, Wang Y, Liu J, Wang J, Cao Q, Hu B, Qiu W, Wang J. Hybrid CNN-Transformer Network With Circular Feature Interaction for Acute Ischemic Stroke Lesion Segmentation on Non-Contrast CT Scans. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2303-2316. [PMID: 38319756 DOI: 10.1109/tmi.2024.3362879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2024]
Abstract
Lesion segmentation is a fundamental step for the diagnosis of acute ischemic stroke (AIS). Non-contrast CT (NCCT) is still a mainstream imaging modality for AIS lesion measurement. However, AIS lesion segmentation on NCCT is challenging due to low contrast, noise and artifacts. To achieve accurate AIS lesion segmentation on NCCT, this study proposes a hybrid convolutional neural network (CNN) and Transformer network with circular feature interaction and bilateral difference learning. It consists of parallel CNN and Transformer encoders, a circular feature interaction module, and a shared CNN decoder with a bilateral difference learning module. A new Transformer block is particularly designed to solve the weak inductive bias problem of the traditional Transformer. To effectively combine features from CNN and Transformer encoders, we first design a multi-level feature aggregation module to combine multi-scale features in each encoder and then propose a novel feature interaction module containing circular CNN-to-Transformer and Transformer-to-CNN interaction blocks. Besides, a bilateral difference learning module is proposed at the bottom level of the decoder to learn the different information between the ischemic and contralateral sides of the brain. The proposed method is evaluated on three AIS datasets: the public AISD, a private dataset and an external dataset. Experimental results show that the proposed method achieves Dices of 61.39% and 46.74% on the AISD and the private dataset, respectively, outperforming 17 state-of-the-art segmentation methods. Besides, volumetric analysis on segmented lesions and external validation results imply that the proposed method is potential to provide support information for AIS diagnosis.
Collapse
|
8
|
Ahmed R, Al Shehhi A, Hassan B, Werghi N, Seghier ML. An appraisal of the performance of AI tools for chronic stroke lesion segmentation. Comput Biol Med 2023; 164:107302. [PMID: 37572443 DOI: 10.1016/j.compbiomed.2023.107302] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2023] [Revised: 07/18/2023] [Accepted: 07/28/2023] [Indexed: 08/14/2023]
Abstract
Automated demarcation of stoke lesions from monospectral magnetic resonance imaging scans is extremely useful for diverse research and clinical applications, including lesion-symptom mapping to explain deficits and predict recovery. There is a significant surge of interest in the development of supervised artificial intelligence (AI) methods for that purpose, including deep learning, with a performance comparable to trained experts. Such AI-based methods, however, require copious amounts of data. Thanks to the availability of large datasets, the development of AI-based methods for lesion segmentation has immensely accelerated in the last decade. One of these datasets is the Anatomical Tracings of Lesions After Stroke (ATLAS) dataset which includes T1-weighted images from hundreds of chronic stroke survivors with their manually traced lesions. This systematic review offers an appraisal of the impact of the ATLAS dataset in promoting the development of AI-based segmentation of stroke lesions. An examination of all published studies, that used the ATLAS dataset to both train and test their methods, highlighted an overall moderate performance (median Dice index = 59.40%) and a huge variability across studies in terms of data preprocessing, data augmentation, AI architecture, and the mode of operation (two-dimensional versus three-dimensional methods). Perhaps most importantly, almost all AI tools were borrowed from existing AI architectures in computer vision, as 90% of all selected studies relied on conventional convolutional neural network-based architectures. Overall, current research has not led to the development of robust AI architectures than can handle spatially heterogenous lesion patterns. This review also highlights the difficulty of gauging the performance of AI tools in the presence of uncertainties in the definition of the ground truth.
Collapse
Affiliation(s)
- Ramsha Ahmed
- Department of Biomedical Engineering, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Aamna Al Shehhi
- Department of Biomedical Engineering, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates; Healthcare Engineering Innovation Center (HEIC), Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Bilal Hassan
- Department of Electrical Engineering and Computer Science, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Naoufel Werghi
- Healthcare Engineering Innovation Center (HEIC), Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates; Department of Electrical Engineering and Computer Science, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Mohamed L Seghier
- Department of Biomedical Engineering, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates; Healthcare Engineering Innovation Center (HEIC), Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates.
| |
Collapse
|