1
|
Wang T, Li X, Liu R, Wang M, Sun J. DECE-Net: a dual-path encoder network with contour enhancement for pneumonia lesion segmentation. J Med Imaging (Bellingham) 2025; 12:034503. [PMID: 40415864 PMCID: PMC12101900 DOI: 10.1117/1.jmi.12.3.034503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2024] [Revised: 04/22/2025] [Accepted: 04/29/2025] [Indexed: 05/27/2025] Open
Abstract
Purpose Early-stage pneumonia is not easily detected, leading to many patients missing the optimal treatment window. This is because segmenting lesion areas from CT images presents several challenges, including low-intensity contrast between the lesion and normal areas, as well as variations in the shape and size of lesion areas. To overcome these challenges, we propose a segmentation network called DECE-Net to segment the pneumonia lesions from CT images automatically. Approach The DECE-Net adds an extra encoder path to the U-Net, where one encoder path extracts the features of the original CT image with the attention multi-scale feature fusion module, and the other encoder path extracts the contour features in the CT contour image with the contour feature extraction module to compensate and enhance the boundary information that is lost in the downsampling process. The network further fuses the low-level features from both encoder paths through the feature fusion attention connection module and connects them to the upsampled high-level features to replace the skip connections in the U-Net. Finally, multi-point deep supervision is applied to the segmentation results at each scale to improve segmentation accuracy. Results We evaluate the DECE-Net using four public COVID-19 segmentation datasets. The mIoU results for the four datasets are 80.76%, 84.59%, 84.41%, and 78.55%, respectively. Conclusions The experimental results indicate that the proposed DECE-Net achieves state-of-the-art performance, especially in the precise segmentation of small lesion areas.
Collapse
Affiliation(s)
- Tianyang Wang
- Hangzhou Normal University, School of Information Science and Technology, Hangzhou, China
| | - Xiumei Li
- Hangzhou Normal University, School of Information Science and Technology, Hangzhou, China
| | - Ruyu Liu
- Hangzhou Normal University, School of Information Science and Technology, Hangzhou, China
| | - Meixi Wang
- Hangzhou Normal University, School of Information Science and Technology, Hangzhou, China
| | - Junmei Sun
- Hangzhou Normal University, School of Information Science and Technology, Hangzhou, China
| |
Collapse
|
2
|
Tian Y, Mao Q, Wang W, Zhang Y. Hierarchical agent transformer network for COVID-19 infection segmentation. Biomed Phys Eng Express 2025; 11:025055. [PMID: 40014880 DOI: 10.1088/2057-1976/adbafa] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Accepted: 02/27/2025] [Indexed: 03/01/2025]
Abstract
Accurate and timely segmentation of COVID-19 infection regions is critical for effective diagnosis and treatment. While convolutional neural networks (CNNs) exhibit strong performance in medical image segmentation, they face challenges in handling complex lesion morphologies with irregular boundaries. Transformer-based approaches, though demonstrating superior capability in capturing global context, suffer from high computational costs and suboptimal multi-scale feature integration. To address these limitations, we proposed Hierarchical Agent Transformer Network (HATNet), a hierarchical encoder-bridge-decoder architecture that optimally balances segmentation accuracy with computational efficiency. The encoder employs novel agent Transformer blocks specifically designed to capture subtle features of small COVID-19 lesions through agent tokens with linear computational complexity. A diversity restoration module (DRM) is innovatively embedded within each agent Transformer block to counteract feature degradation. The hierarchical structure simultaneously extracts high-resolution shallow features and low-resolution fine features, ensuring comprehensive feature representation. The bridge stage incorporates an improved pyramid pooling module (IPPM) that establishes hierarchical global priors, significantly improving contextual understanding for the decoder. The decoder integrates a full-scale bidirectional feature pyramid network (FsBiFPN) with a dedicated border-refinement module (BRM), collectively enhancing edge precision. The HATNet were evaluated on the COVID-19-CT-Seg and CC-CCII datasets. Experimental results yielded Dice scores of 84.14% and 81.22% respectively, demonstrating superior segmentation performance compared to state-of-the-art models. Furthermore, it achieved notable advantages in model parameters and computational complexity, highlighting its clinical deployment potential.
Collapse
Affiliation(s)
- Yi Tian
- College of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, 201620, People's Republic of China
| | - Qi Mao
- College of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, 201620, People's Republic of China
| | - Wenfeng Wang
- College of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, 201620, People's Republic of China
| | - Yan Zhang
- College of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, 201620, People's Republic of China
| |
Collapse
|
3
|
Shafi SM, Chinnappan SK. Hybrid transformer-CNN and LSTM model for lung disease segmentation and classification. PeerJ Comput Sci 2024; 10:e2444. [PMID: 39896390 PMCID: PMC11784776 DOI: 10.7717/peerj-cs.2444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Accepted: 10/01/2024] [Indexed: 02/04/2025]
Abstract
According to the World Health Organization (WHO) report, lung disorders are the third leading cause of mortality worldwide. Approximately three million individuals are affected with various types of lung disorders annually. This issue alarms us to take control measures related to early diagnostics, accurate treatment procedures, etc. The precise identification through the assessment of medical images is crucial for pulmonary disease diagnosis. Also, it remains a formidable challenge due to the diverse and unpredictable nature of pathological lung appearances and shapes. Therefore, the efficient lung disease segmentation and classification model is essential. By taking this initiative, a novel lung disease segmentation with a hybrid LinkNet-Modified LSTM (L-MLSTM) model is proposed in this research article. The proposed model utilizes four essential and fundamental steps for its implementation. The first step is pre-processing, where the input lung images are pre-processed using median filtering. Consequently, an improved Transformer-based convolutional neural network (CNN) model (ITCNN) is proposed to segment the affected region in the segmentation process. After segmentation, essential features such as texture, shape, color, and deep features are retrieved. Specifically, texture features are extracted using modified Local Gradient Increasing Pattern (LGIP) and Multi-texton analysis. Then, the classification step utilizes a hybrid model, the L-MLSTM model. This work leverages two datasets such as the COVID-19 normal pneumonia-CT images dataset (Dataset 1) and the Chest CT scan images dataset (Dataset 2). The dataset is crucial for training and evaluating the model, providing a comprehensive basis for robust and generalizable results. The L-MLSTM model outperforms several existing models, including HDE-NN, DBN, LSTM, LINKNET, SVM, Bi-GRU, RNN, CNN, and VGG19 + CNN, with accuracies of 89% and 95% at learning percentages of 70 and 90, respectively, for datasets 1 and 2. The improved accuracy achieved by the L-MLSTM model highlights its capability to better handle the complexity and variability in lung images. This hybrid approach enhances the model's ability to distinguish between different types of lung diseases and reduces diagnostic errors compared to existing methods.
Collapse
|
4
|
Wang W, Mao Q, Tian Y, Zhang Y, Xiang Z, Ren L. FMD-UNet: fine-grained feature squeeze and multiscale cascade dilated semantic aggregation dual-decoder UNet for COVID-19 lung infection segmentation from CT images. Biomed Phys Eng Express 2024; 10:055031. [PMID: 39142295 DOI: 10.1088/2057-1976/ad6f12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Accepted: 08/14/2024] [Indexed: 08/16/2024]
Abstract
With the advancement of computer-aided diagnosis, the automatic segmentation of COVID-19 infection areas holds great promise for assisting in the timely diagnosis and recovery of patients in clinical practice. Currently, methods relying on U-Net face challenges in effectively utilizing fine-grained semantic information from input images and bridging the semantic gap between the encoder and decoder. To address these issues, we propose an FMD-UNet dual-decoder U-Net network for COVID-19 infection segmentation, which integrates a Fine-grained Feature Squeezing (FGFS) decoder and a Multi-scale Dilated Semantic Aggregation (MDSA) decoder. The FGFS decoder produces fine feature maps through the compression of fine-grained features and a weighted attention mechanism, guiding the model to capture detailed semantic information. The MDSA decoder consists of three hierarchical MDSA modules designed for different stages of input information. These modules progressively fuse different scales of dilated convolutions to process the shallow and deep semantic information from the encoder, and use the extracted feature information to bridge the semantic gaps at various stages, this design captures extensive contextual information while decoding and predicting segmentation, thereby suppressing the increase in model parameters. To better validate the robustness and generalizability of the FMD-UNet, we conducted comprehensive performance evaluations and ablation experiments on three public datasets, and achieved leading Dice Similarity Coefficient (DSC) scores of 84.76, 78.56 and 61.99% in COVID-19 infection segmentation, respectively. Compared to previous methods, the FMD-UNet has fewer parameters and shorter inference time, which also demonstrates its competitiveness.
Collapse
Affiliation(s)
- Wenfeng Wang
- School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, 201620, People's Republic of China
| | - Qi Mao
- School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, 201620, People's Republic of China
| | - Yi Tian
- School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, 201620, People's Republic of China
| | - Yan Zhang
- School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, 201620, People's Republic of China
| | - Zhenwu Xiang
- School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, 201620, People's Republic of China
| | - Lijia Ren
- School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, 201620, People's Republic of China
| |
Collapse
|
5
|
Zhou J, Xiong H, Liu Q. A novel Dual-Branch Asymmetric Encoder-Decoder Segmentation Network for accurate colonic crypt segmentation. Comput Biol Med 2024; 173:108354. [PMID: 38522251 DOI: 10.1016/j.compbiomed.2024.108354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 03/04/2024] [Accepted: 03/19/2024] [Indexed: 03/26/2024]
Abstract
Colorectal cancer (CRC) is a leading cause of cancer-related deaths, with colonic crypts (CC) being crucial in its development. Accurate segmentation of CC is essential for decisions CRC and developing diagnostic strategies. However, colonic crypts' blurred boundaries and morphological diversity bring substantial challenges for automatic segmentation. To mitigate this problem, we proposed the Dual-Branch Asymmetric Encoder-Decoder Segmentation Network (DAUNet), a novel and efficient model tailored for confocal laser endomicroscopy (CLE) CC images. In DAUNet, we crafted a dual-branch feature extraction module (DFEM), employing Focus operations and dense depth-wise separable convolution (DDSC) to extract multiscale features, boosting semantic understanding and coping with the morphological diversity of CC. We also introduced the feature fusion guided module (FFGM) to adaptively combine features from both branches using cross-group spatial and channel attention to improve the model representation in focusing on specific lesion features. These modules are seamlessly integrated into the encoder for effective multiscale information extraction and fusion, and DDSC is further introduced in the decoder to provide rich representations for precise segmentation. Moreover, the local multi-layer perceptron (LMLP) module is designed to decouple and recalibrate features through a local linear transformation that filters out the noise and refines features to provide edge-enriched representation. Experimental evaluations on two datasets demonstrate that the proposed method achieves Intersection over Union (IoU) scores of 81.54% and 84.83%, respectively, which are on par with state-of-the-art methods, exhibiting its effectiveness for CC segmentation. The proposed method holds great potential in assisting physicians with precise lesion localization and region analysis, thereby improving the diagnostic accuracy of CRC.
Collapse
Affiliation(s)
- Jingjun Zhou
- School of Biomedical Engineering, Hainan University, Haikou, 570228, China.
| | - Hong Xiong
- School of Biomedical Engineering, Hainan University, Haikou, 570228, China.
| | - Qian Liu
- School of Biomedical Engineering, Hainan University, Haikou, 570228, China; Key Laboratory of Biomedical Engineering of Hainan Province, Hainan University, Haikou, 570228, China.
| |
Collapse
|