1
|
Hao H, Zhao Y, Leng S, Gu Y, Ma Y, Wang F, Dai Q, Zheng J, Liu Y, Zhang J. Local salient location-aware anomaly mask synthesis for pulmonary disease anomaly detection and lesion localization in CT images. Med Image Anal 2025; 102:103523. [PMID: 40086182 DOI: 10.1016/j.media.2025.103523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Revised: 12/12/2024] [Accepted: 02/22/2025] [Indexed: 03/16/2025]
Abstract
Automated pulmonary anomaly detection using computed tomography (CT) examinations is important for the early warning of pulmonary diseases and can support clinical diagnosis and decision-making. Most training of existing pulmonary disease detection and lesion segmentation models requires expert annotations, which is time-consuming and labour-intensive, and struggles to generalize across atypical diseases. In contrast, unsupervised anomaly detection alleviates the demand for dataset annotation and is more generalizable than supervised methods in detecting rare pathologies. However, due to the large distribution differences of CT scans in a volume and the high similarity between lesion and normal tissues, existing anomaly detection methods struggle to accurately localize small lesions, leading to a low anomaly detection rate. To alleviate these challenges, we propose a local salient location-aware anomaly mask generation and reconstruction framework for pulmonary disease anomaly detection and lesion localization. The framework consists of four components: (1) a Vector Quantized Variational AutoEncoder (VQVAE)-based reconstruction network that generates a codebook storing high-dimensional features; (2) a unsupervised feature statistics based anomaly feature synthesizer to synthesize features that match the realistic anomaly distribution by filtering salient features and interacting with the codebook; (3) a transformer-based feature classification network that identifies synthetic anomaly features; (4) a residual neighbourhood aggregation feature classification loss that mitigates network overfitting by penalizing the classification loss of recoverable corrupted features. Our approach is based on two intuitions. First, generating synthetic anomalies in feature space is more effective due to the fact that lesions have different morphologies in image space and may not have much in common. Secondly, regions with salient features or high reconstruction errors in CT images tend to be similar to lesions and are more prone to synthesize abnormal features. The performance of the proposed method is validated on one public dataset with COVID-19 and one in-house dataset containing 63,610 CT images with five lung diseases. Experimental results show that compared to feature-based, synthesis-based and reconstruction-based methods, the proposed method is adaptable to CT images with four pneumonia types (COVID-19, bacteria, fungal, and mycoplasma) and one non-pneumonia (cancer) diseases and achieves state-of-the-art performance in image-level anomaly detection and lesion localization.
Collapse
Affiliation(s)
- Huaying Hao
- School of Optics and Photonics, Beijing Institute of Technology, China
| | - Yitian Zhao
- Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo, China; Ningbo Cixi Institute of Biomedical Engineering and Ningbo Key Laboratory of Biomedical Imaging Probe Materials and Technology, Cixi, China.
| | - Shaoyi Leng
- Department of Radiology, Ningbo No. 2 Hospital, Ningbo, China
| | - Yuanyuan Gu
- Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo, China; Ningbo Cixi Institute of Biomedical Engineering and Ningbo Key Laboratory of Biomedical Imaging Probe Materials and Technology, Cixi, China
| | - Yuhui Ma
- Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo, China; Ningbo Cixi Institute of Biomedical Engineering and Ningbo Key Laboratory of Biomedical Imaging Probe Materials and Technology, Cixi, China
| | - Feiming Wang
- Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo, China
| | - Qi Dai
- Department of Radiology, Ningbo No. 2 Hospital, Ningbo, China
| | - Jianjun Zheng
- Department of Radiology, Ningbo No. 2 Hospital, Ningbo, China
| | - Yue Liu
- School of Optics and Photonics, Beijing Institute of Technology, China.
| | - Jingfeng Zhang
- Department of Radiology, Ningbo No. 2 Hospital, Ningbo, China.
| |
Collapse
|
2
|
Xu J, Wang H, Lu M, Bi H, Li D, Xue Z, Zhang Q. An accurate and trustworthy deep learning approach for bladder tumor segmentation with uncertainty estimation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 263:108645. [PMID: 39954510 DOI: 10.1016/j.cmpb.2025.108645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2024] [Revised: 01/19/2025] [Accepted: 02/02/2025] [Indexed: 02/17/2025]
Abstract
BACKGROUND AND OBJECTIVE Although deep learning-based intelligent diagnosis of bladder cancer has achieved excellent performance, the reliability of neural network predicted results may not be evaluated. This study aims to explore a trustworthy AI-based tumor segmentation model, which not only outputs predicted results but also provides confidence information about the predictions. METHODS This paper proposes a novel model for bladder tumor segmentation with uncertainty estimation (BSU), which is not merely able to effectively segment the lesion area but also yields an uncertainty map showing the confidence information of the segmentation results. In contrast to previous uncertainty estimation, we utilize test time augmentation (TTA) and test time dropout (TTD) to estimate aleatoric uncertainty and epistemic uncertainty in both internal and external datasets to explore the effects of both uncertainties on different datasets. RESULTS Our BSU model achieved the Dice coefficients of 0.766 and 0.848 on internal and external cystoscopy datasets, respectively, along with accuracy of 0.950 and 0.954. Compared to the state-of-the-art methods, our BSU model demonstrated superior performance, which was further validated by the statistically significance of the t-tests at the conventional level. Clinical experiments verified the practical value of uncertainty estimation in real-world bladder cancer diagnostics. CONCLUSIONS The proposed BSU model is able to visualize the confidence of the segmentation results, serving as a valuable addition for assisting urologists in enhancing both the precision and efficiency of bladder cancer diagnoses in clinical practice.
Collapse
Affiliation(s)
- Jie Xu
- School of Information Technology and Management, University of International Business and Economics, Beijing 100029, China
| | - Haixin Wang
- Cadre Medical Department, The 1st medical Center, Chinese PLA General Hospital, Beijing 100853, China
| | - Min Lu
- Department of Pathology, School of Basic Medical Sciences, Peking University Third Hospital, Peking University Health Science Center, Beijing 100191, China
| | - Hai Bi
- Department of Urology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 201620, China
| | - Deng Li
- Department of Urology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 201620, China
| | - Zixuan Xue
- Department of Urology, Peking University Third Hospital, Beijing 100191, China
| | - Qi Zhang
- School of Information Technology and Management, University of International Business and Economics, Beijing 100029, China.
| |
Collapse
|
3
|
Dang Y, Ma W, Luo X, Wang H. CAD-Unet: A capsule network-enhanced Unet architecture for accurate segmentation of COVID-19 lung infections from CT images. Med Image Anal 2025; 103:103583. [PMID: 40306203 DOI: 10.1016/j.media.2025.103583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Revised: 02/17/2025] [Accepted: 04/04/2025] [Indexed: 05/02/2025]
Abstract
Since the outbreak of the COVID-19 pandemic in 2019, medical imaging has emerged as a primary modality for diagnosing COVID-19 pneumonia. In clinical settings, the segmentation of lung infections from computed tomography images enables rapid and accurate quantification and diagnosis of COVID-19. Segmentation of COVID-19 infections in the lungs poses a formidable challenge, primarily due to the indistinct boundaries and limited contrast presented by ground glass opacity manifestations. Moreover, the confounding similarity among infiltrates, lung tissues, and lung walls further complicates this segmentation task. To address these challenges, this paper introduces a novel deep network architecture, called CAD-Unet, for segmenting COVID-19 lung infections. In this architecture, capsule networks are incorporated into the existing Unet framework. Capsule networks represent a novel type of network architecture that differs from traditional convolutional neural networks. They utilize vectors for information transfer among capsules, facilitating the extraction of intricate lesion spatial information. Additionally, we design a capsule encoder path and establish a coupling path between the unet encoder and the capsule encoder. This design maximizes the complementary advantages of both network structures while achieving efficient information fusion. Finally, extensive experiments are conducted on four publicly available datasets, encompassing binary segmentation tasks and multi-class segmentation tasks. The experimental results demonstrate the superior segmentation performance of the proposed model. The code has been released at: https://github.com/AmanoTooko-jie/CAD-Unet.
Collapse
Affiliation(s)
- Yijie Dang
- School of Information Engineering, Ningxia University, Yinchuan, 750021, Ningxia, China
| | - Weijun Ma
- School of Information Engineering, Ningxia University, Yinchuan, 750021, Ningxia, China; Ningxia Key Laboratory of Artificial Intelligence and Information Security for Channeling Computing Resources from the East to the West, Ningxia University, Yinchuan, 750021, Ningxia, China.
| | - Xiaohu Luo
- School of Mathematics and Computer Science, Ningxia Normal University, Guyuan, 756099, China
| | - Huaizhu Wang
- School of Advanced Interdisciplinary Studies, Ningxia University, Zhongwei, 755000, China
| |
Collapse
|
4
|
Yang Y, Sun G, Zhang T, Wang R, Su J. Semi-supervised medical image segmentation via weak-to-strong perturbation consistency and edge-aware contrastive representation. Med Image Anal 2025; 101:103450. [PMID: 39798528 DOI: 10.1016/j.media.2024.103450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 12/02/2024] [Accepted: 12/26/2024] [Indexed: 01/15/2025]
Abstract
Despite that supervised learning has demonstrated impressive accuracy in medical image segmentation, its reliance on large labeled datasets poses a challenge due to the effort and expertise required for data acquisition. Semi-supervised learning has emerged as a potential solution. However, it tends to yield satisfactory segmentation performance in the central region of the foreground, but struggles in the edge region. In this paper, we propose an innovative framework that effectively leverages unlabeled data to improve segmentation performance, especially in edge regions. Our proposed framework includes two novel designs. Firstly, we introduce a weak-to-strong perturbation strategy with corresponding feature-perturbed consistency loss to efficiently utilize unlabeled data and guide our framework in learning reliable regions. Secondly, we propose an edge-aware contrastive loss that utilizes uncertainty to select positive pairs, thereby learning discriminative pixel-level features in the edge regions using unlabeled data. In this way, the model minimizes the discrepancy of multiple predictions and improves representation ability, ultimately aiming at impressive performance on both primary and edge regions. We conducted a comparative analysis of the segmentation results on the publicly available BraTS2020 dataset, LA dataset, and the 2017 ACDC dataset. Through extensive quantification and visualization experiments under three standard semi-supervised settings, we demonstrate the effectiveness of our approach and set a new state-of-the-art for semi-supervised medical image segmentation. Our code is released publicly at https://github.com/youngyzzZ/SSL-w2sPC.
Collapse
Affiliation(s)
- Yang Yang
- School of Computer Science and Technology, Harbin Institute of Technology at Shenzhen, Shenzhen, 518055, China
| | - Guoying Sun
- School of Computer Science and Technology, Harbin Institute of Technology at Shenzhen, Shenzhen, 518055, China
| | - Tong Zhang
- Department of Network Intelligence, Peng Cheng Laboratory, Shenzhen, 518055, China
| | - Ruixuan Wang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, 510006, China; Department of Network Intelligence, Peng Cheng Laboratory, Shenzhen, 518055, China.
| | - Jingyong Su
- School of Computer Science and Technology, Harbin Institute of Technology at Shenzhen, Shenzhen, 518055, China; National Key Laboratory of Smart Farm Technologies and Systems, Harbin, 150001, China.
| |
Collapse
|
5
|
Ding J, Chang J, Han R, Yang L. CDSE-UNet: Enhancing COVID-19 CT Image Segmentation With Canny Edge Detection and Dual-Path SENet Feature Fusion. Int J Biomed Imaging 2025; 2025:9175473. [PMID: 40124228 PMCID: PMC11930385 DOI: 10.1155/ijbi/9175473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 01/17/2025] [Accepted: 02/12/2025] [Indexed: 03/25/2025] Open
Abstract
Accurate segmentation of COVID-19 CT images is crucial for reducing the severity and mortality rates associated with COVID-19 infections. In response to blurred boundaries and high variability characteristic of lesion areas in COVID-19 CT images, we introduce CDSE-UNet: a novel UNet-based segmentation model that integrates Canny operator edge detection and a Dual-Path SENet Feature Fusion Block (DSBlock). This model enhances the standard UNet architecture by employing the Canny operator for edge detection in sample images, paralleling this with a similar network structure for semantic feature extraction. A key innovation is the DSBlock, applied across corresponding network layers to effectively combine features from both image paths. Moreover, we have developed a Multiscale Convolution Block (MSCovBlock), replacing the standard convolution in UNet, to adapt to the varied lesion sizes and shapes. This addition not only aids in accurately classifying lesion edge pixels but also significantly improves channel differentiation and expands the capacity of the model. Our evaluations on public datasets demonstrate CDSE-UNet's superior performance over other leading models. Specifically, CDSE-UNet achieved an accuracy of 0.9929, a recall of 0.9604, a DSC of 0.9063, and an IoU of 0.8286, outperforming UNet, Attention-UNet, Trans-Unet, Swin-Unet, and Dense-UNet in these metrics.
Collapse
Affiliation(s)
- Jiao Ding
- School of Electrical and Electronic Engineering, Anhui Institute of Information Technology, Wuhu, China
| | - Jie Chang
- Department of Information, Wuhu Shengmeifu Technology Co. Ltd, Wuhu, China
| | - Renrui Han
- School of Medical Information, Wannan Medical College, Wuhu, China
| | - Li Yang
- School of Medical Information, Wannan Medical College, Wuhu, China
| |
Collapse
|
6
|
Jin J, Zhou S, Li Y, Zhu T, Fan C, Zhang H, Li P. Reinforced Collaborative-Competitive Representation for Biomedical Image Recognition. Interdiscip Sci 2025; 17:215-230. [PMID: 39841320 DOI: 10.1007/s12539-024-00683-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Revised: 12/04/2024] [Accepted: 12/05/2024] [Indexed: 01/23/2025]
Abstract
Artificial intelligence technology has demonstrated remarkable diagnostic efficacy in modern biomedical image analysis. However, the practical application of artificial intelligence is significantly limited by the presence of similar pathologies among different diseases and the diversity of pathologies within the same disease. To address this issue, this paper proposes a reinforced collaborative-competitive representation classification (RCCRC) method. RCCRC enhances the contribution of different classes by introducing dual competitive constraints into the objective function. The first constraint integrates the collaborative space representation akin to holistic data, promoting the representation contribution of similar classes. The second constraint introduces specific class subspace representations to encourage competition among all classes, enhancing the discriminative nature of representation vectors. By unifying these two constraints, RCCRC effectively explores both global and specific data features in the reconstruction space. Extensive experiments on various biomedical image databases are conducted to exhibit the advantage of the proposed method in comparison with several state-of-the-art classification algorithms.
Collapse
Affiliation(s)
- Junwei Jin
- The Key Laboratory of Grain Information Processing and Control (Henan University of Technology), Ministry of Education, Zhengzhou, 450001, China
- Henan Key Laboratory of Grain Storage Information Intelligent Perception and Decision Making, Henan University of Technology, Zhengzhou, 450001, China
- School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, 450001, China
- Institute for Complexity Science, Henan University of Technology, Zhengzhou, 450001, China
| | - Songbo Zhou
- School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, 450001, China
| | - Yanting Li
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, 450001, China.
| | - Tanxin Zhu
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, 450001, China
| | - Chao Fan
- School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, 450001, China
- Institute for Complexity Science, Henan University of Technology, Zhengzhou, 450001, China
| | - Hua Zhang
- Institute for Complexity Science, Henan University of Technology, Zhengzhou, 450001, China
| | - Peng Li
- Institute for Complexity Science, Henan University of Technology, Zhengzhou, 450001, China.
| |
Collapse
|
7
|
Chu Y, Wang J, Xiong Y, Gao Y, Liu X, Luo G, Gao X, Zhao M, Huang C, Qiu Z, Meng X. Point-annotation supervision for robust 3D pulmonary infection segmentation by CT-based cascading deep learning. Comput Biol Med 2025; 187:109760. [PMID: 39923589 DOI: 10.1016/j.compbiomed.2025.109760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 01/23/2025] [Accepted: 01/27/2025] [Indexed: 02/11/2025]
Abstract
Infected region segmentation is crucial for pulmonary infection diagnosis, severity assessment, and monitoring treatment progression. High-performance segmentation methods rely heavily on fully annotated, large-scale training datasets. However, manual labeling for pulmonary infections demands substantial investments of time and labor. While weakly supervised learning can greatly reduce annotation efforts, previous developments have focused mainly on natural or medical images with distinct boundaries and consistent textures. These approaches is not applicable to pulmonary infection segmentation, which should contend with high topological and intensity variations, irregular and ambiguous boundaries, and poor contrast in 3D contexts. In this study, we propose a cascading point-annotation framework to segment pulmonary infections, enabling optimization on larger datasets and superior performance on external data. Via comparing the representation of annotated points and unlabeled voxels, as well as establishing global uncertainty, we develop two regularization strategies to constrain the network to a more holistic lesion pattern understanding under sparse annotations. We further encompass an enhancement module to improve global anatomical perception and adaptability to spatial anisotropy, alongside a texture-aware variational module to determine more regionally consistent boundaries based on common textures of infection. Experiments on a large dataset of 1,072 CT volumes demonstrate our method outperforming state-of-the-art weakly-supervised approaches by approximately 3%-6% in dice score and is comparable to fully-supervised methods on external datasets. Moreover, our approach demonstrates robust performance even when applied to an unseen infection subtype, Mycoplasma pneumoniae, which was not included in the training datasets. These results collectively underscore rapid and promising applicability for emerging pulmonary infections.
Collapse
Affiliation(s)
- Yuetan Chu
- Center of Excellence for Smart Health (KCSH), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Jianpeng Wang
- The Department of Critical Care Medicine, First Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Yaxin Xiong
- The Department of Critical Care Medicine, First Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Yuan Gao
- The Department of Critical Care Medicine, First Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Xin Liu
- The Department of Prosthodontics, First Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Gongning Luo
- Center of Excellence for Smart Health (KCSH), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Xin Gao
- Center of Excellence for Smart Health (KCSH), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Mingyan Zhao
- The Department of Critical Care Medicine, First Affiliated Hospital of Harbin Medical University, Harbin, China.
| | - Chao Huang
- Ningbo Institute of Information Technology Application, Chinese Academy of Sciences (CAS), Ningbo, China.
| | - Zhaowen Qiu
- College of Computer and Control Engineering, Northeast Forestry University, Harbin, China.
| | - Xianglin Meng
- The Department of Critical Care Medicine, First Affiliated Hospital of Harbin Medical University, Harbin, China; The Cancer Institute and Department of Nuclear Medicine, Fudan University Shanghai Cancer Center, Shanghai, China.
| |
Collapse
|
8
|
García-Barberán V, Gómez Del Pulgar ME, Guamán HM, Benito-Martin A. The times they are AI-changing: AI-powered advances in the application of extracellular vesicles to liquid biopsy in breast cancer. EXTRACELLULAR VESICLES AND CIRCULATING NUCLEIC ACIDS 2025; 6:128-140. [PMID: 40206803 PMCID: PMC11977355 DOI: 10.20517/evcna.2024.51] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 01/03/2025] [Accepted: 01/25/2025] [Indexed: 04/11/2025]
Abstract
Artificial intelligence (AI) is revolutionizing scientific research by facilitating a paradigm shift in data analysis and discovery. This transformation is characterized by a fundamental change in scientific methods and concepts due to AI's ability to process vast datasets with unprecedented speed and accuracy. In breast cancer research, AI aids in early detection, prognosis, and personalized treatment strategies. Liquid biopsy, a noninvasive tool for detecting circulating tumor traits, could ideally benefit from AI's analytical capabilities, enhancing the detection of minimal residual disease and improving treatment monitoring. Extracellular vesicles (EVs), which are key elements in cell communication and cancer progression, could be analyzed with AI to identify disease-specific biomarkers. AI combined with EV analysis promises an enhancement in diagnosis precision, aiding in early detection and treatment monitoring. Studies show that AI can differentiate cancer types and predict drug efficacy, exemplifying its potential in personalized medicine. Overall, the integration of AI in biomedical research and clinical practice promises significant changes and advancements in diagnostics, personalized medicine-based approaches, and our understanding of complex diseases like cancer.
Collapse
Affiliation(s)
- Vanesa García-Barberán
- Molecular Oncology Laboratory, Medical Oncology Department, Hospital Clínico Universitario San Carlos, Instituto de Investigación Sanitaria San Carlos (IdISSC), Madrid 28040, Spain
| | - María Elena Gómez Del Pulgar
- Molecular Oncology Laboratory, Medical Oncology Department, Hospital Clínico Universitario San Carlos, Instituto de Investigación Sanitaria San Carlos (IdISSC), Madrid 28040, Spain
| | - Heidy M. Guamán
- Molecular Oncology Laboratory, Medical Oncology Department, Hospital Clínico Universitario San Carlos, Instituto de Investigación Sanitaria San Carlos (IdISSC), Madrid 28040, Spain
| | - Alberto Benito-Martin
- Molecular Oncology Laboratory, Medical Oncology Department, Hospital Clínico Universitario San Carlos, Instituto de Investigación Sanitaria San Carlos (IdISSC), Madrid 28040, Spain
- Facultad de Medicina, Universidad Alfonso X el Sabio, Madrid 28691, Spain
| |
Collapse
|
9
|
Zhong W, Ren X, Zhang H. Automatic X-ray teeth segmentation with grouped attention. Sci Rep 2025; 15:64. [PMID: 39747360 PMCID: PMC11696191 DOI: 10.1038/s41598-024-84629-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2024] [Accepted: 12/25/2024] [Indexed: 01/04/2025] Open
Abstract
Detection and teeth segmentation from X-rays, aiding healthcare professionals in accurately determining the shape and growth trends of teeth. However, small dataset sizes due to patient privacy, high noise, and blurred boundaries between periodontal tissue and teeth pose challenges to the models' transportability and generalizability, making them prone to overfitting. To address these issues, we propose a novel model, named Grouped Attention and Cross-Layer Fusion Network (GCNet). GCNet effectively handles numerous noise points and significant individual differences in the data, achieving stable and precise segmentation on small-scale datasets. The model comprises two core modules: Grouped Global Attention (GGA) modules and Cross-Layer Fusion (CLF) modules. The GGA modules capture and group texture and contour features, while the CLF modules combine these features with deep semantic information to improve prediction. Experimental results on the Children's Dental Panoramic Radiographs dataset show that our model outperformed existing models such as GT-U-Net and Teeth U-Net, with a Dice coefficient of 0.9338, sensitivity of 0.9426, and specificity of 0.9821. The GCNet model also demonstrates clearer segmentation boundaries compared to other models.
Collapse
Affiliation(s)
| | - XiaoXiao Ren
- The University of New South Wales, Sydney, Australia
| | - HanWen Zhang
- The University of New South Wales, Sydney, Australia
| |
Collapse
|
10
|
Islam MS, Al Farid F, Shamrat FMJM, Islam MN, Rashid M, Bari BS, Abdullah J, Nazrul Islam M, Akhtaruzzaman M, Nomani Kabir M, Mansor S, Abdul Karim H. Challenges issues and future recommendations of deep learning techniques for SARS-CoV-2 detection utilising X-ray and CT images: a comprehensive review. PeerJ Comput Sci 2024; 10:e2517. [PMID: 39896401 PMCID: PMC11784792 DOI: 10.7717/peerj-cs.2517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Accepted: 10/24/2024] [Indexed: 02/04/2025]
Abstract
The global spread of SARS-CoV-2 has prompted a crucial need for accurate medical diagnosis, particularly in the respiratory system. Current diagnostic methods heavily rely on imaging techniques like CT scans and X-rays, but identifying SARS-CoV-2 in these images proves to be challenging and time-consuming. In this context, artificial intelligence (AI) models, specifically deep learning (DL) networks, emerge as a promising solution in medical image analysis. This article provides a meticulous and comprehensive review of imaging-based SARS-CoV-2 diagnosis using deep learning techniques up to May 2024. This article starts with an overview of imaging-based SARS-CoV-2 diagnosis, covering the basic steps of deep learning-based SARS-CoV-2 diagnosis, SARS-CoV-2 data sources, data pre-processing methods, the taxonomy of deep learning techniques, findings, research gaps and performance evaluation. We also focus on addressing current privacy issues, limitations, and challenges in the realm of SARS-CoV-2 diagnosis. According to the taxonomy, each deep learning model is discussed, encompassing its core functionality and a critical assessment of its suitability for imaging-based SARS-CoV-2 detection. A comparative analysis is included by summarizing all relevant studies to provide an overall visualization. Considering the challenges of identifying the best deep-learning model for imaging-based SARS-CoV-2 detection, the article conducts an experiment with twelve contemporary deep-learning techniques. The experimental result shows that the MobileNetV3 model outperforms other deep learning models with an accuracy of 98.11%. Finally, the article elaborates on the current challenges in deep learning-based SARS-CoV-2 diagnosis and explores potential future directions and methodological recommendations for research and advancement.
Collapse
Affiliation(s)
- Md Shofiqul Islam
- Computer Science and Engineering (CSE), Military Institute of Science and Technology (MIST), Dhaka, Bangladesh
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Warun Ponds, Victoria, Australia
| | - Fahmid Al Farid
- Faculty of Engineering, Multimedia University, Cyeberjaya, Selangor, Malaysia
| | | | - Md Nahidul Islam
- Faculty of Electrical and Electronics Engineering Technology, Universiti Malaysia Pahang Al-Sultan Abdullah (UMPSA), Pekan, Pahang, Malaysia
| | - Mamunur Rashid
- Faculty of Electrical and Electronics Engineering Technology, Universiti Malaysia Pahang Al-Sultan Abdullah (UMPSA), Pekan, Pahang, Malaysia
- Electrical and Computer Engineering, Tennessee Tech University, Cookeville, TN, United States
| | - Bifta Sama Bari
- Faculty of Electrical and Electronics Engineering Technology, Universiti Malaysia Pahang Al-Sultan Abdullah (UMPSA), Pekan, Pahang, Malaysia
- Electrical and Computer Engineering, Tennessee Tech University, Cookeville, TN, United States
| | - Junaidi Abdullah
- Faculty of Computing and Informatics, Multimedia University, Cyberjaya, Selangor, Malaysia
| | - Muhammad Nazrul Islam
- Computer Science and Engineering (CSE), Military Institute of Science and Technology (MIST), Dhaka, Bangladesh
| | - Md Akhtaruzzaman
- Computer Science and Engineering (CSE), Military Institute of Science and Technology (MIST), Dhaka, Bangladesh
| | - Muhammad Nomani Kabir
- Department of Computer Science & Engineering, United International University (UIU), Dhaka, Bangladesh
| | - Sarina Mansor
- Faculty of Engineering, Multimedia University, Cyeberjaya, Selangor, Malaysia
| | - Hezerul Abdul Karim
- Faculty of Engineering, Multimedia University, Cyeberjaya, Selangor, Malaysia
| |
Collapse
|
11
|
Zhong W, Zhang H. EF-net: Accurate edge segmentation for segmenting COVID-19 lung infections from CT images. Heliyon 2024; 10:e40580. [PMID: 39669151 PMCID: PMC11635652 DOI: 10.1016/j.heliyon.2024.e40580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Revised: 11/19/2024] [Accepted: 11/19/2024] [Indexed: 12/14/2024] Open
Abstract
Despite advances in modern medicine including the use of computed tomography for detecting COVID-19, precise identification and segmentation of lesions remain a significant challenge owing to indistinct boundaries and low degrees of contrast between infected and healthy lung tissues. This study introduces a novel model called the edge-based dual-parallel attention (EDA)-guided feature-filtering network (EF-Net), specifically designed to accurately segment the edges of COVID-19 lesions. The proposed model comprises two modules: an EDA module and a feature-filtering module (FFM). EDA efficiently extracts structural and textural features from low-level features, enabling the precise identification of lesion boundaries. FFM receives semantically rich features from a deep-level encoder and integrates features with abundant texture and contour information obtained from the EDA module. After filtering through a gating mechanism of the FFM, the EDA features are fused with deep-level features, yielding features rich in both semantic and textural information. Experiments demonstrate that our model outperforms existing models including Inf_Net, GFNet, and BSNet considering various metrics, offering better and clearer segmentation results, particularly for segmenting lesion edges. Moreover, superior performance on the three datasets is achieved, with dice coefficients of 98.1, 97.3, and 72.1 %.
Collapse
|
12
|
Oliveira ADS, Costa MGF, Costa JPGF, Costa Filho CFF. Comparing Different Data Partitioning Strategies for Segmenting Areas Affected by COVID-19 in CT Scans. Diagnostics (Basel) 2024; 14:2791. [PMID: 39767152 PMCID: PMC11674714 DOI: 10.3390/diagnostics14242791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2024] [Revised: 12/09/2024] [Accepted: 12/10/2024] [Indexed: 01/11/2025] Open
Abstract
BACKGROUND/OBJECTIVES According to the World Health Organization, the gold standard for diagnosing COVID-19 is the Reverse Transcription Polymerase Chain Reaction (RT-PCR) test. However, to confirm the diagnosis in patients who have negative results but still show symptoms, imaging tests, especially computed tomography (CT), are used. In this study, using convolutional neural networks, we compared the following topics using manual and automatic lung segmentation methods: (1) the performance of an automatic segmentation of COVID-19 areas using two strategies for data partitioning, CT scans, and slice strategies; (2) the performance of an automatic segmentation method of COVID-19 when there was interobserver agreement between two groups of radiologists; and (3) the performance of the area affected by COVID-19. METHODS Two datasets and two deep neural network architectures are used to evaluate the automatic segmentation of lungs and COVID-19 areas. The performance of the U-Net architecture is compared with the performance of a new architecture proposed by the research group. RESULTS With automatic lung segmentation, the Dice metrics for the segmentation of the COVID-19 area were 73.01 ± 9.47% and 84.66 ± 5.41% for the CT-scan strategy and slice strategy, respectively. With manual lung segmentation, the Dice metrics for the automatic segmentation of COVID-19 were 74.47 ± 9.94% and 85.35 ± 5.41% for the CT-scan and the slice strategy, respectively. CONCLUSIONS The main conclusions were as follows: COVID-19 segmentation was slightly better for the slice strategy than for the CT-scan strategy; a comparison of the performance of the automatic COVID-19 segmentation and the interobserver agreement, in a group of 7 CT scans, revealed that there was no statistically significant difference between any metric.
Collapse
Affiliation(s)
- Anne de Souza Oliveira
- R&D Center in Electronic and Information Technology, Federal University of Amazonas, Manaus 69077-000, Brazil; (A.d.S.O.); (M.G.F.C.)
| | - Marly Guimarães Fernandes Costa
- R&D Center in Electronic and Information Technology, Federal University of Amazonas, Manaus 69077-000, Brazil; (A.d.S.O.); (M.G.F.C.)
| | | | | |
Collapse
|
13
|
Wang C, Xu R, Xu S, Meng W, Xiao J, Zhang X. Accurate Lung Nodule Segmentation With Detailed Representation Transfer and Soft Mask Supervision. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:18381-18393. [PMID: 37824321 DOI: 10.1109/tnnls.2023.3315271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/14/2023]
Abstract
Accurate lung lesion segmentation from computed tomography (CT) images is crucial to the analysis and diagnosis of lung diseases, such as COVID-19 and lung cancer. However, the smallness and variety of lung nodules and the lack of high-quality labeling make the accurate lung nodule segmentation difficult. To address these issues, we first introduce a novel segmentation mask named " soft mask," which has richer and more accurate edge details description and better visualization, and develop a universal automatic soft mask annotation pipeline to deal with different datasets correspondingly. Then, a novel network with detailed representation transfer and soft mask supervision (DSNet) is proposed to process the input low-resolution images of lung nodules into high-quality segmentation results. Our DSNet contains a special detailed representation transfer module (DRTM) for reconstructing the detailed representation to alleviate the small size of lung nodules images and an adversarial training framework with soft mask for further improving the accuracy of segmentation. Extensive experiments validate that our DSNet outperforms other state-of-the-art methods for accurate lung nodule segmentation, and has strong generalization ability in other accurate medical segmentation tasks with competitive results. Besides, we provide a new challenging lung nodules segmentation dataset for further studies (https://drive.google.com/file/d/15NNkvDTb_0Ku0IoPsNMHezJRTH1Oi1wm/view?usp=sharing).
Collapse
|
14
|
Abdel-Salam M, Houssein EH, Emam MM, Samee NA, Jamjoom MM, Hu G. An adaptive enhanced human memory algorithm for multi-level image segmentation for pathological lung cancer images. Comput Biol Med 2024; 183:109272. [PMID: 39405733 DOI: 10.1016/j.compbiomed.2024.109272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2024] [Revised: 10/06/2024] [Accepted: 10/10/2024] [Indexed: 11/20/2024]
Abstract
Lung cancer is a critical health issue that demands swift and accurate diagnosis for effective treatment. In medical imaging, segmentation is crucial for identifying and isolating regions of interest, which is essential for precise diagnosis and treatment planning. Traditional metaheuristic-based segmentation methods often struggle with slow convergence speed, poor optimized thresholds results, balancing exploration and exploitation, leading to suboptimal performance in the multi-thresholding segmenting of lung cancer images. This study presents ASG-HMO, an enhanced variant of the Human Memory Optimization (HMO) algorithm, selected for its simplicity, versatility, and minimal parameters. Although HMO has never been applied to multi-thresholding image segmentation, its characteristics make it ideal to improve pathology lung cancer image segmentation. The ASG-HMO incorporating four innovative strategies that address key challenges in the segmentation process. Firstly, the enhanced adaptive mutualism phase is proposed to balance exploration and exploitation to accurately delineate tumor boundaries without getting trapped in suboptimal solutions. Second, the spiral motion strategy is utilized to adaptively refines segmentation solutions by focusing on both the overall lung structure and the intricate tumor details. Third, the gaussian mutation strategy introduces diversity in the search process, enabling the exploration of a broader range of segmentation thresholds to enhance the accuracy of segmented regions. Finally, the adaptive t-distribution disturbance strategy is proposed to help the algorithm avoid local optima and refine segmentation in later stages. The effectiveness of ASG-HMO is validated through rigorous testing on the IEEE CEC'17 and CEC'20 benchmark suites, followed by its application to multilevel thresholding segmentation in nine histopathology lung cancer images. In these experiments, six different segmentation thresholds were tested, and the algorithm was compared to several classical, recent, and advanced segmentation algorithms. In addition, the proposed ASG-HMO leverages 2D Renyi entropy and 2D histograms to enhance the precision of the segmentation process. Quantitative result analysis in pathological lung cancer segmentation showed that ASG-HMO achieved superior maximum Peak Signal-to-Noise Ratio (PSNR) of 31.924, Structural Similarity Index Measure (SSIM) of 0.919, Feature Similarity Index Measure (FSIM) of 0.990, and Probability Rand Index (PRI) of 0.924. These results indicate that ASG-HMO significantly outperforms existing algorithms in both convergence speed and segmentation accuracy. This demonstrates the robustness of ASG-HMO as a framework for precise segmentation of pathological lung cancer images, offering substantial potential for improving clinical diagnostic processes.
Collapse
Affiliation(s)
- Mahmoud Abdel-Salam
- Faculty of Computers and Information Science, Mansoura University, Mansoura, Egypt.
| | - Essam H Houssein
- Faculty of Computers and Information, Minia University, Minia, Egypt.
| | - Marwa M Emam
- Faculty of Computers and Information, Minia University, Minia, Egypt.
| | - Nagwan Abdel Samee
- Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh, 11671, Saudi Arabia.
| | - Mona M Jamjoom
- Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, 11671, Saudi Arabia.
| | - Gang Hu
- Department of Applied Mathematics, Xi'an University of Technology, Xi'an, 710054, PR China.
| |
Collapse
|
15
|
Liu X, Liu Y, Fu W, Liu S. RETRACTED ARTICLE: SCTV-UNet: a COVID-19 CT segmentation network based on attention mechanism. Soft comput 2024; 28:473. [PMID: 37362261 PMCID: PMC10028784 DOI: 10.1007/s00500-023-07991-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/27/2023] [Indexed: 03/24/2023]
Affiliation(s)
- Xiangbin Liu
- College of Information Science and
Engineering, Hunan Normal University,
Changsha, 410081 China
| | - Ying Liu
- College of Information Science and
Engineering, Hunan Normal University,
Changsha, 410081 China
| | - Weina Fu
- College of Information Science and
Engineering, Hunan Normal University,
Changsha, 410081 China
| | - Shuai Liu
- School of Educational Science,
Hunan Normal University,
Changsha, 410081 China
| |
Collapse
|
16
|
Alam MS, Wang D, Arzhaeva Y, Ende JA, Kao J, Silverstone L, Yates D, Salvado O, Sowmya A. Attention-based multi-residual network for lung segmentation in diseased lungs with custom data augmentation. Sci Rep 2024; 14:28983. [PMID: 39578613 PMCID: PMC11584877 DOI: 10.1038/s41598-024-79494-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Accepted: 11/11/2024] [Indexed: 11/24/2024] Open
Abstract
Lung disease analysis in chest X-rays (CXR) using deep learning presents significant challenges due to the wide variation in lung appearance caused by disease progression and differing X-ray settings. While deep learning models have shown remarkable success in segmenting lungs from CXR images with normal or mildly abnormal findings, their performance declines when faced with complex structures, such as pulmonary opacifications. In this study, we propose AMRU++, an attention-based multi-residual UNet++ network designed for robust and accurate lung segmentation in CXR images with both normal and severe abnormalities. The model incorporates attention modules to capture relevant spatial information and multi-residual blocks to extract rich contextual and discriminative features of lung regions. To further enhance segmentation performance, we introduce a data augmentation technique that simulates the features and characteristics of CXR pathologies, addressing the issue of limited annotated data. Extensive experiments on public and private datasets comprising 350 cases of pneumoconiosis, COVID-19, and tuberculosis validate the effectiveness of our proposed framework and data augmentation technique.
Collapse
Affiliation(s)
- Md Shariful Alam
- School of Computer Science and Engineering, University of New South Wales, Sydney, Australia.
| | | | | | - Jesse Alexander Ende
- Department of Radiology, St Vincent's Hospital Sydney, Darlinghurst, NSW, 2010, Australia
| | - Joanna Kao
- Department of Radiology, St Vincent's Hospital Sydney, Darlinghurst, NSW, 2010, Australia
| | - Liz Silverstone
- Department of Radiology, St Vincent's Hospital Sydney, Darlinghurst, NSW, 2010, Australia
| | - Deborah Yates
- Department of Thoracic Medicine, St Vincent's Hospital Sydney, Darlinghurst, NSW, 2010, Australia
| | - Olivier Salvado
- School of Electrical Engineering & Robotics, Queensland University of Technology, Brisbane, QLD, 4001, Australia
| | - Arcot Sowmya
- School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
| |
Collapse
|
17
|
Gao Y, Gong M, Ong YS, Qin AK, Wu Y, Xie F. A Collaborative Multimodal Learning-Based Framework for COVID-19 Diagnosis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:15883-15895. [PMID: 37402198 DOI: 10.1109/tnnls.2023.3290188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/06/2023]
Abstract
The pandemic of coronavirus disease 2019 (COVID-19) has led to a global public health crisis, which caused millions of deaths and billions of infections, greatly increasing the pressure on medical resources. With the continuous emergence of viral mutations, developing automated tools for COVID-19 diagnosis is highly desired to assist the clinical diagnosis and reduce the tedious workload of image interpretation. However, medical images in a single site are usually of a limited amount or weakly labeled, while integrating data scattered around different institutions to build effective models is not allowed due to data policy restrictions. In this article, we propose a novel privacy-preserving cross-site framework for COVID-19 diagnosis with multimodal data, seeking to effectively leverage heterogeneous data from multiple parties while preserving patients' privacy. Specifically, a Siamese branched network is introduced as the backbone to capture inherent relationships across heterogeneous samples. The redesigned network is capable of handling semisupervised inputs in multimodalities and conducting task-specific training, in order to improve the model performance of various scenarios. The framework achieves significant improvement compared with state-of-the-art methods, as we demonstrate through extensive simulations on real-world datasets.
Collapse
|
18
|
Huang W, Zhang L, Wang Z, Wang L. Exploring Inherent Consistency for Semi-Supervised Anatomical Structure Segmentation in Medical Imaging. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3731-3741. [PMID: 38743533 DOI: 10.1109/tmi.2024.3400840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Due to the exorbitant expense of obtaining labeled data in the field of medical image analysis, semi-supervised learning has emerged as a favorable method for the segmentation of anatomical structures. Although semi-supervised learning techniques have shown great potential in this field, existing methods only utilize image-level spatial consistency to impose unsupervised regularization on data in label space. Considering that anatomical structures often possess inherent anatomical properties that have not been focused on in previous works, this study introduces the inherent consistency into semi-supervised anatomical structure segmentation. First, the prediction and the ground-truth are projected into an embedding space to obtain latent representations that encapsulate the inherent anatomical properties of the structures. Then, two inherent consistency constraints are designed to leverage these inherent properties by aligning these latent representations. The proposed method is plug-and-play and can be seamlessly integrated with existing methods, thereby collaborating to improve segmentation performance and enhance the anatomical plausibility of the results. To evaluate the effectiveness of the proposed method, experiments are conducted on three public datasets (ACDC, LA, and Pancreas). Extensive experimental results demonstrate that the proposed method exhibits good generalizability and outperforms several state-of-the-art methods.
Collapse
|
19
|
Lan X, Jin W. Multi-scale input layers and dense decoder aggregation network for COVID-19 lesion segmentation from CT scans. Sci Rep 2024; 14:23729. [PMID: 39390053 PMCID: PMC11467340 DOI: 10.1038/s41598-024-74701-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Accepted: 09/27/2024] [Indexed: 10/12/2024] Open
Abstract
Accurate segmentation of COVID-19 lesions from medical images is essential for achieving precise diagnosis and developing effective treatment strategies. Unfortunately, this task presents significant challenges, owing to the complex and diverse characteristics of opaque areas, subtle differences between infected and healthy tissue, and the presence of noise in CT images. To address these difficulties, this paper designs a new deep-learning architecture (named MD-Net) based on multi-scale input layers and dense decoder aggregation network for COVID-19 lesion segmentation. In our framework, the U-shaped structure serves as the cornerstone to facilitate complex hierarchical representations essential for accurate segmentation. Then, by introducing the multi-scale input layers (MIL), the network can effectively analyze both fine-grained details and contextual information in the original image. Furthermore, we introduce an SE-Conv module in the encoder network, which can enhance the ability to identify relevant information while simultaneously suppressing the transmission of extraneous or non-lesion information. Additionally, we design a dense decoder aggregation (DDA) module to integrate feature distributions and important COVID-19 lesion information from adjacent encoder layers. Finally, we conducted a comprehensive quantitative analysis and comparison between two publicly available datasets, namely Vid-QU-EX and QaTa-COV19-v2, to assess the robustness and versatility of MD-Net in segmenting COVID-19 lesions. The experimental results show that the proposed MD-Net has superior performance compared to its competitors, and it exhibits higher scores on the Dice value, Matthews correlation coefficient (Mcc), and Jaccard index. In addition, we also conducted ablation studies on the Vid-QU-EX dataset to evaluate the contributions of each key component within the proposed architecture.
Collapse
Affiliation(s)
- Xiaoke Lan
- College of Internet of Things Technology, Hangzhou Polytechnic, Hangzhou, 311402, China.
| | - Wenbing Jin
- College of Internet of Things Technology, Hangzhou Polytechnic, Hangzhou, 311402, China
| |
Collapse
|
20
|
Pang C, Lu X, Liu X, Zhang R, Lyu L. IIAM: Intra and Inter Attention With Mutual Consistency Learning Network for Medical Image Segmentation. IEEE J Biomed Health Inform 2024; 28:5971-5983. [PMID: 38985557 DOI: 10.1109/jbhi.2024.3426074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Medical image segmentation provides a reliable basis for diagnosis analysis and disease treatment by capturing the global and local features of the target region. To learn global features, convolutional neural networks are replaced with pure transformers, or transformer layers are stacked at the deepest layers of convolutional neural networks. Nevertheless, they are deficient in exploring local-global cues at each scale and the interaction among consensual regions in multiple scales, hindering the learning about the changes in size, shape, and position of target objects. To cope with these defects, we propose a novel Intra and Inter Attention with Mutual Consistency Learning Network (IIAM). Concretely, we design an intra attention module to aggregate the CNN-based local features and transformer-based global information on each scale. In addition, to capture the interaction among consensual regions in multiple scales, we devise an inter attention module to explore the cross-scale dependency of the object and its surroundings. Moreover, to reduce the impact of blurred regions in medical images on the final segmentation results, we introduce multiple decoders to estimate the model uncertainty, where we adopt a mutual consistency learning strategy to minimize the output discrepancy during the end-to-end training and weight the outputs of the three decoders as the final segmentation result. Extensive experiments on three benchmark datasets verify the efficacy of our method and demonstrate superior performance of our model to state-of-the-art techniques.
Collapse
|
21
|
Chen W, Song H, Dai C, Huang Z, Wu A, Shan G, Liu H, Jiang A, Liu X, Ru C, Abdalla K, Dhanani SN, Moosavi KF, Pathak S, Librach C, Zhang Z, Sun Y. CP-Net: Instance-aware part segmentation network for biological cell parsing. Med Image Anal 2024; 97:103243. [PMID: 38954941 DOI: 10.1016/j.media.2024.103243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Revised: 05/15/2024] [Accepted: 06/11/2024] [Indexed: 07/04/2024]
Abstract
Instance segmentation of biological cells is important in medical image analysis for identifying and segmenting individual cells, and quantitative measurement of subcellular structures requires further cell-level subcellular part segmentation. Subcellular structure measurements are critical for cell phenotyping and quality analysis. For these purposes, instance-aware part segmentation network is first introduced to distinguish individual cells and segment subcellular structures for each detected cell. This approach is demonstrated on human sperm cells since the World Health Organization has established quantitative standards for sperm quality assessment. Specifically, a novel Cell Parsing Net (CP-Net) is proposed for accurate instance-level cell parsing. An attention-based feature fusion module is designed to alleviate contour misalignments for cells with an irregular shape by using instance masks as spatial cues instead of as strict constraints to differentiate various instances. A coarse-to-fine segmentation module is developed to effectively segment tiny subcellular structures within a cell through hierarchical segmentation from whole to part instead of directly segmenting each cell part. Moreover, a sperm parsing dataset is built including 320 annotated sperm images with five semantic subcellular part labels. Extensive experiments on the collected dataset demonstrate that the proposed CP-Net outperforms state-of-the-art instance-aware part segmentation networks.
Collapse
Affiliation(s)
- Wenyuan Chen
- Department of Computer Science, University of Toronto, Toronto, M5S 2E4, Canada
| | - Haocong Song
- Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON M5S 3G8, Canada
| | - Changsheng Dai
- School of Mechanical Engineering, Dalian University of Technology, Dalian 116024, China
| | - Zongjie Huang
- Suzhou Boundless Medical Technology Ltd., Co.,, Suzhou 215000, China
| | - Andrew Wu
- Division of Engineering Science, University of Toronto, Toronto, ON M5S 2E4, Canada
| | - Guanqiao Shan
- Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON M5S 3G8, Canada
| | - Hang Liu
- Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON M5S 3G8, Canada
| | - Aojun Jiang
- Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON M5S 3G8, Canada
| | - Xingjian Liu
- School of Mechanical Engineering, Dalian University of Technology, Dalian 116024, China
| | - Changhai Ru
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | | | | | | | - Shruti Pathak
- CReATe Fertility Centre, Toronto, ON M5G 1N8, Canada
| | | | - Zhuoran Zhang
- School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Yu Sun
- Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON M5S 3G8, Canada.
| |
Collapse
|
22
|
Huang X, Zhu Y, Shao M, Xia M, Shen X, Wang P, Wang X. Dual-branch Transformer for semi-supervised medical image segmentation. J Appl Clin Med Phys 2024; 25:e14483. [PMID: 39133901 PMCID: PMC11466465 DOI: 10.1002/acm2.14483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 06/06/2024] [Accepted: 07/01/2024] [Indexed: 10/12/2024] Open
Abstract
PURPOSE In recent years, the use of deep learning for medical image segmentation has become a popular trend, but its development also faces some challenges. Firstly, due to the specialized nature of medical data, precise annotation is time-consuming and labor-intensive. Training neural networks effectively with limited labeled data is a significant challenge in medical image analysis. Secondly, convolutional neural networks commonly used for medical image segmentation research often focus on local features in images. However, the recognition of complex anatomical structures or irregular lesions often requires the assistance of both local and global information, which has led to a bottleneck in its development. Addressing these two issues, in this paper, we propose a novel network architecture. METHODS We integrate a shift window mechanism to learn more comprehensive semantic information and employ a semi-supervised learning strategy by incorporating a flexible amount of unlabeled data. Specifically, a typical U-shaped encoder-decoder structure is applied to obtain rich feature maps. Each encoder is designed as a dual-branch structure, containing Swin modules equipped with windows of different size to capture features of multiple scales. To effectively utilize unlabeled data, a level set function is introduced to establish consistency between the function regression and pixel classification. RESULTS We conducted experiments on the COVID-19 CT dataset and DRIVE dataset and compared our approach with various semi-supervised and fully supervised learning models. On the COVID-19 CT dataset, we achieved a segmentation accuracy of up to 74.56%. Our segmentation accuracy on the DRIVE dataset was 79.79%. CONCLUSIONS The results demonstrate the outstanding performance of our method on several commonly used evaluation metrics. The high segmentation accuracy of our model demonstrates that utilizing Swin modules with different window sizes can enhance the feature extraction capability of the model, and the level set function can enable semi-supervised models to more effectively utilize unlabeled data. This provides meaningful insights for the application of deep learning in medical image segmentation. Our code will be released once the manuscript is accepted for publication.
Collapse
Affiliation(s)
- Xiaojie Huang
- The Second Affiliated HospitalSchool of MedicineZhejiang UniversityHangzhouChina
| | - Yating Zhu
- Zhejiang University of TechnologyHangzhouChina
| | | | - Ming Xia
- Zhejiang University of TechnologyHangzhouChina
| | - Xiaoting Shen
- Stomatology HospitalSchool of StomatologyZhejiang University School of MedicineHangzhouChina
| | - Pingli Wang
- The Second Affiliated HospitalSchool of MedicineZhejiang UniversityHangzhouChina
| | | |
Collapse
|
23
|
Rai S, Bhatt JS, Patra SK. An AI-Based Low-Risk Lung Health Image Visualization Framework Using LR-ULDCT. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:2047-2062. [PMID: 38491236 PMCID: PMC11522248 DOI: 10.1007/s10278-024-01062-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 01/18/2024] [Accepted: 02/12/2024] [Indexed: 03/18/2024]
Abstract
In this article, we propose an AI-based low-risk visualization framework for lung health monitoring using low-resolution ultra-low-dose CT (LR-ULDCT). We present a novel deep cascade processing workflow to achieve diagnostic visualization on LR-ULDCT (<0.3 mSv) at par high-resolution CT (HRCT) of 100 mSV radiation technology. To this end, we build a low-risk and affordable deep cascade network comprising three sequential deep processes: restoration, super-resolution (SR), and segmentation. Given degraded LR-ULDCT, the first novel network unsupervisedly learns restoration function from augmenting patch-based dictionaries and residuals. The restored version is then super-resolved (SR) for target (sensor) resolution. Here, we combine perceptual and adversarial losses in novel GAN to establish the closeness between probability distributions of generated SR-ULDCT and restored LR-ULDCT. Thus SR-ULDCT is presented to the segmentation network that first separates the chest portion from SR-ULDCT followed by lobe-wise colorization. Finally, we extract five lobes to account for the presence of ground glass opacity (GGO) in the lung. Hence, our AI-based system provides low-risk visualization of input degraded LR-ULDCT to various stages, i.e., restored LR-ULDCT, restored SR-ULDCT, and segmented SR-ULDCT, and achieves diagnostic power of HRCT. We perform case studies by experimenting on real datasets of COVID-19, pneumonia, and pulmonary edema/congestion while comparing our results with state-of-the-art. Ablation experiments are conducted for better visualizing different operating pipelines. Finally, we present a verification report by fourteen (14) experienced radiologists and pulmonologists.
Collapse
Affiliation(s)
- Swati Rai
- Indian Institute of Information Technology Vadodara, Vadodara, India.
| | - Jignesh S Bhatt
- Indian Institute of Information Technology Vadodara, Vadodara, India
| | | |
Collapse
|
24
|
Wang H, Cao P, Yang J, Zaiane O. Narrowing the semantic gaps in U-Net with learnable skip connections: The case of medical image segmentation. Neural Netw 2024; 178:106546. [PMID: 39053196 DOI: 10.1016/j.neunet.2024.106546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 04/13/2024] [Accepted: 07/14/2024] [Indexed: 07/27/2024]
Abstract
Current state-of-the-art medical image segmentation techniques predominantly employ the encoder-decoder architecture. Despite its widespread use, this U-shaped framework exhibits limitations in effectively capturing multi-scale features through simple skip connections. In this study, we made a thorough analysis to investigate the potential weaknesses of connections across various segmentation tasks, and suggest two key aspects of potential semantic gaps crucial to be considered: the semantic gap among multi-scale features in different encoding stages and the semantic gap between the encoder and the decoder. To bridge these semantic gaps, we introduce a novel segmentation framework, which incorporates a Dual Attention Transformer module for capturing channel-wise and spatial-wise relationships, and a Decoder-guided Recalibration Attention module for fusing DAT tokens and decoder features. These modules establish a principle of learnable connection that resolves the semantic gaps, leading to a high-performance segmentation model for medical images. Furthermore, it provides a new paradigm for effectively incorporating the attention mechanism into the traditional convolution-based architecture. Comprehensive experimental results demonstrate that our model achieves consistent, significant gains and outperforms state-of-the-art methods with relatively fewer parameters. This study contributes to the advancement of medical image segmentation by offering a more effective and efficient framework for addressing the limitations of current encoder-decoder architectures. Code: https://github.com/McGregorWwww/UDTransNet.
Collapse
Affiliation(s)
- Haonan Wang
- School of Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image of Ministry of Education, Northeastern University, Shenyang, China.
| | - Peng Cao
- School of Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image of Ministry of Education, Northeastern University, Shenyang, China.
| | - Jinzhu Yang
- School of Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image of Ministry of Education, Northeastern University, Shenyang, China
| | | |
Collapse
|
25
|
You C, Dai W, Liu F, Min Y, Dvornek NC, Li X, Clifton DA, Staib L, Duncan JS. Mine Your Own Anatomy: Revisiting Medical Image Segmentation With Extremely Limited Labels. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; PP:11136-11151. [PMID: 39269798 PMCID: PMC11903367 DOI: 10.1109/tpami.2024.3461321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/15/2024]
Abstract
Recent studies on contrastive learning have achieved remarkable performance solely by leveraging few labels in the context of medical image segmentation. Existing methods mainly focus on instance discrimination and invariant mapping (i.e., pulling positive samples closer and negative samples apart in the feature space). However, they face three common pitfalls: (1) tailness: medical image data usually follows an implicit long-tail class distribution. Blindly leveraging all pixels in training hence can lead to the data imbalance issues, and cause deteriorated performance; (2) consistency: it remains unclear whether a segmentation model has learned meaningful and yet consistent anatomical features due to the intra-class variations between different anatomical features; and (3) diversity: the intra-slice correlations within the entire dataset have received significantly less attention. This motivates us to seek a principled approach for strategically making use of the dataset itself to discover similar yet distinct samples from different anatomical views. In this paper, we introduce a novel semi-supervised 2D medical image segmentation framework termed Mine yOur owNAnatomy (MONA), and make three contributions. First, prior work argues that every pixel equally matters to the model training; we observe empirically that this alone is unlikely to define meaningful anatomical features, mainly due to lacking the supervision signal. We show two simple solutions towards learning invariances-through the use of stronger data augmentations and nearest neighbors. Second, we construct a set of objectives that encourage the model to be capable of decomposing medical images into a collection of anatomical features in an unsupervised manner. Lastly, we both empirically and theoretically, demonstrate the efficacy of our MONA on three benchmark datasets with different labeled settings, achieving new state-of-the-art under different labeled semi-supervised settings. MONA makes minimal assumptions on domain expertise, and hence constitutes a practical and versatile solution in medical image analysis. We provide the PyTorch-like pseudo-code in supplementary.
Collapse
|
26
|
Sankaramurthy P, Palaniswamy R, Sellamuthu S, Chelladurai F, Murugadhas A. Lung disease prediction based on CT images using REInf-net and world cup optimization based BI-LSTM classification. NETWORK (BRISTOL, ENGLAND) 2024:1-34. [PMID: 39252464 DOI: 10.1080/0954898x.2024.2392782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 06/11/2024] [Accepted: 08/08/2024] [Indexed: 09/11/2024]
Abstract
A major global source of disability as well as mortality is respiratory illness. Though visual evaluation of computed tomography (CT) images and chest radiographs are a primary diagnostic for respiratory illnesses, it is limited in its ability to assess severity and predict patient outcomes due to low specificity and fundamental infectious organisms. In order to address these problems, world cup optimization-based Bi-LSTM classification and lung disease prediction on CT images using REINF-net were employed. To enhance the image quality, the gathered lung CT images are pre-processed using Lucy Richardson and CLAHE algorithms. For the purpose of lung infection segmentation, the pre-processed images are segmented using the REInf-net. The GLRLM method is used to extract features from the segmented images. In order to predict lung disease in CT images, the extracted features are trained using the Bi-LSTM based on world cup optimization. Accuracy, Precision, recall, Error and Specificity for the proposed model are 97.8%, 96.7%, 96.7%, 2.2% and 98.3%. These evaluated values are contrasted with the results of existing methods like WCO-BiLSTM, MLP, CNN and LSTM. Finally, the Lung disease prediction based on CT images using REINF-Net and world cup optimization based BI-LSTM classification performs better than the existing model.
Collapse
Affiliation(s)
- Padmini Sankaramurthy
- Department of Computing Technologies, SRM Institute of Science and Technology, Kattankulathur, Chennai, India
| | - Renukadevi Palaniswamy
- Department of Computing Technologies, SRM Institute of Science and Technology, Kattankulathur, Chennai, India
| | - Suseela Sellamuthu
- School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, India
| | - Fancy Chelladurai
- Department of Networking and Communications, School of Computing, Faculty of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Chennai, India
| | - Anand Murugadhas
- Department of Networking and Communications, School of Computing, Faculty of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Chennai, India
| |
Collapse
|
27
|
Zhou S, Yao S, Shen T, Wang Q. A Novel End-to-End Deep Learning Framework for Chip Packaging Defect Detection. SENSORS (BASEL, SWITZERLAND) 2024; 24:5837. [PMID: 39275746 PMCID: PMC11398187 DOI: 10.3390/s24175837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Revised: 09/04/2024] [Accepted: 09/06/2024] [Indexed: 09/16/2024]
Abstract
As semiconductor chip manufacturing technology advances, chip structures are becoming more complex, leading to an increased likelihood of void defects in the solder layer during packaging. However, identifying void defects in packaged chips remains a significant challenge due to the complex chip background, varying defect sizes and shapes, and blurred boundaries between voids and their surroundings. To address these challenges, we present a deep-learning-based framework for void defect segmentation in chip packaging. The framework consists of two main components: a solder region extraction method and a void defect segmentation network. The solder region extraction method includes a lightweight segmentation network and a rotation correction algorithm that eliminates background noise and accurately captures the solder region of the chip. The void defect segmentation network is designed for efficient and accurate defect segmentation. To cope with the variability of void defect shapes and sizes, we propose a Mamba model-based encoder that uses a visual state space module for multi-scale information extraction. In addition, we propose an interactive dual-stream decoder that uses a feature correlation cross gate module to fuse the streams' features to improve their correlation and produce more accurate void defect segmentation maps. The effectiveness of the framework is evaluated through quantitative and qualitative experiments on our custom X-ray chip dataset. Furthermore, the proposed void defect segmentation framework for chip packaging has been applied to a real factory inspection line, achieving an accuracy of 93.3% in chip qualification.
Collapse
Affiliation(s)
- Siyi Zhou
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
- Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming 650500, China
| | - Shunhua Yao
- Wuxi Aiyingna Electromechanical Equipment Co., Ltd., Wuxi 214028, China
| | - Tao Shen
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
- Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming 650500, China
| | - Qingwang Wang
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
- Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming 650500, China
| |
Collapse
|
28
|
Vásquez-Venegas C, Sotomayor CG, Ramos B, Castañeda V, Pereira G, Cabrera-Vives G, Härtel S. Human-in-the-Loop-A Deep Learning Strategy in Combination with a Patient-Specific Gaussian Mixture Model Leads to the Fast Characterization of Volumetric Ground-Glass Opacity and Consolidation in the Computed Tomography Scans of COVID-19 Patients. J Clin Med 2024; 13:5231. [PMID: 39274444 PMCID: PMC11396404 DOI: 10.3390/jcm13175231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 08/02/2024] [Accepted: 09/02/2024] [Indexed: 09/16/2024] Open
Abstract
Background/Objectives: The accurate quantification of ground-glass opacities (GGOs) and consolidation volumes has prognostic value in COVID-19 patients. Nevertheless, the accurate manual quantification of the corresponding volumes remains a time-consuming task. Deep learning (DL) has demonstrated good performance in the segmentation of normal lung parenchyma and COVID-19 pneumonia. We introduce a Human-in-the-Loop (HITL) strategy for the segmentation of normal lung parenchyma and COVID-19 pneumonia that is both time efficient and quality effective. Furthermore, we propose a Gaussian Mixture Model (GMM) to classify GGO and consolidation based on a probabilistic characterization and case-sensitive thresholds. Methods: A total of 65 Computed Tomography (CT) scans from 64 patients, acquired between March 2020 and June 2021, were randomly selected. We pretrained a 3D-UNet with an international dataset and implemented a HITL strategy to refine the local dataset with delineations by teams of medical interns, radiology residents, and radiologists. Following each HITL cycle, 3D-UNet was re-trained until the Dice Similarity Coefficients (DSCs) reached the quality criteria set by radiologists (DSC = 0.95/0.8 for the normal lung parenchyma/COVID-19 pneumonia). For the probabilistic characterization, a Gaussian Mixture Model (GMM) was fitted to the Hounsfield Units (HUs) of voxels from the CT scans of patients with COVID-19 pneumonia on the assumption that two distinct populations were superimposed: one for GGO and one for consolidation. Results: Manual delineation of the normal lung parenchyma and COVID-19 pneumonia was performed by seven teams on 65 CT scans from 64 patients (56 ± 16 years old (μ ± σ), 46 males, 62 with reported symptoms). Automated lung/COVID-19 pneumonia segmentation with a DSC > 0.96/0.81 was achieved after three HITL cycles. The HITL strategy improved the DSC by 0.2 and 0.5 for the normal lung parenchyma and COVID-19 pneumonia segmentation, respectively. The distribution of the patient-specific thresholds derived from the GMM yielded a mean of -528.4 ± 99.5 HU (μ ± σ), which is below most of the reported fixed HU thresholds. Conclusions: The HITL strategy allowed for fast and effective annotations, thereby enhancing the quality of segmentation for a local CT dataset. Probabilistic characterization of COVID-19 pneumonia by the GMM enabled patient-specific segmentation of GGO and consolidation. The combination of both approaches is essential to gain confidence in DL approaches in our local environment. The patient-specific probabilistic approach, when combined with the automatic quantification of COVID-19 imaging findings, enhances the understanding of GGO and consolidation during the course of the disease, with the potential to improve the accuracy of clinical predictions.
Collapse
Affiliation(s)
- Constanza Vásquez-Venegas
- Department of Computer Science, Faculty of Engineering, University of Concepción, Concepción 4030000, Chile;
- Laboratory for Scientific Image Analysis SCIAN-Lab, Integrative Biology Program, Institute of Biomedical Sciences, Faculty of Medicine, University of Chile, Santiago 8380453, Chile;
| | - Camilo G. Sotomayor
- Laboratory for Scientific Image Analysis SCIAN-Lab, Integrative Biology Program, Institute of Biomedical Sciences, Faculty of Medicine, University of Chile, Santiago 8380453, Chile;
- Radiology Department, University of Chile Clinical Hospital, University of Chile, Santiago 8380420, Chile;
| | - Baltasar Ramos
- School of Medicine, Faculty of Medicine, University of Chile, Santiago 8380453, Chile;
| | - Víctor Castañeda
- Center of Medical Informatics and Telemedicine & National Center of Health Information Systems, Faculty of Medicine, University of Chile, Santiago 8380453, Chile;
- Department of Medical Technology, Faculty of Medicine, University of Chile, Santiago 8380453, Chile
| | - Gonzalo Pereira
- Radiology Department, University of Chile Clinical Hospital, University of Chile, Santiago 8380420, Chile;
| | - Guillermo Cabrera-Vives
- Department of Computer Science, Faculty of Engineering, University of Concepción, Concepción 4030000, Chile;
| | - Steffen Härtel
- Laboratory for Scientific Image Analysis SCIAN-Lab, Integrative Biology Program, Institute of Biomedical Sciences, Faculty of Medicine, University of Chile, Santiago 8380453, Chile;
- Center of Medical Informatics and Telemedicine & National Center of Health Information Systems, Faculty of Medicine, University of Chile, Santiago 8380453, Chile;
- Biomedical Neuroscience Institute, Faculty of Medicine, University of Chile, Santiago 8380453, Chile
- National Center for Health Information Systems, Santiago 8380453, Chile
- Center of Mathematical Modelling, University of Chile, Santiago 8380453, Chile
| |
Collapse
|
29
|
Senthil R, Anand T, Somala CS, Saravanan KM. Bibliometric analysis of artificial intelligence in healthcare research: Trends and future directions. Future Healthc J 2024; 11:100182. [PMID: 39310219 PMCID: PMC11414662 DOI: 10.1016/j.fhj.2024.100182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 08/06/2024] [Accepted: 08/30/2024] [Indexed: 09/25/2024]
Abstract
Objective The presence of artificial intelligence (AI) in healthcare is a powerful and game-changing force that is completely transforming the industry as a whole. Using sophisticated algorithms and data analytics, AI has unparalleled prospects for improving patient care, streamlining operational efficiency, and fostering innovation across the healthcare ecosystem. This study conducts a comprehensive bibliometric analysis of research on AI in healthcare, utilising the SCOPUS database as the primary data source. Methods Preliminary findings from 2013 identified 153 publications on AI and healthcare. Between 2019 and 2023, the number of publications increased exponentially, indicating significant growth and development in the field. The analysis employs various bibliometric indicators to assess research production performance, science mapping techniques, and thematic mapping analysis. Results The study reveals insights into research hotspots, thematic focus, and emerging trends in AI and healthcare research. Based on an extensive examination of the Scopus database provides a brief overview and suggests potential avenues for further investigation. Conclusion This article provides valuable contributions to understanding the current landscape of AI in healthcare, offering insights for future research directions and informing strategic decision making in the field.
Collapse
Affiliation(s)
- Renganathan Senthil
- Department of Bioinformatics, School of Lifesciences, Vels Institute of Science Technology and Advanced Studies (VISTAS), Pallavaram, Chennai 600117, Tamil Nadu, India
| | - Thirunavukarasou Anand
- SRIIC Lab, Faculty of Clinical Research, Sri Ramachandra Institute of Higher Education and Research, Chennai 600116, Tamil Nadu, India
- B Aatral Biosciences Private Limited, Bangalore 560091, Karnataka, India
| | | | - Konda Mani Saravanan
- B Aatral Biosciences Private Limited, Bangalore 560091, Karnataka, India
- Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai 600073, Tamil Nadu, India
| |
Collapse
|
30
|
Xiao Z, Sun H, Liu F. Semi-supervised CT image segmentation via contrastive learning based on entropy constraints. Biomed Eng Lett 2024; 14:1023-1035. [PMID: 39220023 PMCID: PMC11362456 DOI: 10.1007/s13534-024-00387-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 04/01/2024] [Accepted: 04/30/2024] [Indexed: 09/04/2024] Open
Abstract
Deep learning-based methods for fast target segmentation of computed tomography (CT) imaging have become increasingly popular. The success of current deep learning methods usually depends on a large amount of labeled data. Labeling medical data is a time-consuming and laborious task. Therefore, this paper aims to enhance the segmentation of CT images by using a semi-supervised learning method. In order to utilize the valid information in unlabeled data, we design a semi-supervised network model for contrastive learning based on entropy constraints. We use CNN and Transformer to capture the image's local and global feature information, respectively. In addition, the pseudo-labels generated by the teacher networks are unreliable and will lead to degradation of the model performance if they are directly added to the training. Therefore, unreliable samples with high entropy values are discarded to avoid the model extracting the wrong features. In the student network, we also introduce the residual squeeze and excitation module to learn the connection between different channels of each layer feature to obtain better segmentation performance. We demonstrate the effectiveness of the proposed method on the COVID-19 CT public dataset. We mainly considered three evaluation metrics: DSC, HD95, and JC. Compared with several existing state-of-the-art semi-supervised methods, our method improves DSC by 2.3%, JC by 2.5%, and reduces HD95 by 1.9 mm. In this paper, a semi-supervised medical image segmentation method is designed by fusing CNN and Transformer and utilizing entropy-constrained contrastive learning loss, which improves the utilization of unlabeled medical images.
Collapse
Affiliation(s)
- Zhiyong Xiao
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214122 Jiangsu China
| | - Hao Sun
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214122 Jiangsu China
| | - Fei Liu
- Wuxi Hospital of Traditional Chinese Medicine, Wuxi, 214071 Jiangsu China
| |
Collapse
|
31
|
Beste NC, Jende J, Kronlage M, Kurz F, Heiland S, Bendszus M, Meredig H. Automated peripheral nerve segmentation for MR-neurography. Eur Radiol Exp 2024; 8:97. [PMID: 39186183 PMCID: PMC11347527 DOI: 10.1186/s41747-024-00503-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 08/01/2024] [Indexed: 08/27/2024] Open
Abstract
BACKGROUND Magnetic resonance neurography (MRN) is increasingly used as a diagnostic tool for peripheral neuropathies. Quantitative measures enhance MRN interpretation but require nerve segmentation which is time-consuming and error-prone and has not become clinical routine. In this study, we applied neural networks for the automated segmentation of peripheral nerves. METHODS A neural segmentation network was trained to segment the sciatic nerve and its proximal branches on the MRN scans of the right and left upper leg of 35 healthy individuals, resulting in 70 training examples, via 5-fold cross-validation (CV). The model performance was evaluated on an independent test set of one-sided MRN scans of 60 healthy individuals. RESULTS Mean Dice similarity coefficient (DSC) in CV was 0.892 (95% confidence interval [CI]: 0.888-0.897) with a mean Jaccard index (JI) of 0.806 (95% CI: 0.799-0.814) and mean Hausdorff distance (HD) of 2.146 (95% CI: 2.184-2.208). For the independent test set, DSC and JI were lower while HD was higher, with a mean DSC of 0.789 (95% CI: 0.760-0.815), mean JI of 0.672 (95% CI: 0.642-0.699), and mean HD of 2.118 (95% CI: 2.047-2.190). CONCLUSION The deep learning-based segmentation model showed a good performance for the task of nerve segmentation. Future work will focus on extending training data and including individuals with peripheral neuropathies in training to enable advanced peripheral nerve disease characterization. RELEVANCE STATEMENT The results will serve as a baseline to build upon while developing an automated quantitative MRN feature analysis framework for application in routine reading of MRN examinations. KEY POINTS Quantitative measures enhance MRN interpretation, requiring complex and challenging nerve segmentation. We present a deep learning-based segmentation model with good performance. Our results may serve as a baseline for clinical automated quantitative MRN segmentation.
Collapse
Affiliation(s)
- Nedim Christoph Beste
- Institute of Neuroradiology, University Hospital of Heidelberg, Heidelberg, Germany.
| | - Johann Jende
- Institute of Neuroradiology, University Hospital of Heidelberg, Heidelberg, Germany
| | - Moritz Kronlage
- Institute of Neuroradiology, University Hospital of Heidelberg, Heidelberg, Germany
| | - Felix Kurz
- DKFZ German Cancer Research Center, Heidelberg, Germany
| | - Sabine Heiland
- Institute of Neuroradiology, University Hospital of Heidelberg, Heidelberg, Germany
| | - Martin Bendszus
- Institute of Neuroradiology, University Hospital of Heidelberg, Heidelberg, Germany
| | - Hagen Meredig
- Institute of Neuroradiology, University Hospital of Heidelberg, Heidelberg, Germany
| |
Collapse
|
32
|
Fu J, Deng H. MASDF-Net: A Multi-Attention Codec Network with Selective and Dynamic Fusion for Skin Lesion Segmentation. SENSORS (BASEL, SWITZERLAND) 2024; 24:5372. [PMID: 39205066 PMCID: PMC11359664 DOI: 10.3390/s24165372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Revised: 08/15/2024] [Accepted: 08/16/2024] [Indexed: 09/04/2024]
Abstract
Automated segmentation algorithms for dermoscopic images serve as effective tools that assist dermatologists in clinical diagnosis. While existing deep learning-based skin lesion segmentation algorithms have achieved certain success, challenges remain in accurately delineating the boundaries of lesion regions in dermoscopic images with irregular shapes, blurry edges, and occlusions by artifacts. To address these issues, a multi-attention codec network with selective and dynamic fusion (MASDF-Net) is proposed for skin lesion segmentation in this study. In this network, we use the pyramid vision transformer as the encoder to model the long-range dependencies between features, and we innovatively designed three modules to further enhance the performance of the network. Specifically, the multi-attention fusion (MAF) module allows for attention to be focused on high-level features from various perspectives, thereby capturing more global contextual information. The selective information gathering (SIG) module improves the existing skip-connection structure by eliminating the redundant information in low-level features. The multi-scale cascade fusion (MSCF) module dynamically fuses features from different levels of the decoder part, further refining the segmentation boundaries. We conducted comprehensive experiments on the ISIC 2016, ISIC 2017, ISIC 2018, and PH2 datasets. The experimental results demonstrate the superiority of our approach over existing state-of-the-art methods.
Collapse
Affiliation(s)
| | - Hongmin Deng
- School of Electronics and Information Engineering, Sichuan University, Chengdu 610065, China;
| |
Collapse
|
33
|
Alshemaimri BK. Novel Deep CNNs Explore Regions, Boundaries, and Residual Learning for COVID-19 Infection Analysis in Lung CT. Tomography 2024; 10:1205-1221. [PMID: 39195726 PMCID: PMC11359787 DOI: 10.3390/tomography10080091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2024] [Revised: 07/06/2024] [Accepted: 07/17/2024] [Indexed: 08/29/2024] Open
Abstract
COVID-19 poses a global health crisis, necessitating precise diagnostic methods for timely containment. However, accurately delineating COVID-19-affected regions in lung CT scans is challenging due to contrast variations and significant texture diversity. In this regard, this study introduces a novel two-stage classification and segmentation CNN approach for COVID-19 lung radiological pattern analysis. A novel Residual-BRNet is developed to integrate boundary and regional operations with residual learning, capturing key COVID-19 radiological homogeneous regions, texture variations, and structural contrast patterns in the classification stage. Subsequently, infectious CT images undergo lesion segmentation using the newly proposed RESeg segmentation CNN in the second stage. The RESeg leverages both average and max-pooling implementations to simultaneously learn region homogeneity and boundary-related patterns. Furthermore, novel pixel attention (PA) blocks are integrated into RESeg to effectively address mildly COVID-19-infected regions. The evaluation of the proposed Residual-BRNet CNN in the classification stage demonstrates promising performance metrics, achieving an accuracy of 97.97%, F1-score of 98.01%, sensitivity of 98.42%, and MCC of 96.81%. Meanwhile, PA-RESeg in the segmentation phase achieves an optimal segmentation performance with an IoU score of 98.43% and a dice similarity score of 95.96% of the lesion region. The framework's effectiveness in detecting and segmenting COVID-19 lesions highlights its potential for clinical applications.
Collapse
Affiliation(s)
- Bader Khalid Alshemaimri
- Software Engineering Department, College of Computing and Information Sciences, King Saud University, Riyadh 11671, Saudi Arabia
| |
Collapse
|
34
|
Zheng J, Wang L, Gui J, Yussuf AH. Study on lung CT image segmentation algorithm based on threshold-gradient combination and improved convex hull method. Sci Rep 2024; 14:17731. [PMID: 39085327 PMCID: PMC11291637 DOI: 10.1038/s41598-024-68409-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 07/23/2024] [Indexed: 08/02/2024] Open
Abstract
Lung images often have the characteristics of strong noise, uneven grayscale distribution, and complex pathological structures, which makes lung image segmentation a challenging task. To solve this problems, this paper proposes an initial lung mask extraction algorithm that combines threshold and gradient. The gradient used in the algorithm is obtained by the time series feature extraction method based on differential memory (TFDM), which is obtained by the grayscale threshold and image grayscale features. At the same time, we also proposed a lung contour repair algorithm based on the improved convex hull method to solve the contour loss caused by solid nodules and other lesions. Experimental results show that on the COVID-19 CT segmentation dataset, the advanced lung segmentation algorithm proposed in this article achieves better segmentation results and greatly improves the consistency and accuracy of lung segmentation. Our method can obtain more lung information, resulting in ideal segmentation effects with improved accuracy and robustness.
Collapse
Affiliation(s)
- Junbao Zheng
- School of Computer Science and Technology (School of Artificial Intelligence), Zhejiang Sci-tech University, Hangzhou, 310018, Zhejiang, People's Republic of China
| | - Lixian Wang
- School of Information Science and Engineering, Zhejiang Sci-tech University, Hangzhou, 310018, Zhejiang, People's Republic of China
| | - Jiangsheng Gui
- School of Computer Science and Technology (School of Artificial Intelligence), Zhejiang Sci-tech University, Hangzhou, 310018, Zhejiang, People's Republic of China.
| | - Abdulla Hamad Yussuf
- School of Computer Science and Technology (School of Artificial Intelligence), Zhejiang Sci-tech University, Hangzhou, 310018, Zhejiang, People's Republic of China
| |
Collapse
|
35
|
Newson KS, Benoit DM, Beavis AW. Encoder-decoder convolutional neural network for simple CT segmentation of COVID-19 infected lungs. PeerJ Comput Sci 2024; 10:e2178. [PMID: 39145207 PMCID: PMC11323195 DOI: 10.7717/peerj-cs.2178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 06/17/2024] [Indexed: 08/16/2024]
Abstract
This work presents the application of an Encoder-Decoder convolutional neural network (ED-CNN) model to automatically segment COVID-19 computerised tomography (CT) data. By doing so we are producing an alternative model to current literature, which is easy to follow and reproduce, making it more accessible for real-world applications as little training would be required to use this. Our simple approach achieves results comparable to those of previously published studies, which use more complex deep-learning networks. We demonstrate a high-quality automated segmentation prediction of thoracic CT scans that correctly delineates the infected regions of the lungs. This segmentation automation can be used as a tool to speed up the contouring process, either to check manual contouring in place of a peer checking, when not possible or to give a rapid indication of infection to be referred for further treatment, thus saving time and resources. In contrast, manual contouring is a time-consuming process in which a professional would contour each patient one by one to be later checked by another professional. The proposed model uses approximately 49 k parameters while others average over 1,000 times more parameters. As our approach relies on a very compact model, shorter training times are observed, which make it possible to easily retrain the model using other data and potentially afford "personalised medicine" workflows. The model achieves similarity scores of Specificity (Sp) = 0.996 ± 0.001, Accuracy (Acc) = 0.994 ± 0.002 and Mean absolute error (MAE) = 0.0075 ± 0.0005.
Collapse
Affiliation(s)
- Kiri S. Newson
- Department of Physics and Mathematics, University of Hull, Hull, United Kingdom
| | - David M. Benoit
- E. A. Milne Centre for Astrophysics, Department of Physics and Mathematics, University of Hull, Hull, United Kingdom
| | - Andrew W. Beavis
- Medical Physics Department, Queen’s Centre for Oncology, Hull University Teaching Hospitals NHS Trust, Cottingham, Hull, United Kingdom
- Medical Physics and Biomedical Engineering, University College London, University of London, London, United Kingdom
- Hull York Medical School, University of Hull, Hull, United Kingdom
| |
Collapse
|
36
|
Kanwal K, Asif M, Khalid SG, Liu H, Qurashi AG, Abdullah S. Current Diagnostic Techniques for Pneumonia: A Scoping Review. SENSORS (BASEL, SWITZERLAND) 2024; 24:4291. [PMID: 39001069 PMCID: PMC11244398 DOI: 10.3390/s24134291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 06/22/2024] [Accepted: 06/28/2024] [Indexed: 07/16/2024]
Abstract
Community-acquired pneumonia is one of the most lethal infectious diseases, especially for infants and the elderly. Given the variety of causative agents, the accurate early detection of pneumonia is an active research area. To the best of our knowledge, scoping reviews on diagnostic techniques for pneumonia are lacking. In this scoping review, three major electronic databases were searched and the resulting research was screened. We categorized these diagnostic techniques into four classes (i.e., lab-based methods, imaging-based techniques, acoustic-based techniques, and physiological-measurement-based techniques) and summarized their recent applications. Major research has been skewed towards imaging-based techniques, especially after COVID-19. Currently, chest X-rays and blood tests are the most common tools in the clinical setting to establish a diagnosis; however, there is a need to look for safe, non-invasive, and more rapid techniques for diagnosis. Recently, some non-invasive techniques based on wearable sensors achieved reasonable diagnostic accuracy that could open a new chapter for future applications. Consequently, further research and technology development are still needed for pneumonia diagnosis using non-invasive physiological parameters to attain a better point of care for pneumonia patients.
Collapse
Affiliation(s)
- Kehkashan Kanwal
- College of Speech, Language, and Hearing Sciences, Ziauddin University, Karachi 75000, Pakistan
| | - Muhammad Asif
- Faculty of Computing and Applied Sciences, Sir Syed University of Engineering and Technology, Karachi 75300, Pakistan;
| | - Syed Ghufran Khalid
- Department of Engineering, Faculty of Science and Technology, Nottingham Trent University, Nottingham B15 3TN, UK
| | - Haipeng Liu
- Research Centre for Intelligent Healthcare, Coventry University, Coventry CV1 5FB, UK;
| | | | - Saad Abdullah
- School of Innovation, Design and Engineering, Mälardalen University, 721 23 Västerås, Sweden
| |
Collapse
|
37
|
Kumar S, Bhowmik B. Automated Segmentation of COVID-19 Infected Lungs via Modified U-Net Model. 2024 15TH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT) 2024:1-7. [DOI: 10.1109/icccnt61001.2024.10724997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2025]
Affiliation(s)
- Sunil Kumar
- National Institute of Technology, Surathkal,Maharshi Patanjali CPS Lab BRICS Laboratory,Department of Computer Science and Engineering,Mangalore,Karnataka,Bharat,575025
| | - Biswajit Bhowmik
- National Institute of Technology, Surathkal,Maharshi Patanjali CPS Lab BRICS Laboratory,Department of Computer Science and Engineering,Mangalore,Karnataka,Bharat,575025
| |
Collapse
|
38
|
Bougourzi F, Dornaika F, Distante C, Taleb-Ahmed A. D-TrAttUnet: Toward hybrid CNN-transformer architecture for generic and subtle segmentation in medical images. Comput Biol Med 2024; 176:108590. [PMID: 38763066 DOI: 10.1016/j.compbiomed.2024.108590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 04/16/2024] [Accepted: 05/09/2024] [Indexed: 05/21/2024]
Abstract
Over the past two decades, machine analysis of medical imaging has advanced rapidly, opening up significant potential for several important medical applications. As complicated diseases increase and the number of cases rises, the role of machine-based imaging analysis has become indispensable. It serves as both a tool and an assistant to medical experts, providing valuable insights and guidance. A particularly challenging task in this area is lesion segmentation, a task that is challenging even for experienced radiologists. The complexity of this task highlights the urgent need for robust machine learning approaches to support medical staff. In response, we present our novel solution: the D-TrAttUnet architecture. This framework is based on the observation that different diseases often target specific organs. Our architecture includes an encoder-decoder structure with a composite Transformer-CNN encoder and dual decoders. The encoder includes two paths: the Transformer path and the Encoders Fusion Module path. The Dual-Decoder configuration uses two identical decoders, each with attention gates. This allows the model to simultaneously segment lesions and organs and integrate their segmentation losses. To validate our approach, we performed evaluations on the Covid-19 and Bone Metastasis segmentation tasks. We also investigated the adaptability of the model by testing it without the second decoder in the segmentation of glands and nuclei. The results confirmed the superiority of our approach, especially in Covid-19 infections and the segmentation of bone metastases. In addition, the hybrid encoder showed exceptional performance in the segmentation of glands and nuclei, solidifying its role in modern medical image analysis.
Collapse
Affiliation(s)
- Fares Bougourzi
- Junia, UMR 8520, CNRS, Centrale Lille, University of Polytechnique Hauts-de-France, 59000 Lille, France.
| | - Fadi Dornaika
- University of the Basque Country UPV/EHU, San Sebastian, Spain; IKERBASQUE, Basque Foundation for Science, Bilbao, Spain.
| | - Cosimo Distante
- Institute of Applied Sciences and Intelligent Systems, National Research Council of Italy, 73100 Lecce, Italy.
| | - Abdelmalik Taleb-Ahmed
- Université Polytechnique Hauts-de-France, Université de Lille, CNRS, Valenciennes, 59313, Hauts-de-France, France.
| |
Collapse
|
39
|
Qiu Y, Liu Y, Li S, Xu J. MiniSeg: An Extremely Minimum Network Based on Lightweight Multiscale Learning for Efficient COVID-19 Segmentation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:8570-8584. [PMID: 37015641 DOI: 10.1109/tnnls.2022.3230821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
The rapid spread of the new pandemic, i.e., coronavirus disease 2019 (COVID-19), has severely threatened global health. Deep-learning-based computer-aided screening, e.g., COVID-19 infected area segmentation from computed tomography (CT) image, has attracted much attention by serving as an adjunct to increase the accuracy of COVID-19 screening and clinical diagnosis. Although lesion segmentation is a hot topic, traditional deep learning methods are usually data-hungry with millions of parameters, easy to overfit under limited available COVID-19 training data. On the other hand, fast training/testing and low computational cost are also necessary for quick deployment and development of COVID-19 screening systems, but traditional methods are usually computationally intensive. To address the above two problems, we propose MiniSeg, a lightweight model for efficient COVID-19 segmentation from CT images. Our efforts start with the design of an attentive hierarchical spatial pyramid (AHSP) module for lightweight, efficient, effective multiscale learning that is essential for image segmentation. Then, we build a two-path (TP) encoder for deep feature extraction, where one path uses AHSP modules for learning multiscale contextual features and the other is a shallow convolutional path for capturing fine details. The two paths interact with each other for learning effective representations. Based on the extracted features, a simple decoder is added for COVID-19 segmentation. For comparing MiniSeg to previous methods, we build a comprehensive COVID-19 segmentation benchmark. Extensive experiments demonstrate that the proposed MiniSeg achieves better accuracy because its only 83k parameters make it less prone to overfitting. Its high efficiency also makes it easy to deploy and develop. The code has been released at https://github.com/yun-liu/MiniSeg.
Collapse
|
40
|
Ying X, Peng H, Xie J. Big data analysis for Covid-19 in hospital information systems. PLoS One 2024; 19:e0294481. [PMID: 38776299 PMCID: PMC11111070 DOI: 10.1371/journal.pone.0294481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Accepted: 10/31/2023] [Indexed: 05/24/2024] Open
Abstract
The COVID-19 pandemic has triggered a global public health crisis, affecting hundreds of countries. With the increasing number of infected cases, developing automated COVID-19 identification tools based on CT images can effectively assist clinical diagnosis and reduce the tedious workload of image interpretation. To expand the dataset for machine learning methods, it is necessary to aggregate cases from different medical systems to learn robust and generalizable models. This paper proposes a novel deep learning joint framework that can effectively handle heterogeneous datasets with distribution discrepancies for accurate COVID-19 identification. We address the cross-site domain shift by redesigning the COVID-Net's network architecture and learning strategy, and independent feature normalization in latent space to improve prediction accuracy and learning efficiency. Additionally, we propose using a contrastive training objective to enhance the domain invariance of semantic embeddings and boost classification performance on each dataset. We develop and evaluate our method with two large-scale public COVID-19 diagnosis datasets containing CT images. Extensive experiments show that our method consistently improves the performance both datasets, outperforming the original COVID-Net trained on each dataset by 13.27% and 15.15% in AUC respectively, also exceeding existing state-of-the-art multi-site learning methods.
Collapse
Affiliation(s)
- Xinpa Ying
- Hospital of Chengdu University of TCM, Chengdu, Sichuan, China
| | - Haiyang Peng
- Hospital of Chengdu University of TCM, Chengdu, Sichuan, China
| | - Jun Xie
- Hospital of Chengdu University of TCM, Chengdu, Sichuan, China
| |
Collapse
|
41
|
Shafi SM, Chinnappan SK. Segmenting and classifying lung diseases with M-Segnet and Hybrid Squeezenet-CNN architecture on CT images. PLoS One 2024; 19:e0302507. [PMID: 38753712 PMCID: PMC11098347 DOI: 10.1371/journal.pone.0302507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Accepted: 04/07/2024] [Indexed: 05/18/2024] Open
Abstract
Diagnosing lung diseases accurately and promptly is essential for effectively managing this significant public health challenge on a global scale. This paper introduces a new framework called Modified Segnet-based Lung Disease Segmentation and Severity Classification (MSLDSSC). The MSLDSSC model comprises four phases: "preprocessing, segmentation, feature extraction, and classification." Initially, the input image undergoes preprocessing using an improved Wiener filter technique. This technique estimates the power spectral density of the noisy and original images and computes the SNR assisted by PSNR to evaluate image quality. Next, the preprocessed image undergoes Segmentation to identify and separate the RoI from the background objects in the lung image. We employ a Modified Segnet mechanism that utilizes a proposed hard tanh-Softplus activation function for effective Segmentation. Following Segmentation, features such as MLDN, entropy with MRELBP, shape features, and deep features are extracted. Following the feature extraction phase, the retrieved feature set is input into a hybrid severity classification model. This hybrid model comprises two classifiers: SDPA-Squeezenet and DCNN. These classifiers train on the retrieved feature set and effectively classify the severity level of lung diseases.
Collapse
Affiliation(s)
- Syed Mohammed Shafi
- School of Computer Science and Engineering Vellore Institute of Technology, Vellore, India
| | | |
Collapse
|
42
|
Swinburne N. When the Student Becomes the Master: Boosting Intracranial Hemorrhage Detection Generalizability with Teacher-Student Learning. Radiol Artif Intell 2024; 6:e240126. [PMID: 38597790 PMCID: PMC11140502 DOI: 10.1148/ryai.240126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 03/07/2024] [Accepted: 03/08/2024] [Indexed: 04/11/2024]
Affiliation(s)
- Nathaniel Swinburne
- From the Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065
| |
Collapse
|
43
|
Du W, Huo Y, Zhou R, Sun Y, Tang S, Zhao X, Li Y, Li G. Consistency label-activated region generating network for weakly supervised medical image segmentation. Comput Biol Med 2024; 173:108380. [PMID: 38555701 DOI: 10.1016/j.compbiomed.2024.108380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 03/04/2024] [Accepted: 03/24/2024] [Indexed: 04/02/2024]
Abstract
The current methods of auto-segmenting medical images are limited due to insufficient and ambiguous pathonmorphological labeling. In clinical practice, rough classification labels (such as disease or normal) are more commonly used than precise segmentation masks. However, there is still much to be explored regarding utilizing these weak clinical labels to accurately determine the lesion mask and guide medical image segmentation. In this paper, we proposed a weakly supervised medical image segmentation model to directly generate the lesion mask through a class activation map (CAM) guided cycle-consistency label-activated region transferring network. Cycle-consistency enforces that the mappings between the two domains should be reversible, which ensures that the original image can be reconstructed from the translated image. We developed a complementary branches fusion module to address the issue of blurry boundaries in CAM-guided segmentation. The complementary branch preserves the original semantic information of the non-lesion region and perfectly fuses the transferred feature of the lesion region with a complementary mask-constrained fake image generation process to clear the boundary of the lesion and non-lesion regions. This module allows the class transformation to focus solely on the label-activated region, resulting in more explicit segmentation. This model can accurately identify different region of medical images at the pixel-level while preserving the overall semantic structure semantion. It organizes disease labels and corresponding regions during image synthesis. Our method utilizes a joint discrimination strategy that significantly enhances the precision of the produced lesion mask. Extensive experiments of the proposed method on BraTs, ISIC and COVID-19 datasets demonstrate superior performance over existing state-of-the-art methods. The code and datasets are available at: https://github.com/mlcb-jlu/MedImgSeg.
Collapse
Affiliation(s)
- Wei Du
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
| | - Yongkang Huo
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
| | - Rixin Zhou
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
| | - Yu Sun
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
| | - Shiyi Tang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
| | - Xuan Zhao
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
| | - Ying Li
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China.
| | - Gaoyang Li
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Shanghai, 200092, China; Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
| |
Collapse
|
44
|
Hu Y, Mu N, Liu L, Zhang L, Jiang J, Li X. Slimmable transformer with hybrid axial-attention for medical image segmentation. Comput Biol Med 2024; 173:108370. [PMID: 38564854 PMCID: PMC11772084 DOI: 10.1016/j.compbiomed.2024.108370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 03/14/2024] [Accepted: 03/24/2024] [Indexed: 04/04/2024]
Abstract
The transformer architecture has achieved remarkable success in medical image analysis owing to its powerful capability for capturing long-range dependencies. However, due to the lack of intrinsic inductive bias in modeling visual structural information, the transformer generally requires a large-scale pre-training schedule, limiting the clinical applications over expensive small-scale medical data. To this end, we propose a slimmable transformer to explore intrinsic inductive bias via position information for medical image segmentation. Specifically, we empirically investigate how different position encoding strategies affect the prediction quality of the region of interest (ROI) and observe that ROIs are sensitive to different position encoding strategies. Motivated by this, we present a novel Hybrid Axial-Attention (HAA) that can be equipped with pixel-level spatial structure and relative position information as inductive bias. Moreover, we introduce a gating mechanism to achieve efficient feature selection and further improve the representation quality over small-scale datasets. Experiments on LGG and COVID-19 datasets prove the superiority of our method over the baseline and previous works. Internal workflow visualization with interpretability is conducted to validate our success better; the proposed slimmable transformer has the potential to be further developed into a visual software tool for improving computer-aided lesion diagnosis and treatment planning.
Collapse
Affiliation(s)
- Yiyue Hu
- College of Computer Science, Sichuan Normal University, Chengdu, 610101, China
| | - Nan Mu
- College of Computer Science, Sichuan Normal University, Chengdu, 610101, China; Department of Biomedical Engineering, Michigan Technological University, Houghton, MI, 49931, USA; Visual Computing and Virtual Reality Key Laboratory of Sichuan, Sichuan Normal University, Chengdu, 610068, China.
| | - Lei Liu
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, Shenzhen, 518172, China
| | - Lei Zhang
- College of Computer Science, Sichuan Normal University, Chengdu, 610101, China
| | - Jingfeng Jiang
- Department of Biomedical Engineering, Michigan Technological University, Houghton, MI, 49931, USA
| | - Xiaoning Li
- College of Computer Science, Sichuan Normal University, Chengdu, 610101, China; Education Big Data Collaborative Innovation Center of Sichuan 2011, Chengdu, 610101, China
| |
Collapse
|
45
|
Zhang J, Wang S, Jiang Z, Chen Z, Bai X. CD-Net: Cascaded 3D Dilated convolutional neural network for pneumonia lesion segmentation. Comput Biol Med 2024; 173:108311. [PMID: 38513395 DOI: 10.1016/j.compbiomed.2024.108311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 02/22/2024] [Accepted: 03/12/2024] [Indexed: 03/23/2024]
Abstract
COVID-19 is a global pandemic that has caused significant global, social, and economic disruption. To effectively assist in screening and monitoring diagnosed cases, it is crucial to accurately segment lesions from Computer Tomography (CT) scans. Due to the lack of labeled data and the presence of redundant parameters in 3D CT, there are still significant challenges in diagnosing COVID-19 in related fields. To address the problem, we have developed a new model called the Cascaded 3D Dilated convolutional neural network (CD-Net) for directly processing CT volume data. To reduce memory consumption when cutting volume data into small patches, we initially design a cascade architecture in CD-Net to preserve global information. Then, we construct a Multi-scale Parallel Dilated Convolution (MPDC) block to aggregate features of different sizes and simultaneously reduce the parameters. Moreover, to alleviate the shortage of labeled data, we employ classical transfer learning, which requires only a small amount of data while achieving better performance. Experimental results conducted on the different public-available datasets verify that the proposed CD-Net has reduced the negative-positive ratio and outperformed other existing segmentation methods while requiring less data.
Collapse
Affiliation(s)
- Jinli Zhang
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
| | - Shaomeng Wang
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
| | - Zongli Jiang
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
| | - Zhijie Chen
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
| | - Xiaolu Bai
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
| |
Collapse
|
46
|
Zhang Y, Feng X, Dong Y, Chen Y, Zhao Z, Yang B, Chang Y, Bai Y. SM-GRSNet: sparse mapping-based graph representation segmentation network for honeycomb lung lesion. Phys Med Biol 2024; 69:085020. [PMID: 38417177 DOI: 10.1088/1361-6560/ad2e6b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Accepted: 02/28/2024] [Indexed: 03/01/2024]
Abstract
Objective. Honeycomb lung is a rare but severe disease characterized by honeycomb-like imaging features and distinct radiological characteristics. Therefore, this study aims to develop a deep-learning model capable of segmenting honeycomb lung lesions from Computed Tomography (CT) scans to address the efficacy issue of honeycomb lung segmentation.Methods. This study proposes a sparse mapping-based graph representation segmentation network (SM-GRSNet). SM-GRSNet integrates an attention affinity mechanism to effectively filter redundant features at a coarse-grained region level. The attention encoder generated by this mechanism specifically focuses on the lesion area. Additionally, we introduce a graph representation module based on sparse links in SM-GRSNet. Subsequently, graph representation operations are performed on the sparse graph, yielding detailed lesion segmentation results. Finally, we construct a pyramid-structured cascaded decoder in SM-GRSNet, which combines features from the sparse link-based graph representation modules and attention encoders to generate the final segmentation mask.Results. Experimental results demonstrate that the proposed SM-GRSNet achieves state-of-the-art performance on a dataset comprising 7170 honeycomb lung CT images. Our model attains the highest IOU (87.62%), Dice(93.41%). Furthermore, our model also achieves the lowest HD95 (6.95) and ASD (2.47).Significance.The SM-GRSNet method proposed in this paper can be used for automatic segmentation of honeycomb lung CT images, which enhances the segmentation performance of Honeycomb lung lesions under small sample datasets. It will help doctors with early screening, accurate diagnosis, and customized treatment. This method maintains a high correlation and consistency between the automatic segmentation results and the expert manual segmentation results. Accurate automatic segmentation of the honeycomb lung lesion area is clinically important.
Collapse
Affiliation(s)
- Yuanrong Zhang
- School of Software, Taiyuan University of Technology, Taiyuan 030024, People's Republic of China
| | - Xiufang Feng
- School of Software, Taiyuan University of Technology, Taiyuan 030024, People's Republic of China
| | - Yunyun Dong
- School of Software, Taiyuan University of Technology, Taiyuan 030024, People's Republic of China
| | - Ying Chen
- School of International Education, Beijing University of Chemical Technology, Beijing 100029, People's Republic of China
| | - Zian Zhao
- School of Software, Taiyuan University of Technology, Taiyuan 030024, People's Republic of China
| | - Bingqian Yang
- School of Software, Taiyuan University of Technology, Taiyuan 030024, People's Republic of China
| | - Yunqing Chang
- School of Software, Taiyuan University of Technology, Taiyuan 030024, People's Republic of China
| | - Yujie Bai
- School of Software, Taiyuan University of Technology, Taiyuan 030024, People's Republic of China
| |
Collapse
|
47
|
Miao J, Zhou SP, Zhou GQ, Wang KN, Yang M, Zhou S, Chen Y. SC-SSL: Self-Correcting Collaborative and Contrastive Co-Training Model for Semi-Supervised Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1347-1364. [PMID: 37995173 DOI: 10.1109/tmi.2023.3336534] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2023]
Abstract
Image segmentation achieves significant improvements with deep neural networks at the premise of a large scale of labeled training data, which is laborious to assure in medical image tasks. Recently, semi-supervised learning (SSL) has shown great potential in medical image segmentation. However, the influence of the learning target quality for unlabeled data is usually neglected in these SSL methods. Therefore, this study proposes a novel self-correcting co-training scheme to learn a better target that is more similar to ground-truth labels from collaborative network outputs. Our work has three-fold highlights. First, we advance the learning target generation as a learning task, improving the learning confidence for unannotated data with a self-correcting module. Second, we impose a structure constraint to encourage the shape similarity further between the improved learning target and the collaborative network outputs. Finally, we propose an innovative pixel-wise contrastive learning loss to boost the representation capacity under the guidance of an improved learning target, thus exploring unlabeled data more efficiently with the awareness of semantic context. We have extensively evaluated our method with the state-of-the-art semi-supervised approaches on four public-available datasets, including the ACDC dataset, M&Ms dataset, Pancreas-CT dataset, and Task_07 CT dataset. The experimental results with different labeled-data ratios show our proposed method's superiority over other existing methods, demonstrating its effectiveness in semi-supervised medical image segmentation.
Collapse
|
48
|
Ike CS, Muhammad N, Bibi N, Alhazmi S, Eoghan F. Discriminative context-aware network for camouflaged object detection. Front Artif Intell 2024; 7:1347898. [PMID: 38601112 PMCID: PMC11004367 DOI: 10.3389/frai.2024.1347898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 03/12/2024] [Indexed: 04/12/2024] Open
Abstract
Introduction Animals use camouflage (background matching, disruptive coloration, etc.) for protection, confusing predators and making detection difficult. Camouflage Object Detection (COD) tackles this challenge by identifying objects seamlessly blended into their surroundings. Existing COD techniques struggle with hidden objects due to noisy inferences inherent in natural environments. To address this, we propose the Discriminative Context-aware Network (DiCANet) for improved COD performance. Methods DiCANet addresses camouflage challenges through a two-stage approach. First, an adaptive restoration block intelligently learns feature weights, prioritizing informative channels and pixels. This enhances convolutional neural networks' ability to represent diverse data and handle complex camouflage. Second, a cascaded detection module with an enlarged receptive field refines the object prediction map, achieving clear boundaries without post-processing. Results Without post-processing, DiCANet achieves state-of-the-art performance on challenging COD datasets (CAMO, CHAMELEON, COD10K) by generating accurate saliency maps with rich contextual details and precise boundaries. Discussion DiCANet tackles the challenge of identifying camouflaged objects in noisy environments with its two-stage restoration and cascaded detection approach. This innovative architecture surpasses existing methods in COD tasks, as proven by benchmark dataset experiments.
Collapse
Affiliation(s)
| | - Nazeer Muhammad
- School of Computing, Pak-Austria Fachhochschule Institute of Applied Sciences and Technology, Haripur, Pakistan
| | - Nargis Bibi
- Department of Computer Science, Fatima Jinnah Women University, Rawalpindi, Pakistan
| | - Samah Alhazmi
- Computer Science Department, College of Computing and Informatics, Saudi Electronic University, Riyadh, Saudi Arabia
| | - Furey Eoghan
- Department of Computing, Atlantic Technological University, Letterkenny, Ireland
| |
Collapse
|
49
|
Chen J, Huang G, Yuan X, Zhong G, Zheng Z, Pun CM, Zhu J, Huang Z. Quaternion Cross-Modality Spatial Learning for Multi-Modal Medical Image Segmentation. IEEE J Biomed Health Inform 2024; 28:1412-1423. [PMID: 38145537 DOI: 10.1109/jbhi.2023.3346529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2023]
Abstract
Recently, the Deep Neural Networks (DNNs) have had a large impact on imaging process including medical image segmentation, and the real-valued convolution of DNN has been extensively utilized in multi-modal medical image segmentation to accurately segment lesions via learning data information. However, the weighted summation operation in such convolution limits the ability to maintain spatial dependence that is crucial for identifying different lesion distributions. In this paper, we propose a novel Quaternion Cross-modality Spatial Learning (Q-CSL) which explores the spatial information while considering the linkage between multi-modal images. Specifically, we introduce to quaternion to represent data and coordinates that contain spatial information. Additionally, we propose Quaternion Spatial-association Convolution to learn the spatial information. Subsequently, the proposed De-level Quaternion Cross-modality Fusion (De-QCF) module excavates inner space features and fuses cross-modality spatial dependency. Our experimental results demonstrate that our approach compared to the competitive methods perform well with only 0.01061 M parameters and 9.95G FLOPs.
Collapse
|
50
|
Wang Z, Yu L, Tian S, Huo X. CRMEFNet: A coupled refinement, multiscale exploration and fusion network for medical image segmentation. Comput Biol Med 2024; 171:108202. [PMID: 38402839 DOI: 10.1016/j.compbiomed.2024.108202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Revised: 12/22/2023] [Accepted: 02/18/2024] [Indexed: 02/27/2024]
Abstract
Accurate segmentation of target areas in medical images, such as lesions, is essential for disease diagnosis and clinical analysis. In recent years, deep learning methods have been intensively researched and have generated significant progress in medical image segmentation tasks. However, most of the existing methods have limitations in modeling multilevel feature representations and identification of complex textured pixels at contrasting boundaries. This paper proposes a novel coupled refinement and multiscale exploration and fusion network (CRMEFNet) for medical image segmentation, which explores in the optimization and fusion of multiscale features to address the abovementioned limitations. The CRMEFNet consists of three main innovations: a coupled refinement module (CRM), a multiscale exploration and fusion module (MEFM), and a cascaded progressive decoder (CPD). The CRM decouples features into low-frequency body features and high-frequency edge features, and performs targeted optimization of both to enhance intraclass uniformity and interclass differentiation of features. The MEFM performs a two-stage exploration and fusion of multiscale features using our proposed multiscale aggregation attention mechanism, which explores the differentiated information within the cross-level features, and enhances the contextual connections between the features, to achieves adaptive feature fusion. Compared to existing complex decoders, the CPD decoder (consisting of the CRM and MEFM) can perform fine-grained pixel recognition while retaining complete semantic location information. It also has a simple design and excellent performance. The experimental results from five medical image segmentation tasks, ten datasets and twelve comparison models demonstrate the state-of-the-art performance, interpretability, flexibility and versatility of our CRMEFNet.
Collapse
Affiliation(s)
- Zhi Wang
- College of Software, Xinjiang University, Urumqi, 830000, China; Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000, China
| | - Long Yu
- College of Network Center, Xinjiang University, Urumqi, 830000, China; Signal and Signal Processing Laboratory, College of Information Science and Engineering, Xinjiang University, Urumqi, 830000, China.
| | - Shengwei Tian
- College of Software, Xinjiang University, Urumqi, 830000, China; Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000, China
| | - Xiangzuo Huo
- Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000, China; Signal and Signal Processing Laboratory, College of Information Science and Engineering, Xinjiang University, Urumqi, 830000, China
| |
Collapse
|