1
|
He A, Wu Y, Wang Z, Li T, Fu H. DVPT: Dynamic Visual Prompt Tuning of large pre-trained models for medical image analysis. Neural Netw 2025; 185:107168. [PMID: 39827840 DOI: 10.1016/j.neunet.2025.107168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Revised: 11/07/2024] [Accepted: 01/12/2025] [Indexed: 01/22/2025]
Abstract
Pre-training and fine-tuning have become popular due to the rich representations embedded in large pre-trained models, which can be leveraged for downstream medical tasks. However, existing methods typically either fine-tune all parameters or only task-specific layers of pre-trained models, overlooking the variability in input medical images. As a result, these approaches may lack efficiency or effectiveness. In this study, our goal is to explore parameter-efficient fine-tuning (PEFT) for medical image analysis. To address this challenge, we introduce a novel method called Dynamic Visual Prompt Tuning (DVPT). It can extract knowledge beneficial to downstream tasks from large models with only a few trainable parameters. First, the frozen features are transformed by a lightweight bottleneck layer to learn the domain-specific distribution of downstream medical tasks. Then, a few learnable visual prompts are employed as dynamic queries to conduct cross-attention with the transformed features, aiming to acquire sample-specific features. This DVPT module can be shared across different Transformer layers, further reducing the number of trainable parameters. We conduct extensive experiments with various pre-trained models on medical classification and segmentation tasks. We find that this PEFT method not only efficiently adapts pre-trained models to the medical domain but also enhances data efficiency with limited labeled data. For example, with only 0.5% additional trainable parameters, our method not only outperforms state-of-the-art PEFT methods but also surpasses full fine-tuning by more than 2.20% in Kappa score on the medical classification task. It can save up to 60% of labeled data and 99% of storage cost of ViT-B/16.
Collapse
Affiliation(s)
- Along He
- College of Computer Science, Tianjin Key Laboratory of Network and Data Security Technology, Nankai University, Tianjin, 300350, China
| | - Yanlin Wu
- College of Computer Science, Tianjin Key Laboratory of Network and Data Security Technology, Nankai University, Tianjin, 300350, China
| | - Zhihong Wang
- College of Computer Science, Tianjin Key Laboratory of Network and Data Security Technology, Nankai University, Tianjin, 300350, China
| | - Tao Li
- College of Computer Science, Tianjin Key Laboratory of Network and Data Security Technology, Nankai University, Tianjin, 300350, China.
| | - Huazhu Fu
- Institute of High Performance Computing (IHPC), Agency for Science, Technology and Research (A*STAR), 138632, Singapore
| |
Collapse
|
2
|
Zhang X, Xiao Z, Wu X, Chen Y, Zhao J, Hu Y, Liu J. Pyramid Pixel Context Adaption Network for Medical Image Classification With Supervised Contrastive Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:6802-6815. [PMID: 38829749 DOI: 10.1109/tnnls.2024.3399164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2024]
Abstract
Spatial attention (SA) mechanism has been widely incorporated into deep neural networks (DNNs), significantly lifting the performance in computer vision tasks via long-range dependency modeling. However, it may perform poorly in medical image analysis. Unfortunately, the existing efforts are often unaware that long-range dependency modeling has limitations in highlighting subtle lesion regions. To overcome this limitation, we propose a practical yet lightweight architectural unit, pyramid pixel context adaption (PPCA) module, which exploits multiscale pixel context information to recalibrate pixel position in a pixel-independent manner dynamically. PPCA first applies a well-designed cross-channel pyramid pooling (CCPP) to aggregate multiscale pixel context information, then eliminates the inconsistency among them by the well-designed pixel normalization (PN), and finally estimates per pixel attention weight via a pixel context integration. By embedding PPCA into a DNN with negligible overhead, the PPCA network (PPCANet) is developed for medical image classification. In addition, we introduce supervised contrastive learning to enhance feature representation by exploiting the potential of label information via supervised contrastive loss (CL). The extensive experiments on six medical image datasets show that the PPCANet outperforms state-of-the-art (SOTA) attention-based networks and recent DNNs. We also provide visual analysis and ablation study to explain the behavior of PPCANet in the decision-making process.
Collapse
|
3
|
Ma Y, Gu Y, Guo S, Qin X, Wen S, Shi N, Dai W, Chen Y. Grade-Skewed Domain Adaptation via Asymmetric Bi-Classifier Discrepancy Minimization for Diabetic Retinopathy Grading. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:1115-1126. [PMID: 39441682 DOI: 10.1109/tmi.2024.3485064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2024]
Abstract
Diabetic retinopathy (DR) is a leading cause of preventable low vision worldwide. Deep learning has exhibited promising performance in the grading of DR. Certain deep learning strategies have facilitated convenient regular eye check-ups, which are crucial for managing DR and preventing severe visual impairment. However, the generalization performance on cross-center, cross-vendor, and cross-user test datasets is compromised due to domain shift. Furthermore, the presence of small lesions and the imbalanced grade distribution, resulting from the characteristics of DR grading (e.g., the progressive nature of DR disease and the design of grading standards), complicates image-level domain adaptation for DR grading. The general predictions of the models trained on grade-skewed source domains will be significantly biased toward the majority grades, which further increases the adaptation difficulty. We formulate this problem as a grade-skewed domain adaptation challenge. Under the grade-skewed domain adaptation problem, we propose a novel method for image-level supervised DR grading via Asymmetric Bi-Classifier Discrepancy Minimization (ABiD). First, we propose optimizing the feature extractor by minimizing the discrepancy between the predictions of the asymmetric bi-classifier based on two classification criteria to encourage the exploration of crucial features in adjacent grades and stretch the distribution of adjacent grades in the latent space. Moreover, the classifier difference is maximized by using the forward and inverse distribution compensation mechanism to locate easily confused instances, which avoids pseudo-label bias on the target domain. The experimental results on two public DR datasets and one private DR dataset demonstrate that our method outperforms state-of-the-art methods significantly.
Collapse
|
4
|
Ramshankar N, Murugesan S, K V P, Prathap PMJ. Coinciding Diabetic Retinopathy and Diabetic Macular Edema Grading With Rat Swarm Optimization Algorithm for Enhanced Capsule Generation Adversarial Network. Microsc Res Tech 2025; 88:555-563. [PMID: 39487733 DOI: 10.1002/jemt.24709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 03/14/2024] [Accepted: 09/27/2024] [Indexed: 11/04/2024]
Abstract
In the worldwide working-age population, visual disability and blindness are common conditions caused by diabetic retinopathy (DR) and diabetic macular edema (DME). Nowadays, due to diabetes, many people are affected by eye-related issues. Among these, DR and DME are the two foremost eye diseases, the severity of which may lead to some eye-related problems and blindness. Early detection of DR and DME is essential to preventing vision loss. Therefore, an enhanced capsule generation adversarial network (ECGAN) optimized with the rat swarm optimization (RSO) approach is proposed in this article to coincide with DR and DME grading (DR-DME-ECGAN-RSO-ISBI 2018 IDRiD). The input images are obtained from the ISBI 2018 unbalanced DR grading data set. Then, the input fundus images are preprocessed using the Savitzky-Golay (SG) filter filtering technique, which reduces noise from the input image. The preprocessed image is fed to the discrete shearlet transform (DST) for feature extraction. The extracting features of DR-DME are given to the ECGAN-RSO algorithm to categorize the grading of DR and DME disorders. The proposed approach is implemented in Python and achieves better accuracy by 7.94%, 36.66%, and 4.88% compared to the existing models, such as the combined DR with DME grading for the cross-disease attention network (DR-DME-CANet-ISBI 2018 IDRiD), category attention block for unbalanced grading of DR (DR-DME-HDLCNN-MGMO-ISBI 2018 IDRiD), combined DR-DME classification with a deep learning-convolutional neural network-based modified gray-wolf optimizer with variable weights (DR-DME-ANN-ISBI 2018 IDRiD).
Collapse
Affiliation(s)
- N Ramshankar
- Department of Computer Science and Engineering, R.M.D. Engineering College, Tiruvallur, Tamil Nadu, India
| | - S Murugesan
- Department of Computer Science and Engineering, R.M.D. Engineering College, Tiruvallur, Tamil Nadu, India
| | - Praveen K V
- Department of Information Technology, St. Peter's College of Engineering and Technology, Avadi, Tamil Nadu, India
| | - P M Joe Prathap
- Department of Computer Science and Engineering, R.M.D. Engineering College, Tiruvallur, Tamil Nadu, India
| |
Collapse
|
5
|
Men Y, Fhima J, Celi LA, Ribeiro LZ, Nakayama LF, Behar JA. Deep learning generalization for diabetic retinopathy staging from fundus images. Physiol Meas 2025; 13:015001. [PMID: 39788077 DOI: 10.1088/1361-6579/ada86a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Accepted: 01/09/2025] [Indexed: 01/12/2025]
Abstract
Objective. Diabetic retinopathy (DR) is a serious diabetes complication that can lead to vision loss, making timely identification crucial. Existing data-driven algorithms for DR staging from digital fundus images (DFIs) often struggle with generalization due to distribution shifts between training and target domains.Approach. To address this, DRStageNet, a deep learning model, was developed using six public and independent datasets with 91 984 DFIs from diverse demographics. Five pretrained self-supervised vision transformers (ViTs) were benchmarked, with the best further trained using a multi-source domain (MSD) fine-tuning strategy.Main results. DINOv2 showed a 27.4% improvement in L-Kappa versus other pretrained ViT. MSD fine-tuning improved performance in four of five target domains. The error analysis revealing 60% of errors due to incorrect labels, 77.5% of which were correctly classified by DRStageNet.Significance. We developed DRStageNet, a DL model for DR, designed to accurately stage the condition while addressing the challenge of generalizing performance across target domains. The model and explainability heatmaps are available atwww.aimlab-technion.com/lirot-ai.
Collapse
Affiliation(s)
- Yevgeniy Men
- Andrew and Erna Viterbi Faculty of Electrical & Computer Engineering, Technion, Israel Institute of Technology, Haifa 3200003, Israel
- Faculty of Biomedical Engineering, Technion, Israel Institute of Technology, Haifa 3200003, Israel
| | - Jonathan Fhima
- Faculty of Biomedical Engineering, Technion, Israel Institute of Technology, Haifa 3200003, Israel
- Department of Applied Mathematics, Technion, Israel Institute of Technology, Haifa 3200003, Israel
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA 02139, United States of America
- Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA 02215, United States of America
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02215, United States of America
| | - Lucas Zago Ribeiro
- Ophthalmology department, São Paulo Federal University, Street, São Paulo 610101, Brazil
| | - Luis Filipe Nakayama
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA 02139, United States of America
- Ophthalmology department, São Paulo Federal University, Street, São Paulo 610101, Brazil
| | - Joachim A Behar
- Faculty of Biomedical Engineering, Technion, Israel Institute of Technology, Haifa 3200003, Israel
| |
Collapse
|
6
|
Yi S, Zhou L. Multi-step framework for glaucoma diagnosis in retinal fundus images using deep learning. Med Biol Eng Comput 2025; 63:1-13. [PMID: 39098859 DOI: 10.1007/s11517-024-03172-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 06/14/2024] [Indexed: 08/06/2024]
Abstract
Glaucoma is one of the most common causes of blindness in the world. Screening glaucoma from retinal fundus images based on deep learning is a common method at present. In the diagnosis of glaucoma based on deep learning, the blood vessels within the optic disc interfere with the diagnosis, and there is also some pathological information outside the optic disc in fundus images. Therefore, integrating the original fundus image with the vessel-removed optic disc image can improve diagnostic efficiency. In this paper, we propose a novel multi-step framework named MSGC-CNN that can better diagnose glaucoma. In the framework, (1) we combine glaucoma pathological knowledge with deep learning model, fuse the features of original fundus image and optic disc region in which the interference of blood vessel is specifically removed by U-Net, and make glaucoma diagnosis based on the fused features. (2) Aiming at the characteristics of glaucoma fundus images, such as small amount of data, high resolution, and rich feature information, we design a new feature extraction network RA-ResNet and combined it with transfer learning. In order to verify our method, we conduct binary classification experiments on three public datasets, Drishti-GS, RIM-ONE-R3, and ACRIMA, with accuracy of 92.01%, 93.75%, and 97.87%. The results demonstrate a significant improvement over earlier results.
Collapse
Affiliation(s)
- Sanli Yi
- School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500, China.
- Key Laboratory of Computer Technology Application of Yunnan Province, Kunming, Yunnan, China.
| | - Lingxiang Zhou
- School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500, China
- Key Laboratory of Computer Technology Application of Yunnan Province, Kunming, Yunnan, China
| |
Collapse
|
7
|
Wen C, Ye M, Li H, Chen T, Xiao X. Concept-Based Lesion Aware Transformer for Interpretable Retinal Disease Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:57-68. [PMID: 39012729 DOI: 10.1109/tmi.2024.3429148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/18/2024]
Abstract
Existing deep learning methods have achieved remarkable results in diagnosing retinal diseases, showcasing the potential of advanced AI in ophthalmology. However, the black-box nature of these methods obscures the decision-making process, compromising their trustworthiness and acceptability. Inspired by the concept-based approaches and recognizing the intrinsic correlation between retinal lesions and diseases, we regard retinal lesions as concepts and propose an inherently interpretable framework designed to enhance both the performance and explainability of diagnostic models. Leveraging the transformer architecture, known for its proficiency in capturing long-range dependencies, our model can effectively identify lesion features. By integrating with image-level annotations, it achieves the alignment of lesion concepts with human cognition under the guidance of a retinal foundation model. Furthermore, to attain interpretability without losing lesion-specific information, our method employs a classifier built on a cross-attention mechanism for disease diagnosis and explanation, where explanations are grounded in the contributions of human-understandable lesion concepts and their visual localization. Notably, due to the structure and inherent interpretability of our model, clinicians can implement concept-level interventions to correct the diagnostic errors by simply adjusting erroneous lesion predictions. Experiments conducted on four fundus image datasets demonstrate that our method achieves favorable performance against state-of-the-art methods while providing faithful explanations and enabling concept-level interventions. Our code is publicly available at https://github.com/Sorades/CLAT.
Collapse
|
8
|
Deng T, Huang Y, Yang C. Parallel Multi-Path Network for Ocular Disease Detection Inspired by Visual Cognition Mechanism. IEEE J Biomed Health Inform 2025; 29:345-357. [PMID: 39348249 DOI: 10.1109/jbhi.2024.3471807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/02/2024]
Abstract
Various ocular diseases such as cataracts, glaucoma, and diabetic retinopathy have become several major factors causing non-congenital visual impairment, which seriously threatens people's vision health. The shortage of ophthalmic medical resources has brought huge obstacles to large-scale ocular disease screening. Therefore, it is necessary to use computer-aided diagnosis (CAD) technology to achieve large-scale screening and diagnosis of ocular diseases. In this work, inspired by the human visual cognition mechanism, we propose a parallel multi-path network for multiple ocular diseases detection, called PMP-OD, which integrates the detection of multiple common ocular diseases, including cataracts, glaucoma, diabetic retinopathy, and pathological myopia. The bottom-up features of the fundus image are extracted by a common convolutional module, the Low-level Feature Extraction module, which simulates the non-selective pathway. Simultaneously, the top-down vessel and other lesion features are extracted by the High-level Feature Extraction module that simulates the selective pathway. The retinal vessel and lesion features can be regarded as task-driven high-level semantic information in the physician's disease diagnosis process. Then, the features are fused by a feature fusion module based on the attention mechanism. Finally, the disease classifier gives prediction results according to the integrated multi-features. The experimental results indicate that our PMP-OD model outperforms other state-of-the-art (SOTA) models on an ocular disease dataset reconstructed from ODIR-5K, APTOS-2019, ORIGA-light, and Kaggle.
Collapse
|
9
|
Liu J, Xu Y, Liu Y, Luo H, Huang W, Yao L. Attention-Guided 3D CNN With Lesion Feature Selection for Early Alzheimer's Disease Prediction Using Longitudinal sMRI. IEEE J Biomed Health Inform 2025; 29:324-332. [PMID: 39412975 DOI: 10.1109/jbhi.2024.3482001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2024]
Abstract
Predicting the progression from mild cognitive impairment (MCI) to Alzheimer's disease (AD) is critical for early intervention. Towards this end, various deep learning models have been applied in this domain, typically relying on structural magnetic resonance imaging (sMRI) data from a single time point whereas neglecting the dynamic changes in brain structure over time. Current longitudinal studies inadequately explore disease evolution dynamics and are burdened by high computational complexity. This paper introduces a novel lightweight 3D convolutional neural network specifically designed to capture the evolution of brain diseases for modeling the progression of MCI. First, a longitudinal lesion feature selection strategy is proposed to extract core features from temporal data, facilitating the detection of subtle differences in brain structure between two time points. Next, to refine the model for a more concentrated emphasis on lesion features, a disease trend attention mechanism is introduced to learn the dependencies between overall disease trends and local variation features. Finally, disease prediction visualization techniques are employed to improve the interpretability of the final predictions. Extensive experiments demonstrate that the proposed model achieves state-of-the-art performance in terms of area under the curve (AUC), accuracy, specificity, precision, and F1 score. This study confirms the efficacy of our early diagnostic method, utilizing only two follow-up sMRI scans to predict the disease status of MCI patients 24 months later with an AUC of 79.03%.
Collapse
|
10
|
Cheng Y, Guo Q, Juefei-Xu F, Fu H, Lin SW, Lin W. Adversarial Exposure Attack on Diabetic Retinopathy Imagery Grading. IEEE J Biomed Health Inform 2025; 29:297-309. [PMID: 39331557 DOI: 10.1109/jbhi.2024.3469630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/29/2024]
Abstract
Diabetic Retinopathy (DR) is a leading cause of vision loss around the world. To help diagnose it, numerous cutting-edge works have built powerful deep neural networks (DNNs) to automatically grade DR via retinal fundus images (RFIs). However, RFIs are commonly affected by camera exposure issues that may lead to incorrect grades. The mis-graded results can potentially pose high risks to an aggravation of the condition. In this paper, we study this problem from the viewpoint of adversarial attacks. We identify and introduce a novel solution to an entirely new task, termed as adversarial exposure attack, which is able to produce natural exposure images and mislead the state-of-the-art DNNs. We validate our proposed method on a real-world public DR dataset with three DNNs, e.g., ResNet50, MobileNet, and EfficientNet, demonstrating that our method achieves high image quality and success rate in transferring the attacks. Our method reveals the potential threats to DNN-based automatic DR grading and would benefit the development of exposure-robust DR grading methods in the future.
Collapse
|
11
|
Quan X, Ou X, Gao L, Yin W, Hou G, Zhang H. SCINet: A Segmentation and Classification Interaction CNN Method for Arteriosclerotic Retinopathy Grading. Interdiscip Sci 2024; 16:926-935. [PMID: 39222258 DOI: 10.1007/s12539-024-00650-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 08/09/2024] [Accepted: 08/12/2024] [Indexed: 09/04/2024]
Abstract
As a common disease, cardiovascular and cerebrovascular diseases pose a great harm threat to human wellness. Even using advanced and comprehensive treatment methods, there is still a high mortality rate. Arteriosclerosis, as an important factor reflecting the severity of cardiovascular and cerebrovascular diseases, is imperative to detect the arteriosclerotic retinopathy. However, the detection of arteriosclerosis retinopathy requires expensive and time-consuming manual evaluation, while end-to-end deep learning detection methods also need interpretable design to high light task-related features. Considering the importance of automatic arteriosclerotic retinopathy grading, we propose a segmentation and classification interaction network (SCINet). We propose a segmentation and classification interaction architecture for grading arteriosclerotic retinopathy. After IterNet is used to segment retinal vessel from original fundus images, the backbone feature extractor roughly extracts features from the segmented and original fundus arteriosclerosis images and further enhances them through the vessel aware module. The last classifier module generates fundus arteriosclerosis grading results. Specifically, the vessel aware module is designed to highlight the important areal vessel features segmented from original images by attention mechanism, thereby achieving information interaction. The attention mechanism selectively learns the vessel features of segmentation region information under the proposed interactive architecture, which leads to reweighting the extracted features and enhances significant feature information. Extensive experiments have confirmed the effect of our model. SCINet has the best performance on the task of arteriosclerotic retinopathy grading. Additionally, the CNN method is scalable to similar tasks by incorporating segmented images as auxiliary information.
Collapse
Affiliation(s)
- Xiongwen Quan
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, College of Artificial Intelligence, Nankai University, Tianjin, 300000, China
| | - Xingyuan Ou
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, College of Artificial Intelligence, Nankai University, Tianjin, 300000, China
| | - Li Gao
- Ophthalmology, Tianjin Huanhu Hospital, Tianjin, 300000, China
| | - Wenya Yin
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, College of Artificial Intelligence, Nankai University, Tianjin, 300000, China
| | - Guangyao Hou
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, College of Artificial Intelligence, Nankai University, Tianjin, 300000, China
| | - Han Zhang
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, College of Artificial Intelligence, Nankai University, Tianjin, 300000, China.
| |
Collapse
|
12
|
Guo M, Gong D, Yang W. In-depth analysis of research hotspots and emerging trends in AI for retinal diseases over the past decade. Front Med (Lausanne) 2024; 11:1489139. [PMID: 39635592 PMCID: PMC11614663 DOI: 10.3389/fmed.2024.1489139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2024] [Accepted: 11/06/2024] [Indexed: 12/07/2024] Open
Abstract
Background The application of Artificial Intelligence (AI) in diagnosing retinal diseases represents a significant advancement in ophthalmological research, with the potential to reshape future practices in the field. This study explores the extensive applications and emerging research frontiers of AI in retinal diseases. Objective This study aims to uncover the developments and predict future directions of AI research in retinal disease over the past decade. Methods This study analyzes AI utilization in retinal disease research through articles, using citation data sourced from the Web of Science (WOS) Core Collection database, covering the period from January 1, 2014, to December 31, 2023. A combination of WOS analyzer, CiteSpace 6.2 R4, and VOSviewer 1.6.19 was used for a bibliometric analysis focusing on citation frequency, collaborations, and keyword trends from an expert perspective. Results A total of 2,861 articles across 93 countries or regions were cataloged, with notable growth in article numbers since 2017. China leads with 926 articles, constituting 32% of the total. The United States has the highest h-index at 66, while England has the most significant network centrality at 0.24. Notably, the University of London is the leading institution with 99 articles and shares the highest h-index (25) with University College London. The National University of Singapore stands out for its central role with a score of 0.16. Research primarily spans ophthalmology and computer science, with "network," "transfer learning," and "convolutional neural networks" being prominent burst keywords from 2021 to 2023. Conclusion China leads globally in article counts, while the United States has a significant research impact. The University of London and University College London have made significant contributions to the literature. Diabetic retinopathy is the retinal disease with the highest volume of research. AI applications have focused on developing algorithms for diagnosing retinal diseases and investigating abnormal physiological features of the eye. Future research should pivot toward more advanced diagnostic systems for ophthalmic diseases.
Collapse
Affiliation(s)
- Mingkai Guo
- The Third School of Clinical Medicine, Guangzhou Medical University, Guangzhou, China
| | - Di Gong
- Shenzhen Eye Institute, Shenzhen Eye Hospital, Jinan University, Shenzhen, China
| | - Weihua Yang
- Shenzhen Eye Institute, Shenzhen Eye Hospital, Jinan University, Shenzhen, China
| |
Collapse
|
13
|
Kui X, Hai Z, Zou B, Liang W, Chen L. DFC-Net: a dual-path frequency-domain cross-attention fusion network for retinal image quality assessment. BIOMEDICAL OPTICS EXPRESS 2024; 15:6399-6415. [PMID: 39553870 PMCID: PMC11563343 DOI: 10.1364/boe.531292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 09/12/2024] [Accepted: 10/01/2024] [Indexed: 11/19/2024]
Abstract
Retinal image quality assessment (RIQA) is crucial for diagnosing various eye diseases and ensuring the accuracy of diagnostic analyses based on retinal fundus images. Traditional deep convolutional neural networks (CNNs) for RIQA face challenges such as over-reliance on RGB image brightness and difficulty in differentiating closely ranked image quality categories. To address these issues, we introduced the Dual-Path Frequency-domain Cross-attention Network (DFC-Net), which integrates RGB images and contrast-enhanced images using contrast-limited adaptive histogram equalization (CLAHE) as dual inputs. This approach improves structure detail detection and feature extraction. We also incorporated a frequency-domain attention mechanism (FDAM) to focus selectively on frequency components indicative of quality degradations and a cross-attention mechanism (CAM) to optimize the integration of dual inputs. Our experiments on the EyeQ and RIQA-RFMiD datasets demonstrated significant improvements, achieving a precision of 0.8895, recall of 0.8923, F1-score of 0.8909, and a Kappa score of 0.9191 on the EyeQ dataset. On the RIQA-RFMiD dataset, the precision was 0.702, recall 0.6729, F1-score 0.6869, and Kappa score 0.7210, outperforming current state-of-the-art approaches.
Collapse
Affiliation(s)
- Xiaoyan Kui
- School of Computer Science and Engineering, Central South University, Changsha 410083, Hunan, China
| | - Zeru Hai
- School of Informatics, Hunan University of Chinese Medicine, Changsha 410208, Hunan, China
| | - Beiji Zou
- School of Computer Science and Engineering, Central South University, Changsha 410083, Hunan, China
- School of Informatics, Hunan University of Chinese Medicine, Changsha 410208, Hunan, China
| | - Wei Liang
- School of Advanced Interdisciplinary Studies, Hunan University of Technology and Business, Changsha 410205, China
- The Xiangjiang Laboratory, Changsha 410205, China
| | - Liming Chen
- Ecole Centrale de Lyon, University of Lyon, Lyon 69134, France
| |
Collapse
|
14
|
Xia T, Dang T, Han J, Qendro L, Mascolo C. Uncertainty-Aware Health Diagnostics via Class-Balanced Evidential Deep Learning. IEEE J Biomed Health Inform 2024; 28:6417-6428. [PMID: 38319779 DOI: 10.1109/jbhi.2024.3360002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2024]
Abstract
Uncertainty quantification is critical for ensuring the safety of deep learning-enabled health diagnostics, as it helps the model account for unknown factors and reduces the risk of misdiagnosis. However, existing uncertainty quantification studies often overlook the significant issue of class imbalance, which is common in medical data. In this paper, we propose a class-balanced evidential deep learning framework to achieve fair and reliable uncertainty estimates for health diagnostic models. This framework advances the state-of-the-art uncertainty quantification method of evidential deep learning with two novel mechanisms to address the challenges posed by class imbalance. Specifically, we introduce a pooling loss that enables the model to learn less biased evidence among classes and a learnable prior to regularize the posterior distribution that accounts for the quality of uncertainty estimates. Extensive experiments using benchmark data with varying degrees of imbalance and various naturally imbalanced health data demonstrate the effectiveness and superiority of our method. Our work pushes the envelope of uncertainty quantification from theoretical studies to realistic healthcare application scenarios. By enhancing uncertainty estimation for class-imbalanced data, we contribute to the development of more reliable and practical deep learning-enabled health diagnostic systems.
Collapse
|
15
|
Fan J, Yang T, Wang H, Zhang H, Zhang W, Ji M, Miao J. A Self-Supervised Equivariant Refinement Classification Network for Diabetic Retinopathy Classification. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01270-z. [PMID: 39299958 DOI: 10.1007/s10278-024-01270-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2024] [Revised: 09/09/2024] [Accepted: 09/10/2024] [Indexed: 09/22/2024]
Abstract
Diabetic retinopathy (DR) is a retinal disease caused by diabetes. If there is no intervention, it may even lead to blindness. Therefore, the detection of diabetic retinopathy is of great significance for preventing blindness in patients. Most of the existing DR detection methods use supervised methods, which usually require a large number of accurate pixel-level annotations. To solve this problem, we propose a self-supervised Equivariant Refinement Classification Network (ERCN) for DR classification. First, we use an unsupervised contrast pre-training network to learn a more generalized representation. Secondly, the class activation map (CAM) is refined by self-supervision learning. It first uses a spatial masking method to suppress low-confidence predictions, and then uses the feature similarity between pixels to encourage fine-grained activation to achieve more accurate positioning of the lesion. We propose a hybrid equivariant regularization loss to alleviate the degradation caused by the local minimum in the CAM refinement process. To further improve the classification accuracy, we propose an attention-based multi-instance learning (MIL), which weights each element of the feature map as an instance, which is more effective than the traditional patch-based instance extraction method. We evaluate our method on the EyePACS and DAVIS datasets and achieved 87.4% test accuracy in the EyePACS dataset and 88.7% test accuracy in the DAVIS dataset. It shows that the proposed method achieves better performance in DR detection compared with other state-of-the-art methods in self-supervised DR detection.
Collapse
Affiliation(s)
- Jiacheng Fan
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, 450001, China
| | - Tiejun Yang
- School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, 450001, China.
- Key Laboratory of Grain Information Processing and Control (HAUT), Ministry of Education, Zhengzhou, China.
- Henan Key Laboratory of Grain Photoelectric Detection and Control (HAUT), 100 Lianhua Street, High-Tech Zone, Zhengzhou, 450001, Henan, China.
| | - Heng Wang
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, 450001, China
| | - Huiyao Zhang
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, 450001, China
| | - Wenjie Zhang
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, 450001, China
| | - Mingzhu Ji
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, 450001, China
| | - Jianyu Miao
- School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, 450001, China
| |
Collapse
|
16
|
Pavithra S, Jaladi D, Tamilarasi K. Optical imaging for diabetic retinopathy diagnosis and detection using ensemble models. Photodiagnosis Photodyn Ther 2024; 48:104259. [PMID: 38944405 DOI: 10.1016/j.pdpdt.2024.104259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 06/16/2024] [Accepted: 06/20/2024] [Indexed: 07/01/2024]
Abstract
Diabetes, characterized by heightened blood sugar levels, can lead to a condition called Diabetic Retinopathy (DR), which adversely impacts the eyes due to elevated blood sugar affecting the retinal blood vessels. The most common cause of blindness in diabetics is thought to be Diabetic Retinopathy (DR), particularly in working-age individuals living in poor nations. People with type 1 or type 2 diabetes may develop this illness, and the risk rises with the length of diabetes and inadequate blood sugar management. There are limits to traditional approaches for the early identification of diabetic retinopathy (DR). In order to diagnose diabetic retinopathy, a model based on Convolutional neural network (CNN) is used in a unique way in this research. The suggested model uses a number of deep learning (DL) models, such as VGG19, Resnet50, and InceptionV3, to extract features. After concatenation, these characteristics are sent through the CNN algorithm for classification. By combining the advantages of several models, ensemble approaches can be effective tools for detecting diabetic retinopathy and increase overall performance and resilience. Classification and image recognition are just a few of the tasks that may be accomplished with ensemble approaches like combination of VGG19,Inception V3 and Resnet 50 to achieve high accuracy. The proposed model is evaluated using a publicly accessible collection of fundus images.VGG19, ResNet50, and InceptionV3 differ in their neural network architectures, feature extraction capabilities, object detection methods, and approaches to retinal delineation. VGG19 may excel in capturing fine details, ResNet50 in recognizing complex patterns, and InceptionV3 in efficiently capturing multi-scale features. Their combined use in an ensemble approach can provide a comprehensive analysis of retinal images, aiding in the delineation of retinal regions and identification of abnormalities associated with diabetic retinopathy. For instance, micro aneurysms, the earliest signs of DR, often require precise detection of subtle vascular abnormalities. VGG19's proficiency in capturing fine details allows for the identification of these minute changes in retinal morphology. On the other hand, ResNet50's strength lies in recognizing intricate patterns, making it effective in detecting neoneovascularization and complex haemorrhagic lesions. Meanwhile, InceptionV3's multi-scale feature extraction enables comprehensive analysis, crucial for assessing macular oedema and ischaemic changes across different retinal layers.
Collapse
Affiliation(s)
- S Pavithra
- School of Computer Science and Engineering, VIT University, Chennai, Tamil Nadu, India.
| | - Deepika Jaladi
- School of Computer Science and Engineering, VIT University, Chennai, Tamil Nadu, India.
| | - K Tamilarasi
- School of Computer Science and Engineering, VIT University, Chennai, Tamil Nadu, India.
| |
Collapse
|
17
|
Zhang X, Zhao J, Li Y, Wu H, Zhou X, Liu J. Efficient pyramid channel attention network for pathological myopia recognition with pretraining-and-finetuning. Artif Intell Med 2024; 154:102926. [PMID: 38964193 DOI: 10.1016/j.artmed.2024.102926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 06/21/2024] [Accepted: 06/25/2024] [Indexed: 07/06/2024]
Abstract
Pathological myopia (PM) is the leading ocular disease for impaired vision worldwide. Clinically, the characteristics of pathology distribution in PM are global-local on the fundus image, which plays a significant role in assisting clinicians in diagnosing PM. However, most existing deep neural networks focused on designing complex architectures but rarely explored the pathology distribution prior of PM. To tackle this issue, we propose an efficient pyramid channel attention (EPCA) module, which fully leverages the potential of the clinical pathology prior of PM with pyramid pooling and multi-scale context fusion. Then, we construct EPCA-Net for automatic PM recognition based on fundus images by stacking a sequence of EPCA modules. Moreover, motivated by the recent pretraining-and-finetuning paradigm, we attempt to adapt pre-trained natural image models for PM recognition by freezing them and treating the EPCA and other attention modules as adapters. In addition, we construct a PM recognition benchmark termed PM-fundus by collecting fundus images of PM from publicly available datasets. The comprehensive experiments demonstrate the superiority of EPCA-Net over state-of-the-art methods in the PM recognition task. For example, EPCA-Net achieves 97.56% accuracy and outperforms ViT by 2.85% accuracy on the PM-fundus dataset. The results also show that our method based on the pretraining-and-finetuning paradigm achieves competitive performance through comparisons to part of previous methods based on traditional fine-tuning paradigm with fewer tunable parameters, which has the potential to leverage more natural image foundation models to address the PM recognition task in limited medical data regime.
Collapse
Affiliation(s)
- Xiaoqing Zhang
- Research Institute of Trustworthy Autonomous Systems and Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China; Center for High Performance Computing and Shenzhen Key Laboratory of Intelligent Bioinformatics, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| | - Jilu Zhao
- Research Institute of Trustworthy Autonomous Systems and Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China
| | - Yan Li
- National Clinical Research Center for Ocular Diseases, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China; State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China
| | - Hao Wu
- National Clinical Research Center for Ocular Diseases, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China; State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China
| | - Xiangtian Zhou
- National Clinical Research Center for Ocular Diseases, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China; State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China; Research Unit of Myopia Basic Research and Clinical Prevention and Control, Chinese Academy of Medical Sciences, Wenzhou, 325027, China
| | - Jiang Liu
- Research Institute of Trustworthy Autonomous Systems and Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China; National Clinical Research Center for Ocular Diseases, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China; Singapore Eye Research Institute, 169856, Singapore.
| |
Collapse
|
18
|
Zhang Y, Ma X, Huang K, Li M, Heng PA. Semantic-Oriented Visual Prompt Learning for Diabetic Retinopathy Grading on Fundus Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2960-2969. [PMID: 38564346 DOI: 10.1109/tmi.2024.3383827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Diabetic retinopathy (DR) is a serious ocular condition that requires effective monitoring and treatment by ophthalmologists. However, constructing a reliable DR grading model remains a challenging and costly task, heavily reliant on high-quality training sets and adequate hardware resources. In this paper, we investigate the knowledge transferability of large-scale pre-trained models (LPMs) to fundus images based on prompt learning to construct a DR grading model efficiently. Unlike full-tuning which fine-tunes all parameters of LPMs, prompt learning only involves a minimal number of additional learnable parameters while achieving a competitive effect as full-tuning. Inspired by visual prompt tuning, we propose Semantic-oriented Visual Prompt Learning (SVPL) to enhance the semantic perception ability for better extracting task-specific knowledge from LPMs, without any additional annotations. Specifically, SVPL assigns a group of learnable prompts for each DR level to fit the complex pathological manifestations and then aligns each prompt group to task-specific semantic space via a contrastive group alignment (CGA) module. We also propose a plug-and-play adapter module, Hierarchical Semantic Delivery (HSD), which allows the semantic transition of prompt groups from shallow to deep layers to facilitate efficient knowledge mining and model convergence. Our extensive experiments on three public DR grading datasets demonstrate that SVPL achieves superior results compared to other transfer tuning and DR grading methods. Further analysis suggests that the generalized knowledge from LPMs is advantageous for constructing the DR grading model on fundus images.
Collapse
|
19
|
Jian M, Chen H, Zhang Z, Yang N, Zhang H, Ma L, Xu W, Zhi H. A Lung Nodule Dataset with Histopathology-based Cancer Type Annotation. Sci Data 2024; 11:824. [PMID: 39068171 PMCID: PMC11283520 DOI: 10.1038/s41597-024-03658-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Accepted: 07/17/2024] [Indexed: 07/30/2024] Open
Abstract
Recently, Computer-Aided Diagnosis (CAD) systems have emerged as indispensable tools in clinical diagnostic workflows, significantly alleviating the burden on radiologists. Nevertheless, despite their integration into clinical settings, CAD systems encounter limitations. Specifically, while CAD systems can achieve high performance in the detection of lung nodules, they face challenges in accurately predicting multiple cancer types. This limitation can be attributed to the scarcity of publicly available datasets annotated with expert-level cancer type information. This research aims to bridge this gap by providing publicly accessible datasets and reliable tools for medical diagnosis, facilitating a finer categorization of different types of lung diseases so as to offer precise treatment recommendations. To achieve this objective, we curated a diverse dataset of lung Computed Tomography (CT) images, comprising 330 annotated nodules (nodules are labeled as bounding boxes) from 95 distinct patients. The quality of the dataset was evaluated using a variety of classical classification and detection models, and these promising results demonstrate that the dataset has a feasible application and further facilitate intelligent auxiliary diagnosis.
Collapse
Affiliation(s)
- Muwei Jian
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, China.
- School of Information Science and Technology, Linyi University, Linyi, China.
| | - Hongyu Chen
- School of Information Science and Technology, Linyi University, Linyi, China
| | - Zaiyong Zhang
- Thoracic Surgery Department of Linyi Central Hospital, Linyi, China
| | - Nan Yang
- School of Information Science and Technology, Linyi University, Linyi, China
| | - Haorang Zhang
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, China
| | - Lifu Ma
- Personnel Department of Linyi Central Hospital, Linyi, China
| | - Wenjing Xu
- School of Information Science and Technology, Linyi University, Linyi, China
| | - Huixiang Zhi
- School of Information Science and Technology, Linyi University, Linyi, China
| |
Collapse
|
20
|
Romero-Oraá R, Herrero-Tudela M, López MI, Hornero R, García M. Attention-based deep learning framework for automatic fundus image processing to aid in diabetic retinopathy grading. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 249:108160. [PMID: 38583290 DOI: 10.1016/j.cmpb.2024.108160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 01/26/2024] [Accepted: 03/30/2024] [Indexed: 04/09/2024]
Abstract
BACKGROUND AND OBJECTIVE Early detection and grading of Diabetic Retinopathy (DR) is essential to determine an adequate treatment and prevent severe vision loss. However, the manual analysis of fundus images is time consuming and DR screening programs are challenged by the availability of human graders. Current automatic approaches for DR grading attempt the joint detection of all signs at the same time. However, the classification can be optimized if red lesions and bright lesions are independently processed since the task gets divided and simplified. Furthermore, clinicians would greatly benefit from explainable artificial intelligence (XAI) to support the automatic model predictions, especially when the type of lesion is specified. As a novelty, we propose an end-to-end deep learning framework for automatic DR grading (5 severity degrees) based on separating the attention of the dark structures from the bright structures of the retina. As the main contribution, this approach allowed us to generate independent interpretable attention maps for red lesions, such as microaneurysms and hemorrhages, and bright lesions, such as hard exudates, while using image-level labels only. METHODS Our approach is based on a novel attention mechanism which focuses separately on the dark and the bright structures of the retina by performing a previous image decomposition. This mechanism can be seen as a XAI approach which generates independent attention maps for red lesions and bright lesions. The framework includes an image quality assessment stage and deep learning-related techniques, such as data augmentation, transfer learning and fine-tuning. We used the architecture Xception as a feature extractor and the focal loss function to deal with data imbalance. RESULTS The Kaggle DR detection dataset was used for method development and validation. The proposed approach achieved 83.7 % accuracy and a Quadratic Weighted Kappa of 0.78 to classify DR among 5 severity degrees, which outperforms several state-of-the-art approaches. Nevertheless, the main result of this work is the generated attention maps, which reveal the pathological regions on the image distinguishing the red lesions and the bright lesions. These maps provide explainability to the model predictions. CONCLUSIONS Our results suggest that our framework is effective to automatically grade DR. The separate attention approach has proven useful for optimizing the classification. On top of that, the obtained attention maps facilitate visual interpretation for clinicians. Therefore, the proposed method could be a diagnostic aid for the early detection and grading of DR.
Collapse
Affiliation(s)
- Roberto Romero-Oraá
- Biomedical Engineering Group, University of Valladolid, Valladolid, 47011, Spain; Centro de Investigación Biomédica en Red en Bioingeniería, Biomateriales y Nanomedicina (CIBER-BBN), Spain.
| | - María Herrero-Tudela
- Biomedical Engineering Group, University of Valladolid, Valladolid, 47011, Spain
| | - María I López
- Biomedical Engineering Group, University of Valladolid, Valladolid, 47011, Spain; Centro de Investigación Biomédica en Red en Bioingeniería, Biomateriales y Nanomedicina (CIBER-BBN), Spain
| | - Roberto Hornero
- Biomedical Engineering Group, University of Valladolid, Valladolid, 47011, Spain; Centro de Investigación Biomédica en Red en Bioingeniería, Biomateriales y Nanomedicina (CIBER-BBN), Spain
| | - María García
- Biomedical Engineering Group, University of Valladolid, Valladolid, 47011, Spain; Centro de Investigación Biomédica en Red en Bioingeniería, Biomateriales y Nanomedicina (CIBER-BBN), Spain
| |
Collapse
|
21
|
Zhou Z, Islam MT, Xing L. Multibranch CNN With MLP-Mixer-Based Feature Exploration for High-Performance Disease Diagnosis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7351-7362. [PMID: 37028335 PMCID: PMC11779602 DOI: 10.1109/tnnls.2023.3250490] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Deep learning-based diagnosis is becoming an indispensable part of modern healthcare. For high-performance diagnosis, the optimal design of deep neural networks (DNNs) is a prerequisite. Despite its success in image analysis, existing supervised DNNs based on convolutional layers often suffer from their rudimentary feature exploration ability caused by the limited receptive field and biased feature extraction of conventional convolutional neural networks (CNNs), which compromises the network performance. Here, we propose a novel feature exploration network named manifold embedded multilayer perceptron (MLP) mixer (ME-Mixer), which utilizes both supervised and unsupervised features for disease diagnosis. In the proposed approach, a manifold embedding network is employed to extract class-discriminative features; then, two MLP-Mixer-based feature projectors are adopted to encode the extracted features with the global reception field. Our ME-Mixer network is quite general and can be added as a plugin to any existing CNN. Comprehensive evaluations on two medical datasets are performed. The results demonstrate that their approach greatly enhances the classification accuracy in comparison with different configurations of DNNs with acceptable computational complexity.
Collapse
|
22
|
Huang Y, Lyu J, Cheng P, Tam R, Tang X. SSiT: Saliency-Guided Self-Supervised Image Transformer for Diabetic Retinopathy Grading. IEEE J Biomed Health Inform 2024; 28:2806-2817. [PMID: 38319784 DOI: 10.1109/jbhi.2024.3362878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2024]
Abstract
Self-supervised Learning (SSL) has been widely applied to learn image representations through exploiting unlabeled images. However, it has not been fully explored in the medical image analysis field. In this work, Saliency-guided Self-Supervised image Transformer (SSiT) is proposed for Diabetic Retinopathy (DR) grading from fundus images. We novelly introduce saliency maps into SSL, with a goal of guiding self-supervised pre-training with domain-specific prior knowledge. Specifically, two saliency-guided learning tasks are employed in SSiT: 1) Saliency-guided contrastive learning is conducted based on the momentum contrast, wherein fundus images' saliency maps are utilized to remove trivial patches from the input sequences of the momentum-updated key encoder. Thus, the key encoder is constrained to provide target representations focusing on salient regions, guiding the query encoder to capture salient features. 2) The query encoder is trained to predict the saliency segmentation, encouraging the preservation of fine-grained information in the learned representations. To assess our proposed method, four publicly-accessible fundus image datasets are adopted. One dataset is employed for pre-training, while the three others are used to evaluate the pre-trained models' performance on downstream DR grading. The proposed SSiT significantly outperforms other representative state-of-the-art SSL methods on all downstream datasets and under various evaluation settings. For example, SSiT achieves a Kappa score of 81.88% on the DDR dataset under fine-tuning evaluation, outperforming all other ViT-based SSL methods by at least 9.48%.
Collapse
|
23
|
Xu X, Liu D, Huang G, Wang M, Lei M, Jia Y. Computer aided diagnosis of diabetic retinopathy based on multi-view joint learning. Comput Biol Med 2024; 174:108428. [PMID: 38631117 DOI: 10.1016/j.compbiomed.2024.108428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 04/02/2024] [Accepted: 04/04/2024] [Indexed: 04/19/2024]
Abstract
Diabetic retinopathy (DR) is a kind of ocular complication of diabetes, and its degree grade is an essential basis for early diagnosis of patients. Manual diagnosis is a long and expensive process with a specific risk of misdiagnosis. Computer-aided diagnosis can provide more accurate and practical treatment recommendations. In this paper, we propose a multi-view joint learning DR diagnostic model called RT2Net, which integrates the global features of fundus images and the local detailed features of vascular images to reduce the limitations of single fundus image learning. Firstly, the original image is preprocessed using operations such as contrast-limited adaptive histogram equalization, and the vascular structure of the extracted DR image is segmented. Then, the vascular image and fundus image are input into two branch networks of RT2Net for feature extraction, respectively, and the feature fusion module adaptively fuses the feature vectors' output from the branch networks. Finally, the optimized classification model is used to identify the five categories of DR. This paper conducts extensive experiments on the public datasets EyePACS and APTOS 2019 to demonstrate the method's effectiveness. The accuracy of RT2Net on the two datasets reaches 88.2% and 85.4%, and the area under the receiver operating characteristic curve (AUC) is 0.98 and 0.96, respectively. The excellent classification ability of RT2Net for DR can significantly help patients detect and treat lesions early and provide doctors with a more reliable diagnosis basis, which has significant clinical value for diagnosing DR.
Collapse
Affiliation(s)
- Xuebin Xu
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an 710121, Shaanxi, China; Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an 710121, Shaanxi, China; Xi'an Key Laboratory of Big Data and Intelligent Computing, Xi'an 710121, Shaanxi, China.
| | - Dehua Liu
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an 710121, Shaanxi, China; Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an 710121, Shaanxi, China; Xi'an Key Laboratory of Big Data and Intelligent Computing, Xi'an 710121, Shaanxi, China.
| | - Guohua Huang
- Weinan Central Hospital, Xi'an 714099, Shaanxi, China.
| | - Muyu Wang
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an 710121, Shaanxi, China; Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an 710121, Shaanxi, China; Xi'an Key Laboratory of Big Data and Intelligent Computing, Xi'an 710121, Shaanxi, China.
| | - Meng Lei
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an 710121, Shaanxi, China; Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an 710121, Shaanxi, China; Xi'an Key Laboratory of Big Data and Intelligent Computing, Xi'an 710121, Shaanxi, China.
| | - Yang Jia
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an 710121, Shaanxi, China; Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an 710121, Shaanxi, China; Xi'an Key Laboratory of Big Data and Intelligent Computing, Xi'an 710121, Shaanxi, China.
| |
Collapse
|
24
|
Hai Z, Zou B, Xiao X, Peng Q, Yan J, Zhang W, Yue K. A novel approach for intelligent diagnosis and grading of diabetic retinopathy. Comput Biol Med 2024; 172:108246. [PMID: 38471350 DOI: 10.1016/j.compbiomed.2024.108246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 03/05/2024] [Accepted: 03/05/2024] [Indexed: 03/14/2024]
Abstract
Diabetic retinopathy (DR) is a severe ocular complication of diabetes that can lead to vision damage and even blindness. Currently, traditional deep convolutional neural networks (CNNs) used for DR grading tasks face two primary challenges: (1) insensitivity to minority classes due to imbalanced data distribution, and (2) neglecting the relationship between the left and right eyes by utilizing the fundus image of only one eye for training without differentiating between them. To tackle these challenges, we proposed the DRGCNN (DR Grading CNN) model. To solve the problem caused by imbalanced data distribution, our model adopts a more balanced strategy by allocating an equal number of channels to feature maps representing various DR categories. Furthermore, we introduce a CAM-EfficientNetV2-M encoder dedicated to encoding input retinal fundus images for feature vector generation. The number of parameters of our encoder is 52.88 M, which is less than RegNet_y_16gf (80.57 M) and EfficientNetB7 (63.79 M), but the corresponding kappa value is higher. Additionally, in order to take advantage of the binocular relationship, we input fundus retinal images from both eyes of the patient into the network for features fusion during the training phase. We achieved a kappa value of 86.62% on the EyePACS dataset and 86.16% on the Messidor-2 dataset. Experimental results on these representative datasets for diabetic retinopathy (DR) demonstrate the exceptional performance of our DRGCNN model, establishing it as a highly competitive intelligent classification model in the field of DR. The code is available for use at https://github.com/Fat-Hai/DRGCNN.
Collapse
Affiliation(s)
- Zeru Hai
- School of Informatics, Hunan University of Chinese Medicine, Changsha, Hunan Province, 410208, China
| | - Beiji Zou
- School of Informatics, Hunan University of Chinese Medicine, Changsha, Hunan Province, 410208, China; School of Computer Science and Engineering, Central South University, Changsha, Hunan Province, 410083, China
| | - Xiaoxia Xiao
- School of Informatics, Hunan University of Chinese Medicine, Changsha, Hunan Province, 410208, China.
| | - Qinghua Peng
- School of Traditional Chinese Medicine, Hunan University of Chinese Medicine, Changsha, Hunan Province, 410208, China
| | - Junfeng Yan
- School of Informatics, Hunan University of Chinese Medicine, Changsha, Hunan Province, 410208, China
| | - Wensheng Zhang
- School of Informatics, Hunan University of Chinese Medicine, Changsha, Hunan Province, 410208, China; University of Chinese Academy of Sciences (UCAS), Beijing, 100049, China; Research Center of Precision Sensing and Control, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Kejuan Yue
- School of Computer Science, Hunan First Normal University, Changsha, Hunan Province, 410205, China
| |
Collapse
|
25
|
Zang F, Ma H. CRA-Net: Transformer guided category-relation attention network for diabetic retinopathy grading. Comput Biol Med 2024; 170:107993. [PMID: 38277925 DOI: 10.1016/j.compbiomed.2024.107993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 12/30/2023] [Accepted: 01/13/2024] [Indexed: 01/28/2024]
Abstract
Automated grading of diabetic retinopathy (DR) is an important means for assisting clinical diagnosis and preventing further retinal damage. However, imbalances and similarities between categories in the DR dataset make it highly challenging to accurately grade the severity of the condition. Furthermore, DR images encompass various lesions, and the pathological relationship information among these lesions can be easily overlooked. For instance, under different severity levels, the varying contributions of different lesions to accurate model grading differ significantly. To address the aforementioned issues, we design a transformer guided category-relation attention network (CRA-Net). Specifically, we propose a novel category attention block that enhances feature information within the class from the perspective of DR image categories, thereby alleviating class imbalance problems. Additionally, we design a lesion relation attention block that captures relationships between lesions by incorporating attention mechanisms in two primary aspects: capsule attention models the relative importance of different lesions, allowing the model to focus on more "informative" ones. Spatial attention captures the global position relationship between lesion features under transformer guidance, facilitating more accurate localization of lesions. Experimental and ablation studies on two datasets DDR and APTOS 2019 demonstrate the effectiveness of CRA-Net and obtain competitive performance.
Collapse
Affiliation(s)
- Feng Zang
- School of Electronic Engineering, Heilongjiang University, Harbin 150080, China.
| | - Hui Ma
- School of Electronic Engineering, Heilongjiang University, Harbin 150080, China.
| |
Collapse
|
26
|
Bhati A, Gour N, Khanna P, Ojha A, Werghi N. An interpretable dual attention network for diabetic retinopathy grading: IDANet. Artif Intell Med 2024; 149:102782. [PMID: 38462283 DOI: 10.1016/j.artmed.2024.102782] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 01/05/2024] [Accepted: 01/15/2024] [Indexed: 03/12/2024]
Abstract
Diabetic retinopathy (DR) is the most prevalent cause of visual impairment in adults worldwide. Typically, patients with DR do not show symptoms until later stages, by which time it may be too late to receive effective treatment. DR Grading is challenging because of the small size and variation in lesion patterns. The key to fine-grained DR grading is to discover more discriminating elements such as cotton wool, hard exudates, hemorrhages, microaneurysms etc. Although deep learning models like convolutional neural networks (CNN) seem ideal for the automated detection of abnormalities in advanced clinical imaging, small-size lesions are very hard to distinguish by using traditional networks. This work proposes a bi-directional spatial and channel-wise parallel attention based network to learn discriminative features for diabetic retinopathy grading. The proposed attention block plugged with a backbone network helps to extract features specific to fine-grained DR-grading. This scheme boosts classification performance along with the detection of small-sized lesion parts. Extensive experiments are performed on four widely used benchmark datasets for DR grading, and performance is evaluated on different quality metrics. Also, for model interpretability, activation maps are generated using the LIME method to visualize the predicted lesion parts. In comparison with state-of-the-art methods, the proposed IDANet exhibits better performance for DR grading and lesion detection.
Collapse
Affiliation(s)
- Amit Bhati
- PDPM Indian Institute of Information Technology, Design and Manufacturing, Jabalpur 482005, India
| | - Neha Gour
- Department of Electrical Engineering and Computer Science, Khalifa University, Abu Dhabi, United Arab Emirates
| | - Pritee Khanna
- PDPM Indian Institute of Information Technology, Design and Manufacturing, Jabalpur 482005, India.
| | - Aparajita Ojha
- PDPM Indian Institute of Information Technology, Design and Manufacturing, Jabalpur 482005, India
| | - Naoufel Werghi
- Department of Electrical Engineering and Computer Science, Khalifa University, Abu Dhabi, United Arab Emirates
| |
Collapse
|
27
|
Ma Y, He J, Tan D, Han X, Feng R, Xiong H, Peng X, Pu X, Zhang L, Li Y, Chen S. The clinical and imaging data fusion model for single-period cerebral CTA collateral circulation assessment. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2024; 32:953-971. [PMID: 38820061 DOI: 10.3233/xst-240083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2024]
Abstract
BACKGROUND The Chinese population ranks among the highest globally in terms of stroke prevalence. In the clinical diagnostic process, radiologists utilize computed tomography angiography (CTA) images for diagnosis, enabling a precise assessment of collateral circulation in the brains of stroke patients. Recent studies frequently combine imaging and machine learning methods to develop computer-aided diagnostic algorithms. However, in studies concerning collateral circulation assessment, the extracted imaging features are primarily composed of manually designed statistical features, which exhibit significant limitations in their representational capacity. Accurately assessing collateral circulation using image features in brain CTA images still presents challenges. METHODS To tackle this issue, considering the scarcity of publicly accessible medical datasets, we combined clinical data with imaging data to establish a dataset named RadiomicsClinicCTA. Moreover, we devised two collateral circulation assessment models to exploit the synergistic potential of patients' clinical information and imaging data for a more accurate assessment of collateral circulation: data-level fusion and feature-level fusion. To remove redundant features from the dataset, we employed Levene's test and T-test methods for feature pre-screening. Subsequently, we performed feature dimensionality reduction using the LASSO and random forest algorithms and trained classification models with various machine learning algorithms on the data-level fusion dataset after feature engineering. RESULTS Experimental results on the RadiomicsClinicCTA dataset demonstrate that the optimized data-level fusion model achieves an accuracy and AUC value exceeding 86%. Subsequently, we trained and assessed the performance of the feature-level fusion classification model. The results indicate the feature-level fusion classification model outperforms the optimized data-level fusion model. Comparative experiments show that the fused dataset better differentiates between good and bad side branch features relative to the pure radiomics dataset. CONCLUSIONS Our study underscores the efficacy of integrating clinical and imaging data through fusion models, significantly enhancing the accuracy of collateral circulation assessment in stroke patients.
Collapse
Affiliation(s)
- Yuqi Ma
- College of Computer and Information Science, Southwest University, Chongqing, China
| | - Jingliu He
- College of Computer and Information Science, Southwest University, Chongqing, China
| | - Duo Tan
- The Second People's Hospital of Guizhou Province, Guizhou, China
| | - Xu Han
- School of Electrical and Information Engineering, Tianjin University, Tianjin, China
| | - Ruiqi Feng
- College of Computer and Information Science, Southwest University, Chongqing, China
| | - Hailing Xiong
- College of Electronic and Information Engineering, Southwest University, Chongqing, China
| | - Xihua Peng
- College of Computer and Information Science, Southwest University, Chongqing, China
| | - Xun Pu
- College of Computer and Information Science, Southwest University, Chongqing, China
| | - Lin Zhang
- College of Computer and Information Science, Southwest University, Chongqing, China
| | - Yongmei Li
- Department of Radiology, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Shanxiong Chen
- College of Computer and Information Science, Southwest University, Chongqing, China
- Big Data & Intelligence Engineering School, Chongqing College of International Business and Economics, Chongqing, China
| |
Collapse
|
28
|
Xia H, Long J, Song S, Tan Y. Multi-scale multi-attention network for diabetic retinopathy grading. Phys Med Biol 2023; 69:015007. [PMID: 38035368 DOI: 10.1088/1361-6560/ad111d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 11/30/2023] [Indexed: 12/02/2023]
Abstract
Objective.Diabetic retinopathy (DR) grading plays an important role in clinical diagnosis. However, automatic grading of DR is challenging due to the presence of intra-class variation and small lesions. On the one hand, deep features learned by convolutional neural networks often lose valid information about these small lesions. On the other hand, the great variability of lesion features, including differences in type and quantity, can exhibit considerable divergence even among fundus images of the same grade. To address these issues, we propose a novel multi-scale multi-attention network (MMNet).Approach.Firstly, to focus on different lesion features of fundus images, we propose a lesion attention module, which aims to encode multiple different lesion attention feature maps by combining channel attention and spatial attention, thus extracting global feature information and preserving diverse lesion features. Secondly, we propose a multi-scale feature fusion module to learn more feature information for small lesion regions, which combines complementary relationships between different convolutional layers to capture more detailed feature information. Furthermore, we introduce a Cross-layer Consistency Constraint Loss to overcome semantic differences between multi-scale features.Main results.The proposed MMNet obtains a high accuracy of 86.4% and a high kappa score of 88.4% for multi-class DR grading tasks on the EyePACS dataset, while 98.6% AUC, 95.3% accuracy, 92.7% recall, 95.0% precision, and 93.3% F1-score for referral and non-referral classification on the Messidor-1 dataset. Extensive experiments on two challenging benchmarks demonstrate that our MMNet achieves significant improvements and outperforms other state-of-the-art DR grading methods.Significance.MMNet has improved the diagnostic efficiency and accuracy of diabetes retinopathy and promoted the application of computer-aided medical diagnosis in DR screening.
Collapse
Affiliation(s)
- Haiying Xia
- School of Electronic and Information Engineering, Guangxi Normal University, Guilin 541004, People's Republic of China
| | - Jie Long
- School of Electronic and Information Engineering, Guangxi Normal University, Guilin 541004, People's Republic of China
| | - Shuxiang Song
- School of Electronic and Information Engineering, Guangxi Normal University, Guilin 541004, People's Republic of China
| | - Yumei Tan
- School of Computer Science and Engineering, Guangxi Normal University, Guilin 541004, People's Republic of China
| |
Collapse
|
29
|
An X, Li P, Zhang C. Deep Cascade-Learning Model via Recurrent Attention for Immunofixation Electrophoresis Image Analysis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3847-3859. [PMID: 37698964 DOI: 10.1109/tmi.2023.3314507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/14/2023]
Abstract
Immunofixation Electrophoresis (IFE) analysis has been an indispensable prerequisite for the diagnosis of M-protein, which is an important criterion to recognize diversified plasma cell diseases. Existing intelligent methods of IFE diagnosis commonly employ a single unified classifier to directly classify whether M-protein exists and which isotype of M-protein is. However, this unified classification is not optimal because the two tasks have different characteristics and require different feature extraction techniques. Classifying the M-protein existence depends on the presence or absence of dense bands in IFE data, while classifying the M-protein isotype depends on the location of dense bands. Consequently, a cascading two-classifier framework suitable to the two tasks respectively may achieve better performance. In this paper, we propose a novel deep cascade-learning model, which sequentially integrates a positive-negative classifier based on deep collocative learning and an isotype classifier based on recurrent attention model to address these two tasks respectively. Specifically, the attention mechanism can mimic the visual perception of clinicians, where only the most informative local regions are extracted through sequential partial observations. This not only avoids the interference of redundant regions but also saves computational power. Further, domain knowledge about SP lane and heavy-light-chain lanes is also introduced to assist our attention location. Extensive numerical experiments show that our deep cascade-learning outperforms state-of-the-art methods on recognized evaluation metrics and can effectively capture the co-location of dense bands in different lanes.
Collapse
|
30
|
Kothadiya D, Rehman A, Abbas S, Alamri FS, Saba T. Attention-based deep learning framework to recognize diabetes disease from cellular retinal images. Biochem Cell Biol 2023; 101:550-561. [PMID: 37473447 DOI: 10.1139/bcb-2023-0151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/22/2023] Open
Abstract
A medical disorder known as diabetic retinopathy (DR) affects people who suffer from diabetes. Many people are visually impaired due to DR. Primary cause of DR in patients is high blood sugar, and it affects blood vessels available in the retinal cell. The recent advancement in deep learning and computer vision methods, and their automation applications can recognize the presence of DR in retinal cells and vessel images. Authors have proposed an attention-based hybrid model to recognize diabetes in early stage to prevent harmful clauses. Proposed methodology uses DenseNet121 architecture for convolution learning and then, the feature vector will be enhanced with channel and spatial attention model. The proposed architecture also simulates binary and multiclass classification to recognize the infection and the spreading of disease. Binary classification recognizes DR images either positive or negative, while multiclass classification represents an infection on a scale of 0-4. Simulation of the proposed methodology has achieved 98.57% and 99.01% accuracy for multiclass and binary classification, respectively. Simulation of the study also explored the impact of data augmentation to make the proposed model robust and generalized. Attention-based deep learning model has achieved remarkable accuracy to detect diabetic infection from retinal cellular images.
Collapse
Affiliation(s)
- Deep Kothadiya
- Artificial Intelligence and Data Analytics Lab (AIDA), CCIS, Prince Sultan University, Riyadh 11586, Saudi Arabia
- U & P U Patel Department of Computer Engineering, Chandubhai S. Patel Institute of Technology (CSPIT), Faculty of Technology (FTE), Charotar University of Science and Technology (CHARUSAT), Changa, India
| | - Amjad Rehman
- Artificial Intelligence and Data Analytics Lab (AIDA), CCIS, Prince Sultan University, Riyadh 11586, Saudi Arabia
| | - Sidra Abbas
- Department of Computer Science, COMSATS University Islamabad, Islamabad, Pakistan
| | - Faten S Alamri
- Department of Mathematical Sciences, College of Science, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
| | - Tanzila Saba
- Artificial Intelligence and Data Analytics Lab (AIDA), CCIS, Prince Sultan University, Riyadh 11586, Saudi Arabia
| |
Collapse
|
31
|
Yuan W, Yang J, Yin B, Fan X, Yang J, Sun H, Liu Y, Su M, Li S, Huang X. Noninvasive diagnosis of oral squamous cell carcinoma by multi-level deep residual learning on optical coherence tomography images. Oral Dis 2023; 29:3223-3231. [PMID: 35842738 DOI: 10.1111/odi.14318] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Revised: 06/10/2022] [Accepted: 07/13/2022] [Indexed: 11/29/2022]
Abstract
BACKGROUND Oral Squamous Cell Carcinoma (OSCC) is one of the most severe cancers in the world, and its early detection is crucial for saving patients. There is an inevitable necessity to develop the automatic noninvasive OSCC diagnosis approach to identify the malignant tissues on Optical Coherence Tomography (OCT) images. METHODS This study presents a novel Multi-Level Deep Residual Learning (MDRL) network to identify malignant and benign(normal) tissues from OCT images and trains the network in 460 OCT images captured from 37 patients. The diagnostic performances are compared with different methods in the image-level and the resected patch-level. RESULTS The MDRL system achieves the excellent diagnostic performance, with 91.2% sensitivity, 83.6% specificity, 87.5% accuracy, 85.3% PPV, and 90.2% NPV in image-level, with 0.92 AUC value. Besides, it also implements 100% sensitivity, 86.7% specificity, 93.1% accuracy, 87.5% PPV, and 100% NPV in the resected patch-level. CONCLUSION The developed deep learning system expresses superior performance in noninvasive oral squamous cell carcinoma diagnosis, compared with traditional CNNs and a specialist.
Collapse
Affiliation(s)
- Wei Yuan
- Department of Oral and Maxillofacial-Head and Neck Oncology, Beijing Stomatological Hospital, School of Stomatology, Capital Medical University, Beijing, China
| | - Jinsuo Yang
- Department of Oral and Maxillofacial-Head and Neck Oncology, Beijing Stomatological Hospital, School of Stomatology, Capital Medical University, Beijing, China
| | - Boya Yin
- Department of Oral and Maxillofacial-Head and Neck Oncology, Beijing Stomatological Hospital, School of Stomatology, Capital Medical University, Beijing, China
| | - Xingyu Fan
- Department of Oral and Maxillofacial-Head and Neck Oncology, Beijing Stomatological Hospital, School of Stomatology, Capital Medical University, Beijing, China
| | - Jing Yang
- Department of Oral and Maxillofacial-Head and Neck Oncology, Beijing Stomatological Hospital, School of Stomatology, Capital Medical University, Beijing, China
| | - Haibin Sun
- Department of Oral and Maxillofacial-Head and Neck Oncology, Beijing Stomatological Hospital, School of Stomatology, Capital Medical University, Beijing, China
| | - Yanbin Liu
- Department of Oral and Maxillofacial-Head and Neck Oncology, Beijing Stomatological Hospital, School of Stomatology, Capital Medical University, Beijing, China
| | - Ming Su
- Department of Oral and Maxillofacial-Head and Neck Oncology, Beijing Stomatological Hospital, School of Stomatology, Capital Medical University, Beijing, China
| | - Sen Li
- College of Science, Harbin Institute of Technology, Shenzhen, China
| | - Xin Huang
- Department of Oral and Maxillofacial-Head and Neck Oncology, Beijing Stomatological Hospital, School of Stomatology, Capital Medical University, Beijing, China
| |
Collapse
|
32
|
Suresh T, Brijet Z, Subha TD. Imbalanced medical disease dataset classification using enhanced generative adversarial network. Comput Methods Biomech Biomed Engin 2023; 26:1702-1718. [PMID: 36322625 DOI: 10.1080/10255842.2022.2134729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 09/17/2022] [Accepted: 10/06/2022] [Indexed: 11/05/2022]
Abstract
In general, the imbalanced dataset is a major issue in health applications. The medical data classification faces the imbalanced count of data samples, here at least one class forms only a very small minority of the data, but it is a drawback of most of the machine learning algorithms. The medical datasets are mostly imbalanced in its class labels. When the dataset is imbalanced, the existing classification algorithms typically perform badly on minority class cases. To deal the class imbalance issue, an enhanced generative adversarial network (E-GAN) is proposed in this article. The proposed approach is the consolidation of deep convolutional generative adversarial network and modified convolutional neural network (DCG-MCNN). Initially, the imbalanced data is converted into balanced data in pre-processing process. Data preprocessing comprise of data cleaning, data normalization, data transformation and data reduction using Radius Synthetic minority oversampling technique (RSMOTE) method. The DCG is considered for balancing the dataset generating extra samples under training dataset. This training dataset based, the medical disease classification is enhanced by modified CNN diagnosis model. The proposed system performed is executed in MATLAB. The performance analysis is implemented under the Breast Cancer Wisconsin Dataset that provides the higher maximum geometry mean (MGM) of 8.686, 2.931 and 5.413%, and higher Matthews's correlation coefficient (MCC) of 9.776, 1.841 and 5.413% compared to the existing methods.
Collapse
Affiliation(s)
- T Suresh
- Department of Electronics and Communication Engineering, R.M.K. Engineering college, Kavaraipettai, Tamil Nadu, India
| | - Z Brijet
- Department of Electronics and Instrumentation Engineering, Velammal Engineering College, Surapet, Chennai, Tamil Nadu, India
| | - T D Subha
- Department of Electronics and Communication Engineering, R.M.K. Engineering college, Kavaraipettai, Tamil Nadu, India
| |
Collapse
|
33
|
Kukkar A, Gupta D, Beram SM, Soni M, Singh NK, Sharma A, Neware R, Shabaz M, Rizwan A. Optimizing Deep Learning Model Parameters Using Socially Implemented IoMT Systems for Diabetic Retinopathy Classification Problem. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS 2023; 10:1654-1665. [DOI: 10.1109/tcss.2022.3213369] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/09/2024]
Affiliation(s)
- Ashima Kukkar
- Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, India
| | - Dinesh Gupta
- Computer Science & Engineering, I. K. Gujral Punjab Technical University, Kapurthala, Punjab, India
| | - Shehab Mohamed Beram
- Department of Computing and Information Systems, Research Centre for Human-Machine Collaboration (HUMAC), Sunway University, Kuala Lumpur, Malaysia
| | - Mukesh Soni
- Department of Computer Science and Engineering, University Centre for Research and Development, Chandigarh University, Mohali, Punjab, India
| | - Nikhil Kumar Singh
- Department of Computer Science and Engineering, Maulana Azad National Institute of Technology, Bhopal, Madhya Pradesh, India
| | - Ashutosh Sharma
- School of Computer Science, University of Petroleum and Energy Studies, Dehradun, India
| | - Rahul Neware
- Department of Computer Science and Engineering, G. H. Raisoni College of Engineering, Nagpur, India
| | - Mohammad Shabaz
- Model Institute of Engineering and Technology, Jammu, Jammu and Kashmir, India
| | - Ali Rizwan
- Department of Industrial Engineering, Faculty of Engineering, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
34
|
Tian M, Wang H, Sun Y, Wu S, Tang Q, Zhang M. Fine-grained attention & knowledge-based collaborative network for diabetic retinopathy grading. Heliyon 2023; 9:e17217. [PMID: 37449186 PMCID: PMC10336422 DOI: 10.1016/j.heliyon.2023.e17217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 06/09/2023] [Accepted: 06/10/2023] [Indexed: 07/18/2023] Open
Abstract
Accurate diabetic retinopathy (DR) grading is crucial for making the proper treatment plan to reduce the damage caused by vision loss. This task is challenging due to the fact that the DR related lesions are often small and subtle in visual differences and intra-class variations. Moreover, relationships between the lesions and the DR levels are complicated. Although many deep learning (DL) DR grading systems have been developed with some success, there are still rooms for grading accuracy improvement. A common issue is that not much medical knowledge was used in these DL DR grading systems. As a result, the grading results are not properly interpreted by ophthalmologists, thus hinder the potential for practical applications. This paper proposes a novel fine-grained attention & knowledge-based collaborative network (FA+KC-Net) to address this concern. The fine-grained attention network dynamically divides the extracted feature maps into smaller patches and effectively captures small image features that are meaningful in the sense of its training from large amount of retinopathy fundus images. The knowledge-based collaborative network extracts a-priori medical knowledge features, i.e., lesions such as the microaneurysms (MAs), soft exudates (SEs), hard exudates (EXs), and hemorrhages (HEs). Finally, decision rules are developed to fuse the DR grading results from the fine-grained network and the knowledge-based collaborative network to make the final grading. Extensive experiments are carried out on four widely-used datasets, the DDR, Messidor, APTOS, and EyePACS to evaluate the efficacy of our method and compare with other state-of-the-art (SOTA) DL models. Simulation results show that proposed FA+KC-Net is accurate and stable, achieves the best performances on the DDR, Messidor, and APTOS datasets.
Collapse
Affiliation(s)
- Miao Tian
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Hongqiu Wang
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Yingxue Sun
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Shaozhi Wu
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Qingqing Tang
- Department of Ophthalmology, West China Hospital, Sichuan University, Chengdu, 610041, China
| | - Meixia Zhang
- Department of Ophthalmology, West China Hospital, Sichuan University, Chengdu, 610041, China
| |
Collapse
|
35
|
Huang Y, Lin L, Cheng P, Lyu J, Tam R, Tang X. Identifying the Key Components in ResNet-50 for Diabetic Retinopathy Grading from Fundus Images: A Systematic Investigation. Diagnostics (Basel) 2023; 13:diagnostics13101664. [PMID: 37238149 DOI: 10.3390/diagnostics13101664] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 04/29/2023] [Accepted: 05/01/2023] [Indexed: 05/28/2023] Open
Abstract
Although deep learning-based diabetic retinopathy (DR) classification methods typically benefit from well-designed architectures of convolutional neural networks, the training setting also has a non-negligible impact on prediction performance. The training setting includes various interdependent components, such as an objective function, a data sampling strategy, and a data augmentation approach. To identify the key components in a standard deep learning framework (ResNet-50) for DR grading, we systematically analyze the impact of several major components. Extensive experiments are conducted on a publicly available dataset EyePACS. We demonstrate that (1) the DR grading framework is sensitive to input resolution, objective function, and composition of data augmentation; (2) using mean square error as the loss function can effectively improve the performance with respect to a task-specific evaluation metric, namely the quadratically weighted Kappa; (3) utilizing eye pairs boosts the performance of DR grading and; (4) using data resampling to address the problem of imbalanced data distribution in EyePACS hurts the performance. Based on these observations and an optimal combination of the investigated components, our framework, without any specialized network design, achieves a state-of-the-art result (0.8631 for Kappa) on the EyePACS test set (a total of 42,670 fundus images) with only image-level labels. We also examine the proposed training practices on other fundus datasets and other network architectures to evaluate their generalizability. Our codes and pre-trained model are available online.
Collapse
Affiliation(s)
- Yijin Huang
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen 518055, China
- School of Biomedical Engineering, The University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Li Lin
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen 518055, China
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, China
| | - Pujin Cheng
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| | - Junyan Lyu
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen 518055, China
- Queensland Brain Institute, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Roger Tam
- School of Biomedical Engineering, The University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Xiaoying Tang
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| |
Collapse
|
36
|
Liu R, Wang T, Li H, Zhang P, Li J, Yang X, Shen D, Sheng B. TMM-Nets: Transferred Multi- to Mono-Modal Generation for Lupus Retinopathy Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1083-1094. [PMID: 36409801 DOI: 10.1109/tmi.2022.3223683] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Rare diseases, which are severely underrepresented in basic and clinical research, can particularly benefit from machine learning techniques. However, current learning-based approaches usually focus on either mono-modal image data or matched multi-modal data, whereas the diagnosis of rare diseases necessitates the aggregation of unstructured and unmatched multi-modal image data due to their rare and diverse nature. In this study, we therefore propose diagnosis-guided multi-to-mono modal generation networks (TMM-Nets) along with training and testing procedures. TMM-Nets can transfer data from multiple sources to a single modality for diagnostic data structurization. To demonstrate their potential in the context of rare diseases, TMM-Nets were deployed to diagnose the lupus retinopathy (LR-SLE), leveraging unmatched regular and ultra-wide-field fundus images for transfer learning. The TMM-Nets encoded the transfer learning from diabetic retinopathy to LR-SLE based on the similarity of the fundus lesions. In addition, a lesion-aware multi-scale attention mechanism was developed for clinical alerts, enabling TMM-Nets not only to inform patient care, but also to provide insights consistent with those of clinicians. An adversarial strategy was also developed to refine multi- to mono-modal image generation based on diagnostic results and the data distribution to enhance the data augmentation performance. Compared to the baseline model, the TMM-Nets showed 35.19% and 33.56% F1 score improvements on the test and external validation sets, respectively. In addition, the TMM-Nets can be used to develop diagnostic models for other rare diseases.
Collapse
|
37
|
Wang X, Han Y, Deng Y. CSGSA-Net: Canonical-structured graph sparse attention network for fetal ECG estimation. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
38
|
Jian M, Chen H, Tao C, Li X, Wang G. Triple-DRNet: A triple-cascade convolution neural network for diabetic retinopathy grading using fundus images. Comput Biol Med 2023; 155:106631. [PMID: 36805216 DOI: 10.1016/j.compbiomed.2023.106631] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 01/29/2023] [Accepted: 02/04/2023] [Indexed: 02/10/2023]
Abstract
Diabetic Retinopathy (DR) is a universal ocular complication of diabetes patients and also the main disease that causes blindness in the world wide. Automatic and efficient DR grading acts a vital role in timely treatment. However, it is difficult to effectively distinguish different types of distinct lesions (such as neovascularization in proliferative DR, microaneurysms in mild NPDR, etc.) using traditional convolutional neural networks (CNN), which greatly affects the ultimate classification results. In this article, we propose a triple-cascade network model (Triple-DRNet) to solve the aforementioned issue. The Triple-DRNet effectively subdivides the classification of five types of DR as well as improves the grading performance which mainly includes the following aspects: (1) In the first stage, the network carries out two types of classification, namely DR and No DR. (2) In the second stage, the cascade network is intended to distinguish the two categories between PDR and NPDR. (3) The final cascade network will be designed to differentiate the mild, moderate and severe types in NPDR. Experimental results show that the ACC of the Triple-DRNet on the APTOS 2019 Blindness Detection dataset achieves 92.08%, and the QWK metric reaches 93.62%, which proves the effectiveness of the devised Triple-DRNet compared with other mainstream models.
Collapse
Affiliation(s)
- Muwei Jian
- School of Information Science and Technology, Linyi University, Linyi, China; School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, China.
| | - Hongyu Chen
- School of Information Science and Technology, Linyi University, Linyi, China
| | - Chen Tao
- School of Information Science and Technology, Linyi University, Linyi, China
| | - Xiaoguang Li
- Faculty of Information Tecnology, Beijing University of Technology, Beijing, China.
| | - Gaige Wang
- School of Computer Science and Technology, Ocean University of China, Qingdao, China
| |
Collapse
|
39
|
Li M, Chen C, Cao Y, Zhou P, Deng X, Liu P, Wang Y, Lv X, Chen C. CIABNet: Category imbalance attention block network for the classification of multi-differentiated types of esophageal cancer. Med Phys 2023; 50:1507-1527. [PMID: 36272103 DOI: 10.1002/mp.16067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Revised: 08/25/2022] [Accepted: 09/09/2022] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND Esophageal cancer has become one of the important cancers that seriously threaten human life and health, and its incidence and mortality rate are still among the top malignant tumors. Histopathological image analysis is the gold standard for diagnosing different differentiation types of esophageal cancer. PURPOSE The grading accuracy and interpretability of the auxiliary diagnostic model for esophageal cancer are seriously affected by small interclass differences, imbalanced data distribution, and poor model interpretability. Therefore, we focused on developing the category imbalance attention block network (CIABNet) model to try to solve the previous problems. METHODS First, the quantitative metrics and model visualization results are integrated to transfer knowledge from the source domain images to better identify the regions of interest (ROI) in the target domain of esophageal cancer. Second, in order to pay attention to the subtle interclass differences, we propose the concatenate fusion attention block, which can focus on the contextual local feature relationships and the changes of channel attention weights among different regions simultaneously. Third, we proposed a category imbalance attention module, which treats each esophageal cancer differentiation class fairly based on aggregating different intensity information at multiple scales and explores more representative regional features for each class, which effectively mitigates the negative impact of category imbalance. Finally, we use feature map visualization to focus on interpreting whether the ROIs are the same or similar between the model and pathologists, thus better improving the interpretability of the model. RESULTS The experimental results show that the CIABNet model outperforms other state-of-the-art models, which achieves the most advanced results in classifying the differentiation types of esophageal cancer with an average classification accuracy of 92.24%, an average precision of 93.52%, an average recall of 90.31%, an average F1 value of 91.73%, and an average AUC value of 97.43%. In addition, the CIABNet model has essentially similar or identical to the ROI of pathologists in identifying histopathological images of esophageal cancer. CONCLUSIONS Our experimental results prove that our proposed computer-aided diagnostic algorithm shows great potential in histopathological images of multi-differentiated types of esophageal cancer.
Collapse
Affiliation(s)
- Min Li
- College of Information Science and Engineering, Xinjiang University, Urumqi, China
- Key Laboratory of Signal Detection and Processing, Xinjiang University, Urumqi, China
| | - Chen Chen
- College of Information Science and Engineering, Xinjiang University, Urumqi, China
- Xinjiang Cloud Computing Application Laboratory, Karamay, China
| | - Yanzhen Cao
- Department of Pathology, The Affiliated Tumor Hospital of Xinjiang Medical University, Urumqi, China
| | - Panyun Zhou
- College of Software, Xinjiang University, Urumqi, China
| | - Xin Deng
- College of Software, Xinjiang University, Urumqi, China
| | - Pei Liu
- College of Information Science and Engineering, Xinjiang University, Urumqi, China
| | - Yunling Wang
- The First Affiliated Hospital of Xinjiang Medical University, Urumqi, China
| | - Xiaoyi Lv
- College of Information Science and Engineering, Xinjiang University, Urumqi, China
- Key Laboratory of Signal Detection and Processing, Xinjiang University, Urumqi, China
- Xinjiang Cloud Computing Application Laboratory, Karamay, China
- College of Software, Xinjiang University, Urumqi, China
- Key Laboratory of software engineering technology, Xinjiang University, Urumqi, China
| | - Cheng Chen
- College of Software, Xinjiang University, Urumqi, China
| |
Collapse
|
40
|
A Medical Image Segmentation Method Based on Improved UNet 3+ Network. Diagnostics (Basel) 2023; 13:diagnostics13030576. [PMID: 36766681 PMCID: PMC9914627 DOI: 10.3390/diagnostics13030576] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 01/16/2023] [Accepted: 02/01/2023] [Indexed: 02/07/2023] Open
Abstract
In recent years, segmentation details and computing efficiency have become more important in medical image segmentation for clinical applications. In deep learning, UNet based on a convolutional neural network is one of the most commonly used models. UNet 3+ was designed as a modified UNet by adopting the architecture of full-scale skip connections. However, full-scale feature fusion can result in excessively redundant computations. This study aimed to reduce the network parameters of UNet 3+ while further improving the feature extraction capability. First, to eliminate redundancy and improve computational efficiency, we prune the full-scale skip connections of UNet 3+. In addition, we use the attention module called Convolutional Block Attention Module (CBAM) to capture more essential features and thus improve the feature expression capabilities. The performance of the proposed model was validated by three different types of datasets: skin cancer segmentation, breast cancer segmentation, and lung segmentation. The parameters are reduced by about 36% and 18% compared to UNet and UNet 3+, respectively. The results show that the proposed method not only outperformed the comparison models in a variety of evaluation metrics but also achieved more accurate segmentation results. The proposed models have lower network parameters that enhance feature extraction and improve segmentation performance efficiently. Furthermore, the models have great potential for application in medical imaging computer-aided diagnosis.
Collapse
|
41
|
Jasper Gnana Chandran J, Jabez J, Srinivasulu S. Auto-Metric Graph Neural Network optimized with Capuchin search optimization algorithm for coinciding diabetic retinopathy and diabetic Macular edema grading. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
42
|
Attention-Driven Cascaded Network for Diabetic Retinopathy Grading from Fundus Images. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
43
|
Han Z, Yang B, Deng S, Li Z, Tong Z. Category weighted network and relation weighted label for diabetic retinopathy screening. Comput Biol Med 2023; 152:106408. [PMID: 36516580 DOI: 10.1016/j.compbiomed.2022.106408] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 11/10/2022] [Accepted: 12/03/2022] [Indexed: 12/08/2022]
Abstract
Diabetic retinopathy (DR) is the primary cause of blindness in adults. Incorporating machine learning into DR grading can improve the accuracy of medical diagnosis. However, problems, such as severe data imbalance, persists. Existing studies on DR grading ignore the correlation between its labels. In this study, a category weighted network (CWN) was proposed to achieve data balance at the model level. In the CWN, a reference for weight settings is provided by calculating the category gradient norm and reducing the experimental overhead. We proposed to use relation weighted labels instead of the one-hot label to investigate the distance relationship between labels. Experiments revealed that the proposed CWN achieved excellent performance on various DR datasets. Furthermore, relation weighted labels exhibit broad applicability and can improve other methods using one-hot labels. The proposed method achieved kappa scores of 0.9431 and 0.9226 and accuracy of 90.94% and 86.12% on DDR and APTOS datasets, respectively.
Collapse
Affiliation(s)
- Zhike Han
- Zhejiang University, Hangzhou, 310027, Zhejiang, China; Zhejiang University City College, Hangzhou, 310015, Zhejiang, China
| | - Bin Yang
- Zhejiang University, Hangzhou, 310027, Zhejiang, China
| | | | - Zhuorong Li
- Zhejiang University City College, Hangzhou, 310015, Zhejiang, China.
| | - Zhou Tong
- The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, 310058, Zhejiang, China
| |
Collapse
|
44
|
Alwakid G, Gouda W, Humayun M, Jhanjhi NZ. Deep learning-enhanced diabetic retinopathy image classification. Digit Health 2023; 9:20552076231194942. [PMID: 37588156 PMCID: PMC10426308 DOI: 10.1177/20552076231194942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/28/2023] [Indexed: 08/18/2023] Open
Abstract
Objective Diabetic retinopathy (DR) can sometimes be treated and prevented from causing irreversible vision loss if caught and treated properly. In this work, a deep learning (DL) model is employed to accurately identify all five stages of DR. Methods The suggested methodology presents two examples, one with and one without picture augmentation. A balanced dataset meeting the same criteria in both cases is then generated using augmentative methods. The DenseNet-121-rendered model on the Asia Pacific Tele-Ophthalmology Society (APTOS) and dataset for diabetic retinopathy (DDR) datasets performed exceptionally well when compared to other methods for identifying the five stages of DR. Results Our propose model achieved the highest test accuracy of 98.36%, top-2 accuracy of 100%, and top-3 accuracy of 100% for the APTOS dataset, and the highest test accuracy of 79.67%, top-2 accuracy of 92.%76, and top-3 accuracy of 98.94% for the DDR dataset. Additional criteria (precision, recall, and F1-score) for gauging the efficacy of the proposed model were established with the help of APTOS and DDR. Conclusions It was discovered that feeding a model with higher-quality photographs increased its efficiency and ability for learning, as opposed to both state-of-the-art technology and the other, non-enhanced model.
Collapse
Affiliation(s)
- Ghadah Alwakid
- Department of Computer Science, College of Computer and Information Sciences, Jouf University, Sakakah, Saudi Arabia
| | - Walaa Gouda
- Department of Electrical Engineering, Faculty of Engineering at Shoubra, Benha University, Cairo, Egypt
| | - Mamoona Humayun
- Department of Information Systems, College of Computer and Information Sciences, Jouf University, Sakakah, Saudi Arabia
| | | |
Collapse
|
45
|
Guo X, Li X, Lin Q, Li G, Hu X, Che S. Joint grading of diabetic retinopathy and diabetic macular edema using an adaptive attention block and semisupervised learning. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04295-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
46
|
Zhao S, Wu Y, Tong M, Yao Y, Qian W, Qi S. CoT-XNet: contextual transformer with Xception network for diabetic retinopathy grading. Phys Med Biol 2022; 67. [PMID: 36322995 DOI: 10.1088/1361-6560/ac9fa0] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 11/01/2022] [Indexed: 11/07/2022]
Abstract
Objective.Diabetic retinopathy (DR) grading is primarily performed by assessing fundus images. Many types of lesions, such as microaneurysms, hemorrhages, and soft exudates, are available simultaneously in a single image. However, their sizes may be small, making it difficult to differentiate adjacent DR grades even using deep convolutional neural networks (CNNs). Recently, a vision transformer has shown comparable or even superior performance to CNNs, and it also learns different visual representations from CNNs. Inspired by this finding, we propose a two-path contextual transformer with Xception network (CoT-XNet) to improve the accuracy of DR grading.Approach.The representations learned by CoT through one path and those by the Xception network through another path are concatenated before the fully connected layer. Meanwhile, the dedicated pre-processing, data resampling, and test time augmentation strategies are implemented. The performance of CoT-XNet is evaluated in the publicly available datasets of DDR, APTOS2019, and EyePACS, which include over 50 000 images. Ablation experiments and comprehensive comparisons with various state-of-the-art (SOTA) models have also been performed.Main results.Our proposed CoT-XNet shows better performance than available SOTA models, and the accuracy and Kappa are 83.10% and 0.8496, 84.18% and 0.9000 and 84.10% and 0.7684 respectively, in the three datasets (listed above). Class activation maps of CoT and Xception networks are different and complementary in most images.Significance.By concatenating the different visual representations learned by CoT and Xception networks, CoT-XNet can accurately grade DR from fundus images and present good generalizability. CoT-XNet will promote the application of artificial intelligence-based systems in the DR screening of large-scale populations.
Collapse
Affiliation(s)
- Shuiqing Zhao
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, People's Republic of China.,Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, People's Republic of China
| | - Yanan Wu
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, People's Republic of China
| | - Mengmeng Tong
- Ningbo Blue Illumination Tech Co., Ltd, Ningbo, People's Republic of China
| | - Yudong Yao
- Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ, United States of America
| | - Wei Qian
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, People's Republic of China
| | - Shouliang Qi
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, People's Republic of China.,Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, People's Republic of China
| |
Collapse
|
47
|
Wu C, Fu L, Tian Z, Liu J, Song J, Guo W, Zhao Y, Zheng D, Jin Y, Yi D, Jiang X. LWMA-Net: Light-weighted morphology attention learning for human embryo grading. Comput Biol Med 2022; 151:106242. [PMID: 36436483 DOI: 10.1016/j.compbiomed.2022.106242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 09/23/2022] [Accepted: 10/22/2022] [Indexed: 11/16/2022]
Abstract
Visual inspection of embryo morphology is routinely used in embryo assessment and selection. However, due to the complexity of morphologies and large inter- and intra-observer variances among embryologists, manual evaluations remain to be subjective and time-consuming. Thus, we proposed a light-weighted morphology attention learning network (LWMA-Net) for automatic assistance on embryo grading. The LWMA-Net integrated a morphology attention module (MAM) to seek the informative features and their locations and a multiscale fusion module (MFM) to increase the features flowing in the model. The LWMA-Net was trained with a primary set of 3599 embryos from 2318 couples that were clinically enrolled between Sep. 2016 and Dec. 2018, and generated area under the receiver operating characteristic curves (AUCs) of 96.88% and 97.58% on 4- and 3-category gradings, respectively. An independent test set comprises 691 embryos from 321 couples between Jan. 2019 and Jan. 2021 were used to test the assisted fertility values on the embryo grading. Five experienced embryologists were invited to regrade the embryos in the independent set with and without the aid of the LWMA-Net three months apart. Embryologists aided by our LWMA-Net significantly improved their grading capabilities with average AUCs improved by 4.98%-5.32% on 4- and 3-category grading tasks, respectively, which suggests good potential of our LWMA-Net on assisted human reproduction.
Collapse
Affiliation(s)
- Chongwei Wu
- Department of Biomedical Engineering, School of Intelligent Medicine, China Medical University, Shenyang, 110122, China
| | - Langyuan Fu
- Department of Biomedical Engineering, School of Intelligent Medicine, China Medical University, Shenyang, 110122, China
| | - Zhiying Tian
- Key Laboratory of Reproductive Health and Medical Genetics, National Health and Family Planning Commission, Liaoning Research Institute of Family Planning, Shenyang, 110031, China
| | - Jiao Liu
- Department of Reproductive Medicine, Dalian Municipal Women and Children's Medical Center (Group), Dalian, 116083, China
| | - Jiangdian Song
- School of Medical Informatics, China Medical University, Shenyang, 110122, China
| | - Wei Guo
- College of Computer Science, Shenyang Aerospace University, Shenyang, 110136, China
| | - Yu Zhao
- Department of Reproductive Medicine, Dalian Municipal Women and Children's Medical Center (Group), Dalian, 116083, China
| | - Duo Zheng
- Department of Biomedical Engineering, School of Intelligent Medicine, China Medical University, Shenyang, 110122, China
| | - Ying Jin
- Key Laboratory of Reproductive Health and Medical Genetics, National Health and Family Planning Commission, Liaoning Research Institute of Family Planning, Shenyang, 110031, China
| | - Dongxu Yi
- Key Laboratory of Reproductive Health and Medical Genetics, National Health and Family Planning Commission, Liaoning Research Institute of Family Planning, Shenyang, 110031, China
| | - Xiran Jiang
- Department of Biomedical Engineering, School of Intelligent Medicine, China Medical University, Shenyang, 110122, China.
| |
Collapse
|
48
|
A Novel original feature fusion network for joint diabetic retinopathy and diabetic Macular edema grading. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-08038-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
49
|
Wu C, Li S, Liu X, Jiang F, Shi B. DMs-MAFM+EfficientNet: a hybrid model for predicting dysthyroid optic neuropathy. Med Biol Eng Comput 2022; 60:3217-3230. [PMID: 36129645 PMCID: PMC9490694 DOI: 10.1007/s11517-022-02663-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Accepted: 08/19/2022] [Indexed: 11/24/2022]
Abstract
Thyroid-associated ophthalmopathy (TAO) is a very common autoimmune orbital disease. Approximately 4%-8% of TAO patients will deteriorate and develop the most severe dysthyroid optic neuropathy (DON). According to the current data provided by clinical experts, there is still a certain proportion of suspected DON patients who cannot be diagnosed, and the clinical evaluation has low sensitivity and specificity. There is an urgent need for an efficient and accurate method to assist physicians in identifying DON. This study proposes a hybrid deep learning model to accurately identify suspected DON patients using computed tomography (CT). The hybrid model is mainly composed of the double multiscale and multi attention fusion module (DMs-MAFM) and a deep convolutional neural network. The DMs-MAFM is the feature extraction module proposed in this study, and it contains a multiscale feature fusion algorithm and improved channel attention and spatial attention, which can capture the features of tiny objects in the images. Multiscale feature fusion is combined with an attention mechanism to form a multilevel feature extraction module. The multiscale fusion algorithm can aggregate different receptive field features, and then fully obtain the channel and spatial correlation of the feature map through the multiscale channel attention aggregation module and spatial attention module, respectively. According to the experimental results, the hybrid model proposed in this study can accurately identify suspected DON patients, with Accuracy reaching 96%, Specificity reaching 99.5%, Sensitivity reaching 94%, Precision reaching 98.9% and F1-score reaching 96.4%. According to the evaluation by experts, the hybrid model proposed in this study has some enlightening significance for the diagnosis and prediction of clinically suspect DON.
Collapse
Affiliation(s)
- Cong Wu
- School of Computer Science, Hubei University of Technology, Nanli Street 28, Wuhan, 430068, China.
| | - Shijun Li
- School of Computer Science, Hubei University of Technology, Nanli Street 28, Wuhan, 430068, China
| | - Xiao Liu
- School of Computer Science, Hubei University of Technology, Nanli Street 28, Wuhan, 430068, China
| | - Fagang Jiang
- Union Hosptial Tongji Medical College Huazhong University of Science and Technology, Zhongshan Park, Wuhan, 430022, China
| | - Bingjie Shi
- Union Hosptial Tongji Medical College Huazhong University of Science and Technology, Zhongshan Park, Wuhan, 430022, China
| |
Collapse
|
50
|
Li F, Tang S, Chen Y, Zou H. Deep attentive convolutional neural network for automatic grading of imbalanced diabetic retinopathy in retinal fundus images. BIOMEDICAL OPTICS EXPRESS 2022; 13:5813-5835. [PMID: 36733744 PMCID: PMC9872872 DOI: 10.1364/boe.472176] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 09/25/2022] [Accepted: 10/06/2022] [Indexed: 06/18/2023]
Abstract
Automated fine-grained diabetic retinopathy (DR) grading was of great significance for assisting ophthalmologists in monitoring DR and designing tailored treatments for patients. Nevertheless, it is a challenging task as a result of high intra-class variations, high inter-class similarities, small lesions, and imbalanced data distributions. The pivotal factor for the success in fine-grained DR grading is to discern more subtle associated lesion features, such as microaneurysms (MA), Hemorrhages (HM), soft exudates (SE), and hard exudates (HE). In this paper, we constructed a simple yet effective deep attentive convolutional neural network (DACNN) for DR grading and lesion discovery with only image-wise supervision. Designed as a top-down architecture, our model incorporated stochastic atrous spatial pyramid pooling (sASPP), global attention mechanism (GAM), category attention mechanism (CAM), and learnable connected module (LCM) to better extract lesion-related features and maximize the DR grading performance. To be concrete, we devised sASPP combining randomness with atrous spatial pyramid pooling (ASPP) to accommodate the various scales of the lesions and struggle against the co-adaptation of multiple atrous convolutions. Then, GAM was introduced to extract class-agnostic global attention feature details, whilst CAM was explored for seeking class-specific distinctive region-level lesion feature information and regarding each DR severity grade in an equal way, which tackled the problem of imbalance DR data distributions. Further, the LCM was designed to automatically and adaptively search the optimal connections among layers for better extracting detailed small lesion feature representations. The proposed approach obtained high accuracy of 88.0% and kappa score of 88.6% for multi-class DR grading task on the EyePACS dataset, respectively, while 98.5% AUC, 93.8% accuracy, 87.9% kappa, 90.7% recall, 94.6% precision, and 92.6% F1-score for referral and non-referral classification on the Messidor dataset. Extensive experimental results on three challenging benchmarks demonstrated that the proposed approach achieved competitive performance in DR grading and lesion discovery using retinal fundus images compared with existing cutting-edge methods, and had good generalization capacity for unseen DR datasets. These promising results highlighted its potential as an efficient and reliable tool to assist ophthalmologists in large-scale DR screening.
Collapse
Affiliation(s)
- Feng Li
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Shiqing Tang
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Yuyang Chen
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Haidong Zou
- Shanghai Eye Disease Prevention & Treatment Center, Shanghai 200040, China
- Ophthalmology Center, Shanghai General Hospital, Shanghai 200080, China
| |
Collapse
|