1
|
Cao Z, Shi Y, Zhang S, Chen H, Liu W, Yue G, Lin H. Decentralized learning for medical image classification with prototypical contrastive network. Med Phys 2025. [PMID: 40089972 DOI: 10.1002/mp.17753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2024] [Revised: 02/17/2025] [Accepted: 02/22/2025] [Indexed: 03/18/2025] Open
Abstract
BACKGROUND Recently, deep convolutional neural networks (CNNs) have shown great potential in medical image classification tasks. However, the practical usage of the methods is constrained by two challenges: 1) the challenge of using nonindependent and identically distributed (non-IID) datasets from various medical institutions while ensuring privacy, and 2) the data imbalance problem due to the frequency of different diseases. PURPOSE The objective of this paper is to present a novel approach for addressing these challenges through a decentralized learning method using a prototypical contrastive network to achieve precise medical image classification while mitigating the non-IID problem across different clients. METHODS We propose a prototype contrastive network that minimizes disparities among heterogeneous clients. This network utilizes an approximate global prototype to alleviate the non-IID dataset problem for each local client by projecting data onto a balanced prototype space. To validate the effectiveness of our algorithm, we employed three distinct datasets of color fundus photographs for diabetic retinopathy: the EyePACS, APTOS, and IDRiD datasets. During training, we incorporated 35k images from EyePACS, 3662 from APTOS, and 516 from IDRiD. For testing, we used 53k images from EyePACS. Additionally, we included the COVIDx dataset of chest X-rays for comparative analysis, comprising 29 986 training images and 400 test samples. RESULTS In this study, we conducted comprehensive comparisons with existing works using four medical image datasets. Specifically, on the EyePACS dataset under the balanced IID setting, our method outperformed the FedAvg baseline by 3.7% in accuracy. In the Dirichlet non-IID setting, which presents an extremely unbalanced distribution, our method showed a notable 6.6% enhancement in accuracy over FedAvg. Similarly, on the APTOS dataset, our method achieved a 3.7% improvement in accuracy over FedAvg under the balanced IID setting and a 5.0% improvement under the Dirichlet non-IID setting. Notably, on the DCC non-IID and COVID-19 datasets, our method established a new state-of-the-art across all evaluation metrics, including WAccuracy, WPrecision, WRecall, and WF-score. CONCLUSIONS Our proposed prototypical contrastive loss guides the local client's data distribution to align with the global distribution. Additionally, our method uses an approximate global prototype to address unbalanced dataset distribution across local clients by projecting all data onto a new balanced prototype space. Our model achieves state-of-the-art performance on the EyePACS, APTOS, IDRiD, and COVIDx datasets.
Collapse
Affiliation(s)
- Zhantao Cao
- Institutions for Research, CETC Cyberspace Security Technology CO., LTD., Chengdu, China
- Chengdu Westone Information Security Technology Co., Ltd., Chengdu, China
- Ubiquitous Intelligence and Trusted Services Key Laboratory of Sichuan Province, Chengdu, China
| | - Yuanbing Shi
- Institutions for Research, CETC Cyberspace Security Technology CO., LTD., Chengdu, China
- Chengdu Westone Information Security Technology Co., Ltd., Chengdu, China
| | - Shuli Zhang
- Institutions for Research, CETC Cyberspace Security Technology CO., LTD., Chengdu, China
| | - Huanan Chen
- Institutions for Research, CETC Cyberspace Security Technology CO., LTD., Chengdu, China
| | - Weide Liu
- Institute for Infocomm Research, A*STAR, Singapore, Singapore
| | - Guanghui Yue
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
| | - Huazhen Lin
- Center of Statistical Research and School of Statistics, Southwestern University of Finance and Economics, Chengdu, China
| |
Collapse
|
2
|
Ma Y, Gu Y, Guo S, Qin X, Wen S, Shi N, Dai W, Chen Y. Grade-Skewed Domain Adaptation via Asymmetric Bi-Classifier Discrepancy Minimization for Diabetic Retinopathy Grading. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:1115-1126. [PMID: 39441682 DOI: 10.1109/tmi.2024.3485064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2024]
Abstract
Diabetic retinopathy (DR) is a leading cause of preventable low vision worldwide. Deep learning has exhibited promising performance in the grading of DR. Certain deep learning strategies have facilitated convenient regular eye check-ups, which are crucial for managing DR and preventing severe visual impairment. However, the generalization performance on cross-center, cross-vendor, and cross-user test datasets is compromised due to domain shift. Furthermore, the presence of small lesions and the imbalanced grade distribution, resulting from the characteristics of DR grading (e.g., the progressive nature of DR disease and the design of grading standards), complicates image-level domain adaptation for DR grading. The general predictions of the models trained on grade-skewed source domains will be significantly biased toward the majority grades, which further increases the adaptation difficulty. We formulate this problem as a grade-skewed domain adaptation challenge. Under the grade-skewed domain adaptation problem, we propose a novel method for image-level supervised DR grading via Asymmetric Bi-Classifier Discrepancy Minimization (ABiD). First, we propose optimizing the feature extractor by minimizing the discrepancy between the predictions of the asymmetric bi-classifier based on two classification criteria to encourage the exploration of crucial features in adjacent grades and stretch the distribution of adjacent grades in the latent space. Moreover, the classifier difference is maximized by using the forward and inverse distribution compensation mechanism to locate easily confused instances, which avoids pseudo-label bias on the target domain. The experimental results on two public DR datasets and one private DR dataset demonstrate that our method outperforms state-of-the-art methods significantly.
Collapse
|
3
|
Wen C, Ye M, Li H, Chen T, Xiao X. Concept-Based Lesion Aware Transformer for Interpretable Retinal Disease Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:57-68. [PMID: 39012729 DOI: 10.1109/tmi.2024.3429148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/18/2024]
Abstract
Existing deep learning methods have achieved remarkable results in diagnosing retinal diseases, showcasing the potential of advanced AI in ophthalmology. However, the black-box nature of these methods obscures the decision-making process, compromising their trustworthiness and acceptability. Inspired by the concept-based approaches and recognizing the intrinsic correlation between retinal lesions and diseases, we regard retinal lesions as concepts and propose an inherently interpretable framework designed to enhance both the performance and explainability of diagnostic models. Leveraging the transformer architecture, known for its proficiency in capturing long-range dependencies, our model can effectively identify lesion features. By integrating with image-level annotations, it achieves the alignment of lesion concepts with human cognition under the guidance of a retinal foundation model. Furthermore, to attain interpretability without losing lesion-specific information, our method employs a classifier built on a cross-attention mechanism for disease diagnosis and explanation, where explanations are grounded in the contributions of human-understandable lesion concepts and their visual localization. Notably, due to the structure and inherent interpretability of our model, clinicians can implement concept-level interventions to correct the diagnostic errors by simply adjusting erroneous lesion predictions. Experiments conducted on four fundus image datasets demonstrate that our method achieves favorable performance against state-of-the-art methods while providing faithful explanations and enabling concept-level interventions. Our code is publicly available at https://github.com/Sorades/CLAT.
Collapse
|
4
|
Anaya-Sánchez H, Altamirano-Robles L, Díaz-Hernández R, Zapotecas-Martínez S. WGAN-GP for Synthetic Retinal Image Generation: Enhancing Sensor-Based Medical Imaging for Classification Models. SENSORS (BASEL, SWITZERLAND) 2024; 25:167. [PMID: 39796958 PMCID: PMC11723073 DOI: 10.3390/s25010167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2024] [Revised: 12/19/2024] [Accepted: 12/28/2024] [Indexed: 01/13/2025]
Abstract
Accurate synthetic image generation is crucial for addressing data scarcity challenges in medical image classification tasks, particularly in sensor-derived medical imaging. In this work, we propose a novel method using a Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) and nearest-neighbor interpolation to generate high-quality synthetic images for diabetic retinopathy classification. Our approach enhances training datasets by generating realistic retinal images that retain critical pathological features. We evaluated the method across multiple retinal image datasets, including Retinal-Lesions, Fine-Grained Annotated Diabetic Retinopathy (FGADR), Indian Diabetic Retinopathy Image Dataset (IDRiD), and the Kaggle Diabetic Retinopathy dataset. The proposed method outperformed traditional generative models, such as conditional GANs and PathoGAN, achieving the best performance on key metrics: a Fréchet Inception Distance (FID) of 15.21, a Mean Squared Error (MSE) of 0.002025, and a Structural Similarity Index (SSIM) of 0.89 in the Kaggle dataset. Additionally, expert evaluations revealed that only 56.66% of synthetic images could be distinguished from real ones, demonstrating the high fidelity and clinical relevance of the generated data. These results highlight the effectiveness of our approach in improving medical image classification by generating realistic and diverse synthetic datasets.
Collapse
Affiliation(s)
- Héctor Anaya-Sánchez
- Computer Science Department, Instituto Nacional de Astrofísica Óptica y Electrónica, Luis Enrrique Erro No. 1, Sta. María Tonantzintla, Puebla 72840, Mexico; (H.A.-S.); (S.Z.-M.)
| | - Leopoldo Altamirano-Robles
- Computer Science Department, Instituto Nacional de Astrofísica Óptica y Electrónica, Luis Enrrique Erro No. 1, Sta. María Tonantzintla, Puebla 72840, Mexico; (H.A.-S.); (S.Z.-M.)
| | - Raquel Díaz-Hernández
- Optics Department, Instituto Nacional de Astrofísica Óptica y Electrónica, Luis Enrrique Erro No. 1, Sta. María Tonantzintla, Puebla 72840, Mexico;
| | - Saúl Zapotecas-Martínez
- Computer Science Department, Instituto Nacional de Astrofísica Óptica y Electrónica, Luis Enrrique Erro No. 1, Sta. María Tonantzintla, Puebla 72840, Mexico; (H.A.-S.); (S.Z.-M.)
| |
Collapse
|
5
|
Kamal SA, Du Y, Khalid M, Farrash M, Dhelim S. DRSegNet: A cutting-edge approach to Diabetic Retinopathy segmentation and classification using parameter-aware Nature-Inspired optimization. PLoS One 2024; 19:e0312016. [PMID: 39637079 PMCID: PMC11620556 DOI: 10.1371/journal.pone.0312016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2024] [Accepted: 09/30/2024] [Indexed: 12/07/2024] Open
Abstract
Diabetic retinopathy (DR) is a prominent reason of blindness globally, which is a diagnostically challenging disease owing to the intricate process of its development and the human eye's complexity, which consists of nearly forty connected components like the retina, iris, optic nerve, and so on. This study proposes a novel approach to the identification of DR employing methods such as synthetic data generation, K- Means Clustering-Based Binary Grey Wolf Optimizer (KCBGWO), and Fully Convolutional Encoder-Decoder Networks (FCEDN). This is achieved using Generative Adversarial Networks (GANs) to generate high-quality synthetic data and transfer learning for accurate feature extraction and classification, integrating these with Extreme Learning Machines (ELM). The substantial evaluation plan we have provided on the IDRiD dataset gives exceptional outcomes, where our proposed model gives 99.87% accuracy and 99.33% sensitivity, while its specificity is 99. 78%. This is why the outcomes of the presented study can be viewed as promising in terms of the further development of the proposed approach for DR diagnosis, as well as in creating a new reference point within the framework of medical image analysis and providing more effective and timely treatments.
Collapse
Affiliation(s)
- Sundreen Asad Kamal
- School of Electronics and Information Technology, Xi’an Jiaotong University, Xian, China
| | - Youtian Du
- School of Electronics and Information Technology, Xi’an Jiaotong University, Xian, China
| | - Majdi Khalid
- Department of Computer Science and Artificial Intelligence, College of Computing, Umm Al-Qura University, Makkah, Saudi Arabia
| | - Majed Farrash
- Department of Computer Science and Artificial Intelligence, College of Computing, Umm Al-Qura University, Makkah, Saudi Arabia
| | | |
Collapse
|
6
|
Singh A, Gorade V, Mishra D. MLVICX: Multi-Level Variance-Covariance Exploration for Chest X-Ray Self-Supervised Representation Learning. IEEE J Biomed Health Inform 2024; 28:7480-7490. [PMID: 39240749 DOI: 10.1109/jbhi.2024.3455337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2024]
Abstract
Self-supervised learning (SSL) reduces the need for manual annotation in deep learning models for medical image analysis. By learning the representations from unablelled data, self-supervised models perform well on tasks that require little to no fine-tuning. However, for medical images, like chest X-rays, characterised by complex anatomical structures and diverse clinical conditions, a need arises for representation learning techniques that encode fine-grained details while preserving the broader contextual information. In this context, we introduce MLVICX (Multi-Level Variance-Covariance Exploration for Chest X-ray Self-Supervised Representation Learning), an approach to capture rich representations in the form of embeddings from chest X-ray images. Central to our approach is a novel multi-level variance and covariance exploration strategy that effectively enables the model to detect diagnostically meaningful patterns while reducing redundancy. MLVICX promotes the retention of critical medical insights by adapting global and local contextual details and enhancing the variance and covariance of the learned embeddings. We demonstrate the performance of MLVICX in advancing self-supervised chest X-ray representation learning through comprehensive experiments. The performance enhancements we observe across various downstream tasks highlight the significance of the proposed approach in enhancing the utility of chest X-ray embeddings for precision medical diagnosis and comprehensive image analysis. For pertaining, we used the NIH-Chest X-ray dataset. Downstream tasks utilized NIH-Chest X-ray, Vinbig-CXR, RSNA pneumonia, and SIIM-ACR Pneumothorax datasets. Overall, we observe up to 3% performance gain over SOTA SSL approaches in various downstream tasks. Additionally, to demonstrate generalizability of our method, we conducted additional experiments on fundus images and observed superior performance on multiple datasets. Codes are available at GitHub.
Collapse
|
7
|
Kumar PR, Shilpa B, Jha RK, Chellibouina VS. Spatial attention U-Net model with Harris hawks optimization for retinal blood vessel and optic disc segmentation in fundus images. Int Ophthalmol 2024; 44:359. [PMID: 39207645 DOI: 10.1007/s10792-024-03279-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Accepted: 08/17/2024] [Indexed: 09/04/2024]
Abstract
BACKGROUND The state of the human eye's blood vessels is a crucial aspect in the diagnosis of ophthalmological illnesses. For many computer-aided diagnostic systems, precise retinal vessel segmentation is an essential job. However, it remains a difficult task due to the intricate vascular system of the eye. Although many different vascular segmentation techniques have already been presented, additional study is still required to address the problem of inadequate segmentation of thin and tiny vessels. METHODS In this work, we introduce the Spatial Attention U-Net (SAU-Net) model with harris hawks' optimization (HHO), a lightweight network that can be applied as a data augmentation technique to improve the efficiency of the existing annotated samples without the need of thousands of training instances for Retinal Blood Vessel and Optic Disc Segmentation. The SAU-Net-HHO implementation uses a spatially inferred attention map multiplied by the input feature map for adaptive feature enhancement. U-Net convolutional blocks have been replaced with structured dropout blocks in the proposed network to prevent overfitting. Data from both vascular extraction (DRIVE) and structured analysis of the retina (STARE) are used to evaluate SAU-Net-HHO performance. RESULTS The results show that the proposed SAU-Net-HHO performs well on both datasets. Analysing the obtained results, an average of 98.5% accuracy and Specificity 96.7% was achieved for DRIVE dataset and 97.8% accuracy and specificity 94.5% for STARE dataset. The proposed method yields numerical results with average values that are on par with those of state-of-the-art methods. CONCLUSION Visual inspection has revealed that the suggested method can segment thin and tiny vessels with greater accuracy than previous methods. It also demonstrates its potential for real-life clinical application.
Collapse
Affiliation(s)
- Puranam Revanth Kumar
- Department of Artificial Intelligence and Machine Learning, School of Engineering, Malla Reddy University, Hyderabad, India.
| | - B Shilpa
- Department of Computer Science and Engineering, AVN Institute of Engineering and Technology, Hyderabad, India
| | - Rajesh Kumar Jha
- Department of Electronics and Communication Engineering, Faculty of Science and Technology (IcfaiTech), ICFAI Foundation for Higher Education, Hyderabad, India
| | | |
Collapse
|
8
|
Lepetit-Aimon G, Playout C, Boucher MC, Duval R, Brent MH, Cheriet F. MAPLES-DR: MESSIDOR Anatomical and Pathological Labels for Explainable Screening of Diabetic Retinopathy. Sci Data 2024; 11:914. [PMID: 39179588 PMCID: PMC11343847 DOI: 10.1038/s41597-024-03739-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 08/05/2024] [Indexed: 08/26/2024] Open
Abstract
Reliable automatic diagnosis of Diabetic Retinopathy (DR) and Macular Edema (ME) is an invaluable asset in improving the rate of monitored patients among at-risk populations and in enabling earlier treatments before the pathology progresses and threatens vision. However, the explainability of screening models is still an open question, and specifically designed datasets are required to support the research. We present MAPLES-DR (MESSIDOR Anatomical and Pathological Labels for Explainable Screening of Diabetic Retinopathy), which contains, for 198 images of the MESSIDOR public fundus dataset, new diagnoses for DR and ME as well as new pixel-wise segmentation maps for 10 anatomical and pathological biomarkers related to DR. This paper documents the design choices and the annotation procedure that produced MAPLES-DR, discusses the interobserver variability and the overall quality of the annotations, and provides guidelines on using the dataset in a machine learning context.
Collapse
Affiliation(s)
- Gabriel Lepetit-Aimon
- Department of Computer and Software Engineering, Polytechnique Montréal, Montréal, QC, Canada.
| | - Clément Playout
- Department of Ophthalmology, Université de Montréal, Montréal, Canada
- Centre Universitaire d'Ophtalmologie, Hôpital Maisonneuve-Rosemont, Montréal, Canada
| | - Marie Carole Boucher
- Department of Ophthalmology, Université de Montréal, Montréal, Canada
- Centre Universitaire d'Ophtalmologie, Hôpital Maisonneuve-Rosemont, Montréal, Canada
| | - Renaud Duval
- Department of Ophthalmology, Université de Montréal, Montréal, Canada
- Centre Universitaire d'Ophtalmologie, Hôpital Maisonneuve-Rosemont, Montréal, Canada
| | - Michael H Brent
- Department of Ophthalmology and Vision Science, University of Toronto, Toronto, Canada
| | - Farida Cheriet
- Department of Computer and Software Engineering, Polytechnique Montréal, Montréal, QC, Canada
| |
Collapse
|
9
|
Pavithra S, Jaladi D, Tamilarasi K. Optical imaging for diabetic retinopathy diagnosis and detection using ensemble models. Photodiagnosis Photodyn Ther 2024; 48:104259. [PMID: 38944405 DOI: 10.1016/j.pdpdt.2024.104259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 06/16/2024] [Accepted: 06/20/2024] [Indexed: 07/01/2024]
Abstract
Diabetes, characterized by heightened blood sugar levels, can lead to a condition called Diabetic Retinopathy (DR), which adversely impacts the eyes due to elevated blood sugar affecting the retinal blood vessels. The most common cause of blindness in diabetics is thought to be Diabetic Retinopathy (DR), particularly in working-age individuals living in poor nations. People with type 1 or type 2 diabetes may develop this illness, and the risk rises with the length of diabetes and inadequate blood sugar management. There are limits to traditional approaches for the early identification of diabetic retinopathy (DR). In order to diagnose diabetic retinopathy, a model based on Convolutional neural network (CNN) is used in a unique way in this research. The suggested model uses a number of deep learning (DL) models, such as VGG19, Resnet50, and InceptionV3, to extract features. After concatenation, these characteristics are sent through the CNN algorithm for classification. By combining the advantages of several models, ensemble approaches can be effective tools for detecting diabetic retinopathy and increase overall performance and resilience. Classification and image recognition are just a few of the tasks that may be accomplished with ensemble approaches like combination of VGG19,Inception V3 and Resnet 50 to achieve high accuracy. The proposed model is evaluated using a publicly accessible collection of fundus images.VGG19, ResNet50, and InceptionV3 differ in their neural network architectures, feature extraction capabilities, object detection methods, and approaches to retinal delineation. VGG19 may excel in capturing fine details, ResNet50 in recognizing complex patterns, and InceptionV3 in efficiently capturing multi-scale features. Their combined use in an ensemble approach can provide a comprehensive analysis of retinal images, aiding in the delineation of retinal regions and identification of abnormalities associated with diabetic retinopathy. For instance, micro aneurysms, the earliest signs of DR, often require precise detection of subtle vascular abnormalities. VGG19's proficiency in capturing fine details allows for the identification of these minute changes in retinal morphology. On the other hand, ResNet50's strength lies in recognizing intricate patterns, making it effective in detecting neoneovascularization and complex haemorrhagic lesions. Meanwhile, InceptionV3's multi-scale feature extraction enables comprehensive analysis, crucial for assessing macular oedema and ischaemic changes across different retinal layers.
Collapse
Affiliation(s)
- S Pavithra
- School of Computer Science and Engineering, VIT University, Chennai, Tamil Nadu, India.
| | - Deepika Jaladi
- School of Computer Science and Engineering, VIT University, Chennai, Tamil Nadu, India.
| | - K Tamilarasi
- School of Computer Science and Engineering, VIT University, Chennai, Tamil Nadu, India.
| |
Collapse
|
10
|
Zhang X, Zhao J, Li Y, Wu H, Zhou X, Liu J. Efficient pyramid channel attention network for pathological myopia recognition with pretraining-and-finetuning. Artif Intell Med 2024; 154:102926. [PMID: 38964193 DOI: 10.1016/j.artmed.2024.102926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 06/21/2024] [Accepted: 06/25/2024] [Indexed: 07/06/2024]
Abstract
Pathological myopia (PM) is the leading ocular disease for impaired vision worldwide. Clinically, the characteristics of pathology distribution in PM are global-local on the fundus image, which plays a significant role in assisting clinicians in diagnosing PM. However, most existing deep neural networks focused on designing complex architectures but rarely explored the pathology distribution prior of PM. To tackle this issue, we propose an efficient pyramid channel attention (EPCA) module, which fully leverages the potential of the clinical pathology prior of PM with pyramid pooling and multi-scale context fusion. Then, we construct EPCA-Net for automatic PM recognition based on fundus images by stacking a sequence of EPCA modules. Moreover, motivated by the recent pretraining-and-finetuning paradigm, we attempt to adapt pre-trained natural image models for PM recognition by freezing them and treating the EPCA and other attention modules as adapters. In addition, we construct a PM recognition benchmark termed PM-fundus by collecting fundus images of PM from publicly available datasets. The comprehensive experiments demonstrate the superiority of EPCA-Net over state-of-the-art methods in the PM recognition task. For example, EPCA-Net achieves 97.56% accuracy and outperforms ViT by 2.85% accuracy on the PM-fundus dataset. The results also show that our method based on the pretraining-and-finetuning paradigm achieves competitive performance through comparisons to part of previous methods based on traditional fine-tuning paradigm with fewer tunable parameters, which has the potential to leverage more natural image foundation models to address the PM recognition task in limited medical data regime.
Collapse
Affiliation(s)
- Xiaoqing Zhang
- Research Institute of Trustworthy Autonomous Systems and Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China; Center for High Performance Computing and Shenzhen Key Laboratory of Intelligent Bioinformatics, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| | - Jilu Zhao
- Research Institute of Trustworthy Autonomous Systems and Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China
| | - Yan Li
- National Clinical Research Center for Ocular Diseases, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China; State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China
| | - Hao Wu
- National Clinical Research Center for Ocular Diseases, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China; State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China
| | - Xiangtian Zhou
- National Clinical Research Center for Ocular Diseases, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China; State Key Laboratory of Ophthalmology, Optometry and Visual Science, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China; Research Unit of Myopia Basic Research and Clinical Prevention and Control, Chinese Academy of Medical Sciences, Wenzhou, 325027, China
| | - Jiang Liu
- Research Institute of Trustworthy Autonomous Systems and Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China; National Clinical Research Center for Ocular Diseases, Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China; Singapore Eye Research Institute, 169856, Singapore.
| |
Collapse
|
11
|
Zhang Y, Ma X, Huang K, Li M, Heng PA. Semantic-Oriented Visual Prompt Learning for Diabetic Retinopathy Grading on Fundus Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2960-2969. [PMID: 38564346 DOI: 10.1109/tmi.2024.3383827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Diabetic retinopathy (DR) is a serious ocular condition that requires effective monitoring and treatment by ophthalmologists. However, constructing a reliable DR grading model remains a challenging and costly task, heavily reliant on high-quality training sets and adequate hardware resources. In this paper, we investigate the knowledge transferability of large-scale pre-trained models (LPMs) to fundus images based on prompt learning to construct a DR grading model efficiently. Unlike full-tuning which fine-tunes all parameters of LPMs, prompt learning only involves a minimal number of additional learnable parameters while achieving a competitive effect as full-tuning. Inspired by visual prompt tuning, we propose Semantic-oriented Visual Prompt Learning (SVPL) to enhance the semantic perception ability for better extracting task-specific knowledge from LPMs, without any additional annotations. Specifically, SVPL assigns a group of learnable prompts for each DR level to fit the complex pathological manifestations and then aligns each prompt group to task-specific semantic space via a contrastive group alignment (CGA) module. We also propose a plug-and-play adapter module, Hierarchical Semantic Delivery (HSD), which allows the semantic transition of prompt groups from shallow to deep layers to facilitate efficient knowledge mining and model convergence. Our extensive experiments on three public DR grading datasets demonstrate that SVPL achieves superior results compared to other transfer tuning and DR grading methods. Further analysis suggests that the generalized knowledge from LPMs is advantageous for constructing the DR grading model on fundus images.
Collapse
|
12
|
Chen T, Bai Y, Mao H, Liu S, Xu K, Xiong Z, Ma S, Yang F, Zhao Y. Cross-modality transfer learning with knowledge infusion for diabetic retinopathy grading. Front Med (Lausanne) 2024; 11:1400137. [PMID: 38808141 PMCID: PMC11130363 DOI: 10.3389/fmed.2024.1400137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Accepted: 04/15/2024] [Indexed: 05/30/2024] Open
Abstract
Background Ultra-wide-field (UWF) fundus photography represents an emerging retinal imaging technique offering a broader field of view, thus enhancing its utility in screening and diagnosing various eye diseases, notably diabetic retinopathy (DR). However, the application of computer-aided diagnosis for DR using UWF images confronts two major challenges. The first challenge arises from the limited availability of labeled UWF data, making it daunting to train diagnostic models due to the high cost associated with manual annotation of medical images. Secondly, existing models' performance requires enhancement due to the absence of prior knowledge to guide the learning process. Purpose By leveraging extensively annotated datasets within the field, which encompass large-scale, high-quality color fundus image datasets annotated at either image-level or pixel-level, our objective is to transfer knowledge from these datasets to our target domain through unsupervised domain adaptation. Methods Our approach presents a robust model for assessing the severity of diabetic retinopathy (DR) by leveraging unsupervised lesion-aware domain adaptation in ultra-wide-field (UWF) images. Furthermore, to harness the wealth of detailed annotations in publicly available color fundus image datasets, we integrate an adversarial lesion map generator. This generator supplements the grading model by incorporating auxiliary lesion information, drawing inspiration from the clinical methodology of evaluating DR severity by identifying and quantifying associated lesions. Results We conducted both quantitative and qualitative evaluations of our proposed method. In particular, among the six representative DR grading methods, our approach achieved an accuracy (ACC) of 68.18% and a precision (pre) of 67.43%. Additionally, we conducted extensive experiments in ablation studies to validate the effectiveness of each component of our proposed method. Conclusion In conclusion, our method not only improves the accuracy of DR grading, but also enhances the interpretability of the results, providing clinicians with a reliable DR grading scheme.
Collapse
Affiliation(s)
- Tao Chen
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Yanmiao Bai
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Haiting Mao
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Shouyue Liu
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Keyi Xu
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Zhouwei Xiong
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Shaodong Ma
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Fang Yang
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Yitian Zhao
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| |
Collapse
|
13
|
Du F, Zhao L, Luo H, Xing Q, Wu J, Zhu Y, Xu W, He W, Wu J. Recognition of eye diseases based on deep neural networks for transfer learning and improved D-S evidence theory. BMC Med Imaging 2024; 24:19. [PMID: 38238662 PMCID: PMC10797809 DOI: 10.1186/s12880-023-01176-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 12/06/2023] [Indexed: 01/22/2024] Open
Abstract
BACKGROUND Human vision has inspired significant advancements in computer vision, yet the human eye is prone to various silent eye diseases. With the advent of deep learning, computer vision for detecting human eye diseases has gained prominence, but most studies have focused only on a limited number of eye diseases. RESULTS Our model demonstrated a reduction in inherent bias and enhanced robustness. The fused network achieved an Accuracy of 0.9237, Kappa of 0.878, F1 Score of 0.914 (95% CI [0.875-0.954]), Precision of 0.945 (95% CI [0.928-0.963]), Recall of 0.89 (95% CI [0.821-0.958]), and an AUC value of ROC at 0.987. These metrics are notably higher than those of comparable studies. CONCLUSIONS Our deep neural network-based model exhibited improvements in eye disease recognition metrics over models from peer research, highlighting its potential application in this field. METHODS In deep learning-based eye recognition, to improve the learning efficiency of the model, we train and fine-tune the network by transfer learning. In order to eliminate the decision bias of the models and improve the credibility of the decisions, we propose a model decision fusion method based on the D-S theory. However, D-S theory is an incomplete and conflicting theory, we improve and eliminate the existed paradoxes, propose the improved D-S evidence theory(ID-SET), and apply it to the decision fusion of eye disease recognition models.
Collapse
Affiliation(s)
- Fanyu Du
- School of Medical Imaging, North Sichuan Medical College, Nanchong, 637000, China
- Faculty of Data Science, City University of Macau, Macau, 999078, China
- Guangdong Provincial Key Laboratory of Robotics and Intelligent System, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518000, China
| | - Lishuai Zhao
- School of Medical Imaging, North Sichuan Medical College, Nanchong, 637000, China
| | - Hui Luo
- Faculty of Data Science, City University of Macau, Macau, 999078, China
- School of Information and Management, Guangxi Medical University, Nanning, 530021, China
| | - Qijia Xing
- Affiliated Hospital of North Sichuan Medical College, Nanchong, 637000, China
| | - Jun Wu
- School of Medical Imaging, North Sichuan Medical College, Nanchong, 637000, China
| | - Yuanzhong Zhu
- School of Medical Imaging, North Sichuan Medical College, Nanchong, 637000, China
| | - Wansong Xu
- School of Medical Imaging, North Sichuan Medical College, Nanchong, 637000, China
| | - Wenjing He
- School of Medical Imaging, North Sichuan Medical College, Nanchong, 637000, China
| | - Jianfang Wu
- Faculty of Data Science, City University of Macau, Macau, 999078, China.
| |
Collapse
|
14
|
Shi D, Zhang W, He S, Chen Y, Song F, Liu S, Wang R, Zheng Y, He M. Translation of Color Fundus Photography into Fluorescein Angiography Using Deep Learning for Enhanced Diabetic Retinopathy Screening. OPHTHALMOLOGY SCIENCE 2023; 3:100401. [PMID: 38025160 PMCID: PMC10630672 DOI: 10.1016/j.xops.2023.100401] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 08/23/2023] [Accepted: 09/08/2023] [Indexed: 12/01/2023]
Abstract
Purpose To develop and validate a deep learning model that can transform color fundus (CF) photography into corresponding venous and late-phase fundus fluorescein angiography (FFA) images. Design Cross-sectional study. Participants We included 51 370 CF-venous FFA pairs and 14 644 CF-late FFA pairs from 4438 patients for model development. External testing involved 50 eyes with CF-FFA pairs and 2 public datasets for diabetic retinopathy (DR) classification, with 86 952 CF from EyePACs, and 1744 CF from MESSIDOR2. Methods We trained a deep-learning model to transform CF into corresponding venous and late-phase FFA images. The translated FFA images' quality was evaluated quantitatively on the internal test set and subjectively on 100 eyes with CF-FFA paired images (50 from external), based on the realisticity of the global image, anatomical landmarks (macula, optic disc, and vessels), and lesions. Moreover, we validated the clinical utility of the translated FFA for classifying 5-class DR and diabetic macular edema (DME) in the EyePACs and MESSIDOR2 datasets. Main Outcome Measures Image generation was quantitatively assessed by structural similarity measures (SSIM), and subjectively by 2 clinical experts on a 5-point scale (1 refers real FFA); intragrader agreement was assessed by kappa. The DR classification accuracy was assessed by area under the receiver operating characteristic curve. Results The SSIM of the translated FFA images were > 0.6, and the subjective quality scores ranged from 1.37 to 2.60. Both experts reported similar quality scores with substantial agreement (all kappas > 0.8). Adding the generated FFA on top of CF improved DR classification in the EyePACs and MESSIDOR2 datasets, with the area under the receiver operating characteristic curve increased from 0.912 to 0.939 on the EyePACs dataset and from 0.952 to 0.972 on the MESSIDOR2 dataset. The DME area under the receiver operating characteristic curve also increased from 0.927 to 0.974 in the MESSIDOR2 dataset. Conclusions Our CF-to-FFA framework produced realistic FFA images. Moreover, adding the translated FFA images on top of CF improved the accuracy of DR screening. These results suggest that CF-to-FFA translation could be used as a surrogate method when FFA examination is not feasible and as a simple add-on to improve DR screening. Financial Disclosures Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
Collapse
Affiliation(s)
- Danli Shi
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong
- Research Centre for SHARP Vision (RCSV), The Hong Kong Polytechnic University, Kowloon, Hong Kong
| | - Weiyi Zhang
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong
| | - Shuang He
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Guangdong Provincial Clinical Research Center for Ocular Diseases, Sun Yat-sen University, Guangzhou, China
| | - Yanxian Chen
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong
| | - Fan Song
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong
| | - Shunming Liu
- Department of Ophthalmology, Guangdong Academy of Medical Sciences, Guangdong Provincial People's Hospital, Guangzhou, China
| | - Ruobing Wang
- Department of Ophthalmology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Yingfeng Zheng
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Guangdong Provincial Clinical Research Center for Ocular Diseases, Sun Yat-sen University, Guangzhou, China
| | - Mingguang He
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong
- Research Centre for SHARP Vision (RCSV), The Hong Kong Polytechnic University, Kowloon, Hong Kong
- Department of Ophthalmology, Guangdong Academy of Medical Sciences, Guangdong Provincial People's Hospital, Guangzhou, China
| |
Collapse
|
15
|
Ma F, Wang S, Dai C, Qi F, Meng J. A new retinal OCT-angiography diabetic retinopathy dataset for segmentation and DR grading. JOURNAL OF BIOPHOTONICS 2023; 16:e202300052. [PMID: 37421596 DOI: 10.1002/jbio.202300052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 06/29/2023] [Accepted: 06/30/2023] [Indexed: 07/10/2023]
Abstract
PURPOSE Diabetic retinopathy (DR) is one of the most common diseases caused by diabetes and can lead to vision loss or even blindness. The wide-field optical coherence tomography (OCT) angiography is non-invasive imaging technology and convenient to diagnose DR. METHODS A newly constructed Retinal OCT-Angiography Diabetic retinopathy (ROAD) dataset is utilized for segmentation and grading tasks. It contains 1200 normal images, 1440 DR images, and 1440 ground truths for DR image segmentation. To handle the problem of grading DR, we propose a novel and effective framework, named projective map attention-based convolutional neural network (PACNet). RESULTS The experimental results demonstrate the effectiveness of our PACNet. The accuracy of the proposed framework for grading DR is 87.5% on the ROAD dataset. CONCLUSIONS The information on ROAD can be viewed at URL https://mip2019.github.io/ROAD. The ROAD dataset will be helpful for the development of the early detection of DR field and future research. TRANSLATIONAL RELEVANCE The novel framework for grading DR is a valuable research and clinical diagnosis method.
Collapse
Affiliation(s)
- Fei Ma
- Qufu Normal University, Rizhao, Shandong, China
| | | | - Cuixia Dai
- College Science, Shanghai Institute of Technology, Shanghai, China
| | - Fumin Qi
- National Supercomputing Center in Shenzhen, Shenzhen, Guangdong, China
| | - Jing Meng
- Qufu Normal University, Rizhao, Shandong, China
| |
Collapse
|
16
|
Dao QT, Trinh HQ, Nguyen VA. An effective and comprehensible method to detect and evaluate retinal damage due to diabetes complications. PeerJ Comput Sci 2023; 9:e1585. [PMID: 37810367 PMCID: PMC10557496 DOI: 10.7717/peerj-cs.1585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 08/20/2023] [Indexed: 10/10/2023]
Abstract
The leading cause of vision loss globally is diabetic retinopathy. Researchers are making great efforts to automatically detect and diagnose correctly diabetic retinopathy. Diabetic retinopathy includes five stages: no diabetic retinopathy, mild diabetic retinopathy, moderate diabetic retinopathy, severe diabetic retinopathy and proliferative diabetic retinopathy. Recent studies have offered several multi-tasking deep learning models to detect and assess the level of diabetic retinopathy. However, the explanation for the assessment of disease severity of these models is limited, and only stops at showing lesions through images. These studies have not explained on what basis the appraisal of disease severity is based. In this article, we present a system for assessing and interpreting the five stages of diabetic retinopathy. The proposed system is built from internal models including a deep learning model that detects lesions and an explanatory model that assesses disease stage. The deep learning model that detects lesions uses the Mask R-CNN deep learning network to specify the location and shape of the lesion and classify the lesion types. This model is a combination of two networks: one used to detect hemorrhagic and exudative lesions, and one used to detect vascular lesions like aneurysm and proliferation. The explanatory model appraises disease severity based on the severity of each type of lesion and the association between types. The severity of the disease will be decided by the model based on the number of lesions, the density and the area of the lesions. The experimental results on real-world datasets show that our proposed method achieves high accuracy of assessing five stages of diabetic retinopathy comparable to existing state-of-the-art methods and is capable of explaining the causes of disease severity.
Collapse
Affiliation(s)
- Quang Toan Dao
- Institute of Information Technology, Vietnam Academy of Science and Technology, Hanoi, Vietnam
| | - Hoang Quan Trinh
- Vietnam Space Center, Vietnam Academy of Science and Technology, Hanoi, Vietnam
| | - Viet Anh Nguyen
- Institute of Information Technology, Vietnam Academy of Science and Technology, Hanoi, Vietnam
| |
Collapse
|
17
|
Li R, Gu Y, Wang X, Pan J. A Cross-Domain Weakly Supervised Diabetic Retinopathy Lesion Identification Method Based on Multiple Instance Learning and Domain Adaptation. Bioengineering (Basel) 2023; 10:1100. [PMID: 37760202 PMCID: PMC10525098 DOI: 10.3390/bioengineering10091100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 09/11/2023] [Accepted: 09/12/2023] [Indexed: 09/29/2023] Open
Abstract
Accurate identification of lesions and their use across different medical institutions are the foundation and key to the clinical application of automatic diabetic retinopathy (DR) detection. Existing detection or segmentation methods can achieve acceptable results in DR lesion identification, but they strongly rely on a large number of fine-grained annotations that are not easily accessible and suffer severe performance degradation in the cross-domain application. In this paper, we propose a cross-domain weakly supervised DR lesion identification method using only easily accessible coarse-grained lesion attribute labels. We first propose the novel lesion-patch multiple instance learning method (LpMIL), which leverages the lesion attribute label for patch-level supervision to complete weakly supervised lesion identification. Then, we design a semantic constraint adaptation method (LpSCA) that improves the lesion identification performance of our model in different domains with semantic constraint loss. Finally, we perform secondary annotation on the open-source dataset EyePACS, to obtain the largest fine-grained annotated dataset EyePACS-pixel, and validate the performance of our model on it. Extensive experimental results on the public dataset FGADR and our EyePACS-pixel demonstrate that compared with the existing detection and segmentation methods, the proposed method can identify lesions accurately and comprehensively, and obtain competitive results using only coarse-grained annotations.
Collapse
Affiliation(s)
- Renyu Li
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing 100191, China; (R.L.); (X.W.); (J.P.)
| | - Yunchao Gu
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing 100191, China; (R.L.); (X.W.); (J.P.)
- Hangzhou Innovation Institute, Beihang University, Hangzhou 310051, China
- Research Unit of Virtual Body and Virtual Surgery Technologies, Chinese Academy of Medical Sciences, 2019RU004, Beijing 100191, China
| | - Xinliang Wang
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing 100191, China; (R.L.); (X.W.); (J.P.)
| | - Junjun Pan
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing 100191, China; (R.L.); (X.W.); (J.P.)
| |
Collapse
|
18
|
Batool S, Gilani SO, Waris A, Iqbal KF, Khan NB, Khan MI, Eldin SM, Awwad FA. Deploying efficient net batch normalizations (BNs) for grading diabetic retinopathy severity levels from fundus images. Sci Rep 2023; 13:14462. [PMID: 37660096 PMCID: PMC10475020 DOI: 10.1038/s41598-023-41797-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Accepted: 08/31/2023] [Indexed: 09/04/2023] Open
Abstract
Diabetic retinopathy (DR) is one of the main causes of blindness in people around the world. Early diagnosis and treatment of DR can be accomplished by organizing large regular screening programs. Still, it is difficult to spot diabetic retinopathy timely because the situation might not indicate signs in the primary stages of the disease. Due to a drastic increase in diabetic patients, there is an urgent need for efficient diabetic retinopathy detecting systems. Auto-encoders, sparse coding, and limited Boltzmann machines were used as a few past deep learning (DL) techniques and features for the classification of DR. Convolutional Neural Networks (CNN) have been identified as a promising solution for detecting and classifying DR. We employ the deep learning capabilities of efficient net batch normalization (BNs) pre-trained models to automatically acquire discriminative features from fundus images. However, we successfully achieved F1 scores above 80% on all efficient net BNs in the EYE-PACS dataset (calculated F1 score for DeepDRiD another dataset) and the results are better than previous studies. In this paper, we improved the accuracy and F1 score of the efficient net BNs pre-trained models on the EYE-PACS dataset by applying a Gaussian Smooth filter and data augmentation transforms. Using our proposed technique, we have achieved F1 scores of 84% and 87% for EYE-PACS and DeepDRiD.
Collapse
Affiliation(s)
- Summiya Batool
- National University of Sciences and Technology, Islamabad, 44000, Pakistan
| | - Syed Omer Gilani
- National University of Sciences and Technology, Islamabad, 44000, Pakistan
| | - Asim Waris
- National University of Sciences and Technology, Islamabad, 44000, Pakistan
| | | | - Niaz B Khan
- National University of Sciences and Technology, Islamabad, 44000, Pakistan
- Mechanical Engineering Department, College of Engineering, University of Bahrain, Isa Town, 32038, Bahrain
| | - M Ijaz Khan
- Depaetment of Mechanical Engineering, Lebanese American University, Kraytem, Beirut, 1102-2801, Lebanon.
- Department of Mathematics and Statistics, Riphah International University I-14, Islamabad, 44000, Pakistan.
- Department of Mechanics and Engineering Science, Peking University, Beijing, 100871, China.
| | - Sayed M Eldin
- Faculty of Engineering, Center of Research, Future University in Egypt, New Cairo, 11835, Egypt
| | - Fuad A Awwad
- Department of Quantitative Analysis, College of Business Administration, King Saud University, P.O. Box 71115, 11587, Riyadh, Saudi Arabia
| |
Collapse
|
19
|
Tian M, Wang H, Sun Y, Wu S, Tang Q, Zhang M. Fine-grained attention & knowledge-based collaborative network for diabetic retinopathy grading. Heliyon 2023; 9:e17217. [PMID: 37449186 PMCID: PMC10336422 DOI: 10.1016/j.heliyon.2023.e17217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 06/09/2023] [Accepted: 06/10/2023] [Indexed: 07/18/2023] Open
Abstract
Accurate diabetic retinopathy (DR) grading is crucial for making the proper treatment plan to reduce the damage caused by vision loss. This task is challenging due to the fact that the DR related lesions are often small and subtle in visual differences and intra-class variations. Moreover, relationships between the lesions and the DR levels are complicated. Although many deep learning (DL) DR grading systems have been developed with some success, there are still rooms for grading accuracy improvement. A common issue is that not much medical knowledge was used in these DL DR grading systems. As a result, the grading results are not properly interpreted by ophthalmologists, thus hinder the potential for practical applications. This paper proposes a novel fine-grained attention & knowledge-based collaborative network (FA+KC-Net) to address this concern. The fine-grained attention network dynamically divides the extracted feature maps into smaller patches and effectively captures small image features that are meaningful in the sense of its training from large amount of retinopathy fundus images. The knowledge-based collaborative network extracts a-priori medical knowledge features, i.e., lesions such as the microaneurysms (MAs), soft exudates (SEs), hard exudates (EXs), and hemorrhages (HEs). Finally, decision rules are developed to fuse the DR grading results from the fine-grained network and the knowledge-based collaborative network to make the final grading. Extensive experiments are carried out on four widely-used datasets, the DDR, Messidor, APTOS, and EyePACS to evaluate the efficacy of our method and compare with other state-of-the-art (SOTA) DL models. Simulation results show that proposed FA+KC-Net is accurate and stable, achieves the best performances on the DDR, Messidor, and APTOS datasets.
Collapse
Affiliation(s)
- Miao Tian
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Hongqiu Wang
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Yingxue Sun
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Shaozhi Wu
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Qingqing Tang
- Department of Ophthalmology, West China Hospital, Sichuan University, Chengdu, 610041, China
| | - Meixia Zhang
- Department of Ophthalmology, West China Hospital, Sichuan University, Chengdu, 610041, China
| |
Collapse
|
20
|
Liu R, Wang T, Li H, Zhang P, Li J, Yang X, Shen D, Sheng B. TMM-Nets: Transferred Multi- to Mono-Modal Generation for Lupus Retinopathy Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1083-1094. [PMID: 36409801 DOI: 10.1109/tmi.2022.3223683] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Rare diseases, which are severely underrepresented in basic and clinical research, can particularly benefit from machine learning techniques. However, current learning-based approaches usually focus on either mono-modal image data or matched multi-modal data, whereas the diagnosis of rare diseases necessitates the aggregation of unstructured and unmatched multi-modal image data due to their rare and diverse nature. In this study, we therefore propose diagnosis-guided multi-to-mono modal generation networks (TMM-Nets) along with training and testing procedures. TMM-Nets can transfer data from multiple sources to a single modality for diagnostic data structurization. To demonstrate their potential in the context of rare diseases, TMM-Nets were deployed to diagnose the lupus retinopathy (LR-SLE), leveraging unmatched regular and ultra-wide-field fundus images for transfer learning. The TMM-Nets encoded the transfer learning from diabetic retinopathy to LR-SLE based on the similarity of the fundus lesions. In addition, a lesion-aware multi-scale attention mechanism was developed for clinical alerts, enabling TMM-Nets not only to inform patient care, but also to provide insights consistent with those of clinicians. An adversarial strategy was also developed to refine multi- to mono-modal image generation based on diagnostic results and the data distribution to enhance the data augmentation performance. Compared to the baseline model, the TMM-Nets showed 35.19% and 33.56% F1 score improvements on the test and external validation sets, respectively. In addition, the TMM-Nets can be used to develop diagnostic models for other rare diseases.
Collapse
|
21
|
Hou Q, Cao P, Jia L, Chen L, Yang J, Zaiane OR. Image Quality Assessment Guided Collaborative Learning of Image Enhancement and Classification for Diabetic Retinopathy Grading. IEEE J Biomed Health Inform 2023; 27:1455-1466. [PMID: 37015399 DOI: 10.1109/jbhi.2022.3231276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Diabetic retinopathy (DR) is one of the most serious complications of diabetes and is a prominent cause of permanent blindness. However, the low-quality fundus images increase the uncertainty of clinical diagnosis, resulting in a significant decrease on the grading performance of the fundus images. Therefore, enhancing the image quality is essential for predicting the grade level in DR diagnosis. In essence, we are faced with three challenges: (I) How to appropriately evaluate the quality of fundus images; (II) How to effectively enhance low-quality fundus images for providing reliable fundus images to ophthalmologists or automated analysis systems; (III) How to jointly train the quality assessment and enhancement for improving the DR grading performance. Considering the importance of image quality assessment and enhancement for DR grading, we propose a collaborative learning framework to jointly train the subnetworks of the image quality assessment as well as enhancement, and DR disease grading in a unified framework. The key contribution of the proposed framework lies in modelling the potential correlation of these tasks and the joint training of these subnetworks, which significantly improves the fundus image quality and DR grading performance. Our framework is a general learning model, which may be useful in other medical images with low-quality data. Extensive experimental results have shown that our method outperforms state-of-the-art DR grading methods by a considerable 73.6% ACC/71.2% Kappa and 88.5% ACC/86.3% Kappa on Messidor and EyeQ benchmark datasets, respectively. In addition, our method significantly enhances the low-quality fundus images while preserving fundus structure features and lesion information. To make the framework more general, we also evaluate the enhancement results in more downstream tasks, such as vessel segmentation.
Collapse
|
22
|
Detecting red-lesions from retinal fundus images using unique morphological features. Sci Rep 2023; 13:3487. [PMID: 36859429 PMCID: PMC9977778 DOI: 10.1038/s41598-023-30459-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 02/23/2023] [Indexed: 03/03/2023] Open
Abstract
One of the most important retinal diseases is Diabetic Retinopathy (DR) which can lead to serious damage to vision if remains untreated. Red-lesions are from important demonstrations of DR helping its identification in early stages. The detection and verification of them is helpful in the evaluation of disease severity and progression. In this paper, a novel image processing method is proposed for extracting red-lesions from fundus images. The method works based on finding and extracting the unique morphological features of red-lesions. After quality improvement of images, a pixel-based verification is performed in the proposed method to find the ones which provide a significant intensity change in a curve-like neighborhood. In order to do so, a curve is considered around each pixel and the intensity changes around the curve boundary are considered. The pixels for which it is possible to find such curves in at least two directions are considered as parts of red-lesions. The simplicity of computations, the high accuracy of results, and no need to post-processing operations are the important characteristics of the proposed method endorsing its good performance.
Collapse
|
23
|
Zhao Y, Wang X, Che T, Bao G, Li S. Multi-task deep learning for medical image computing and analysis: A review. Comput Biol Med 2023; 153:106496. [PMID: 36634599 DOI: 10.1016/j.compbiomed.2022.106496] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 12/06/2022] [Accepted: 12/27/2022] [Indexed: 12/29/2022]
Abstract
The renaissance of deep learning has provided promising solutions to various tasks. While conventional deep learning models are constructed for a single specific task, multi-task deep learning (MTDL) that is capable to simultaneously accomplish at least two tasks has attracted research attention. MTDL is a joint learning paradigm that harnesses the inherent correlation of multiple related tasks to achieve reciprocal benefits in improving performance, enhancing generalizability, and reducing the overall computational cost. This review focuses on the advanced applications of MTDL for medical image computing and analysis. We first summarize four popular MTDL network architectures (i.e., cascaded, parallel, interacted, and hybrid). Then, we review the representative MTDL-based networks for eight application areas, including the brain, eye, chest, cardiac, abdomen, musculoskeletal, pathology, and other human body regions. While MTDL-based medical image processing has been flourishing and demonstrating outstanding performance in many tasks, in the meanwhile, there are performance gaps in some tasks, and accordingly we perceive the open challenges and the perspective trends. For instance, in the 2018 Ischemic Stroke Lesion Segmentation challenge, the reported top dice score of 0.51 and top recall of 0.55 achieved by the cascaded MTDL model indicate further research efforts in high demand to escalate the performance of current models.
Collapse
Affiliation(s)
- Yan Zhao
- Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, 100083, China
| | - Xiuying Wang
- School of Computer Science, The University of Sydney, Sydney, NSW, 2008, Australia.
| | - Tongtong Che
- Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, 100083, China
| | - Guoqing Bao
- School of Computer Science, The University of Sydney, Sydney, NSW, 2008, Australia
| | - Shuyu Li
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, 100875, China.
| |
Collapse
|
24
|
Atasever S, Azginoglu N, Terzi DS, Terzi R. A comprehensive survey of deep learning research on medical image analysis with focus on transfer learning. Clin Imaging 2023; 94:18-41. [PMID: 36462229 DOI: 10.1016/j.clinimag.2022.11.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 10/17/2022] [Accepted: 11/01/2022] [Indexed: 11/13/2022]
Abstract
This survey aims to identify commonly used methods, datasets, future trends, knowledge gaps, constraints, and limitations in the field to provide an overview of current solutions used in medical image analysis in parallel with the rapid developments in transfer learning (TL). Unlike previous studies, this survey grouped the last five years of current studies for the period between January 2017 and February 2021 according to different anatomical regions and detailed the modality, medical task, TL method, source data, target data, and public or private datasets used in medical imaging. Also, it provides readers with detailed information on technical challenges, opportunities, and future research trends. In this way, an overview of recent developments is provided to help researchers to select the most effective and efficient methods and access widely used and publicly available medical datasets, research gaps, and limitations of the available literature.
Collapse
Affiliation(s)
- Sema Atasever
- Computer Engineering Department, Nevsehir Hacı Bektas Veli University, Nevsehir, Turkey.
| | - Nuh Azginoglu
- Computer Engineering Department, Kayseri University, Kayseri, Turkey.
| | | | - Ramazan Terzi
- Computer Engineering Department, Amasya University, Amasya, Turkey.
| |
Collapse
|
25
|
Hou B. High-fidelity diabetic retina fundus image synthesis from freestyle lesion maps. BIOMEDICAL OPTICS EXPRESS 2023; 14:533-549. [PMID: 36874499 PMCID: PMC9979677 DOI: 10.1364/boe.477906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 12/08/2022] [Accepted: 12/15/2022] [Indexed: 06/18/2023]
Abstract
Retina fundus imaging for diagnosing diabetic retinopathy (DR) is an efficient and patient-friendly modality, where many high-resolution images can be easily obtained for accurate diagnosis. With the advancements of deep learning, data-driven models may facilitate the process of high-throughput diagnosis especially in areas with less availability of certified human experts. Many datasets of DR already exist for training learning-based models. However, most are often unbalanced, do not have a large enough sample count, or both. This paper proposes a two-stage pipeline for generating photo-realistic retinal fundus images based on either artificially generated or free-hand drawn semantic lesion maps. The first stage uses a conditional StyleGAN to generate synthetic lesion maps based on a DR severity grade. The second stage then uses GauGAN to convert the synthetic lesion maps into high resolution fundus images. We evaluate the photo-realism of generated images using the Fréchet inception distance (FID), and show the efficacy of our pipeline through downstream tasks, such as; dataset augmentation for automatic DR grading and lesion segmentation.
Collapse
|
26
|
Jin K, Yan Y, Wang S, Yang C, Chen M, Liu X, Terasaki H, Yeo TH, Singh NG, Wang Y, Ye J. iERM: An Interpretable Deep Learning System to Classify Epiretinal Membrane for Different Optical Coherence Tomography Devices: A Multi-Center Analysis. J Clin Med 2023; 12:400. [PMID: 36675327 PMCID: PMC9862104 DOI: 10.3390/jcm12020400] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 12/29/2022] [Accepted: 01/03/2023] [Indexed: 01/06/2023] Open
Abstract
Background: Epiretinal membranes (ERM) have been found to be common among individuals >50 years old. However, the severity grading assessment for ERM based on optical coherence tomography (OCT) images has remained a challenge due to lacking reliable and interpretable analysis methods. Thus, this study aimed to develop a two-stage deep learning (DL) system named iERM to provide accurate automatic grading of ERM for clinical practice. Methods: The iERM was trained based on human segmentation of key features to improve classification performance and simultaneously provide interpretability to the classification results. We developed and tested iERM using a total of 4547 OCT B-Scans of four different commercial OCT devices that were collected from nine international medical centers. Results: As per the results, the integrated network effectively improved the grading performance by 1−5.9% compared with the traditional classification DL model and achieved high accuracy scores of 82.9%, 87.0%, and 79.4% in the internal test dataset and two external test datasets, respectively. This is comparable to retinal specialists whose average accuracy scores are 87.8% and 79.4% in two external test datasets. Conclusion: This study proved to be a benchmark method to improve the performance and enhance the interpretability of the traditional DL model with the implementation of segmentation based on prior human knowledge. It may have the potential to provide precise guidance for ERM diagnosis and treatment.
Collapse
Affiliation(s)
- Kai Jin
- Department of Ophthalmology, The Second Affiliated Hospital of Zhejiang University, College of Medicine, Hangzhou 310009, China
| | - Yan Yan
- Department of Ophthalmology, The Second Affiliated Hospital of Zhejiang University, College of Medicine, Hangzhou 310009, China
| | - Shuai Wang
- School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai 264209, China
| | - Ce Yang
- School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai 264209, China
| | - Menglu Chen
- Department of Ophthalmology, The Second Affiliated Hospital of Zhejiang University, College of Medicine, Hangzhou 310009, China
| | - Xindi Liu
- Department of Ophthalmology, The Second Affiliated Hospital of Zhejiang University, College of Medicine, Hangzhou 310009, China
| | - Hiroto Terasaki
- Department of Ophthalmology, Kagoshima University Graduate School of Medical and Dental Sciences, Kagoshima 890-8520, Japan
| | - Tun-Hang Yeo
- Ophthalmology and Visual Sciences, Khoo Teck Puat Hospital, National Healthcare Group, Singapore 768828, Singapore
| | - Neha Gulab Singh
- Ophthalmology and Visual Sciences, Khoo Teck Puat Hospital, National Healthcare Group, Singapore 768828, Singapore
| | - Yao Wang
- Department of Ophthalmology, The Second Affiliated Hospital of Zhejiang University, College of Medicine, Hangzhou 310009, China
| | - Juan Ye
- Department of Ophthalmology, The Second Affiliated Hospital of Zhejiang University, College of Medicine, Hangzhou 310009, China
| |
Collapse
|
27
|
Han Z, Yang B, Deng S, Li Z, Tong Z. Category weighted network and relation weighted label for diabetic retinopathy screening. Comput Biol Med 2023; 152:106408. [PMID: 36516580 DOI: 10.1016/j.compbiomed.2022.106408] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 11/10/2022] [Accepted: 12/03/2022] [Indexed: 12/08/2022]
Abstract
Diabetic retinopathy (DR) is the primary cause of blindness in adults. Incorporating machine learning into DR grading can improve the accuracy of medical diagnosis. However, problems, such as severe data imbalance, persists. Existing studies on DR grading ignore the correlation between its labels. In this study, a category weighted network (CWN) was proposed to achieve data balance at the model level. In the CWN, a reference for weight settings is provided by calculating the category gradient norm and reducing the experimental overhead. We proposed to use relation weighted labels instead of the one-hot label to investigate the distance relationship between labels. Experiments revealed that the proposed CWN achieved excellent performance on various DR datasets. Furthermore, relation weighted labels exhibit broad applicability and can improve other methods using one-hot labels. The proposed method achieved kappa scores of 0.9431 and 0.9226 and accuracy of 90.94% and 86.12% on DDR and APTOS datasets, respectively.
Collapse
Affiliation(s)
- Zhike Han
- Zhejiang University, Hangzhou, 310027, Zhejiang, China; Zhejiang University City College, Hangzhou, 310015, Zhejiang, China
| | - Bin Yang
- Zhejiang University, Hangzhou, 310027, Zhejiang, China
| | | | - Zhuorong Li
- Zhejiang University City College, Hangzhou, 310015, Zhejiang, China.
| | - Zhou Tong
- The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, 310058, Zhejiang, China
| |
Collapse
|
28
|
Saleem S, Amin J, Sharif M, Mallah GA, Kadry S, Gandomi AH. Leukemia segmentation and classification: A comprehensive survey. Comput Biol Med 2022; 150:106028. [PMID: 36126356 DOI: 10.1016/j.compbiomed.2022.106028] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 07/11/2022] [Accepted: 08/20/2022] [Indexed: 11/30/2022]
Abstract
Blood is made up of leukocytes (WBCs), erythrocytes (RBCs), and thrombocytes. The ratio of blood cancer diseases is increasing rapidly, among which leukemia is one of the famous cancer which may lead to death. Leukemia cancer is initiated by the unnecessary growth of immature WBCs present in the sponge tissues of bone marrow. It is generally analyzed by etiologists by perceiving slides of blood smear images under a microscope. The morphological features and blood cells count facilitated the etiologists to detect leukemia. Due to the late detection and expensive instruments used for leukemia analysis, the death rate has risen significantly. The fluorescence-based cell sorting technique and manual recounts using a hemocytometer are error-prone and imprecise. Leukemia detection methods consist of pre-processing, segmentation, features extraction, and classification. In this article, recent deep learning methodologies and challenges for leukemia detection are discussed. These methods are helpful to examine the microscopic blood smears images and for the detection of leukemia more accurately.
Collapse
Affiliation(s)
- Saba Saleem
- Department of Computer Science, COMSATS University Islamabad, Wah Campus, Pakistan
| | - Javaria Amin
- Department of Computer Science, University of Wah, Wah Cantt, Pakistan
| | - Muhammad Sharif
- Department of Computer Science, COMSATS University Islamabad, Wah Campus, Pakistan
| | | | - Seifedine Kadry
- Department of Applied Data Science, Noroff University College, Kristiansand, Norway; Department of Electrical and Computer Engineering, Lebanese American University, Byblos, Lebanon
| | - Amir H Gandomi
- Faculty of Engineering & Information Technology, University of Technology Sydney, Ultimo, NSW, 2007, Australia.
| |
Collapse
|
29
|
Li F, Tang S, Chen Y, Zou H. Deep attentive convolutional neural network for automatic grading of imbalanced diabetic retinopathy in retinal fundus images. BIOMEDICAL OPTICS EXPRESS 2022; 13:5813-5835. [PMID: 36733744 PMCID: PMC9872872 DOI: 10.1364/boe.472176] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 09/25/2022] [Accepted: 10/06/2022] [Indexed: 06/18/2023]
Abstract
Automated fine-grained diabetic retinopathy (DR) grading was of great significance for assisting ophthalmologists in monitoring DR and designing tailored treatments for patients. Nevertheless, it is a challenging task as a result of high intra-class variations, high inter-class similarities, small lesions, and imbalanced data distributions. The pivotal factor for the success in fine-grained DR grading is to discern more subtle associated lesion features, such as microaneurysms (MA), Hemorrhages (HM), soft exudates (SE), and hard exudates (HE). In this paper, we constructed a simple yet effective deep attentive convolutional neural network (DACNN) for DR grading and lesion discovery with only image-wise supervision. Designed as a top-down architecture, our model incorporated stochastic atrous spatial pyramid pooling (sASPP), global attention mechanism (GAM), category attention mechanism (CAM), and learnable connected module (LCM) to better extract lesion-related features and maximize the DR grading performance. To be concrete, we devised sASPP combining randomness with atrous spatial pyramid pooling (ASPP) to accommodate the various scales of the lesions and struggle against the co-adaptation of multiple atrous convolutions. Then, GAM was introduced to extract class-agnostic global attention feature details, whilst CAM was explored for seeking class-specific distinctive region-level lesion feature information and regarding each DR severity grade in an equal way, which tackled the problem of imbalance DR data distributions. Further, the LCM was designed to automatically and adaptively search the optimal connections among layers for better extracting detailed small lesion feature representations. The proposed approach obtained high accuracy of 88.0% and kappa score of 88.6% for multi-class DR grading task on the EyePACS dataset, respectively, while 98.5% AUC, 93.8% accuracy, 87.9% kappa, 90.7% recall, 94.6% precision, and 92.6% F1-score for referral and non-referral classification on the Messidor dataset. Extensive experimental results on three challenging benchmarks demonstrated that the proposed approach achieved competitive performance in DR grading and lesion discovery using retinal fundus images compared with existing cutting-edge methods, and had good generalization capacity for unseen DR datasets. These promising results highlighted its potential as an efficient and reliable tool to assist ophthalmologists in large-scale DR screening.
Collapse
Affiliation(s)
- Feng Li
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Shiqing Tang
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Yuyang Chen
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Haidong Zou
- Shanghai Eye Disease Prevention & Treatment Center, Shanghai 200040, China
- Ophthalmology Center, Shanghai General Hospital, Shanghai 200080, China
| |
Collapse
|
30
|
Nirthika R, Manivannan S, Ramanan A. Siamese network based fine grained classification for Diabetic Retinopathy grading. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
31
|
OHGCNet: Optimal feature selection-based hybrid graph convolutional network model for joint DR-DME classification. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
32
|
Wang H, Zhou Y, Zhang J, Lei J, Sun D, Xu F, Xu X. Anomaly segmentation in retinal images with poisson-blending data augmentation. Med Image Anal 2022; 81:102534. [PMID: 35842977 DOI: 10.1016/j.media.2022.102534] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 03/14/2022] [Accepted: 07/08/2022] [Indexed: 11/24/2022]
Abstract
Diabetic retinopathy (DR) is one of the most important complications of diabetes. Accurate segmentation of DR lesions is of great importance for the early diagnosis of DR. However, simultaneous segmentation of multi-type DR lesions is technically challenging because of 1) the lack of pixel-level annotations and 2) the large diversity between different types of DR lesions. In this study, first, we propose a novel Poisson-blending data augmentation (PBDA) algorithm to generate synthetic images, which can be easily utilized to expand the existing training data for lesion segmentation. We perform extensive experiments to recognize the important attributes in the PBDA algorithm. We show that position constraints are of great importance and that the synthesis density of one type of lesion has a joint influence on the segmentation of other types of lesions. Second, we propose a convolutional neural network architecture, named DSR-U-Net++ (i.e., DC-SC residual U-Net++), for the simultaneous segmentation of multi-type DR lesions. Ablation studies showed that the mean area under precision recall curve (AUPR) for all four types of lesions increased by >5% with PBDA. The proposed DSR-U-Net++ with PBDA outperformed the state-of-the-art methods by 1.7%-9.9% on the Indian Diabetic Retinopathy Image Dataset (IDRiD) and 67.3% on the e-ophtha dataset with respect to mean AUPR. The developed method would be an efficient tool to generate large-scale task-specific training data for other medical anomaly segmentation tasks.
Collapse
Affiliation(s)
- Hualin Wang
- The Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Yuhong Zhou
- The Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Jiong Zhang
- Cixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo, 315300, China
| | - Jianqin Lei
- Department of Ophthalmology, First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710049, China
| | - Dongke Sun
- Jiangsu Key Laboratory for Design and Manufacture of Micro-Nano Biomedical Instruments, Southeast University, Southeast University, Nanjing, 211189, China
| | - Feng Xu
- The Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Xiayu Xu
- The Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, 710049, China; Zhejiang Research Institute of Xi'an Jiaotong University, Hangzhou, 311215, China.
| |
Collapse
|
33
|
Dayana AM, Emmanuel WRS. Deep learning enabled optimized feature selection and classification for grading diabetic retinopathy severity in the fundus image. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07471-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
34
|
Payá E, Bori L, Colomer A, Meseguer M, Naranjo V. Automatic characterization of human embryos at day 4 post-insemination from time-lapse imaging using supervised contrastive learning and inductive transfer learning techniques. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 221:106895. [PMID: 35609359 DOI: 10.1016/j.cmpb.2022.106895] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 05/03/2022] [Accepted: 05/15/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND Embryo morphology is a predictive marker for implantation success and ultimately live births. Viability evaluation and quality grading are commonly used to select the embryo with the highest implantation potential. However, the traditional method of manual embryo assessment is time-consuming and highly susceptible to inter- and intra-observer variability. Automation of this process results in more objective and accurate predictions. METHOD In this paper, we propose a novel methodology based on deep learning to automatically evaluate the morphological appearance of human embryos from time-lapse imaging. A supervised contrastive learning framework is implemented to predict embryo viability at day 4 and day 5, and an inductive transfer approach is applied to classify embryo quality at both times. RESULTS Results showed that both methods outperformed conventional approaches and improved state-of-the-art embryology results for an independent test set. The viability result achieved an accuracy of 0.8103 and 0.9330 and the quality results reached values of 0.7500 and 0.8001 for day 4 and day 5, respectively. Furthermore, qualitative results kept consistency with the clinical interpretation. CONCLUSIONS The proposed methods are up to date with the artificial intelligence literature and have been proven to be promising. Furthermore, our findings represent a breakthrough in the field of embryology in that they study the possibilities of embryo selection at day 4. Moreover, the grad-CAMs findings are directly in line with embryologists' decisions. Finally, our results demonstrated excellent potential for the inclusion of the models in clinical practice.
Collapse
Affiliation(s)
- Elena Payá
- Instituto de Investigación e Innovación en Bioingeniería, Universitat Politècnica de València, Valencia, 46022, Spain; IVI-RMA Valencia, Spain.
| | | | - Adrián Colomer
- Instituto de Investigación e Innovación en Bioingeniería, Universitat Politècnica de València, Valencia, 46022, Spain
| | | | - Valery Naranjo
- Instituto de Investigación e Innovación en Bioingeniería, Universitat Politècnica de València, Valencia, 46022, Spain
| |
Collapse
|
35
|
|
36
|
Necessity of Local Modification for Deep Learning Algorithms to Predict Diabetic Retinopathy. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph19031204. [PMID: 35162226 PMCID: PMC8834743 DOI: 10.3390/ijerph19031204] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 01/12/2022] [Accepted: 01/18/2022] [Indexed: 11/16/2022]
Abstract
Deep learning (DL) algorithms are used to diagnose diabetic retinopathy (DR). However, most of these algorithms have been trained using global data or data from patients of a single region. Using different model architectures (e.g., Inception-v3, ResNet101, and DenseNet121), we assessed the necessity of modifying the algorithms for universal society screening. We used the open-source dataset from the Kaggle Diabetic Retinopathy Detection competition to develop a model for the detection of DR severity. We used a local dataset from Taipei City Hospital to verify the necessity of model localization and validated the three aforementioned models with local datasets. The experimental results revealed that Inception-v3 outperformed ResNet101 and DenseNet121 with a foreign global dataset, whereas DenseNet121 outperformed Inception-v3 and ResNet101 with the local dataset. The quadratic weighted kappa score (κ) was used to evaluate the model performance. All models had 5-8% higher κ for the local dataset than for the foreign dataset. Confusion matrix analysis revealed that, compared with the local ophthalmologists' diagnoses, the severity predicted by the three models was overestimated. Thus, DL algorithms using artificial intelligence based on global data must be locally modified to ensure the applicability of a well-trained model to make diagnoses in clinical environments.
Collapse
|
37
|
Guo Y, Peng Y. CARNet: Cascade attentive RefineNet for multi-lesion segmentation of diabetic retinopathy images. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-021-00630-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
AbstractDiabetic retinopathy is the leading cause of blindness in working population. Lesion segmentation from fundus images helps ophthalmologists accurately diagnose and grade of diabetic retinopathy. However, the task of lesion segmentation is full of challenges due to the complex structure, the various sizes and the interclass similarity with other fundus tissues. To address the issue, this paper proposes a cascade attentive RefineNet (CARNet) for automatic and accurate multi-lesion segmentation of diabetic retinopathy. It can make full use of the fine local details and coarse global information from the fundus image. CARNet is composed of global image encoder, local image encoder and attention refinement decoder. We take the whole image and the patch image as the dual input, and feed them to ResNet50 and ResNet101, respectively, for downsampling to extract lesion features. The high-level refinement decoder uses dual attention mechanism to integrate the same-level features in the two encoders with the output of the low-level attention refinement module for multiscale information fusion, which focus the model on the lesion area to generate accurate predictions. We evaluated the segmentation performance of the proposed CARNet on the IDRiD, E-ophtha and DDR data sets. Extensive comparison experiments and ablation studies on various data sets demonstrate the proposed framework outperforms the state-of-the-art approaches and has better accuracy and robustness. It not only overcomes the interference of similar tissues and noises to achieve accurate multi-lesion segmentation, but also preserves the contour details and shape features of small lesions without overloading GPU memory usage.
Collapse
|
38
|
Huang X, Wang H, She C, Feng J, Liu X, Hu X, Chen L, Tao Y. Artificial intelligence promotes the diagnosis and screening of diabetic retinopathy. Front Endocrinol (Lausanne) 2022; 13:946915. [PMID: 36246896 PMCID: PMC9559815 DOI: 10.3389/fendo.2022.946915] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 09/12/2022] [Indexed: 11/13/2022] Open
Abstract
Deep learning evolves into a new form of machine learning technology that is classified under artificial intelligence (AI), which has substantial potential for large-scale healthcare screening and may allow the determination of the most appropriate specific treatment for individual patients. Recent developments in diagnostic technologies facilitated studies on retinal conditions and ocular disease in metabolism and endocrinology. Globally, diabetic retinopathy (DR) is regarded as a major cause of vision loss. Deep learning systems are effective and accurate in the detection of DR from digital fundus photographs or optical coherence tomography. Thus, using AI techniques, systems with high accuracy and efficiency can be developed for diagnosing and screening DR at an early stage and without the resources that are only accessible in special clinics. Deep learning enables early diagnosis with high specificity and sensitivity, which makes decisions based on minimally handcrafted features paving the way for personalized DR progression real-time monitoring and in-time ophthalmic or endocrine therapies. This review will discuss cutting-edge AI algorithms, the automated detecting systems of DR stage grading and feature segmentation, the prediction of DR outcomes and therapeutics, and the ophthalmic indications of other systemic diseases revealed by AI.
Collapse
Affiliation(s)
- Xuan Huang
- Department of Ophthalmology, Beijing Chaoyang Hospital, Capital Medical University, Beijing, China
- Medical Research Center, Beijing Chaoyang Hospital, Capital Medical University, Beijing, China
| | - Hui Wang
- Department of Ophthalmology, Beijing Chaoyang Hospital, Capital Medical University, Beijing, China
| | - Chongyang She
- Department of Ophthalmology, Beijing Chaoyang Hospital, Capital Medical University, Beijing, China
| | - Jing Feng
- Department of Ophthalmology, Beijing Chaoyang Hospital, Capital Medical University, Beijing, China
| | - Xuhui Liu
- Department of Ophthalmology, Beijing Chaoyang Hospital, Capital Medical University, Beijing, China
| | - Xiaofeng Hu
- Department of Ophthalmology, Beijing Chaoyang Hospital, Capital Medical University, Beijing, China
| | - Li Chen
- Department of Ophthalmology, Beijing Chaoyang Hospital, Capital Medical University, Beijing, China
| | - Yong Tao
- Department of Ophthalmology, Beijing Chaoyang Hospital, Capital Medical University, Beijing, China
- *Correspondence: Yong Tao,
| |
Collapse
|
39
|
Chen Y, Long J, Guo J. RF-GANs: A Method to Synthesize Retinal Fundus Images Based on Generative Adversarial Network. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:3812865. [PMID: 34804140 PMCID: PMC8598326 DOI: 10.1155/2021/3812865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 10/03/2021] [Accepted: 10/23/2021] [Indexed: 11/17/2022]
Abstract
Diabetic retinopathy (DR) is a diabetic complication affecting the eyes, which is the main cause of blindness in young and middle-aged people. In order to speed up the diagnosis of DR, a mass of deep learning methods have been used for the detection of this disease, but they failed to attain excellent results due to unbalanced training data, i.e., the lack of DR fundus images. To address the problem of data imbalance, this paper proposes a method dubbed retinal fundus images generative adversarial networks (RF-GANs), which is based on generative adversarial network, to synthesize retinal fundus images. RF-GANs is composed of two generation models, RF-GAN1 and RF-GAN2. Firstly, RF-GAN1 is employed to translate retinal fundus images from source domain (the domain of semantic segmentation datasets) to target domain (the domain of EyePACS dataset connected to Kaggle (EyePACS)). Then, we train the semantic segmentation models with the translated images, and employ the trained models to extract the structural and lesion masks (hereafter, we refer to it as Masks) of EyePACS. Finally, we employ RF-GAN2 to synthesize retinal fundus images using the Masks and DR grading labels. This paper verifies the effectiveness of the method: RF-GAN1 can narrow down the domain gap between different datasets to improve the performance of the segmentation models. RF-GAN2 can synthesize realistic retinal fundus images. Adopting the synthesized images for data augmentation, the accuracy and quadratic weighted kappa of the state-of-the-art DR grading model on the testing set of EyePACS increase by 1.53% and 1.70%, respectively.
Collapse
Affiliation(s)
- Yu Chen
- Information and Computer Engineering College, Northeast Forestry University, Harbin, China
| | - Jun Long
- Information and Computer Engineering College, Northeast Forestry University, Harbin, China
| | - Jifeng Guo
- Information and Computer Engineering College, Northeast Forestry University, Harbin, China
| |
Collapse
|
40
|
Deep feed forward neural network-based screening system for diabetic retinopathy severity classification using the lion optimization algorithm. Graefes Arch Clin Exp Ophthalmol 2021; 260:1245-1263. [PMID: 34505925 DOI: 10.1007/s00417-021-05375-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Revised: 07/19/2021] [Accepted: 08/06/2021] [Indexed: 10/20/2022] Open
Abstract
Diabetic Retinopathy (DR) has become a major cause of blindness in recent years. Diabetic patients should be screened on a regular basis for early detection, which can help them avoid blindness. Furthermore, the number of diabetic patients undergoing these screening procedures is rapidly increasing, resulting in increased workload for ophthalmologists. An efficient screening system that assists ophthalmologists in DR diagnosis saves ophthalmologists a lot of time and effort. To address this issue, an automatic DR detection screening system is required to improve diagnosis speed and detection accuracy. Appropriate treatment can be provided to patients to prevent vision loss if the severity levels of DR are accurately diagnosed in the early stages. A growing number of screening systems for DR diagnosis have been developed in recent years using various deep learning models, and the majority of the published work did not include any optimization algorithm in the neural network for severity classification. The use of an optimization algorithm with the necessary hyper parameter tuning will improve the model's performance. Considering this as motivation, we proposed a five-phase DFNN-LOA model. The DFNN-LOA algorithm presented here has five phases: (i) pre-processing, (ii) optic disc detection, (iii) segmentation, (iv) feature extraction, and (v) severity classification. The proposed model's experimental analysis is carried out on the MESSIDOR dataset. The experimental results show that the proposed DFNN-LOA model has superior characteristics, with maximum accuracy, sensitivity, specificity, F1-score, PPV, and NPV of 97.6%, 98.4%, 90.7%, 96.5%, 94.6%, and 97.1%, respectively.
Collapse
|
41
|
Niu Y, Gu L, Zhao Y, Lu F. Explainable Diabetic Retinopathy Detection and Retinal Image Generation. IEEE J Biomed Health Inform 2021; 26:44-55. [PMID: 34495852 DOI: 10.1109/jbhi.2021.3110593] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Though deep learning has shown successful performance in classifying the label and severity stage of certain diseases, most of them give few explanations on how to make predictions. Inspired by Koch's Postulates, the foundation in evidence-based medicine (EBM) to identify the pathogen, we propose to exploit the interpretability of deep learning application in medical diagnosis. By isolating neuron activation patterns from a diabetic retinopathy (DR) detector and visualizing them, we can determine the symptoms that the DR detector identifies as evidence to make prediction. To be specific, we first define novel pathological descriptors using activated neurons of the DR detector to encode both spatial and appearance information of lesions. Then, to visualize the symptom encoded in the descriptor, we propose Patho-GAN, a new network to synthesize medically plausible retinal images. By manipulating these descriptors, we could even arbitrarily control the position, quantity, and categories of generated lesions. We also show that our synthesized images carry the symptoms directly related to diabetic retinopathy diagnosis. Our generated images are both qualitatively and quantitatively superior to the ones by previous methods. Besides, compared to existing methods that take hours to generate an image, our second level speed endows the potential to be an effective solution for data augmentation.
Collapse
|
42
|
Lakshminarayanan V, Kheradfallah H, Sarkar A, Jothi Balaji J. Automated Detection and Diagnosis of Diabetic Retinopathy: A Comprehensive Survey. J Imaging 2021; 7:165. [PMID: 34460801 PMCID: PMC8468161 DOI: 10.3390/jimaging7090165] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 08/23/2021] [Accepted: 08/24/2021] [Indexed: 12/16/2022] Open
Abstract
Diabetic Retinopathy (DR) is a leading cause of vision loss in the world. In the past few years, artificial intelligence (AI) based approaches have been used to detect and grade DR. Early detection enables appropriate treatment and thus prevents vision loss. For this purpose, both fundus and optical coherence tomography (OCT) images are used to image the retina. Next, Deep-learning (DL)-/machine-learning (ML)-based approaches make it possible to extract features from the images and to detect the presence of DR, grade its severity and segment associated lesions. This review covers the literature dealing with AI approaches to DR such as ML and DL in classification and segmentation that have been published in the open literature within six years (2016-2021). In addition, a comprehensive list of available DR datasets is reported. This list was constructed using both the PICO (P-Patient, I-Intervention, C-Control, O-Outcome) and Preferred Reporting Items for Systematic Review and Meta-analysis (PRISMA) 2009 search strategies. We summarize a total of 114 published articles which conformed to the scope of the review. In addition, a list of 43 major datasets is presented.
Collapse
Affiliation(s)
- Vasudevan Lakshminarayanan
- Theoretical and Experimental Epistemology Lab, School of Optometry and Vision Science, University of Waterloo, Waterloo, ON N2L 3G1, Canada;
| | - Hoda Kheradfallah
- Theoretical and Experimental Epistemology Lab, School of Optometry and Vision Science, University of Waterloo, Waterloo, ON N2L 3G1, Canada;
| | - Arya Sarkar
- Department of Computer Engineering, University of Engineering and Management, Kolkata 700 156, India;
| | | |
Collapse
|
43
|
Zhou Y, Wang B, He X, Cui S, Shao L. DR-GAN: Conditional Generative Adversarial Network for Fine-Grained Lesion Synthesis on Diabetic Retinopathy Images. IEEE J Biomed Health Inform 2020; 26:56-66. [PMID: 33332280 DOI: 10.1109/jbhi.2020.3045475] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Diabetic retinopathy (DR) is a complication of diabetes that severely affects eyes. It can be graded into five levels of severity according to international protocol. However, optimizing a grading model to have strong generalizability requires a large amount of balanced training data, which is difficult to collect, particularly for the high severity levels. Typical data augmentation methods, including random flipping and rotation, cannot generate data with high diversity. In this paper, we propose a diabetic retinopathy generative adversarial network (DR-GAN) to synthesize high-resolution fundus images which can be manipulated with arbitrary grading and lesion information. Thus, large-scale generated data can be used for more meaningful augmentation to train a DR grading and lesion segmentation model. The proposed retina generator is conditioned on the structural and lesion masks, as well as adaptive grading vectors sampled from the latent grading space, which can be adopted to control the synthesized grading severity. Moreover, a multi-scale spatial and channel attention module is devised to improve the generation ability to synthesize small details. Multi-scale discriminators are designed to operate from large to small receptive fields, and joint adversarial losses are adopted to optimize the whole network in an end-to-end manner. With extensive experiments evaluated on the EyePACS dataset connected to Kaggle, as well as the FGADR dataset, we validate the effectiveness of our method, which can both synthesize highly realistic (1280 × 1280) controllable fundus images and contribute to the DR grading task.
Collapse
|