1
|
Artificial intelligence in chorioretinal pathology through fundoscopy: a comprehensive review. Int J Retina Vitreous 2024; 10:36. [PMID: 38654344 PMCID: PMC11036694 DOI: 10.1186/s40942-024-00554-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Accepted: 04/02/2024] [Indexed: 04/25/2024] Open
Abstract
BACKGROUND Applications for artificial intelligence (AI) in ophthalmology are continually evolving. Fundoscopy is one of the oldest ocular imaging techniques but remains a mainstay in posterior segment imaging due to its prevalence, ease of use, and ongoing technological advancement. AI has been leveraged for fundoscopy to accomplish core tasks including segmentation, classification, and prediction. MAIN BODY In this article we provide a review of AI in fundoscopy applied to representative chorioretinal pathologies, including diabetic retinopathy and age-related macular degeneration, among others. We conclude with a discussion of future directions and current limitations. SHORT CONCLUSION As AI evolves, it will become increasingly essential for the modern ophthalmologist to understand its applications and limitations to improve patient outcomes and continue to innovate.
Collapse
|
2
|
An Updated Simplified Severity Scale for Age-Related Macular Degeneration, Incorporating Reticular Pseudodrusen: Age-Related Eye Disease Study Report No. 42. Ophthalmology 2024:S0161-6420(24)00263-X. [PMID: 38657840 DOI: 10.1016/j.ophtha.2024.04.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 03/25/2024] [Accepted: 04/15/2024] [Indexed: 04/26/2024] Open
Abstract
PURPOSE To update the Age-Related Eye Disease Study (AREDS) Simplified Severity Scale for risk of late age-related macular degeneration (AMD), including incorporation of reticular pseudodrusen (RPD), and to perform external validation on the AREDS2. DESIGN Post hoc analysis of two clinical trial cohorts: AREDS and AREDS2. PARTICIPANTS Participants with no late AMD in either eye at baseline in AREDS (n=2719) and AREDS2 (n=1472). METHODS Five-year rates of progression to late AMD were calculated according to levels 0-4 on the Simplified Severity Scale, following two updates: (i) non-central GA considered part of the outcome rather than a risk feature, and (ii) scale separation according to RPD status (determined by validated deep learning grading of color fundus photographs). MAIN OUTCOME MEASURES Five-year rate of progression to late AMD (defined as neovascular AMD or any GA). RESULTS In the AREDS, following the first scale update, the five-year rates of progression to late AMD for levels 0-4 were 0.3%, 4.5%, 12.9%, 32.2%, and 55.6%, respectively. Following both updates, the proportion progressing to late AMD by five years was 8.4% in participants without RPD and 40.6% in those with RPD. As the final Simplified Severity Scale, the five-year progression rates for levels 0-4, respectively, were 0.3%, 4.3%, 11.6%, 26.7%, and 50.0%, for participants without RPD at baseline, and 2.8%, 8.0%, 29.0%, 58.7%, and 72.2%, for participants with RPD at baseline. In external validation on the AREDS2, for levels 2-4, the progression rates were similar, at 15.0%, 27.7%, and 45.7% (RPD absent) and 26.2%, 46.0%, and 73.0% (RPD present), respectively. CONCLUSIONS The AREDS AMD Simplified Severity Scale has been modernized with two important updates. The new scale for individuals without RPD has five-year progression rates of ∼0.5%, 4%, 12%, ∼25%, and 50%, such that the rates on the original scale remain accurate. The new scale for individuals with RPD has five-year progression rates of 3%, 8%, ∼30%, ∼60%, and ∼70%, i.e., approximately double for most levels. This scale fits updated definitions of late AMD, has increased prognostic accuracy, appears generalizable to similar populations, but remains simple for broad risk categorization.
Collapse
|
3
|
Multimodality Fusion Strategies in Eye Disease Diagnosis. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01105-x. [PMID: 38639808 DOI: 10.1007/s10278-024-01105-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 03/08/2024] [Accepted: 03/26/2024] [Indexed: 04/20/2024]
Abstract
Multimodality fusion has gained significance in medical applications, particularly in diagnosing challenging diseases like eye diseases, notably diabetic eye diseases that pose risks of vision loss and blindness. Mono-modality eye disease diagnosis proves difficult, often missing crucial disease indicators. In response, researchers advocate multimodality-based approaches to enhance diagnostics. This study is a unique exploration, evaluating three multimodality fusion strategies-early, joint, and late-in conjunction with state-of-the-art convolutional neural network models for automated eye disease binary detection across three datasets: fundus fluorescein angiography, macula, and combination of digital retinal images for vessel extraction, structured analysis of the retina, and high-resolution fundus. Findings reveal the efficacy of each fusion strategy: type 0 early fusion with DenseNet121 achieves an impressive 99.45% average accuracy. InceptionResNetV2 emerges as the top-performing joint fusion architecture with an average accuracy of 99.58%. Late fusion ResNet50V2 achieves a perfect score of 100% across all metrics, surpassing both early and joint fusion. Comparative analysis demonstrates that late fusion ResNet50V2 matches the accuracy of state-of-the-art feature-level fusion model for multiview learning. In conclusion, this study substantiates late fusion as the optimal strategy for eye disease diagnosis compared to early and joint fusion, showcasing its superiority in leveraging multimodal information.
Collapse
|
4
|
Artificial intelligence in age-related macular degeneration: state of the art and recent updates. BMC Ophthalmol 2024; 24:121. [PMID: 38491380 PMCID: PMC10943791 DOI: 10.1186/s12886-024-03381-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 03/06/2024] [Indexed: 03/18/2024] Open
Abstract
Age related macular degeneration (AMD) represents a leading cause of vision loss and it is expected to affect 288 million people by 2040. During the last decade, machine learning technologies have shown great potential to revolutionize clinical management of AMD and support research for a better understanding of the disease. The aim of this review is to provide a panoramic description of all the applications of AI to AMD management and screening that have been analyzed in recent past literature. Deep learning (DL) can be effectively used to diagnose AMD, to predict short term risk of exudation and need for injections within the next 2 years. Moreover, DL technology has the potential to customize anti-VEGF treatment choice with a higher accuracy than expert human experts. In addition, accurate prediction of VA response to treatment can be provided to the patients with the use of ML models, which could considerably increase patients' compliance to treatment in favorable cases. Lastly, AI, especially in the form of DL, can effectively predict conversion to GA in 12 months and also suggest new biomarkers of conversion with an innovative reverse engineering approach.
Collapse
|
5
|
Biomarkers for the Progression of Intermediate Age-Related Macular Degeneration. Ophthalmol Ther 2023; 12:2917-2941. [PMID: 37773477 PMCID: PMC10640447 DOI: 10.1007/s40123-023-00807-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 08/30/2023] [Indexed: 10/01/2023] Open
Abstract
Age-related macular degeneration (AMD) is a leading cause of severe vision loss worldwide, with a global prevalence that is predicted to substantially increase. Identifying early biomarkers indicative of progression risk will improve our ability to assess which patients are at greatest risk of progressing from intermediate AMD (iAMD) to vision-threatening late-stage AMD. This is key to ensuring individualized management and timely intervention before substantial structural damage. Some structural biomarkers suggestive of AMD progression risk are well established, such as changes seen on color fundus photography and more recently optical coherence tomography (drusen volume, pigmentary abnormalities). Emerging biomarkers identified through multimodal imaging, including reticular pseudodrusen, hyperreflective foci, and drusen sub-phenotypes, are being intensively explored as risk factors for progression towards late-stage disease. Other structural biomarkers merit further research, such as ellipsoid zone reflectivity and choriocapillaris flow features. The measures of visual function that best detect change in iAMD and correlate with risk of progression remain under intense investigation, with tests such as dark adaptometry and cone-specific contrast tests being explored. Evidence on blood and plasma markers is preliminary, but there are indications that changes in levels of C-reactive protein and high-density lipoprotein cholesterol may be used to stratify patients and predict risk. With further research, some of these biomarkers may be used to monitor progression. Emerging artificial intelligence methods may help evaluate and validate these biomarkers; however, until we have large and well-curated longitudinal data sets, using artificial intelligence effectively to inform clinical trial design and detect outcomes will remain challenging. This is an exciting area of intense research, and further work is needed to establish the most promising biomarkers for disease progression and their use in clinical care and future trials. Ultimately, a multimodal approach may yield the most accurate means of monitoring and predicting future progression towards vision-threatening, late-stage AMD.
Collapse
|
6
|
Deep-GA-Net for Accurate and Explainable Detection of Geographic Atrophy on OCT Scans. OPHTHALMOLOGY SCIENCE 2023; 3:100311. [PMID: 37304045 PMCID: PMC10251072 DOI: 10.1016/j.xops.2023.100311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 04/06/2023] [Accepted: 04/07/2023] [Indexed: 06/13/2023]
Abstract
Objective To propose Deep-GA-Net, a 3-dimensional (3D) deep learning network with 3D attention layer, for the detection of geographic atrophy (GA) on spectral domain OCT (SD-OCT) scans, explain its decision making, and compare it with existing methods. Design Deep learning model development. Participants Three hundred eleven participants from the Age-Related Eye Disease Study 2 Ancillary SD-OCT Study. Methods A dataset of 1284 SD-OCT scans from 311 participants was used to develop Deep-GA-Net. Cross-validation was used to evaluate Deep-GA-Net, where each testing set contained no participant from the corresponding training set. En face heatmaps and important regions at the B-scan level were used to visualize the outputs of Deep-GA-Net, and 3 ophthalmologists graded the presence or absence of GA in them to assess the explainability (i.e., understandability and interpretability) of its detections. Main Outcome Measures Accuracy, area under receiver operating characteristic curve (AUC), area under precision-recall curve (APR). Results Compared with other networks, Deep-GA-Net achieved the best metrics, with accuracy of 0.93, AUC of 0.94, and APR of 0.91, and received the best gradings of 0.98 and 0.68 on the en face heatmap and B-scan grading tasks, respectively. Conclusions Deep-GA-Net was able to detect GA accurately from SD-OCT scans. The visualizations of Deep-GA-Net were more explainable, as suggested by 3 ophthalmologists. The code and pretrained models are publicly available at https://github.com/ncbi/Deep-GA-Net. Financial Disclosures The author(s) have no proprietary or commercial interest in any materials discussed in this article.
Collapse
|
7
|
Reticular Pseudodrusen Status, ARMS2/HTRA1 Genotype, and Geographic Atrophy Enlargement: Age-Related Eye Disease Study 2 Report 32. Ophthalmology 2023; 130:488-500. [PMID: 36481221 PMCID: PMC10121754 DOI: 10.1016/j.ophtha.2022.11.026] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 10/27/2022] [Accepted: 11/28/2022] [Indexed: 12/12/2022] Open
Abstract
PURPOSE To determine whether reticular pseudodrusen (RPD) status, ARMS2/HTRA1 genotype, or both are associated with altered geographic atrophy (GA) enlargement rate and to analyze potential mediation of genetic effects by RPD status. DESIGN Post hoc analysis of an Age-Related Eye Disease Study 2 cohort. PARTICIPANTS Eyes with GA: n = 771 from 563 participants. METHODS Geographic atrophy area was measured from fundus photographs at annual visits. Reticular pseudodrusen presence was graded from fundus autofluorescence images. Mixed-model regression of square root of GA area was performed by RPD status, ARMS2 genotype, or both. MAIN OUTCOME MEASURES Change in square root of GA area. RESULTS Geographic atrophy enlargement was significantly faster in eyes with RPD (P < 0.0001): 0.379 mm/year (95% confidence interval [CI], 0.329-0.430 mm/year) versus 0.273 mm/year (95% CI, 0.256-0.289 mm/year). Enlargement was also significantly faster in individuals carrying ARMS2 risk alleles (P < 0.0001): 0.224 mm/year (95% CI, 0.198-0.250 mm/year), 0.287 mm/year (95% CI, 0.263-0.310 mm/year), and 0.307 mm/year (95% CI, 0.273-0.341 mm/year) for 0, 1, and 2, respectively. In mediation analysis, the direct effect of ARMS2 genotype was 0.074 mm/year (95% CI, 0.009-0.139 mm/year), whereas the indirect effect of ARMS2 genotype via RPD status was 0.002 mm/year (95% CI, -0.006 to 0.009 mm/year). In eyes with incident GA, RPD presence was not associated with an altered likelihood of central involvement (P = 0.29) or multifocality (P = 0.16) at incidence. In eyes with incident noncentral GA, RPD presence was associated with faster GA progression to the central macula (P = 0.009): 157 μm/year (95% CI, 126-188 μm/year) versus 111 μm/year (95% CI, 97-125 μm/year). Similar findings were observed in the Age-Related Eye Disease Study. CONCLUSIONS Geographic atrophy enlargement is faster in eyes with RPD and in individuals carrying ARMS2/HTRA1 risk alleles. However, RPD status does not mediate the association between ARMS2/HTRA1 genotype and faster enlargement. Reticular pseudodrusen presence and ARMS2/HTRA1 genotype are relatively independent risk factors, operating by distinct mechanisms. Reticular pseudodrusen presence does not predict central involvement or multifocality at GA incidence but is associated with faster progression toward the central macula. Reticular pseudodrusen status should be considered for improved predictions of enlargement rate. FINANCIAL DISCLOSURE(S) Proprietary or commercial disclosure may be found after the references.
Collapse
|
8
|
Classification of dry and wet macular degeneration based on the ConvNeXT model. Front Comput Neurosci 2022; 16:1079155. [PMID: 36568576 PMCID: PMC9773079 DOI: 10.3389/fncom.2022.1079155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 11/24/2022] [Indexed: 12/13/2022] Open
Abstract
Purpose To assess the value of an automated classification model for dry and wet macular degeneration based on the ConvNeXT model. Methods A total of 672 fundus images of normal, dry, and wet macular degeneration were collected from the Affiliated Eye Hospital of Nanjing Medical University and the fundus images of dry macular degeneration were expanded. The ConvNeXT three-category model was trained on the original and expanded datasets, and compared to the results of the VGG16, ResNet18, ResNet50, EfficientNetB7, and RegNet three-category models. A total of 289 fundus images were used to test the models, and the classification results of the models on different datasets were compared. The main evaluation indicators were sensitivity, specificity, F1-score, area under the curve (AUC), accuracy, and kappa. Results Using 289 fundus images, three-category models trained on the original and expanded datasets were assessed. The ConvNeXT model trained on the expanded dataset was the most effective, with a diagnostic accuracy of 96.89%, kappa value of 94.99%, and high diagnostic consistency. The sensitivity, specificity, F1-score, and AUC values for normal fundus images were 100.00, 99.41, 99.59, and 99.80%, respectively. The sensitivity, specificity, F1-score, and AUC values for dry macular degeneration diagnosis were 87.50, 98.76, 90.32, and 97.10%, respectively. The sensitivity, specificity, F1-score, and AUC values for wet macular degeneration diagnosis were 97.52, 97.02, 96.72, and 99.10%, respectively. Conclusion The ConvNeXT-based category model for dry and wet macular degeneration automatically identified dry and wet macular degeneration, aiding rapid, and accurate clinical diagnosis.
Collapse
|
9
|
Reticular Pseudodrusen: The Third Macular Risk Feature for Progression to Late Age-Related Macular Degeneration: Age-Related Eye Disease Study 2 Report 30. Ophthalmology 2022; 129:1107-1119. [PMID: 35660417 PMCID: PMC9509418 DOI: 10.1016/j.ophtha.2022.05.021] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 05/17/2022] [Accepted: 05/25/2022] [Indexed: 10/18/2022] Open
Abstract
PURPOSE To analyze reticular pseudodrusen (RPD) as an independent risk factor for progression to late age-related macular degeneration (AMD), alongside traditional macular risk factors (soft drusen and pigmentary abnormalities) considered simultaneously. DESIGN Post hoc analysis of 2 clinical trial cohorts: Age-Related Eye Disease Study (AREDS) and AREDS2. PARTICIPANTS Eyes with no late AMD at baseline in AREDS (6959 eyes, 3780 participants) and AREDS2 (3355 eyes, 2056 participants). METHODS Color fundus photographs (CFPs) from annual visits were graded for soft drusen, pigmentary abnormalities, and late AMD. Presence of RPD was from grading of fundus autofluorescence images (AREDS2) and deep learning grading of CFPs (AREDS). Proportional hazards regression analyses were performed, considering AREDS AMD severity scales (modified simplified severity scale [person] and 9-step scale [eye]) and RPD presence simultaneously. MAIN OUTCOME MEASURES Progression to late AMD, geographic atrophy (GA), and neovascular AMD. RESULTS In AREDS, for late AMD analyses by person, in a model considering the simplified severity scale simultaneously, RPD presence was associated with a higher risk of progression: hazard ratio (HR), 2.15 (95% confidence interval [CI], 1.75-2.64). However, the risk associated with RPD presence differed at different severity scale levels: HR, 3.23 (95% CI, 1.60-6.51), HR, 3.81 (95% CI, 2.38-6.10), HR, 2.28 (95% CI, 1.59-3.27), and HR, 1.64 (95% CI, 1.20-2.24), at levels 0-1, 2, 3, and 4, respectively. Considering the 9-step scale (by eye), RPD presence was associated with higher risk: HR, 2.54 (95% CI, 2.07-3.13). The HRs were 5.11 (95% CI, 3.93-6.66) at levels 1-6 and 1.78 (95% CI, 1.43-2.22) at levels 7 and 8. In AREDS2, by person, RPD presence was not associated with higher risk: HR, 1.18 (95% CI, 0.90-1.56); by eye, it was HR, 1.57 (95% CI, 1.31-1.89). In both cohorts, RPD presence carried a higher risk for GA than neovascular AMD. CONCLUSIONS Reticular pseudodrusen represent an important risk factor for progression to late AMD, particularly GA. However, the added risk varies markedly by severity level, with highly increased risk at lower/moderate levels and less increased risk at higher levels. Reticular pseudodrusen status should be included in updated AMD classification systems, risk calculators, and clinical trials.
Collapse
|
10
|
LitMC-BERT: Transformer-Based Multi-Label Classification of Biomedical Literature With An Application on COVID-19 Literature Curation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2584-2595. [PMID: 35536809 PMCID: PMC9647722 DOI: 10.1109/tcbb.2022.3173562] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 04/19/2022] [Accepted: 04/22/2022] [Indexed: 05/20/2023]
Abstract
The rapid growth of biomedical literature poses a significant challenge for curation and interpretation. This has become more evident during the COVID-19 pandemic. LitCovid, a literature database of COVID-19 related papers in PubMed, has accumulated over 200,000 articles with millions of accesses. Approximately 10,000 new articles are added to LitCovid every month. A main curation task in LitCovid is topic annotation where an article is assigned with up to eight topics, e.g., Treatment and Diagnosis. The annotated topics have been widely used both in LitCovid (e.g., accounting for ∼18% of total uses) and downstream studies such as network generation. However, it has been a primary curation bottleneck due to the nature of the task and the rapid literature growth. This study proposes LITMC-BERT, a transformer-based multi-label classification method in biomedical literature. It uses a shared transformer backbone for all the labels while also captures label-specific features and the correlations between label pairs. We compare LITMC-BERT with three baseline models on two datasets. Its micro-F1 and instance-based F1 are 5% and 4% higher than the current best results, respectively, and only requires ∼18% of the inference time than the Binary BERT baseline. The related datasets and models are available via https://github.com/ncbi/ml-transformer.
Collapse
|
11
|
Trustworthy AI: Closing the gap between development and integration of AI systems in ophthalmic practice. Prog Retin Eye Res 2021; 90:101034. [PMID: 34902546 DOI: 10.1016/j.preteyeres.2021.101034] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 12/03/2021] [Accepted: 12/06/2021] [Indexed: 01/14/2023]
Abstract
An increasing number of artificial intelligence (AI) systems are being proposed in ophthalmology, motivated by the variety and amount of clinical and imaging data, as well as their potential benefits at the different stages of patient care. Despite achieving close or even superior performance to that of experts, there is a critical gap between development and integration of AI systems in ophthalmic practice. This work focuses on the importance of trustworthy AI to close that gap. We identify the main aspects or challenges that need to be considered along the AI design pipeline so as to generate systems that meet the requirements to be deemed trustworthy, including those concerning accuracy, resiliency, reliability, safety, and accountability. We elaborate on mechanisms and considerations to address those aspects or challenges, and define the roles and responsibilities of the different stakeholders involved in AI for ophthalmic care, i.e., AI developers, reading centers, healthcare providers, healthcare institutions, ophthalmological societies and working groups or committees, patients, regulatory bodies, and payers. Generating trustworthy AI is not a responsibility of a sole stakeholder. There is an impending necessity for a collaborative approach where the different stakeholders are represented along the AI design pipeline, from the definition of the intended use to post-market surveillance after regulatory approval. This work contributes to establish such multi-stakeholder interaction and the main action points to be taken so that the potential benefits of AI reach real-world ophthalmic settings.
Collapse
|
12
|
Improving Interpretability in Machine Diagnosis. OPHTHALMOLOGY SCIENCE 2021; 1:100038. [PMID: 36247813 PMCID: PMC9559084 DOI: 10.1016/j.xops.2021.100038] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 07/02/2021] [Accepted: 07/02/2021] [Indexed: 11/28/2022]
Abstract
Purpose Manually identifying geographic atrophy (GA) presence and location on OCT volume scans can be challenging and time consuming. This study developed a deep learning model simultaneously (1) to perform automated detection of GA presence or absence from OCT volume scans and (2) to provide interpretability by demonstrating which regions of which B-scans show GA. Design Med-XAI-Net, an interpretable deep learning model was developed to detect GA presence or absence from OCT volume scans using only volume scan labels, as well as to interpret the most relevant B-scans and B-scan regions. Participants One thousand two hundred eighty-four OCT volume scans (each containing 100 B-scans) from 311 participants, including 321 volumes with GA and 963 volumes without GA. Methods Med-XAI-Net simulates the human diagnostic process by using a region-attention module to locate the most relevant region in each B-scan, followed by an image-attention module to select the most relevant B-scans for classifying GA presence or absence in each OCT volume scan. Med-XAI-Net was trained and tested (80% and 20% participants, respectively) using gold standard volume scan labels from human expert graders. Main Outcome Measures Accuracy, area under the receiver operating characteristic (ROC) curve, F1 score, sensitivity, and specificity. Results In the detection of GA presence or absence, Med-XAI-Net obtained superior performance (91.5%, 93.5%, 82.3%, 82.8%, and 94.6% on accuracy, area under the ROC curve, F1 score, sensitivity, and specificity, respectively) to that of 2 other state-of-the-art deep learning methods. The performance of ophthalmologists grading only the 5 B-scans selected by Med-XAI-Net as most relevant (95.7%, 95.4%, 91.2%, and 100%, respectively) was almost identical to that of ophthalmologists grading all volume scans (96.0%, 95.7%, 91.8%, and 100%, respectively). Even grading only 1 region in 1 B-scan, the ophthalmologists demonstrated moderately high performance (89.0%, 87.4%, 77.6%, and 100%, respectively). Conclusions Despite using ground truth labels during training at the volume scan level only, Med-XAI-Net was effective in locating GA in B-scans and selecting relevant B-scans within each volume scan for GA diagnosis. These results illustrate the strengths of Med-XAI-Net in interpreting which regions and B-scans contribute to GA detection in the volume scan.
Collapse
|