1
|
D S R, Saji KS. Hybrid deep learning framework for diabetic retinopathy classification with optimized attention AlexNet. Comput Biol Med 2025; 190:110054. [PMID: 40154203 DOI: 10.1016/j.compbiomed.2025.110054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2024] [Revised: 02/24/2025] [Accepted: 03/18/2025] [Indexed: 04/01/2025]
Abstract
Diabetic retinopathy (DR) is a chronic condition associated with diabetes that can lead to vision impairment and, if not addressed, may progress to irreversible blindness. Consequently, it is essential to detect pathological changes in the retina to assess DR severity accurately. Manual examination of retinal disorders is often complex, time consuming, and susceptible to errors due to fine retinal disorder. In recent years, Deep Learning (DL) based optimizations have shown significant promises in improving DR recognition and classification. At last, the advanced classification method using metaheuristic optimization for grading severity in fundus images. This work presents an automated DR classification using metaheuristic optimization based advanced DL model. There are four stages are involved in the suggested DR classification. At first, the pre-processing stage is performed green channel conversion, CLAHE and Gaussian filtering (GF). Then, the fundus lesions are segmented by the Fuzzy Possibilistic C Ordered Means (FPCOM). Finally, the lesions are classified by Attention AlexNet based Improved Nutcracker Optimizer (At-AlexNet-ImNO). The ImNO optimizes the At-AlexNet's weights and hyperparameters and boosts the classification performance. The experimentation is performed on two benchmark datasets like APTOS-2019 Blindness-Detection and EyePacs. Accuracy, precision and recall values achieved are 99.23 %, 98 % and 98.2 % on APTOS-2019 and accuracy, precision and recall values achieved are 99.43 %, 98.2 % and 98.65 % on EyePacs respectively.
Collapse
Affiliation(s)
- Renu D S
- Department of Computer Science and Engineering, Mar Ephraem College of Engineering and Technology, Elavuvilai, Tamilnadu, India.
| | - K S Saji
- Department of Electrical and Electronics Engineering, Meenakshi Sundararajan Engineering College, Kodambakkam, Chennai, Tamilnadu, India.
| |
Collapse
|
2
|
Cruz-Abrams O, Dodds Rojas R, Abramson DH. Machine learning demonstrates clinical utility in distinguishing retinoblastoma from pseudo retinoblastoma with RetCam images. Ophthalmic Genet 2025; 46:180-185. [PMID: 39834033 DOI: 10.1080/13816810.2025.2455576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2024] [Revised: 12/15/2024] [Accepted: 01/14/2025] [Indexed: 01/22/2025]
Abstract
BACKGROUND Retinoblastoma is diagnosed and treated without biopsy based solely on appearance (with the indirect ophthalmoscope and imaging). More than 20 benign ophthalmic disorders resemble retinoblastoma and errors in diagnosis continue to be made worldwide. A better noninvasive method for distinguishing retinoblastoma from pseudo retinoblastoma is needed. METHODS RetCam imaging of retinoblastoma and pseudo retinoblastoma from the largest retinoblastoma center in the U.S. (Memorial Sloan Kettering Cancer Center, New York, NY) were used for this study. We used several neural networks (ResNet-18, ResNet-34, ResNet-50, ResNet-101, ResNet-152, and a Vision Image Transformer, or VIT), using 80% of images for training, 10% for validation, and 10% for testing. RESULTS Two thousand eight hundred eighty-two RetCam images from patients with retinoblastoma at diagnosis, 1,970 images from pseudo retinoblastoma at diagnosis, and 804 normal pediatric fundus images were included. The highest sensitivity (98.6%) was obtained with a ResNet-101 model, as were the highest accuracy and F1 scores of 97.3% and 97.7%. The highest specificity (97.0%) and precision (97.0%) was attained with a ResNet-152 model. CONCLUSION Our machine learning algorithm successfully distinguished retinoblastoma from retinoblastoma with high specificity and sensitivity and if implemented worldwide will prevent hundreds of eyes from incorrectly being surgically removed yearly.
Collapse
Affiliation(s)
- Owen Cruz-Abrams
- Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, N.Y, US
| | | | - David H Abramson
- Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, N.Y, US
| |
Collapse
|
3
|
Bi Z, Li J, Liu Q, Fang Z. Deep learning-based optical coherence tomography and retinal images for detection of diabetic retinopathy: a systematic and meta analysis. Front Endocrinol (Lausanne) 2025; 16:1485311. [PMID: 40171193 PMCID: PMC11958191 DOI: 10.3389/fendo.2025.1485311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/23/2024] [Accepted: 02/28/2025] [Indexed: 04/03/2025] Open
Abstract
Objective To systematically review and meta-analyze the effectiveness of deep learning algorithms applied to optical coherence tomography (OCT) and retinal images for the detection of diabetic retinopathy (DR). Methods We conducted a comprehensive literature search in multiple databases including PubMed, Cochrane library, Web of Science, Embase and IEEE Xplore up to July 2024. Studies that utilized deep learning techniques for the detection of DR using OCT and retinal images were included. Data extraction and quality assessment were performed independently by two reviewers. Meta-analysis was conducted to determine pooled sensitivity, specificity, and diagnostic odds ratios. Results A total of 47 studies were included in the systematic review, 10 were meta-analyzed, encompassing a total of 188268 retinal images and OCT scans. The meta-analysis revealed a pooled sensitivity of 1.88 (95% CI: 1.45-2.44) and a pooled specificity of 1.33 (95% CI: 0.97-1.84) for the detection of DR using deep learning models. All of the outcome of deep learning-based optical coherence tomography ORs ≥0.785, indicating that all included studies with artificial intelligence assistance produced good boosting results. Conclusion Deep learning-based approaches show high accuracy in detecting diabetic retinopathy from OCT and retinal images, supporting their potential as reliable tools in clinical settings. Future research should focus on standardizing datasets, improving model interpretability, and validating performance across diverse populations. Systematic Review Registration https://www.crd.york.ac.uk/PROSPERO/, identifier CRD42024575847.
Collapse
Affiliation(s)
- Zheng Bi
- Department of Endocrinology, The First Affiliated Hospital of Anhui University of Traditional Chinese Medicine, Hefei, Anhui, China
| | - Jinju Li
- First Clinical Medical College, Anhui University of Traditional Chinese Medicine, Hefei, Anhui, China
| | - Qiongyi Liu
- First Clinical Medical College, Anhui University of Traditional Chinese Medicine, Hefei, Anhui, China
| | - Zhaohui Fang
- Department of Endocrinology, The First Affiliated Hospital of Anhui University of Traditional Chinese Medicine, Hefei, Anhui, China
- Xin ‘an Medical and Chinese Medicine Modernization Research Institute, Hefei Comprehensive National Science Center, Hefei, Anhui, China
| |
Collapse
|
4
|
Phipps B, Hadoux X, Sheng B, Campbell JP, Liu TYA, Keane PA, Cheung CY, Chung TY, Wong TY, van Wijngaarden P. AI image generation technology in ophthalmology: Use, misuse and future applications. Prog Retin Eye Res 2025; 106:101353. [PMID: 40107410 DOI: 10.1016/j.preteyeres.2025.101353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 03/12/2025] [Accepted: 03/13/2025] [Indexed: 03/22/2025]
Abstract
BACKGROUND AI-powered image generation technology holds the potential to reshape medical practice, yet it remains an unfamiliar technology for both medical researchers and clinicians alike. Given the adoption of this technology relies on clinician understanding and acceptance, we sought to demystify its use in ophthalmology. To this end, we present a literature review on image generation technology in ophthalmology, examining both its theoretical applications and future role in clinical practice. METHODS First, we consider the key model designs used for image synthesis, including generative adversarial networks, autoencoders, and diffusion models. We then perform a survey of the literature for image generation technology in ophthalmology prior to September 2024, presenting both the type of model used and its clinical application. Finally, we discuss the limitations of this technology, the risks of its misuse and the future directions of research in this field. RESULTS Applications of this technology include improving AI diagnostic models, inter-modality image transformation, more accurate treatment and disease prognostication, image denoising, and individualised education. Key barriers to its adoption include bias in generative models, risks to patient data security, computational and logistical barriers to development, challenges with model explainability, inconsistent use of validation metrics between studies and misuse of synthetic images. Looking forward, researchers are placing a further emphasis on clinically grounded metrics, the development of image generation foundation models and the implementation of methods to ensure data provenance. CONCLUSION Compared to other medical applications of AI, image generation is still in its infancy. Yet, it holds the potential to revolutionise ophthalmology across research, education and clinical practice. This review aims to guide ophthalmic researchers wanting to leverage this technology, while also providing an insight for clinicians on how it may change ophthalmic practice in the future.
Collapse
Affiliation(s)
- Benjamin Phipps
- Centre for Eye Research Australia, Royal Victorian Eye and Ear Hospital, East Melbourne, 3002, VIC, Australia; Ophthalmology, Department of Surgery, University of Melbourne, Parkville, 3010, VIC, Australia.
| | - Xavier Hadoux
- Centre for Eye Research Australia, Royal Victorian Eye and Ear Hospital, East Melbourne, 3002, VIC, Australia; Ophthalmology, Department of Surgery, University of Melbourne, Parkville, 3010, VIC, Australia
| | - Bin Sheng
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - J Peter Campbell
- Department of Ophthalmology, Casey Eye Institute, Oregon Health and Science University, Portland, USA
| | - T Y Alvin Liu
- Retina Division, Wilmer Eye Institute, Johns Hopkins University, Baltimore, MD, 21287, USA
| | - Pearse A Keane
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK; Institute of Ophthalmology, University College London, London, UK
| | - Carol Y Cheung
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, 999077, China
| | - Tham Yih Chung
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Centre for Innovation and Precision Eye Health, Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore and National University Health System, Singapore; Singapore Eye Research Institute, Singapore National Eye Centre, Singapore; Eye Academic Clinical Program (Eye ACP), Duke NUS Medical School, Singapore
| | - Tien Y Wong
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore; Tsinghua Medicine, Tsinghua University, Beijing, China; Beijing Visual Science and Translational Eye Research Institute, Beijing Tsinghua Changgung Hospital, Beijing, China
| | - Peter van Wijngaarden
- Centre for Eye Research Australia, Royal Victorian Eye and Ear Hospital, East Melbourne, 3002, VIC, Australia; Ophthalmology, Department of Surgery, University of Melbourne, Parkville, 3010, VIC, Australia; Florey Institute of Neuroscience & Mental Health, Parkville, VIC, Australia
| |
Collapse
|
5
|
Moannaei M, Jadidian F, Doustmohammadi T, Kiapasha AM, Bayani R, Rahmani M, Jahanbazy MR, Sohrabivafa F, Asadi Anar M, Magsudy A, Sadat Rafiei SK, Khakpour Y. Performance and limitation of machine learning algorithms for diabetic retinopathy screening and its application in health management: a meta-analysis. Biomed Eng Online 2025; 24:34. [PMID: 40087776 PMCID: PMC11909973 DOI: 10.1186/s12938-025-01336-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Accepted: 01/07/2025] [Indexed: 03/17/2025] Open
Abstract
BACKGROUND In recent years, artificial intelligence and machine learning algorithms have been used more extensively to diagnose diabetic retinopathy and other diseases. Still, the effectiveness of these methods has not been thoroughly investigated. This study aimed to evaluate the performance and limitations of machine learning and deep learning algorithms in detecting diabetic retinopathy. METHODS This study was conducted based on the PRISMA checklist. We searched online databases, including PubMed, Scopus, and Google Scholar, for relevant articles up to September 30, 2023. After the title, abstract, and full-text screening, data extraction and quality assessment were done for the included studies. Finally, a meta-analysis was performed. RESULTS We included 76 studies with a total of 1,371,517 retinal images, of which 51 were used for meta-analysis. Our meta-analysis showed a significant sensitivity and specificity with a percentage of 90.54 (95%CI [90.42, 90.66], P < 0.001) and 78.33% (95%CI [78.21, 78.45], P < 0.001). However, the AUC (area under curvature) did not statistically differ across studies, but had a significant figure of 0.94 (95% CI [- 46.71, 48.60], P = 1). CONCLUSIONS Although machine learning and deep learning algorithms can properly diagnose diabetic retinopathy, their discriminating capacity is limited. However, they could simplify the diagnosing process. Further studies are required to improve algorithms.
Collapse
Affiliation(s)
- Mehrsa Moannaei
- School of Medicine, Hormozgan University of Medical Sciences, Bandar Abbas, Iran
| | - Faezeh Jadidian
- School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Tahereh Doustmohammadi
- Department and Faculty of Health Education and Health Promotion, Student Research Committee, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Amir Mohammad Kiapasha
- Student Research Committee, School of Medicine, Shahid Beheshti University of Medical Science, Tehran, Iran
| | - Romina Bayani
- Student Research Committee, School of Medicine, Shahid Beheshti University of Medical Science, Tehran, Iran
| | | | | | - Fereshteh Sohrabivafa
- Health Education and Promotion, Department of Community Medicine, School of Medicine, Dezful University of Medical Sciences, Dezful, Iran
| | - Mahsa Asadi Anar
- Student Research Committee, Shahid Beheshti University of Medical Science, Arabi Ave, Daneshjoo Blvd, Velenjak, Tehran, 19839-63113, Iran.
| | - Amin Magsudy
- Faculty of Medicine, Islamic Azad University Tabriz Branch, Tabriz, Iran
| | - Seyyed Kiarash Sadat Rafiei
- Student Research Committee, Shahid Beheshti University of Medical Science, Arabi Ave, Daneshjoo Blvd, Velenjak, Tehran, 19839-63113, Iran
| | - Yaser Khakpour
- Faculty of Medicine, Guilan University of Medical Sciences, Rasht, Iran
| |
Collapse
|
6
|
Wen C, Ye M, Li H, Chen T, Xiao X. Concept-Based Lesion Aware Transformer for Interpretable Retinal Disease Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:57-68. [PMID: 39012729 DOI: 10.1109/tmi.2024.3429148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/18/2024]
Abstract
Existing deep learning methods have achieved remarkable results in diagnosing retinal diseases, showcasing the potential of advanced AI in ophthalmology. However, the black-box nature of these methods obscures the decision-making process, compromising their trustworthiness and acceptability. Inspired by the concept-based approaches and recognizing the intrinsic correlation between retinal lesions and diseases, we regard retinal lesions as concepts and propose an inherently interpretable framework designed to enhance both the performance and explainability of diagnostic models. Leveraging the transformer architecture, known for its proficiency in capturing long-range dependencies, our model can effectively identify lesion features. By integrating with image-level annotations, it achieves the alignment of lesion concepts with human cognition under the guidance of a retinal foundation model. Furthermore, to attain interpretability without losing lesion-specific information, our method employs a classifier built on a cross-attention mechanism for disease diagnosis and explanation, where explanations are grounded in the contributions of human-understandable lesion concepts and their visual localization. Notably, due to the structure and inherent interpretability of our model, clinicians can implement concept-level interventions to correct the diagnostic errors by simply adjusting erroneous lesion predictions. Experiments conducted on four fundus image datasets demonstrate that our method achieves favorable performance against state-of-the-art methods while providing faithful explanations and enabling concept-level interventions. Our code is publicly available at https://github.com/Sorades/CLAT.
Collapse
|
7
|
Chen H, Alfred M, Brown AD, Atinga A, Cohen E. Intersection of Performance, Interpretability, and Fairness in Neural Prototype Tree for Chest X-Ray Pathology Detection: Algorithm Development and Validation Study. JMIR Form Res 2024; 8:e59045. [PMID: 39636692 DOI: 10.2196/59045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 10/08/2024] [Accepted: 10/30/2024] [Indexed: 12/07/2024] Open
Abstract
BACKGROUND While deep learning classifiers have shown remarkable results in detecting chest X-ray (CXR) pathologies, their adoption in clinical settings is often hampered by the lack of transparency. To bridge this gap, this study introduces the neural prototype tree (NPT), an interpretable image classifier that combines the diagnostic capability of deep learning models and the interpretability of the decision tree for CXR pathology detection. OBJECTIVE This study aimed to investigate the utility of the NPT classifier in 3 dimensions, including performance, interpretability, and fairness, and subsequently examined the complex interaction between these dimensions. We highlight both local and global explanations of the NPT classifier and discuss its potential utility in clinical settings. METHODS This study used CXRs from the publicly available Chest X-ray 14, CheXpert, and MIMIC-CXR datasets. We trained 6 separate classifiers for each CXR pathology in all datasets, 1 baseline residual neural network (ResNet)-152, and 5 NPT classifiers with varying levels of interpretability. Performance, interpretability, and fairness were measured using the area under the receiver operating characteristic curve (ROC AUC), interpretation complexity (IC), and mean true positive rate (TPR) disparity, respectively. Linear regression analyses were performed to investigate the relationship between IC and ROC AUC, as well as between IC and mean TPR disparity. RESULTS The performance of the NPT classifier improved as the IC level increased, surpassing that of ResNet-152 at IC level 15 for the Chest X-ray 14 dataset and IC level 31 for the CheXpert and MIMIC-CXR datasets. The NPT classifier at IC level 1 exhibited the highest degree of unfairness, as indicated by the mean TPR disparity. The magnitude of unfairness, as measured by the mean TPR disparity, was more pronounced in groups differentiated by age (chest X-ray 14 0.112, SD 0.015; CheXpert 0.097, SD 0.010; MIMIC 0.093, SD 0.017) compared to sex (chest X-ray 14 0.054 SD 0.012; CheXpert 0.062, SD 0.008; MIMIC 0.066, SD 0.013). A significant positive relationship between interpretability (ie, IC level) and performance (ie, ROC AUC) was observed across all CXR pathologies (P<.001). Furthermore, linear regression analysis revealed a significant negative relationship between interpretability and fairness (ie, mean TPR disparity) across age and sex subgroups (P<.001). CONCLUSIONS By illuminating the intricate relationship between performance, interpretability, and fairness of the NPT classifier, this research offers insightful perspectives that could guide future developments in effective, interpretable, and equitable deep learning classifiers for CXR pathology detection.
Collapse
Affiliation(s)
- Hongbo Chen
- Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON, Canada
| | - Myrtede Alfred
- Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON, Canada
| | | | - Angela Atinga
- Sunnybrook Health Sciences Centre, Toronto, ON, Canada
| | - Eldan Cohen
- Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
8
|
El-Ateif S, Idri A. Multimodality Fusion Strategies in Eye Disease Diagnosis. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:2524-2558. [PMID: 38639808 PMCID: PMC11522204 DOI: 10.1007/s10278-024-01105-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 03/08/2024] [Accepted: 03/26/2024] [Indexed: 04/20/2024]
Abstract
Multimodality fusion has gained significance in medical applications, particularly in diagnosing challenging diseases like eye diseases, notably diabetic eye diseases that pose risks of vision loss and blindness. Mono-modality eye disease diagnosis proves difficult, often missing crucial disease indicators. In response, researchers advocate multimodality-based approaches to enhance diagnostics. This study is a unique exploration, evaluating three multimodality fusion strategies-early, joint, and late-in conjunction with state-of-the-art convolutional neural network models for automated eye disease binary detection across three datasets: fundus fluorescein angiography, macula, and combination of digital retinal images for vessel extraction, structured analysis of the retina, and high-resolution fundus. Findings reveal the efficacy of each fusion strategy: type 0 early fusion with DenseNet121 achieves an impressive 99.45% average accuracy. InceptionResNetV2 emerges as the top-performing joint fusion architecture with an average accuracy of 99.58%. Late fusion ResNet50V2 achieves a perfect score of 100% across all metrics, surpassing both early and joint fusion. Comparative analysis demonstrates that late fusion ResNet50V2 matches the accuracy of state-of-the-art feature-level fusion model for multiview learning. In conclusion, this study substantiates late fusion as the optimal strategy for eye disease diagnosis compared to early and joint fusion, showcasing its superiority in leveraging multimodal information.
Collapse
Affiliation(s)
- Sara El-Ateif
- Software Project Management Research Team, ENSIAS, Mohammed V University, BP 713, Agdal, Rabat, Morocco
| | - Ali Idri
- Software Project Management Research Team, ENSIAS, Mohammed V University, BP 713, Agdal, Rabat, Morocco.
- Faculty of Medical Sciences, Mohammed VI Polytechnic University, Marrakech-Rhamna, Benguerir, Morocco.
| |
Collapse
|
9
|
Shin JY, Son J, Kong ST, Park J, Park B, Park KH, Jung KH, Park SJ. Clinical Utility of Deep Learning Assistance for Detecting Various Abnormal Findings in Color Retinal Fundus Images: A Reader Study. Transl Vis Sci Technol 2024; 13:34. [PMID: 39441571 PMCID: PMC11512572 DOI: 10.1167/tvst.13.10.34] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 02/28/2024] [Indexed: 10/25/2024] Open
Abstract
Purpose To evaluate the clinical usefulness of a deep learning-based detection device for multiple abnormal findings on retinal fundus photographs for readers with varying expertise. Methods Fourteen ophthalmologists (six residents, eight specialists) assessed 399 fundus images with respect to 12 major ophthalmologic findings, with or without the assistance of a deep learning algorithm, in two separate reading sessions. Sensitivity, specificity, and reading time per image were compared. Results With algorithmic assistance, readers significantly improved in sensitivity for all 12 findings (P < 0.05) but tended to be less specific (P < 0.05) for hemorrhage, drusen, membrane, and vascular abnormality, more profoundly so in residents. Sensitivity without algorithmic assistance was significantly lower in residents (23.1%∼75.8%) compared to specialists (55.1%∼97.1%) in nine findings, but it improved to similar levels with algorithmic assistance (67.8%∼99.4% in residents, 83.2%∼99.5% in specialists) with only hemorrhage remaining statistically significantly lower. Variances in sensitivity were significantly reduced for all findings. Reading time per image decreased in images with fewer than three findings per image, more profoundly in residents. When simulated based on images acquired from a health screening center, average reading time was estimated to be reduced by 25.9% (from 16.4 seconds to 12.1 seconds per image) for residents, and by 2.0% (from 9.6 seconds to 9.4 seconds) for specialists. Conclusions Deep learning-based computer-assisted detection devices increase sensitivity, reduce inter-reader variance in sensitivity, and reduce reading time in less complicated images. Translational Relevance This study evaluated the influence that algorithmic assistance in detecting abnormal findings on retinal fundus photographs has on clinicians, possibly predicting its influence on clinical application.
Collapse
Affiliation(s)
- Joo Young Shin
- Department of Ophthalmology, Seoul Metropolitan Government Seoul National University Boramae Medical Centre, Seoul, Republic of Korea
| | | | | | | | | | - Kyu Hyung Park
- Department of Ophthalmology, Seoul National University College of Medicine, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
| | - Kyu-Hwan Jung
- VUNO Inc., Seoul, Republic of Korea
- Department of Medical Device Research and Management, Samsung Advanced Institute for Health Sciences and Technology, Sungkyunkwan University, Seoul, Republic of Korea
| | - Sang Jun Park
- Department of Ophthalmology, Seoul National University College of Medicine, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
| |
Collapse
|
10
|
Bhati D, Neha F, Amiruzzaman M. A Survey on Explainable Artificial Intelligence (XAI) Techniques for Visualizing Deep Learning Models in Medical Imaging. J Imaging 2024; 10:239. [PMID: 39452402 PMCID: PMC11508748 DOI: 10.3390/jimaging10100239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2024] [Revised: 09/14/2024] [Accepted: 09/21/2024] [Indexed: 10/26/2024] Open
Abstract
The combination of medical imaging and deep learning has significantly improved diagnostic and prognostic capabilities in the healthcare domain. Nevertheless, the inherent complexity of deep learning models poses challenges in understanding their decision-making processes. Interpretability and visualization techniques have emerged as crucial tools to unravel the black-box nature of these models, providing insights into their inner workings and enhancing trust in their predictions. This survey paper comprehensively examines various interpretation and visualization techniques applied to deep learning models in medical imaging. The paper reviews methodologies, discusses their applications, and evaluates their effectiveness in enhancing the interpretability, reliability, and clinical relevance of deep learning models in medical image analysis.
Collapse
Affiliation(s)
- Deepshikha Bhati
- Department of Computer Science, Kent State University, Kent, OH 44242, USA;
| | - Fnu Neha
- Department of Computer Science, Kent State University, Kent, OH 44242, USA;
| | - Md Amiruzzaman
- Department of Computer Science, West Chester University, West Chester, PA 19383, USA;
| |
Collapse
|
11
|
Abstract
Artificial intelligence (AI) has the potential to improve human decision-making by providing decision recommendations and problem-relevant information to assist human decision-makers. However, the full realization of the potential of human-AI collaboration continues to face several challenges. First, the conditions that support complementarity (i.e., situations in which the performance of a human with AI assistance exceeds the performance of an unassisted human or the AI in isolation) must be understood. This task requires humans to be able to recognize situations in which the AI should be leveraged and to develop new AI systems that can learn to complement the human decision-maker. Second, human mental models of the AI, which contain both expectations of the AI and reliance strategies, must be accurately assessed. Third, the effects of different design choices for human-AI interaction must be understood, including both the timing of AI assistance and the amount of model information that should be presented to the human decision-maker to avoid cognitive overload and ineffective reliance strategies. In response to each of these three challenges, we present an interdisciplinary perspective based on recent empirical and theoretical findings and discuss new research directions.
Collapse
Affiliation(s)
- Mark Steyvers
- Department of Cognitive Sciences, University of California, Irvine
| | - Aakriti Kumar
- Department of Cognitive Sciences, University of California, Irvine
| |
Collapse
|
12
|
Wang S, Shen W, Gao Z, Jiang X, Wang Y, Li Y, Ma X, Wang W, Xin S, Ren W, Jin K, Ye J. Enhancing the ophthalmic AI assessment with a fundus image quality classifier using local and global attention mechanisms. Front Med (Lausanne) 2024; 11:1418048. [PMID: 39175821 PMCID: PMC11339790 DOI: 10.3389/fmed.2024.1418048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Accepted: 07/23/2024] [Indexed: 08/24/2024] Open
Abstract
Background The assessment of image quality (IQA) plays a pivotal role in the realm of image-based computer-aided diagnosis techniques, with fundus imaging standing as the primary method for the screening and diagnosis of ophthalmic diseases. Conventional studies on fundus IQA tend to rely on simplistic datasets for evaluation, predominantly focusing on either local or global information, rather than a synthesis of both. Moreover, the interpretability of these studies often lacks compelling evidence. In order to address these issues, this study introduces the Local and Global Attention Aggregated Deep Neural Network (LGAANet), an innovative approach that integrates both local and global information for enhanced analysis. Methods The LGAANet was developed and validated using a Multi-Source Heterogeneous Fundus (MSHF) database, encompassing a diverse collection of images. This dataset includes 802 color fundus photography (CFP) images (302 from portable cameras), and 500 ultrawide-field (UWF) images from 904 patients with diabetic retinopathy (DR) and glaucoma, as well as healthy individuals. The assessment of image quality was meticulously carried out by a trio of ophthalmologists, leveraging the human visual system as a benchmark. Furthermore, the model employs attention mechanisms and saliency maps to bolster its interpretability. Results In testing with the CFP dataset, LGAANet demonstrated remarkable accuracy in three critical dimensions of image quality (illumination, clarity and contrast based on the characteristics of human visual system, and indicates the potential aspects to improve the image quality), recording scores of 0.947, 0.924, and 0.947, respectively. Similarly, when applied to the UWF dataset, the model achieved accuracies of 0.889, 0.913, and 0.923, respectively. These results underscore the efficacy of LGAANet in distinguishing between varying degrees of image quality with high precision. Conclusion To our knowledge, LGAANet represents the inaugural algorithm trained on an MSHF dataset specifically for fundus IQA, marking a significant milestone in the advancement of computer-aided diagnosis in ophthalmology. This research significantly contributes to the field, offering a novel methodology for the assessment and interpretation of fundus images in the detection and diagnosis of ocular diseases.
Collapse
Affiliation(s)
- Shengzhan Wang
- The Affiliated People’s Hospital of Ningbo University, Ningbo, Zhejiang, China
| | - Wenyue Shen
- Eye Center, School of Medicine, The Second Affiliated Hospital, Zhejiang University, Hangzhou, Zhejiang, China
| | - Zhiyuan Gao
- Eye Center, School of Medicine, The Second Affiliated Hospital, Zhejiang University, Hangzhou, Zhejiang, China
| | - Xiaoyu Jiang
- College of Control Science and Engineering, Zhejiang University, Hangzhou, China
| | - Yaqi Wang
- College of Media, Communication University of Zhejiang, Hangzhou, China
| | - Yunxiang Li
- College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, China
| | - Xiaoyu Ma
- Institute of Intelligent Media, Communication University of Zhejiang, Hangzhou, China
| | - Wenhao Wang
- The Affiliated People’s Hospital of Ningbo University, Ningbo, Zhejiang, China
| | - Shuanghua Xin
- The Affiliated People’s Hospital of Ningbo University, Ningbo, Zhejiang, China
| | - Weina Ren
- The Affiliated People’s Hospital of Ningbo University, Ningbo, Zhejiang, China
| | - Kai Jin
- Eye Center, School of Medicine, The Second Affiliated Hospital, Zhejiang University, Hangzhou, Zhejiang, China
| | - Juan Ye
- Eye Center, School of Medicine, The Second Affiliated Hospital, Zhejiang University, Hangzhou, Zhejiang, China
| |
Collapse
|
13
|
Mathieu A, Ajana S, Korobelnik JF, Le Goff M, Gontier B, Rougier MB, Delcourt C, Delyfer MN. DeepAlienorNet: A deep learning model to extract clinical features from colour fundus photography in age-related macular degeneration. Acta Ophthalmol 2024; 102:e823-e830. [PMID: 38345159 DOI: 10.1111/aos.16660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Revised: 01/11/2024] [Accepted: 01/25/2024] [Indexed: 07/09/2024]
Abstract
OBJECTIVE This study aimed to develop a deep learning (DL) model, named 'DeepAlienorNet', to automatically extract clinical signs of age-related macular degeneration (AMD) from colour fundus photography (CFP). METHODS AND ANALYSIS The ALIENOR Study is a cohort of French individuals 77 years of age or older. A multi-label DL model was developed to grade the presence of 7 clinical signs: large soft drusen (>125 μm), intermediate soft (63-125 μm), large area of soft drusen (total area >500 μm), presence of central soft drusen (large or intermediate), hyperpigmentation, hypopigmentation, and advanced AMD (defined as neovascular or atrophic AMD). Prediction performances were evaluated using cross-validation and the expert human interpretation of the clinical signs as the ground truth. RESULTS A total of 1178 images were included in the study. Averaging the 7 clinical signs' detection performances, DeepAlienorNet achieved an overall sensitivity, specificity, and AUROC of 0.77, 0.83, and 0.87, respectively. The model demonstrated particularly strong performance in predicting advanced AMD and large areas of soft drusen. It can also generate heatmaps, highlighting the relevant image areas for interpretation. CONCLUSION DeepAlienorNet demonstrates promising performance in automatically identifying clinical signs of AMD from CFP, offering several notable advantages. Its high interpretability reduces the black box effect, addressing ethical concerns. Additionally, the model can be easily integrated to automate well-established and validated AMD progression scores, and the user-friendly interface further enhances its usability. The main value of DeepAlienorNet lies in its ability to assist in precise severity scoring for further adapted AMD management, all while preserving interpretability.
Collapse
Affiliation(s)
- Alexis Mathieu
- Inserm, Bordeaux Population Health Research Center, UMR 1219, University of Bordeaux, Bordeaux, France
- Service d'Ophtalmologie, Centre Hospitalier Universitaire de Bordeaux, Bordeaux, France
| | - Soufiane Ajana
- Inserm, Bordeaux Population Health Research Center, UMR 1219, University of Bordeaux, Bordeaux, France
| | - Jean-François Korobelnik
- Inserm, Bordeaux Population Health Research Center, UMR 1219, University of Bordeaux, Bordeaux, France
- Service d'Ophtalmologie, Centre Hospitalier Universitaire de Bordeaux, Bordeaux, France
| | - Mélanie Le Goff
- Inserm, Bordeaux Population Health Research Center, UMR 1219, University of Bordeaux, Bordeaux, France
| | - Brigitte Gontier
- Service d'Ophtalmologie, Centre Hospitalier Universitaire de Bordeaux, Bordeaux, France
| | | | - Cécile Delcourt
- Inserm, Bordeaux Population Health Research Center, UMR 1219, University of Bordeaux, Bordeaux, France
- Service d'Ophtalmologie, Centre Hospitalier Universitaire de Bordeaux, Bordeaux, France
| | - Marie-Noëlle Delyfer
- Inserm, Bordeaux Population Health Research Center, UMR 1219, University of Bordeaux, Bordeaux, France
- Service d'Ophtalmologie, Centre Hospitalier Universitaire de Bordeaux, Bordeaux, France
- FRCRnet/FCRIN Network, Bordeaux, France
| |
Collapse
|
14
|
Yun C, Tang F, Gao Z, Wang W, Bai F, Miller JD, Liu H, Lee Y, Lou Q. Construction of Risk Prediction Model of Type 2 Diabetic Kidney Disease Based on Deep Learning. Diabetes Metab J 2024; 48:771-779. [PMID: 38685670 PMCID: PMC11307115 DOI: 10.4093/dmj.2023.0033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Accepted: 05/27/2023] [Indexed: 05/02/2024] Open
Abstract
BACKGRUOUND This study aimed to develop a diabetic kidney disease (DKD) prediction model using long short term memory (LSTM) neural network and evaluate its performance using accuracy, precision, recall, and area under the curve (AUC) of the receiver operating characteristic (ROC) curve. METHODS The study identified DKD risk factors through literature review and physician focus group, and collected 7 years of data from 6,040 type 2 diabetes mellitus patients based on the risk factors. Pytorch was used to build the LSTM neural network, with 70% of the data used for training and the other 30% for testing. Three models were established to examine the impact of glycosylated hemoglobin (HbA1c), systolic blood pressure (SBP), and pulse pressure (PP) variabilities on the model's performance. RESULTS The developed model achieved an accuracy of 83% and an AUC of 0.83. When the risk factor of HbA1c variability, SBP variability, or PP variability was removed one by one, the accuracy of each model was significantly lower than that of the optimal model, with an accuracy of 78% (P<0.001), 79% (P<0.001), and 81% (P<0.001), respectively. The AUC of ROC was also significantly lower for each model, with values of 0.72 (P<0.001), 0.75 (P<0.001), and 0.77 (P<0.05). CONCLUSION The developed DKD risk predictive model using LSTM neural networks demonstrated high accuracy and AUC value. When HbA1c, SBP, and PP variabilities were added to the model as featured characteristics, the model's performance was greatly improved.
Collapse
Affiliation(s)
- Chuan Yun
- Department of Endocrinology, The First Affiliated Hospital of Hainan Medical University, Haikou, China
| | - Fangli Tang
- International School of Nursing, Hainan Medical University, Haikou, China
| | - Zhenxiu Gao
- School of International Education, Nanjing Medical University, Nanjing, China
| | - Wenjun Wang
- Department of Endocrinology, The First Affiliated Hospital of Hainan Medical University, Haikou, China
| | - Fang Bai
- Nursing Department 531, The First Affiliated Hospital of Hainan Medical University, Haikou, China
| | - Joshua D. Miller
- Department of Medicine, Division of Endocrinology & Metabolism, Renaissance School of Medicine, Stony Brook University, Stony Brook, NY, USA
| | - Huanhuan Liu
- Department of Endocrinology, Hainan General Hospital, Haikou, China
| | | | - Qingqing Lou
- The First Affiliated Hospital of Hainan Medical University, Hainan Clinical Research Center for Metabolic Disease, Haikou, China
| |
Collapse
|
15
|
Wong CYT, Antaki F, Woodward-Court P, Ong AY, Keane PA. The role of saliency maps in enhancing ophthalmologists' trust in artificial intelligence models. Asia Pac J Ophthalmol (Phila) 2024; 13:100087. [PMID: 39069106 DOI: 10.1016/j.apjo.2024.100087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Revised: 07/17/2024] [Accepted: 07/23/2024] [Indexed: 07/30/2024] Open
Abstract
PURPOSE Saliency maps (SM) allow clinicians to better understand the opaque decision-making process in artificial intelligence (AI) models by visualising the important features responsible for predictions. This ultimately improves interpretability and confidence. In this work, we review the use case for SMs, exploring their impact on clinicians' understanding and trust in AI models. We use the following ophthalmic conditions as examples: (1) glaucoma, (2) myopia, (3) age-related macular degeneration, and (4) diabetic retinopathy. METHOD A multi-field search on MEDLINE, Embase, and Web of Science was conducted using specific keywords. Only studies on the use of SMs in glaucoma, myopia, AMD, or DR were considered for inclusion. RESULTS Findings reveal that SMs are often used to validate AI models and advocate for their adoption, potentially leading to biased claims. Overlooking the technical limitations of SMs, and the conductance of superficial assessments of their quality and relevance, was discerned. Uncertainties persist regarding the role of saliency maps in building trust in AI. It is crucial to enhance understanding of SMs' technical constraints and improve evaluation of their quality, impact, and suitability for specific tasks. Establishing a standardised framework for selecting and assessing SMs, as well as exploring their relationship with other reliability sources (e.g. safety and generalisability), is essential for enhancing clinicians' trust in AI. CONCLUSION We conclude that SMs are not beneficial for interpretability and trust-building purposes in their current forms. Instead, SMs may confer benefits to model debugging, model performance enhancement, and hypothesis testing (e.g. novel biomarkers).
Collapse
Affiliation(s)
| | - Fares Antaki
- Institute of Ophthalmology, University College London, London, United Kingdom
| | | | - Ariel Yuhan Ong
- Institute of Ophthalmology, University College London, London, United Kingdom
| | - Pearse A Keane
- Institute of Ophthalmology, University College London, London, United Kingdom.
| |
Collapse
|
16
|
Chen T, Bai Y, Mao H, Liu S, Xu K, Xiong Z, Ma S, Yang F, Zhao Y. Cross-modality transfer learning with knowledge infusion for diabetic retinopathy grading. Front Med (Lausanne) 2024; 11:1400137. [PMID: 38808141 PMCID: PMC11130363 DOI: 10.3389/fmed.2024.1400137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Accepted: 04/15/2024] [Indexed: 05/30/2024] Open
Abstract
Background Ultra-wide-field (UWF) fundus photography represents an emerging retinal imaging technique offering a broader field of view, thus enhancing its utility in screening and diagnosing various eye diseases, notably diabetic retinopathy (DR). However, the application of computer-aided diagnosis for DR using UWF images confronts two major challenges. The first challenge arises from the limited availability of labeled UWF data, making it daunting to train diagnostic models due to the high cost associated with manual annotation of medical images. Secondly, existing models' performance requires enhancement due to the absence of prior knowledge to guide the learning process. Purpose By leveraging extensively annotated datasets within the field, which encompass large-scale, high-quality color fundus image datasets annotated at either image-level or pixel-level, our objective is to transfer knowledge from these datasets to our target domain through unsupervised domain adaptation. Methods Our approach presents a robust model for assessing the severity of diabetic retinopathy (DR) by leveraging unsupervised lesion-aware domain adaptation in ultra-wide-field (UWF) images. Furthermore, to harness the wealth of detailed annotations in publicly available color fundus image datasets, we integrate an adversarial lesion map generator. This generator supplements the grading model by incorporating auxiliary lesion information, drawing inspiration from the clinical methodology of evaluating DR severity by identifying and quantifying associated lesions. Results We conducted both quantitative and qualitative evaluations of our proposed method. In particular, among the six representative DR grading methods, our approach achieved an accuracy (ACC) of 68.18% and a precision (pre) of 67.43%. Additionally, we conducted extensive experiments in ablation studies to validate the effectiveness of each component of our proposed method. Conclusion In conclusion, our method not only improves the accuracy of DR grading, but also enhances the interpretability of the results, providing clinicians with a reliable DR grading scheme.
Collapse
Affiliation(s)
- Tao Chen
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Yanmiao Bai
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Haiting Mao
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Shouyue Liu
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Keyi Xu
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Zhouwei Xiong
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Shaodong Ma
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Fang Yang
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Yitian Zhao
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| |
Collapse
|
17
|
Song A, Lusk JB, Roh KM, Hsu ST, Valikodath NG, Lad EM, Muir KW, Engelhard MM, Limkakeng AT, Izatt JA, McNabb RP, Kuo AN. RobOCTNet: Robotics and Deep Learning for Referable Posterior Segment Pathology Detection in an Emergency Department Population. Transl Vis Sci Technol 2024; 13:12. [PMID: 38488431 PMCID: PMC10946693 DOI: 10.1167/tvst.13.3.12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 01/31/2024] [Indexed: 03/19/2024] Open
Abstract
Purpose To evaluate the diagnostic performance of a robotically aligned optical coherence tomography (RAOCT) system coupled with a deep learning model in detecting referable posterior segment pathology in OCT images of emergency department patients. Methods A deep learning model, RobOCTNet, was trained and internally tested to classify OCT images as referable versus non-referable for ophthalmology consultation. For external testing, emergency department patients with signs or symptoms warranting evaluation of the posterior segment were imaged with RAOCT. RobOCTNet was used to classify the images. Model performance was evaluated against a reference standard based on clinical diagnosis and retina specialist OCT review. Results We included 90,250 OCT images for training and 1489 images for internal testing. RobOCTNet achieved an area under the curve (AUC) of 1.00 (95% confidence interval [CI], 0.99-1.00) for detection of referable posterior segment pathology in the internal test set. For external testing, RAOCT was used to image 72 eyes of 38 emergency department patients. In this set, RobOCTNet had an AUC of 0.91 (95% CI, 0.82-0.97), a sensitivity of 95% (95% CI, 87%-100%), and a specificity of 76% (95% CI, 62%-91%). The model's performance was comparable to two human experts' performance. Conclusions A robotically aligned OCT coupled with a deep learning model demonstrated high diagnostic performance in detecting referable posterior segment pathology in a cohort of emergency department patients. Translational Relevance Robotically aligned OCT coupled with a deep learning model may have the potential to improve emergency department patient triage for ophthalmology referral.
Collapse
Affiliation(s)
- Ailin Song
- Duke University School of Medicine, Durham, NC, USA
- Department of Ophthalmology, Duke University, Durham, NC, USA
| | - Jay B. Lusk
- Duke University School of Medicine, Durham, NC, USA
| | - Kyung-Min Roh
- Department of Ophthalmology, Duke University, Durham, NC, USA
| | - S. Tammy Hsu
- Department of Ophthalmology, Duke University, Durham, NC, USA
| | | | - Eleonora M. Lad
- Department of Ophthalmology, Duke University, Durham, NC, USA
| | - Kelly W. Muir
- Department of Ophthalmology, Duke University, Durham, NC, USA
| | - Matthew M. Engelhard
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
| | | | - Joseph A. Izatt
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Ryan P. McNabb
- Department of Ophthalmology, Duke University, Durham, NC, USA
| | - Anthony N. Kuo
- Department of Ophthalmology, Duke University, Durham, NC, USA
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| |
Collapse
|
18
|
Jin W, Fatehi M, Guo R, Hamarneh G. Evaluating the clinical utility of artificial intelligence assistance and its explanation on the glioma grading task. Artif Intell Med 2024; 148:102751. [PMID: 38325929 DOI: 10.1016/j.artmed.2023.102751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 11/06/2023] [Accepted: 12/21/2023] [Indexed: 02/09/2024]
Abstract
Clinical evaluation evidence and model explainability are key gatekeepers to ensure the safe, accountable, and effective use of artificial intelligence (AI) in clinical settings. We conducted a clinical user-centered evaluation with 35 neurosurgeons to assess the utility of AI assistance and its explanation on the glioma grading task. Each participant read 25 brain MRI scans of patients with gliomas, and gave their judgment on the glioma grading without and with the assistance of AI prediction and explanation. The AI model was trained on the BraTS dataset with 88.0% accuracy. The AI explanation was generated using the explainable AI algorithm of SmoothGrad, which was selected from 16 algorithms based on the criterion of being truthful to the AI decision process. Results showed that compared to the average accuracy of 82.5±8.7% when physicians performed the task alone, physicians' task performance increased to 87.7±7.3% with statistical significance (p-value = 0.002) when assisted by AI prediction, and remained at almost the same level of 88.5±7.0% (p-value = 0.35) with the additional assistance of AI explanation. Based on quantitative and qualitative results, the observed improvement in physicians' task performance assisted by AI prediction was mainly because physicians' decision patterns converged to be similar to AI, as physicians only switched their decisions when disagreeing with AI. The insignificant change in physicians' performance with the additional assistance of AI explanation was because the AI explanations did not provide explicit reasons, contexts, or descriptions of clinical features to help doctors discern potentially incorrect AI predictions. The evaluation showed the clinical utility of AI to assist physicians on the glioma grading task, and identified the limitations and clinical usage gaps of existing explainable AI techniques for future improvement.
Collapse
Affiliation(s)
- Weina Jin
- School of Computing Science, Simon Fraser University, Burnaby, Canada.
| | - Mostafa Fatehi
- Division of Neurosurgery, The University of British Columbia, Vancouver, Canada.
| | - Ru Guo
- Division of Neurosurgery, The University of British Columbia, Vancouver, Canada.
| | - Ghassan Hamarneh
- School of Computing Science, Simon Fraser University, Burnaby, Canada.
| |
Collapse
|
19
|
Du F, Zhao L, Luo H, Xing Q, Wu J, Zhu Y, Xu W, He W, Wu J. Recognition of eye diseases based on deep neural networks for transfer learning and improved D-S evidence theory. BMC Med Imaging 2024; 24:19. [PMID: 38238662 PMCID: PMC10797809 DOI: 10.1186/s12880-023-01176-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 12/06/2023] [Indexed: 01/22/2024] Open
Abstract
BACKGROUND Human vision has inspired significant advancements in computer vision, yet the human eye is prone to various silent eye diseases. With the advent of deep learning, computer vision for detecting human eye diseases has gained prominence, but most studies have focused only on a limited number of eye diseases. RESULTS Our model demonstrated a reduction in inherent bias and enhanced robustness. The fused network achieved an Accuracy of 0.9237, Kappa of 0.878, F1 Score of 0.914 (95% CI [0.875-0.954]), Precision of 0.945 (95% CI [0.928-0.963]), Recall of 0.89 (95% CI [0.821-0.958]), and an AUC value of ROC at 0.987. These metrics are notably higher than those of comparable studies. CONCLUSIONS Our deep neural network-based model exhibited improvements in eye disease recognition metrics over models from peer research, highlighting its potential application in this field. METHODS In deep learning-based eye recognition, to improve the learning efficiency of the model, we train and fine-tune the network by transfer learning. In order to eliminate the decision bias of the models and improve the credibility of the decisions, we propose a model decision fusion method based on the D-S theory. However, D-S theory is an incomplete and conflicting theory, we improve and eliminate the existed paradoxes, propose the improved D-S evidence theory(ID-SET), and apply it to the decision fusion of eye disease recognition models.
Collapse
Affiliation(s)
- Fanyu Du
- School of Medical Imaging, North Sichuan Medical College, Nanchong, 637000, China
- Faculty of Data Science, City University of Macau, Macau, 999078, China
- Guangdong Provincial Key Laboratory of Robotics and Intelligent System, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518000, China
| | - Lishuai Zhao
- School of Medical Imaging, North Sichuan Medical College, Nanchong, 637000, China
| | - Hui Luo
- Faculty of Data Science, City University of Macau, Macau, 999078, China
- School of Information and Management, Guangxi Medical University, Nanning, 530021, China
| | - Qijia Xing
- Affiliated Hospital of North Sichuan Medical College, Nanchong, 637000, China
| | - Jun Wu
- School of Medical Imaging, North Sichuan Medical College, Nanchong, 637000, China
| | - Yuanzhong Zhu
- School of Medical Imaging, North Sichuan Medical College, Nanchong, 637000, China
| | - Wansong Xu
- School of Medical Imaging, North Sichuan Medical College, Nanchong, 637000, China
| | - Wenjing He
- School of Medical Imaging, North Sichuan Medical College, Nanchong, 637000, China
| | - Jianfang Wu
- Faculty of Data Science, City University of Macau, Macau, 999078, China.
| |
Collapse
|
20
|
Yanagawa M, Sato J. Seeing Is Not Always Believing: Discrepancies in Saliency Maps. Radiol Artif Intell 2024; 6:e230488. [PMID: 38166327 PMCID: PMC10831517 DOI: 10.1148/ryai.230488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 11/09/2023] [Accepted: 11/17/2023] [Indexed: 01/04/2024]
Affiliation(s)
- Masahiro Yanagawa
- From the Department of Radiology, Osaka University Graduate School of Medicine, 2-2 Yamadaoka, Suita-city, Osaka 565-0871, Japan
| | - Junya Sato
- From the Department of Radiology, Osaka University Graduate School of Medicine, 2-2 Yamadaoka, Suita-city, Osaka 565-0871, Japan
| |
Collapse
|
21
|
Zhang J, Chao H, Dasegowda G, Wang G, Kalra MK, Yan P. Revisiting the Trustworthiness of Saliency Methods in Radiology AI. Radiol Artif Intell 2024; 6:e220221. [PMID: 38166328 PMCID: PMC10831523 DOI: 10.1148/ryai.220221] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 10/04/2023] [Accepted: 10/23/2023] [Indexed: 01/04/2024]
Abstract
Purpose To determine whether saliency maps in radiology artificial intelligence (AI) are vulnerable to subtle perturbations of the input, which could lead to misleading interpretations, using prediction-saliency correlation (PSC) for evaluating the sensitivity and robustness of saliency methods. Materials and Methods In this retrospective study, locally trained deep learning models and a research prototype provided by a commercial vendor were systematically evaluated on 191 229 chest radiographs from the CheXpert dataset and 7022 MR images from a human brain tumor classification dataset. Two radiologists performed a reader study on 270 chest radiograph pairs. A model-agnostic approach for computing the PSC coefficient was used to evaluate the sensitivity and robustness of seven commonly used saliency methods. Results The saliency methods had low sensitivity (maximum PSC, 0.25; 95% CI: 0.12, 0.38) and weak robustness (maximum PSC, 0.12; 95% CI: 0.0, 0.25) on the CheXpert dataset, as demonstrated by leveraging locally trained model parameters. Further evaluation showed that the saliency maps generated from a commercial prototype could be irrelevant to the model output, without knowledge of the model specifics (area under the receiver operating characteristic curve decreased by 8.6% without affecting the saliency map). The human observer studies confirmed that it is difficult for experts to identify the perturbed images; the experts had less than 44.8% correctness. Conclusion Popular saliency methods scored low PSC values on the two datasets of perturbed chest radiographs, indicating weak sensitivity and robustness. The proposed PSC metric provides a valuable quantification tool for validating the trustworthiness of medical AI explainability. Keywords: Saliency Maps, AI Trustworthiness, Dynamic Consistency, Sensitivity, Robustness Supplemental material is available for this article. © RSNA, 2023 See also the commentary by Yanagawa and Sato in this issue.
Collapse
Affiliation(s)
- Jiajin Zhang
- From the Department of Biomedical Engineering, Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, 110 8th St, Biotech 4231, Troy, NY 12180 (J.Z., H.C., G.W., P.Y.); and Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Mass (G.D., M.K.K.)
| | - Hanqing Chao
- From the Department of Biomedical Engineering, Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, 110 8th St, Biotech 4231, Troy, NY 12180 (J.Z., H.C., G.W., P.Y.); and Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Mass (G.D., M.K.K.)
| | - Giridhar Dasegowda
- From the Department of Biomedical Engineering, Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, 110 8th St, Biotech 4231, Troy, NY 12180 (J.Z., H.C., G.W., P.Y.); and Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Mass (G.D., M.K.K.)
| | - Ge Wang
- From the Department of Biomedical Engineering, Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, 110 8th St, Biotech 4231, Troy, NY 12180 (J.Z., H.C., G.W., P.Y.); and Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Mass (G.D., M.K.K.)
| | - Mannudeep K. Kalra
- From the Department of Biomedical Engineering, Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, 110 8th St, Biotech 4231, Troy, NY 12180 (J.Z., H.C., G.W., P.Y.); and Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Mass (G.D., M.K.K.)
| | - Pingkun Yan
- From the Department of Biomedical Engineering, Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, 110 8th St, Biotech 4231, Troy, NY 12180 (J.Z., H.C., G.W., P.Y.); and Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Mass (G.D., M.K.K.)
| |
Collapse
|
22
|
Song A, Borkar DS. Advances in Teleophthalmology Screening for Diabetic Retinopathy. Int Ophthalmol Clin 2024; 64:97-113. [PMID: 38146884 DOI: 10.1097/iio.0000000000000505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2023]
|
23
|
Dhibar S, Jana B. Accurate Prediction of Antifreeze Protein from Sequences through Natural Language Text Processing and Interpretable Machine Learning Approaches. J Phys Chem Lett 2023; 14:10727-10735. [PMID: 38009833 DOI: 10.1021/acs.jpclett.3c02817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Antifreeze proteins (AFPs) bind to growing iceplanes owing to their structural complementarity nature, thereby inhibiting the ice-crystal growth by thermal hysteresis. Classification of AFPs from sequence is a difficult task due to their low sequence similarity, and therefore, the usual sequence similarity algorithms, like Blast and PSI-Blast, are not efficient. Here, a method combining n-gram feature vectors and machine learning models to accelerate the identification of potential AFPs from sequences is proposed. All these n-gram features are extracted from the K-mer counting method. The comparative analysis reveals that, among different machine learning models, Xgboost outperforms others in predicting AFPs from sequence when penta-mers are used as a feature vector. When tested on an independent dataset, our method performed better compared to other existing ones with sensitivity of 97.50%, recall of 98.30%, and f1 score of 99.10%. Further, we used the SHAP method, which provides important insight into the functional activity of AFPs.
Collapse
Affiliation(s)
- Saikat Dhibar
- School of Chemical Sciences, Indian Association for the Cultivation of Science, Jadavpur, Kolkata 700032, India
| | - Biman Jana
- School of Chemical Sciences, Indian Association for the Cultivation of Science, Jadavpur, Kolkata 700032, India
| |
Collapse
|
24
|
Sultan S, Acharya Y, Zayed O, Elzomour H, Parodi JC, Soliman O, Hynes N. Is the Cardiovascular Specialist Ready For the Fifth Revolution? The Role of Artificial Intelligence, Machine Learning, Big Data Analysis, Intelligent Swarming, and Knowledge-Centered Service on the Future of Global Cardiovascular Healthcare Delivery. J Endovasc Ther 2023; 30:877-884. [PMID: 35695277 PMCID: PMC10637093 DOI: 10.1177/15266028221102660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Affiliation(s)
- Sherif Sultan
- Western Vascular Institute, Department of Vascular and Endovascular Surgery, University Hospital Galway, National University of Ireland, Galway, Galway, Ireland
- Department of Vascular Surgery and Endovascular Surgery, Galway Clinic, Royal College of Surgeons in Ireland and National University of Ireland, Galway Affiliated Hospital, Galway, Ireland
- CORRIB-CÚRAM-Vascular Group, National University of Ireland, Galway, Galway, Ireland
| | - Yogesh Acharya
- Western Vascular Institute, Department of Vascular and Endovascular Surgery, University Hospital Galway, National University of Ireland, Galway, Galway, Ireland
- Department of Vascular Surgery and Endovascular Surgery, Galway Clinic, Royal College of Surgeons in Ireland and National University of Ireland, Galway Affiliated Hospital, Galway, Ireland
| | - Omnia Zayed
- Data Science Institute, National University of Ireland, Galway, Galway, Ireland
| | - Hesham Elzomour
- Discipline of Cardiology, CORRIB-CÚRAM-Vascular Group, National University of Ireland, Galway, Galway, Ireland
| | - Juan Carlos Parodi
- Department of Vascular Surgery and Biomedical Engineering Department, Alma Mater, University of Buenos Aires, and Trinidad Hospital, Buenos Aires, Argentina
| | - Osama Soliman
- Discipline of Cardiology, CORRIB-CÚRAM-Vascular Group, National University of Ireland, Galway, Galway, Ireland
| | - Niamh Hynes
- CORRIB-CÚRAM-Vascular Group, National University of Ireland, Galway, Galway, Ireland
| |
Collapse
|
25
|
Gao Z, Pan X, Shao J, Jiang X, Su Z, Jin K, Ye J. Automatic interpretation and clinical evaluation for fundus fluorescein angiography images of diabetic retinopathy patients by deep learning. Br J Ophthalmol 2023; 107:1852-1858. [PMID: 36171054 DOI: 10.1136/bjo-2022-321472] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 09/04/2022] [Indexed: 11/03/2022]
Abstract
BACKGROUND/AIMS Fundus fluorescein angiography (FFA) is an important technique to evaluate diabetic retinopathy (DR) and other retinal diseases. The interpretation of FFA images is complex and time-consuming, and the ability of diagnosis is uneven among different ophthalmologists. The aim of the study is to develop a clinically usable multilevel classification deep learning model for FFA images, including prediagnosis assessment and lesion classification. METHODS A total of 15 599 FFA images of 1558 eyes from 845 patients diagnosed with DR were collected and annotated. Three convolutional neural network (CNN) models were trained to generate the label of image quality, location, laterality of eye, phase and five lesions. Performance of the models was evaluated by accuracy, F-1 score, the area under the curve and human-machine comparison. The images with false positive and false negative results were analysed in detail. RESULTS Compared with LeNet-5 and VGG16, ResNet18 got the best result, achieving an accuracy of 80.79%-93.34% for prediagnosis assessment and an accuracy of 63.67%-88.88% for lesion detection. The human-machine comparison showed that the CNN had similar accuracy with junior ophthalmologists. The false positive and false negative analysis indicated a direction of improvement. CONCLUSION This is the first study to do automated standardised labelling on FFA images. Our model is able to be applied in clinical practice, and will make great contributions to the development of intelligent diagnosis of FFA images.
Collapse
Affiliation(s)
- Zhiyuan Gao
- Department of Ophthalmology, Zhejiang University School of Medicine Second Affiliated Hospital, Hangzhou, Zhejiang, China
| | - Xiangji Pan
- Department of Ophthalmology, Zhejiang University School of Medicine Second Affiliated Hospital, Hangzhou, Zhejiang, China
| | - Ji Shao
- Department of Ophthalmology, Zhejiang University School of Medicine Second Affiliated Hospital, Hangzhou, Zhejiang, China
| | - Xiaoyu Jiang
- College of Control Science and Engineering, Zhejiang University, Hangzhou, Zhejiang, China
| | - Zhaoan Su
- Department of Ophthalmology, Zhejiang University School of Medicine Second Affiliated Hospital, Hangzhou, Zhejiang, China
| | - Kai Jin
- Department of Ophthalmology, Zhejiang University School of Medicine Second Affiliated Hospital, Hangzhou, Zhejiang, China
| | - Juan Ye
- Department of Ophthalmology, Zhejiang University School of Medicine Second Affiliated Hospital, Hangzhou, Zhejiang, China
| |
Collapse
|
26
|
Baharlouei Z, Rabbani H, Plonka G. Wavelet scattering transform application in classification of retinal abnormalities using OCT images. Sci Rep 2023; 13:19013. [PMID: 37923770 PMCID: PMC10624695 DOI: 10.1038/s41598-023-46200-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 10/29/2023] [Indexed: 11/06/2023] Open
Abstract
To assist ophthalmologists in diagnosing retinal abnormalities, Computer Aided Diagnosis has played a significant role. In this paper, a particular Convolutional Neural Network based on Wavelet Scattering Transform (WST) is used to detect one to four retinal abnormalities from Optical Coherence Tomography (OCT) images. Predefined wavelet filters in this network decrease the computation complexity and processing time compared to deep learning methods. We use two layers of the WST network to obtain a direct and efficient model. WST generates a sparse representation of the images which is translation-invariant and stable concerning local deformations. Next, a Principal Component Analysis classifies the extracted features. We evaluate the model using four publicly available datasets to have a comprehensive comparison with the literature. The accuracies of classifying the OCT images of the OCTID dataset into two and five classes were [Formula: see text] and [Formula: see text], respectively. We achieved an accuracy of [Formula: see text] in detecting Diabetic Macular Edema from Normal ones using the TOPCON device-based dataset. Heidelberg and Duke datasets contain DME, Age-related Macular Degeneration, and Normal classes, in which we achieved accuracy of [Formula: see text] and [Formula: see text], respectively. A comparison of our results with the state-of-the-art models shows that our model outperforms these models for some assessments or achieves nearly the best results reported so far while having a much smaller computational complexity.
Collapse
Affiliation(s)
- Zahra Baharlouei
- Medical Image and Signal Processing Research Center, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Hossein Rabbani
- Medical Image and Signal Processing Research Center, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran.
| | - Gerlind Plonka
- Institute for Numerical and Applied Mathematics, Georg-August-University of Goettingen, Göttingen, Germany
| |
Collapse
|
27
|
Zhang A, Wu Z, Wu E, Wu M, Snyder MP, Zou J, Wu JC. Leveraging physiology and artificial intelligence to deliver advancements in health care. Physiol Rev 2023; 103:2423-2450. [PMID: 37104717 PMCID: PMC10390055 DOI: 10.1152/physrev.00033.2022] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 03/06/2023] [Accepted: 04/25/2023] [Indexed: 04/29/2023] Open
Abstract
Artificial intelligence in health care has experienced remarkable innovation and progress in the last decade. Significant advancements can be attributed to the utilization of artificial intelligence to transform physiology data to advance health care. In this review, we explore how past work has shaped the field and defined future challenges and directions. In particular, we focus on three areas of development. First, we give an overview of artificial intelligence, with special attention to the most relevant artificial intelligence models. We then detail how physiology data have been harnessed by artificial intelligence to advance the main areas of health care: automating existing health care tasks, increasing access to care, and augmenting health care capabilities. Finally, we discuss emerging concerns surrounding the use of individual physiology data and detail an increasingly important consideration for the field, namely the challenges of deploying artificial intelligence models to achieve meaningful clinical impact.
Collapse
Affiliation(s)
- Angela Zhang
- Stanford Cardiovascular Institute, School of Medicine, Stanford University, Stanford, California, United States
- Department of Genetics, School of Medicine, Stanford University, Stanford, California, United States
- Greenstone Biosciences, Palo Alto, California, United States
| | - Zhenqin Wu
- Department of Chemistry, Stanford University, Stanford, California, United States
| | - Eric Wu
- Department of Electrical Engineering, Stanford University, Stanford, California, United States
| | - Matthew Wu
- Greenstone Biosciences, Palo Alto, California, United States
| | - Michael P Snyder
- Department of Genetics, School of Medicine, Stanford University, Stanford, California, United States
| | - James Zou
- Department of Biomedical Informatics, School of Medicine, Stanford University, Stanford, California, United States
- Department of Computer Science, Stanford University, Stanford, California, United States
| | - Joseph C Wu
- Stanford Cardiovascular Institute, School of Medicine, Stanford University, Stanford, California, United States
- Greenstone Biosciences, Palo Alto, California, United States
- Division of Cardiovascular Medicine, Department of Medicine, Stanford University, Stanford, California, United States
- Department of Radiology, School of Medicine, Stanford University, Stanford, California, United States
| |
Collapse
|
28
|
Allgaier J, Mulansky L, Draelos RL, Pryss R. How does the model make predictions? A systematic literature review on the explainability power of machine learning in healthcare. Artif Intell Med 2023; 143:102616. [PMID: 37673561 DOI: 10.1016/j.artmed.2023.102616] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 02/22/2023] [Accepted: 05/15/2023] [Indexed: 09/08/2023]
Abstract
BACKGROUND Medical use cases for machine learning (ML) are growing exponentially. The first hospitals are already using ML systems as decision support systems in their daily routine. At the same time, most ML systems are still opaque and it is not clear how these systems arrive at their predictions. METHODS In this paper, we provide a brief overview of the taxonomy of explainability methods and review popular methods. In addition, we conduct a systematic literature search on PubMed to investigate which explainable artificial intelligence (XAI) methods are used in 450 specific medical supervised ML use cases, how the use of XAI methods has emerged recently, and how the precision of describing ML pipelines has evolved over the past 20 years. RESULTS A large fraction of publications with ML use cases do not use XAI methods at all to explain ML predictions. However, when XAI methods are used, open-source and model-agnostic explanation methods are more commonly used, with SHapley Additive exPlanations (SHAP) and Gradient Class Activation Mapping (Grad-CAM) for tabular and image data leading the way. ML pipelines have been described in increasing detail and uniformity in recent years. However, the willingness to share data and code has stagnated at about one-quarter. CONCLUSIONS XAI methods are mainly used when their application requires little effort. The homogenization of reports in ML use cases facilitates the comparability of work and should be advanced in the coming years. Experts who can mediate between the worlds of informatics and medicine will become more and more in demand when using ML systems due to the high complexity of the domain.
Collapse
Affiliation(s)
- Johannes Allgaier
- Institute of Clinical Epidemiology and Biometry, Julius-Maximilians-Universität Würzburg (JMU), Germany.
| | - Lena Mulansky
- Institute of Clinical Epidemiology and Biometry, Julius-Maximilians-Universität Würzburg (JMU), Germany.
| | | | - Rüdiger Pryss
- Institute of Clinical Epidemiology and Biometry, Julius-Maximilians-Universität Würzburg (JMU), Germany.
| |
Collapse
|
29
|
Tan TF, Dai P, Zhang X, Jin L, Poh S, Hong D, Lim J, Lim G, Teo ZL, Liu N, Ting DSW. Explainable artificial intelligence in ophthalmology. Curr Opin Ophthalmol 2023; 34:422-430. [PMID: 37527200 DOI: 10.1097/icu.0000000000000983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/03/2023]
Abstract
PURPOSE OF REVIEW Despite the growing scope of artificial intelligence (AI) and deep learning (DL) applications in the field of ophthalmology, most have yet to reach clinical adoption. Beyond model performance metrics, there has been an increasing emphasis on the need for explainability of proposed DL models. RECENT FINDINGS Several explainable AI (XAI) methods have been proposed, and increasingly applied in ophthalmological DL applications, predominantly in medical imaging analysis tasks. SUMMARY We summarize an overview of the key concepts, and categorize some examples of commonly employed XAI methods. Specific to ophthalmology, we explore XAI from a clinical perspective, in enhancing end-user trust, assisting clinical management, and uncovering new insights. We finally discuss its limitations and future directions to strengthen XAI for application to clinical practice.
Collapse
Affiliation(s)
- Ting Fang Tan
- Artificial Intelligence and Digital Innovation Research Group
- Singapore National Eye Centre, Singapore General Hospital
| | - Peilun Dai
- Institute of High Performance Computing, A∗STAR
| | - Xiaoman Zhang
- Duke-National University of Singapore Medical School, Singapore
| | - Liyuan Jin
- Artificial Intelligence and Digital Innovation Research Group
- Duke-National University of Singapore Medical School, Singapore
| | - Stanley Poh
- Singapore National Eye Centre, Singapore General Hospital
| | - Dylan Hong
- Artificial Intelligence and Digital Innovation Research Group
| | - Joshua Lim
- Singapore National Eye Centre, Singapore General Hospital
| | - Gilbert Lim
- Artificial Intelligence and Digital Innovation Research Group
| | - Zhen Ling Teo
- Artificial Intelligence and Digital Innovation Research Group
- Singapore National Eye Centre, Singapore General Hospital
| | - Nan Liu
- Artificial Intelligence and Digital Innovation Research Group
- Duke-National University of Singapore Medical School, Singapore
| | - Daniel Shu Wei Ting
- Artificial Intelligence and Digital Innovation Research Group
- Singapore National Eye Centre, Singapore General Hospital
- Duke-National University of Singapore Medical School, Singapore
- Byers Eye Institute, Stanford University, Stanford, California, USA
| |
Collapse
|
30
|
Lu K, Tong Y, Yu S, Lin Y, Yang Y, Xu H, Li Y, Yu S. Building a trustworthy AI differential diagnosis application for Crohn's disease and intestinal tuberculosis. BMC Med Inform Decis Mak 2023; 23:160. [PMID: 37582768 PMCID: PMC10426047 DOI: 10.1186/s12911-023-02257-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Accepted: 08/02/2023] [Indexed: 08/17/2023] Open
Abstract
BACKGROUND Differentiating between Crohn's disease (CD) and intestinal tuberculosis (ITB) with endoscopy is challenging. We aim to perform more accurate endoscopic diagnosis between CD and ITB by building a trustworthy AI differential diagnosis application. METHODS A total of 1271 electronic health record (EHR) patients who had undergone colonoscopies at Peking Union Medical College Hospital (PUMCH) and were clinically diagnosed with CD (n = 875) or ITB (n = 396) were used in this study. We build a workflow to make diagnoses with EHRs and mine differential diagnosis features; this involves finetuning the pretrained language models, distilling them into a light and efficient TextCNN model, interpreting the neural network and selecting differential attribution features, and then adopting manual feature checking and carrying out debias training. RESULTS The accuracy of debiased TextCNN on differential diagnosis between CD and ITB is 0.83 (CR F1: 0.87, ITB F1: 0.77), which is the best among the baselines. On the noisy validation set, its accuracy was 0.70 (CR F1: 0.87, ITB: 0.69), which was significantly higher than that of models without debias. We also find that the debiased model more easily mines the diagnostically significant features. The debiased TextCNN unearthed 39 diagnostic features in the form of phrases, 17 of which were key diagnostic features recognized by the guidelines. CONCLUSION We build a trustworthy AI differential diagnosis application for differentiating between CD and ITB focusing on accuracy, interpretability and robustness. The classifiers perform well, and the features which had statistical significance were in agreement with clinical guidelines.
Collapse
Affiliation(s)
- Keming Lu
- Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Yuanren Tong
- Department of Gastroenterology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100730, China
| | - Si Yu
- Department of Gastroenterology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100730, China
| | - Yucong Lin
- Center for Statistical Science, Tsinghua University, Beijing, 100084, China
- Department of Industrial Engineering, Tsinghua University, Beijing, 100084, China
| | - Yingyun Yang
- Department of Gastroenterology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100730, China
| | - Hui Xu
- Department of Gastroenterology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100730, China
| | - Yue Li
- Department of Gastroenterology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100730, China.
| | - Sheng Yu
- Center for Statistical Science, Tsinghua University, Beijing, 100084, China.
- Department of Industrial Engineering, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
31
|
Cleland CR, Rwiza J, Evans JR, Gordon I, MacLeod D, Burton MJ, Bascaran C. Artificial intelligence for diabetic retinopathy in low-income and middle-income countries: a scoping review. BMJ Open Diabetes Res Care 2023; 11:e003424. [PMID: 37532460 PMCID: PMC10401245 DOI: 10.1136/bmjdrc-2023-003424] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 07/11/2023] [Indexed: 08/04/2023] Open
Abstract
Diabetic retinopathy (DR) is a leading cause of blindness globally. There is growing evidence to support the use of artificial intelligence (AI) in diabetic eye care, particularly for screening populations at risk of sight loss from DR in low-income and middle-income countries (LMICs) where resources are most stretched. However, implementation into clinical practice remains limited. We conducted a scoping review to identify what AI tools have been used for DR in LMICs and to report their performance and relevant characteristics. 81 articles were included. The reported sensitivities and specificities were generally high providing evidence to support use in clinical practice. However, the majority of studies focused on sensitivity and specificity only and there was limited information on cost, regulatory approvals and whether the use of AI improved health outcomes. Further research that goes beyond reporting sensitivities and specificities is needed prior to wider implementation.
Collapse
Affiliation(s)
- Charles R Cleland
- International Centre for Eye Health, Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, London, UK
- Eye Department, Kilimanjaro Christian Medical Centre, Moshi, United Republic of Tanzania
| | - Justus Rwiza
- Eye Department, Kilimanjaro Christian Medical Centre, Moshi, United Republic of Tanzania
| | - Jennifer R Evans
- International Centre for Eye Health, Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, London, UK
| | - Iris Gordon
- International Centre for Eye Health, Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, London, UK
| | - David MacLeod
- Tropical Epidemiology Group, Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, UK
| | - Matthew J Burton
- International Centre for Eye Health, Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, London, UK
- National Institute for Health Research Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
| | - Covadonga Bascaran
- International Centre for Eye Health, Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, London, UK
| |
Collapse
|
32
|
Edianto A, Trencher G, Manych N, Matsubae K. Forecasting coal power plant retirement ages and lock-in with random forest regression. PATTERNS (NEW YORK, N.Y.) 2023; 4:100776. [PMID: 37521043 PMCID: PMC10382988 DOI: 10.1016/j.patter.2023.100776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/19/2023] [Revised: 03/24/2023] [Accepted: 05/19/2023] [Indexed: 08/01/2023]
Abstract
Averting dangerous climate change requires expediting the retirement of coal-fired power plants (CFPPs). Given multiple barriers hampering this, here we forecast the future retirement ages of the world's CFPPs. We use supervised machine learning to first learn from the past, determining the factors that influenced historical retirements. We then apply our model to a dataset of 6,541 operating or under-construction units in 66 countries. Based on results, we also forecast associated carbon emissions and the degree to which countries are locked in to coal power. Contrasting with the historical average of roughly 40 years over 2010-2021, our model forecasts earlier retirement for 63% of current CFPP units. This results in 38% less emissions than if assuming historical retirement trends. However, the lock-in index forecasts considerable difficulties to retire CFPPs early in countries with high dependence on coal power, a large capacity or number of units, and young plant ages.
Collapse
Affiliation(s)
- Achmed Edianto
- Graduate School of Environmental Studies, Tohoku University, Miyagi, Japan
| | - Gregory Trencher
- Graduate School of Global Environmental Studies, Kyoto University, Kyoto, Japan
| | - Niccolò Manych
- Mercator Research Institute on Global Commons and Climate Change, Berlin, Germany
- Department Economics of Climate Change, Technische Universität Berlin, Berlin, Germany
| | - Kazuyo Matsubae
- Graduate School of Environmental Studies, Tohoku University, Miyagi, Japan
| |
Collapse
|
33
|
Ramaekers M, Viviers CGA, Janssen BV, Hellström TAE, Ewals L, van der Wulp K, Nederend J, Jacobs I, Pluyter JR, Mavroeidis D, van der Sommen F, Besselink MG, Luyer MDP. Computer-Aided Detection for Pancreatic Cancer Diagnosis: Radiological Challenges and Future Directions. J Clin Med 2023; 12:4209. [PMID: 37445243 PMCID: PMC10342462 DOI: 10.3390/jcm12134209] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 06/08/2023] [Accepted: 06/19/2023] [Indexed: 07/15/2023] Open
Abstract
Radiological imaging plays a crucial role in the detection and treatment of pancreatic ductal adenocarcinoma (PDAC). However, there are several challenges associated with the use of these techniques in daily clinical practice. Determination of the presence or absence of cancer using radiological imaging is difficult and requires specific expertise, especially after neoadjuvant therapy. Early detection and characterization of tumors would potentially increase the number of patients who are eligible for curative treatment. Over the last decades, artificial intelligence (AI)-based computer-aided detection (CAD) has rapidly evolved as a means for improving the radiological detection of cancer and the assessment of the extent of disease. Although the results of AI applications seem promising, widespread adoption in clinical practice has not taken place. This narrative review provides an overview of current radiological CAD systems in pancreatic cancer, highlights challenges that are pertinent to clinical practice, and discusses potential solutions for these challenges.
Collapse
Affiliation(s)
- Mark Ramaekers
- Department of Surgery, Catharina Cancer Institute, Catharina Hospital Eindhoven, 5623 EJ Eindhoven, The Netherlands;
| | - Christiaan G. A. Viviers
- Department of Electrical Engineering, Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands; (C.G.A.V.); (T.A.E.H.); (F.v.d.S.)
| | - Boris V. Janssen
- Department of Surgery, Amsterdam UMC, University of Amsterdam, 1105 AZ Amsterdam, The Netherlands; (B.V.J.); (M.G.B.)
- Cancer Center Amsterdam, 1081 HV Amsterdam, The Netherlands
| | - Terese A. E. Hellström
- Department of Electrical Engineering, Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands; (C.G.A.V.); (T.A.E.H.); (F.v.d.S.)
| | - Lotte Ewals
- Department of Radiology, Catharina Cancer Institute, Catharina Hospital Eindhoven, 5623 EJ Eindhoven, The Netherlands; (L.E.); (K.v.d.W.); (J.N.)
| | - Kasper van der Wulp
- Department of Radiology, Catharina Cancer Institute, Catharina Hospital Eindhoven, 5623 EJ Eindhoven, The Netherlands; (L.E.); (K.v.d.W.); (J.N.)
| | - Joost Nederend
- Department of Radiology, Catharina Cancer Institute, Catharina Hospital Eindhoven, 5623 EJ Eindhoven, The Netherlands; (L.E.); (K.v.d.W.); (J.N.)
| | - Igor Jacobs
- Department of Hospital Services and Informatics, Philips Research, 5656 AE Eindhoven, The Netherlands;
| | - Jon R. Pluyter
- Department of Experience Design, Philips Design, 5656 AE Eindhoven, The Netherlands;
| | - Dimitrios Mavroeidis
- Department of Data Science, Philips Research, 5656 AE Eindhoven, The Netherlands;
| | - Fons van der Sommen
- Department of Electrical Engineering, Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands; (C.G.A.V.); (T.A.E.H.); (F.v.d.S.)
| | - Marc G. Besselink
- Department of Surgery, Amsterdam UMC, University of Amsterdam, 1105 AZ Amsterdam, The Netherlands; (B.V.J.); (M.G.B.)
- Cancer Center Amsterdam, 1081 HV Amsterdam, The Netherlands
| | - Misha D. P. Luyer
- Department of Surgery, Catharina Cancer Institute, Catharina Hospital Eindhoven, 5623 EJ Eindhoven, The Netherlands;
| |
Collapse
|
34
|
Chen RJ, Wang JJ, Williamson DFK, Chen TY, Lipkova J, Lu MY, Sahai S, Mahmood F. Algorithmic fairness in artificial intelligence for medicine and healthcare. Nat Biomed Eng 2023; 7:719-742. [PMID: 37380750 PMCID: PMC10632090 DOI: 10.1038/s41551-023-01056-8] [Citation(s) in RCA: 90] [Impact Index Per Article: 45.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 04/13/2023] [Indexed: 06/30/2023]
Abstract
In healthcare, the development and deployment of insufficiently fair systems of artificial intelligence (AI) can undermine the delivery of equitable care. Assessments of AI models stratified across subpopulations have revealed inequalities in how patients are diagnosed, treated and billed. In this Perspective, we outline fairness in machine learning through the lens of healthcare, and discuss how algorithmic biases (in data acquisition, genetic variation and intra-observer labelling variability, in particular) arise in clinical workflows and the resulting healthcare disparities. We also review emerging technology for mitigating biases via disentanglement, federated learning and model explainability, and their role in the development of AI-based software as a medical device.
Collapse
Affiliation(s)
- Richard J Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Judy J Wang
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Boston University School of Medicine, Boston, MA, USA
| | - Drew F K Williamson
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Tiffany Y Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Jana Lipkova
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Ming Y Lu
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Sharifa Sahai
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Faisal Mahmood
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Cancer Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA.
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA.
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
- Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
35
|
Jin K, Gao Z, Jiang X, Wang Y, Ma X, Li Y, Ye J. MSHF: A Multi-Source Heterogeneous Fundus (MSHF) Dataset for Image Quality Assessment. Sci Data 2023; 10:286. [PMID: 37198230 PMCID: PMC10192420 DOI: 10.1038/s41597-023-02188-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 04/27/2023] [Indexed: 05/19/2023] Open
Abstract
Image quality assessment (IQA) is significant for current techniques of image-based computer-aided diagnosis, and fundus imaging is the chief modality for screening and diagnosing ophthalmic diseases. However, most of the existing IQA datasets are single-center datasets, disregarding the type of imaging device, eye condition, and imaging environment. In this paper, we collected a multi-source heterogeneous fundus (MSHF) database. The MSHF dataset consisted of 1302 high-resolution normal and pathologic images from color fundus photography (CFP), images of healthy volunteers taken with a portable camera, and ultrawide-field (UWF) images of diabetic retinopathy patients. Dataset diversity was visualized with a spatial scatter plot. Image quality was determined by three ophthalmologists according to its illumination, clarity, contrast and overall quality. To the best of our knowledge, this is one of the largest fundus IQA datasets and we believe this work will be beneficial to the construction of a standardized medical image database.
Collapse
Affiliation(s)
- Kai Jin
- Eye Center, The Second Affiliated Hospital, School of Medicine, Zhejiang University, Zhejiang Provincial Key Laboratory of Ophthalmology, Zhejiang Provincial Clinical Research Center for Eye Diseases, Zhejiang Provincial Engineering Institute on Eye Diseases, Zhejiang, Hangzhou, 310009, China
| | - Zhiyuan Gao
- Eye Center, The Second Affiliated Hospital, School of Medicine, Zhejiang University, Zhejiang Provincial Key Laboratory of Ophthalmology, Zhejiang Provincial Clinical Research Center for Eye Diseases, Zhejiang Provincial Engineering Institute on Eye Diseases, Zhejiang, Hangzhou, 310009, China
| | - Xiaoyu Jiang
- College of Control Science and Engineering, Zhejiang University, Hangzhou, 310027, China
| | - Yaqi Wang
- College of Media, Communication University of Zhejiang, Hangzhou, 310018, China
| | - Xiaoyu Ma
- Institute of Intelligent Media, Communication University of Zhejiang, Hangzhou, 310018, China
| | - Yunxiang Li
- College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, 310018, China
| | - Juan Ye
- Eye Center, The Second Affiliated Hospital, School of Medicine, Zhejiang University, Zhejiang Provincial Key Laboratory of Ophthalmology, Zhejiang Provincial Clinical Research Center for Eye Diseases, Zhejiang Provincial Engineering Institute on Eye Diseases, Zhejiang, Hangzhou, 310009, China.
| |
Collapse
|
36
|
Wickstrøm KK, Østmo EA, Radiya K, Mikalsen KØ, Kampffmeyer MC, Jenssen R. A clinically motivated self-supervised approach for content-based image retrieval of CT liver images. Comput Med Imaging Graph 2023; 107:102239. [PMID: 37207397 DOI: 10.1016/j.compmedimag.2023.102239] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 05/02/2023] [Accepted: 05/02/2023] [Indexed: 05/21/2023]
Abstract
Deep learning-based approaches for content-based image retrieval (CBIR) of computed tomography (CT) liver images is an active field of research, but suffer from some critical limitations. First, they are heavily reliant on labeled data, which can be challenging and costly to acquire. Second, they lack transparency and explainability, which limits the trustworthiness of deep CBIR systems. We address these limitations by: (1) Proposing a self-supervised learning framework that incorporates domain-knowledge into the training procedure, and, (2) by providing the first representation learning explainability analysis in the context of CBIR of CT liver images. Results demonstrate improved performance compared to the standard self-supervised approach across several metrics, as well as improved generalization across datasets. Further, we conduct the first representation learning explainability analysis in the context of CBIR, which reveals new insights into the feature extraction process. Lastly, we perform a case study with cross-examination CBIR that demonstrates the usability of our proposed framework. We believe that our proposed framework could play a vital role in creating trustworthy deep CBIR systems that can successfully take advantage of unlabeled data.
Collapse
Affiliation(s)
- Kristoffer Knutsen Wickstrøm
- Machine Learning Group at the Department of Physics and Technology, UiT the Arctic University of Norway, Tromsø NO-9037, Norway.
| | - Eirik Agnalt Østmo
- Machine Learning Group at the Department of Physics and Technology, UiT the Arctic University of Norway, Tromsø NO-9037, Norway
| | - Keyur Radiya
- Department of Gastrointestinal Surgery, University Hospital of North Norway (UNN), Tromsø, Norway
| | - Karl Øyvind Mikalsen
- Machine Learning Group at the Department of Physics and Technology, UiT the Arctic University of Norway, Tromsø NO-9037, Norway; Department of Gastrointestinal Surgery, University Hospital of North Norway (UNN), Tromsø, Norway
| | - Michael Christian Kampffmeyer
- Machine Learning Group at the Department of Physics and Technology, UiT the Arctic University of Norway, Tromsø NO-9037, Norway; Norwegian Computing Center, Department SAMBA, P.O. Box 114 Blindern, Oslo NO-0314, Norway
| | - Robert Jenssen
- Machine Learning Group at the Department of Physics and Technology, UiT the Arctic University of Norway, Tromsø NO-9037, Norway; Norwegian Computing Center, Department SAMBA, P.O. Box 114 Blindern, Oslo NO-0314, Norway; Department of Computer Science, University of Copenhagen, Universitetsparken 1, 2100 København Ø, Denmark
| |
Collapse
|
37
|
Ahmad PN, Shah AM, Lee K. A Review on Electronic Health Record Text-Mining for Biomedical Name Entity Recognition in Healthcare Domain. Healthcare (Basel) 2023; 11:1268. [PMID: 37174810 PMCID: PMC10178605 DOI: 10.3390/healthcare11091268] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 04/24/2023] [Accepted: 04/26/2023] [Indexed: 05/15/2023] Open
Abstract
Biomedical-named entity recognition (bNER) is critical in biomedical informatics. It identifies biomedical entities with special meanings, such as people, places, and organizations, as predefined semantic types in electronic health records (EHR). bNER is essential for discovering novel knowledge using computational methods and Information Technology. Early bNER systems were configured manually to include domain-specific features and rules. However, these systems were limited in handling the complexity of the biomedical text. Recent advances in deep learning (DL) have led to the development of more powerful bNER systems. DL-based bNER systems can learn the patterns of biomedical text automatically, making them more robust and efficient than traditional rule-based systems. This paper reviews the healthcare domain of bNER, using DL techniques and artificial intelligence in clinical records, for mining treatment prediction. bNER-based tools are categorized systematically and represent the distribution of input, context, and tag (encoder/decoder). Furthermore, to create a labeled dataset for our machine learning sentiment analyzer to analyze the sentiment of a set of tweets, we used a manual coding approach and the multi-task learning method to bias the training signals with domain knowledge inductively. To conclude, we discuss the challenges facing bNER systems and future directions in the healthcare field.
Collapse
Affiliation(s)
- Pir Noman Ahmad
- School of Computer Science, Harbin Institute of Technology, Harbin 150001, China
| | - Adnan Muhammad Shah
- Department of Computer Engineering, Gachon University, Seongnam 13120, Republic of Korea
| | - KangYoon Lee
- Department of Computer Engineering, Gachon University, Seongnam 13120, Republic of Korea
| |
Collapse
|
38
|
Ayhan MS, Faber H, Kühlewein L, Inhoffen W, Aliyeva G, Ziemssen F, Berens P. Multitask Learning for Activity Detection in Neovascular Age-Related Macular Degeneration. Transl Vis Sci Technol 2023; 12:12. [PMID: 37052912 PMCID: PMC10103736 DOI: 10.1167/tvst.12.4.12] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/14/2023] Open
Abstract
Purpose The purpose of this study was to provide a comparison of performance and explainability of a multitask convolutional deep neuronal network to single-task networks for activity detection in neovascular age-related macular degeneration (nAMD). Methods From 70 patients (46 women and 24 men) who attended the University Eye Hospital Tübingen, 3762 optical coherence tomography B-scans (right eye = 2011 and left eye = 1751) were acquired with Heidelberg Spectralis, Heidelberg, Germany. B-scans were graded by a retina specialist and an ophthalmology resident, and then used to develop a multitask deep learning model to predict disease activity in neovascular age-related macular degeneration along with the presence of sub- and intraretinal fluid. We used performance metrics for comparison to single-task networks and visualized the deep neural network (DNN)-based decision with t-distributed stochastic neighbor embedding and clinically validated saliency mapping techniques. Results The multitask model surpassed single-task networks in accuracy for activity detection (94.2% vs. 91.2%). The area under the curve of the receiver operating curve was 0.984 for the multitask model versus 0.974 for the single-task model. Furthermore, compared to single-task networks, visualizations via t-distributed stochastic neighbor embedding and saliency maps highlighted that multitask networks' decisions for activity detection in neovascular age-related macular degeneration were highly consistent with the presence of both sub- and intraretinal fluid. Conclusions Multitask learning increases the performance of neuronal networks for predicting disease activity, while providing clinicians with an easily accessible decision control, which resembles human reasoning. Translational Relevance By improving nAMD activity detection performance and transparency of automated decisions, multitask DNNs can support the translation of machine learning research into clinical decision support systems for nAMD activity detection.
Collapse
Affiliation(s)
- Murat Seçkin Ayhan
- Institute for Ophthalmic Research, University of Tübingen, Tübingen, Germany
| | - Hanna Faber
- Institute for Ophthalmic Research, University of Tübingen, Tübingen, Germany
- University Eye Clinic, University of Tübingen, Tübingen, Germany
| | - Laura Kühlewein
- Institute for Ophthalmic Research, University of Tübingen, Tübingen, Germany
- University Eye Clinic, University of Tübingen, Tübingen, Germany
| | - Werner Inhoffen
- University Eye Clinic, University of Tübingen, Tübingen, Germany
| | - Gulnar Aliyeva
- University Eye Clinic, University of Tübingen, Tübingen, Germany
| | - Focke Ziemssen
- University Eye Clinic, University of Tübingen, Tübingen, Germany
- University Eye Clinic, University of Leipzig, Leipzig, Germany
| | - Philipp Berens
- Institute for Ophthalmic Research, University of Tübingen, Tübingen, Germany
- Tübingen AI Center, Tübingen, Germany
- Hertie Institute for AI in Brain Health, University of Tübingen, Tübingen, Germany
| |
Collapse
|
39
|
Nazir S, Dickson DM, Akram MU. Survey of explainable artificial intelligence techniques for biomedical imaging with deep neural networks. Comput Biol Med 2023; 156:106668. [PMID: 36863192 DOI: 10.1016/j.compbiomed.2023.106668] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2022] [Revised: 01/12/2023] [Accepted: 02/10/2023] [Indexed: 02/21/2023]
Abstract
Artificial Intelligence (AI) techniques of deep learning have revolutionized the disease diagnosis with their outstanding image classification performance. In spite of the outstanding results, the widespread adoption of these techniques in clinical practice is still taking place at a moderate pace. One of the major hindrance is that a trained Deep Neural Networks (DNN) model provides a prediction, but questions about why and how that prediction was made remain unanswered. This linkage is of utmost importance for the regulated healthcare domain to increase the trust in the automated diagnosis system by the practitioners, patients and other stakeholders. The application of deep learning for medical imaging has to be interpreted with caution due to the health and safety concerns similar to blame attribution in the case of an accident involving autonomous cars. The consequences of both a false positive and false negative cases are far reaching for patients' welfare and cannot be ignored. This is exacerbated by the fact that the state-of-the-art deep learning algorithms comprise of complex interconnected structures, millions of parameters, and a 'black box' nature, offering little understanding of their inner working unlike the traditional machine learning algorithms. Explainable AI (XAI) techniques help to understand model predictions which help develop trust in the system, accelerate the disease diagnosis, and meet adherence to regulatory requirements. This survey provides a comprehensive review of the promising field of XAI for biomedical imaging diagnostics. We also provide a categorization of the XAI techniques, discuss the open challenges, and provide future directions for XAI which would be of interest to clinicians, regulators and model developers.
Collapse
Affiliation(s)
- Sajid Nazir
- Department of Computing, Glasgow Caledonian University, Glasgow, UK.
| | - Diane M Dickson
- Department of Podiatry and Radiography, Research Centre for Health, Glasgow Caledonian University, Glasgow, UK
| | - Muhammad Usman Akram
- Computer and Software Engineering Department, National University of Sciences and Technology, Islamabad, Pakistan
| |
Collapse
|
40
|
Peeters F, Rommes S, Elen B, Gerrits N, Stalmans I, Jacob J, De Boever P. Artificial Intelligence Software for Diabetic Eye Screening: Diagnostic Performance and Impact of Stratification. J Clin Med 2023; 12:jcm12041408. [PMID: 36835942 PMCID: PMC9967595 DOI: 10.3390/jcm12041408] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 01/31/2023] [Accepted: 02/07/2023] [Indexed: 02/12/2023] Open
Abstract
AIM To evaluate the MONA.health artificial intelligence screening software for detecting referable diabetic retinopathy (DR) and diabetic macular edema (DME), including subgroup analysis. METHODS The algorithm's threshold value was fixed at the 90% sensitivity operating point on the receiver operating curve to perform the disease classification. Diagnostic performance was appraised on a private test set and publicly available datasets. Stratification analysis was executed on the private test set considering age, ethnicity, sex, insulin dependency, year of examination, camera type, image quality, and dilatation status. RESULTS The software displayed an area under the curve (AUC) of 97.28% for DR and 98.08% for DME on the private test set. The specificity and sensitivity for combined DR and DME predictions were 94.24 and 90.91%, respectively. The AUC ranged from 96.91 to 97.99% on the publicly available datasets for DR. AUC values were above 95% in all subgroups, with lower predictive values found for individuals above the age of 65 (82.51% sensitivity) and Caucasians (84.03% sensitivity). CONCLUSION We report good overall performance of the MONA.health screening software for DR and DME. The software performance remains stable with no significant deterioration of the deep learning models in any studied strata.
Collapse
Affiliation(s)
- Freya Peeters
- Department of Ophthalmology, University Hospitals Leuven, 3000 Leuven, Belgium
- Biomedical Sciences Group, Research Group Ophthalmology, Department of Neurosciences, KU Leuven, 3000 Leuven, Belgium
- Correspondence:
| | - Stef Rommes
- MONA.health, 3060 Bertem, Belgium
- Flemish Institute for Technological Research (VITO), 2400 Mol, Belgium
| | - Bart Elen
- Flemish Institute for Technological Research (VITO), 2400 Mol, Belgium
| | - Nele Gerrits
- Flemish Institute for Technological Research (VITO), 2400 Mol, Belgium
| | - Ingeborg Stalmans
- Department of Ophthalmology, University Hospitals Leuven, 3000 Leuven, Belgium
- Biomedical Sciences Group, Research Group Ophthalmology, Department of Neurosciences, KU Leuven, 3000 Leuven, Belgium
| | - Julie Jacob
- Department of Ophthalmology, University Hospitals Leuven, 3000 Leuven, Belgium
- Biomedical Sciences Group, Research Group Ophthalmology, Department of Neurosciences, KU Leuven, 3000 Leuven, Belgium
| | - Patrick De Boever
- Flemish Institute for Technological Research (VITO), 2400 Mol, Belgium
- Centre for Environmental Sciences, Hasselt University, Diepenbeek, 3500 Hasselt, Belgium
| |
Collapse
|
41
|
Singla S, Eslami M, Pollack B, Wallace S, Batmanghelich K. Explaining the black-box smoothly-A counterfactual approach. Med Image Anal 2023; 84:102721. [PMID: 36571975 PMCID: PMC9835100 DOI: 10.1016/j.media.2022.102721] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 11/23/2022] [Accepted: 12/02/2022] [Indexed: 12/15/2022]
Abstract
We propose a BlackBox Counterfactual Explainer, designed to explain image classification models for medical applications. Classical approaches (e.g., , saliency maps) that assess feature importance do not explain how imaging features in important anatomical regions are relevant to the classification decision. Such reasoning is crucial for transparent decision-making in healthcare applications. Our framework explains the decision for a target class by gradually exaggerating the semantic effect of the class in a query image. We adopted a Generative Adversarial Network (GAN) to generate a progressive set of perturbations to a query image, such that the classification decision changes from its original class to its negation. Our proposed loss function preserves essential details (e.g., support devices) in the generated images. We used counterfactual explanations from our framework to audit a classifier trained on a chest X-ray dataset with multiple labels. Clinical evaluation of model explanations is a challenging task. We proposed clinically-relevant quantitative metrics such as cardiothoracic ratio and the score of a healthy costophrenic recess to evaluate our explanations. We used these metrics to quantify the counterfactual changes between the populations with negative and positive decisions for a diagnosis by the given classifier. We conducted a human-grounded experiment with diagnostic radiology residents to compare different styles of explanations (no explanation, saliency map, cycleGAN explanation, and our counterfactual explanation) by evaluating different aspects of explanations: (1) understandability, (2) classifier's decision justification, (3) visual quality, (d) identity preservation, and (5) overall helpfulness of an explanation to the users. Our results show that our counterfactual explanation was the only explanation method that significantly improved the users' understanding of the classifier's decision compared to the no-explanation baseline. Our metrics established a benchmark for evaluating model explanation methods in medical images. Our explanations revealed that the classifier relied on clinically relevant radiographic features for its diagnostic decisions, thus making its decision-making process more transparent to the end-user.
Collapse
Affiliation(s)
- Sumedha Singla
- Computer Science Department at the University of Pittsburgh, Pittsburgh, PA, 15206, USA.
| | - Motahhare Eslami
- School of Computer Science, Human-Computer Interaction Institute, Carnegie Mellon University, USA.
| | - Brian Pollack
- Department of Biomedical Informatics, the University of Pittsburgh, Pittsburgh, PA, 15206, USA.
| | - Stephen Wallace
- University of Pittsburgh Medical School, Pittsburgh, PA, 15206, USA.
| | - Kayhan Batmanghelich
- Department of Biomedical Informatics, the University of Pittsburgh, Pittsburgh, PA, 15206, USA.
| |
Collapse
|
42
|
Shickel B, Loftus TJ, Ruppert M, Upchurch GR, Ozrazgat-Baslanti T, Rashidi P, Bihorac A. Dynamic predictions of postoperative complications from explainable, uncertainty-aware, and multi-task deep neural networks. Sci Rep 2023; 13:1224. [PMID: 36681755 PMCID: PMC9867692 DOI: 10.1038/s41598-023-27418-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Accepted: 01/01/2023] [Indexed: 01/22/2023] Open
Abstract
Accurate prediction of postoperative complications can inform shared decisions regarding prognosis, preoperative risk-reduction, and postoperative resource use. We hypothesized that multi-task deep learning models would outperform conventional machine learning models in predicting postoperative complications, and that integrating high-resolution intraoperative physiological time series would result in more granular and personalized health representations that would improve prognostication compared to preoperative predictions. In a longitudinal cohort study of 56,242 patients undergoing 67,481 inpatient surgical procedures at a university medical center, we compared deep learning models with random forests and XGBoost for predicting nine common postoperative complications using preoperative, intraoperative, and perioperative patient data. Our study indicated several significant results across experimental settings that suggest the utility of deep learning for capturing more precise representations of patient health for augmented surgical decision support. Multi-task learning improved efficiency by reducing computational resources without compromising predictive performance. Integrated gradients interpretability mechanisms identified potentially modifiable risk factors for each complication. Monte Carlo dropout methods provided a quantitative measure of prediction uncertainty that has the potential to enhance clinical trust. Multi-task learning, interpretability mechanisms, and uncertainty metrics demonstrated potential to facilitate effective clinical implementation.
Collapse
Affiliation(s)
- Benjamin Shickel
- Department of Medicine, University of Florida, Gainesville, FL, 32611, USA
- Intelligent Critical Care Center (IC3), University of Florida, Gainesville, FL, 32611, USA
| | - Tyler J Loftus
- Department of Surgery, University of Florida, Gainesville, FL, 32611, USA
- Intelligent Critical Care Center (IC3), University of Florida, Gainesville, FL, 32611, USA
| | - Matthew Ruppert
- Department of Medicine, University of Florida, Gainesville, FL, 32611, USA
- Precision and Intelligent Systems in Medicine (PRISMAp), University of Florida, Gainesville, FL, 32611, USA
- Intelligent Critical Care Center (IC3), University of Florida, Gainesville, FL, 32611, USA
| | - Gilbert R Upchurch
- Department of Surgery, University of Florida, Gainesville, FL, 32611, USA
| | - Tezcan Ozrazgat-Baslanti
- Department of Medicine, University of Florida, Gainesville, FL, 32611, USA
- Precision and Intelligent Systems in Medicine (PRISMAp), University of Florida, Gainesville, FL, 32611, USA
- Intelligent Critical Care Center (IC3), University of Florida, Gainesville, FL, 32611, USA
| | - Parisa Rashidi
- Department of Medicine, University of Florida, Gainesville, FL, 32611, USA
- Department of Biomedical Engineering, University of Florida, Gainesville, FL, 32611, USA
- Intelligent Health Lab (i-Heal), University of Florida, Gainesville, FL, 32611, USA
- Intelligent Critical Care Center (IC3), University of Florida, Gainesville, FL, 32611, USA
| | - Azra Bihorac
- Department of Medicine, University of Florida, Gainesville, FL, 32611, USA.
- Precision and Intelligent Systems in Medicine (PRISMAp), University of Florida, Gainesville, FL, 32611, USA.
- Intelligent Critical Care Center (IC3), University of Florida, Gainesville, FL, 32611, USA.
| |
Collapse
|
43
|
An empirical study of preprocessing techniques with convolutional neural networks for accurate detection of chronic ocular diseases using fundus images. APPL INTELL 2023; 53:1548-1566. [PMID: 35528131 PMCID: PMC9059700 DOI: 10.1007/s10489-022-03490-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/08/2022] [Indexed: 01/07/2023]
Abstract
Chronic Ocular Diseases (COD) such as myopia, diabetic retinopathy, age-related macular degeneration, glaucoma, and cataract can affect the eye and may even lead to severe vision impairment or blindness. According to a recent World Health Organization (WHO) report on vision, at least 2.2 billion individuals worldwide suffer from vision impairment. Often, overt signs indicative of COD do not manifest until the disease has progressed to an advanced stage. However, if COD is detected early, vision impairment can be avoided by early intervention and cost-effective treatment. Ophthalmologists are trained to detect COD by examining certain minute changes in the retina, such as microaneurysms, macular edema, hemorrhages, and alterations in the blood vessels. The range of eye conditions is diverse, and each of these conditions requires a unique patient-specific treatment. Convolutional neural networks (CNNs) have demonstrated significant potential in multi-disciplinary fields, including the detection of a variety of eye diseases. In this study, we combined several preprocessing approaches with convolutional neural networks to accurately detect COD in eye fundus images. To the best of our knowledge, this is the first work that provides a qualitative analysis of preprocessing approaches for COD classification using CNN models. Experimental results demonstrate that CNNs trained on the region of interest segmented images outperform the models trained on the original input images by a substantial margin. Additionally, an ensemble of three preprocessing techniques outperformed other state-of-the-art approaches by 30% and 3%, in terms of Kappa and F 1 scores, respectively. The developed prototype has been extensively tested and can be evaluated on more comprehensive COD datasets for deployment in the clinical setup.
Collapse
|
44
|
Deep Learning and Medical Image Processing Techniques for Diabetic Retinopathy: A Survey of Applications, Challenges, and Future Trends. JOURNAL OF HEALTHCARE ENGINEERING 2023; 2023:2728719. [PMID: 36776951 PMCID: PMC9911247 DOI: 10.1155/2023/2728719] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 10/28/2022] [Accepted: 11/25/2022] [Indexed: 02/05/2023]
Abstract
Diabetic retinopathy (DR) is a common eye retinal disease that is widely spread all over the world. It leads to the complete loss of vision based on the level of severity. It damages both retinal blood vessels and the eye's microscopic interior layers. To avoid such issues, early detection of DR is essential in association with routine screening methods to discover mild causes in manual initiation. But these diagnostic procedures are extremely difficult and expensive. The unique contributions of the study include the following: first, providing detailed background of the DR disease and the traditional detection techniques. Second, the various imaging techniques and deep learning applications in DR are presented. Third, the different use cases and real-life scenarios are explored relevant to DR detection wherein deep learning techniques have been implemented. The study finally highlights the potential research opportunities for researchers to explore and deliver effective performance results in diabetic retinopathy detection.
Collapse
|
45
|
Pammi M, Aghaeepour N, Neu J. Multiomics, artificial intelligence, and precision medicine in perinatology. Pediatr Res 2023; 93:308-315. [PMID: 35804156 PMCID: PMC9825681 DOI: 10.1038/s41390-022-02181-x] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 05/12/2022] [Accepted: 05/30/2022] [Indexed: 01/11/2023]
Abstract
Technological advances in omics evaluation, bioinformatics, and artificial intelligence have made us rethink ways to improve patient outcomes. Collective quantification and characterization of biological data including genomics, epigenomics, metabolomics, and proteomics is now feasible at low cost with rapid turnover. Significant advances in the integration methods of these multiomics data sets by machine learning promise us a holistic view of disease pathogenesis and yield biomarkers for disease diagnosis and prognosis. Using machine learning tools and algorithms, it is possible to integrate multiomics data with clinical information to develop predictive models that identify risk before the condition is clinically apparent, thus facilitating early interventions to improve the health trajectories of the patients. In this review, we intend to update the readers on the recent developments related to the use of artificial intelligence in integrating multiomic and clinical data sets in the field of perinatology, focusing on neonatal intensive care and the opportunities for precision medicine. We intend to briefly discuss the potential negative societal and ethical consequences of using artificial intelligence in healthcare. We are poised for a new era in medicine where computational analysis of biological and clinical data sets will make precision medicine a reality. IMPACT: Biotechnological advances have made multiomic evaluations feasible and integration of multiomics data may provide a holistic view of disease pathophysiology. Artificial Intelligence and machine learning tools are being increasingly used in healthcare for diagnosis, prognostication, and outcome predictions. Leveraging artificial intelligence and machine learning tools for integration of multiomics and clinical data will pave the way for precision medicine in perinatology.
Collapse
Affiliation(s)
- Mohan Pammi
- Section of Neonatology, Department of Pediatrics, Baylor College of Medicine and Texas Children's Hospital, Houston, TX, USA.
| | - Nima Aghaeepour
- Departments of Anesthesiology, Pediatrics, and Biomedical Data Sciences, Stanford University School of Medicine, Stanford, CA, USA
| | - Josef Neu
- Section of Neonatology, Department of Pediatrics, University of Florida, Gainesville, FL, USA
| |
Collapse
|
46
|
Hung KH, Kao YC, Tang YH, Chen YT, Wang CH, Wang YC, Lee OKS. Application of a deep learning system in glaucoma screening and further classification with colour fundus photographs: a case control study. BMC Ophthalmol 2022; 22:483. [PMID: 36510171 PMCID: PMC9743575 DOI: 10.1186/s12886-022-02730-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Accepted: 12/06/2022] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND To verify efficacy of automatic screening and classification of glaucoma with deep learning system. METHODS A cross-sectional, retrospective study in a tertiary referral hospital. Patients with healthy optic disc, high-tension, or normal-tension glaucoma were enrolled. Complicated non-glaucomatous optic neuropathy was excluded. Colour and red-free fundus images were collected for development of DLS and comparison of their efficacy. The convolutional neural network with the pre-trained EfficientNet-b0 model was selected for machine learning. Glaucoma screening (Binary) and ternary classification with or without additional demographics (age, gender, high myopia) were evaluated, followed by creating confusion matrix and heatmaps. Area under receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, and F1 score were viewed as main outcome measures. RESULTS Two hundred and twenty-two cases (421 eyes) were enrolled, with 1851 images in total (1207 normal and 644 glaucomatous disc). Train set and test set were comprised of 1539 and 312 images, respectively. If demographics were not provided, AUC, accuracy, precision, sensitivity, F1 score, and specificity of our deep learning system in eye-based glaucoma screening were 0.98, 0.91, 0.86, 0.86, 0.86, and 0.94 in test set. Same outcome measures in eye-based ternary classification without demographic data were 0.94, 0.87, 0.87, 0.87, 0.87, and 0.94 in our test set, respectively. Adding demographics has no significant impact on efficacy, but establishing a linkage between eyes and images is helpful for a better performance. Confusion matrix and heatmaps suggested that retinal lesions and quality of photographs could affect classification. Colour fundus images play a major role in glaucoma classification, compared to red-free fundus images. CONCLUSIONS Promising results with high AUC and specificity were shown in distinguishing normal optic nerve from glaucomatous fundus images and doing further classification.
Collapse
Affiliation(s)
- Kuo-Hsuan Hung
- grid.413801.f0000 0001 0711 0593Department of Ophthalmology, Chang-Gung Memorial Hospital, Linkou, No.5, Fu-Hsing St., Kuei Shan Hsiang, Tao Yuan Hsien, Taiwan ,grid.145695.a0000 0004 1798 0922Chang-Gung University College of Medicine, No.259 Wen-Hwa 1st Road, Kuei Shan Hsiang, Tao Yuan Hsien, Taiwan ,grid.260539.b0000 0001 2059 7017Institute of Clinical Medicine, National Yang Ming Chiao Tung University, No.201, Sec.2, Shih-Pai Rd. Peitou, R.O.C, Taipei, 112 Taiwan
| | - Yu-Ching Kao
- Muen Biomedical and Optoelectronics Technologies Inc., Taipei, Taiwan
| | - Yu-Hsuan Tang
- grid.260539.b0000 0001 2059 7017Institute of Clinical Medicine, National Yang Ming Chiao Tung University, No.201, Sec.2, Shih-Pai Rd. Peitou, R.O.C, Taipei, 112 Taiwan
| | - Yi-Ting Chen
- Muen Biomedical and Optoelectronics Technologies Inc., Taipei, Taiwan
| | - Chuen-Heng Wang
- Muen Biomedical and Optoelectronics Technologies Inc., Taipei, Taiwan
| | - Yu-Chen Wang
- Muen Biomedical and Optoelectronics Technologies Inc., Taipei, Taiwan
| | - Oscar Kuang-Sheng Lee
- grid.260539.b0000 0001 2059 7017Institute of Clinical Medicine, National Yang Ming Chiao Tung University, No.201, Sec.2, Shih-Pai Rd. Peitou, R.O.C, Taipei, 112 Taiwan ,grid.260539.b0000 0001 2059 7017Stem Cell Research Centre, National Yang Ming Chiao Tung University, Taipei, Taiwan ,grid.411508.90000 0004 0572 9415Department of Orthopedics, China Medical University Hospital, Taichung, Taiwan
| |
Collapse
|
47
|
Features extraction using encoded local binary pattern for detection and grading diabetic retinopathy. Health Inf Sci Syst 2022; 10:14. [PMID: 35782197 PMCID: PMC9243209 DOI: 10.1007/s13755-022-00181-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Accepted: 06/05/2022] [Indexed: 11/16/2022] Open
Abstract
Introduction Reliable computer diagnosis of diabetic retinopathy (DR) is needed to rescue many with diabetes who may be under threat of blindness. This research aims to detect the presence of diabetic retinopathy in fundus images and grade the disease severity without lesion segmentation. Methods To ensure that the fundus images are in a standard state of brightness, a series of preprocessing steps have been applied to the green channel image using histogram matching and a median filter. Then, contrast-limited adaptive histogram equalisation is performed, followed by the unsharp filter. The preprocessed image is divided into small blocks, and then each block is processed to extract uniform local binary patterns (LBPs) features. The extracted features are encoded, and the feature size is reduced to 3.5 percent of its original size. Classifiers like Support Vector Machine (SVM) and a proposed CNN model were used to classify retinal fundus images. The classification is abnormal or normal and to grade the severity of DR. Results Our feature extraction method was tested on a binary classifier and resulted in an accuracy of 98.37% and 98.84% on the Messidor2 and EyePACS databases, respectively. The proposed system could grade DR severity into three grades (0: no DR, 1: mild DR, and 5: moderate, severe NPDR, and PDR). It obtains an F1-score of 0.9617 and an accuracy of 95.37% on the EyePACS database, and an F1-score of 0.9860 and an accuracy of 97.57% on the Messidor2 database. The resultant values are dependent on the selection of (neighbours, radius) pairs during the extraction of LBP features. Conclusions This study’s results proved that the preprocessing steps are significant and had a great effect on highlighting image features. The novel method of stacking and encoding the LBP values in the feature vector greatly affects results when using SVM or CNN for classification. The proposed system outperforms the state of the artwork. The proposed CNN model performs better than SVM.
Collapse
|
48
|
Garvey KV, Thomas Craig KJ, Russell R, Novak LL, Moore D, Miller BM. Considering Clinician Competencies for the Implementation of Artificial Intelligence-Based Tools in Health Care: Findings From a Scoping Review. JMIR Med Inform 2022; 10:e37478. [PMID: 36318697 PMCID: PMC9713618 DOI: 10.2196/37478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 05/09/2022] [Accepted: 10/25/2022] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND The use of artificial intelligence (AI)-based tools in the care of individual patients and patient populations is rapidly expanding. OBJECTIVE The aim of this paper is to systematically identify research on provider competencies needed for the use of AI in clinical settings. METHODS A scoping review was conducted to identify articles published between January 1, 2009, and May 1, 2020, from MEDLINE, CINAHL, and the Cochrane Library databases, using search queries for terms related to health care professionals (eg, medical, nursing, and pharmacy) and their professional development in all phases of clinical education, AI-based tools in all settings of clinical practice, and professional education domains of competencies and performance. Limits were provided for English language, studies on humans with abstracts, and settings in the United States. RESULTS The searches identified 3476 records, of which 4 met the inclusion criteria. These studies described the use of AI in clinical practice and measured at least one aspect of clinician competence. While many studies measured the performance of the AI-based tool, only 4 measured clinician performance in terms of the knowledge, skills, or attitudes needed to understand and effectively use the new tools being tested. These 4 articles primarily focused on the ability of AI to enhance patient care and clinical decision-making by improving information flow and display, specifically for physicians. CONCLUSIONS While many research studies were identified that investigate the potential effectiveness of using AI technologies in health care, very few address specific competencies that are needed by clinicians to use them effectively. This highlights a critical gap.
Collapse
Affiliation(s)
- Kim V Garvey
- Center for Advanced Mobile Healthcare Learning, Vanderbilt University Medical Center, Nashville, TN, United States
- Department of Anesthesiology, School of Medicine, Vanderbilt University, Nashville, TN, United States
| | - Kelly Jean Thomas Craig
- Center for Artificial Intelligence, Research, and Evaluation, IBM Watson Health, Cambridge, MA, United States
- Clinical Evidence Development, Aetna Medical Affairs, CVS Health, Hartford, CT, United States
| | - Regina Russell
- Department of Medical Education and Administration, School of Medicine, Vanderbilt University, Nashville, TN, United States
| | - Laurie L Novak
- Department of Biomedical Informatics, School of Medicine, Vanderbilt University, Nashville, TN, United States
- Center of Excellence in Applied Artificial Intelligence, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Don Moore
- Department of Medical Education and Administration, School of Medicine, Vanderbilt University, Nashville, TN, United States
| | - Bonnie M Miller
- Center for Advanced Mobile Healthcare Learning, Vanderbilt University Medical Center, Nashville, TN, United States
- Department of Medical Education and Administration, School of Medicine, Vanderbilt University, Nashville, TN, United States
| |
Collapse
|
49
|
Qian X, Jingying H, Xian S, Yuqing Z, Lili W, Baorui C, Wei G, Yefeng Z, Qiang Z, Chunyan C, Cheng B, Kai M, Yi Q. The effectiveness of artificial intelligence-based automated grading and training system in education of manual detection of diabetic retinopathy. Front Public Health 2022; 10:1025271. [PMID: 36419999 PMCID: PMC9678340 DOI: 10.3389/fpubh.2022.1025271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 10/18/2022] [Indexed: 11/09/2022] Open
Abstract
Background The purpose of this study is to develop an artificial intelligence (AI)-based automated diabetic retinopathy (DR) grading and training system from a real-world diabetic dataset of China, and in particular, to investigate its effectiveness as a learning tool of DR manual grading for medical students. Methods We developed an automated DR grading and training system equipped with an AI-driven diagnosis algorithm to highlight highly prognostic related regions in the input image. Less experienced prospective physicians received pre- and post-training tests by the AI diagnosis platform. Then, changes in the diagnostic accuracy of the participants were evaluated. Results We randomly selected 8,063 cases diagnosed with DR and 7,925 with non-DR fundus images from type 2 diabetes patients. The automated DR grading system we developed achieved accuracy, sensitivity/specificity, and AUC values of 0.965, 0.965/0.966, and 0.980 for moderate or worse DR (95 percent CI: 0.976-0.984). When the graders received assistance from the output of the AI system, the metrics were enhanced in varying degrees. The automated DR grading system helped to improve the accuracy of human graders, i.e., junior residents and medical students, from 0.947 and 0.915 to 0.978 and 0.954, respectively. Conclusion The AI-based systemdemonstrated high diagnostic accuracy for the detection of DR on fundus images from real-world diabetics, and could be utilized as a training aid system for trainees lacking formal instruction on DR management.
Collapse
Affiliation(s)
- Xu Qian
- Department of Geriatrics, Qilu Hospital of Shandong University, Jinan, China,Key Laboratory of Cardiovascular Proteomics of Shandong Province, Jinan, China,Jinan Clinical Research Center for Geriatric Medicine (202132001), Jinan, China
| | - Han Jingying
- School of Basic Medical Sciences, Shandong University, Jinan, China
| | - Song Xian
- Department of Geriatrics, Qilu Hospital of Shandong University, Jinan, China
| | - Zhao Yuqing
- Department of Geriatrics, Qilu Hospital of Shandong University, Jinan, China
| | - Wu Lili
- Department of Geriatrics, Qilu Hospital of Shandong University, Jinan, China
| | - Chu Baorui
- Department of Geriatrics, Qilu Hospital of Shandong University, Jinan, China
| | - Guo Wei
- Lunan Eye Hospital, Linyi, China
| | | | | | | | | | - Ma Kai
- Tencent Healthcare, Shenzhen, China
| | - Qu Yi
- Department of Geriatrics, Qilu Hospital of Shandong University, Jinan, China,Key Laboratory of Cardiovascular Proteomics of Shandong Province, Jinan, China,Jinan Clinical Research Center for Geriatric Medicine (202132001), Jinan, China,*Correspondence: Qu Yi
| |
Collapse
|
50
|
Geraniin ameliorates streptozotocin-induced diabetic retinopathy in rats via modulating retinal inflammation and oxidative stress. ARAB J CHEM 2022. [DOI: 10.1016/j.arabjc.2022.104396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|